Comment by xdennis
How is Unicode in any way related to JSON? JSON should just encode whatever dumb data someone wants to transport.
Unicode validation/cleanup should be done separately because it's needed in multiple places, not just JSON.
How is Unicode in any way related to JSON? JSON should just encode whatever dumb data someone wants to transport.
Unicode validation/cleanup should be done separately because it's needed in multiple places, not just JSON.
JSON (unfortunately) requires strings to be Unicode. (JSON has other problems too, but Unicode is one of them.)
So?
All the letters in this string are “just text”:
"\u0000\u0089\uDEAD\uD9BF\uDFFF"
JSON itself allows putting sequences of escape characters in the string that don’t unescape to valid Unicode. That’s fine, because the strings aren’t required to represent any particular encoding: it’s up to a layer higher than JSON to be opinionated about that.I wouldn’t want my shell’s pipeline buffers to reject data it doesn’t like, why should a JSON serializer?
The contents of JSON strings doesn’t admit random binary data. You need to use an encoding like Base64 for that purpose.