Comment by xdennis

Comment by xdennis 13 hours ago

6 replies

How is Unicode in any way related to JSON? JSON should just encode whatever dumb data someone wants to transport.

Unicode validation/cleanup should be done separately because it's needed in multiple places, not just JSON.

layer8 13 hours ago

The contents of JSON strings doesn’t admit random binary data. You need to use an encoding like Base64 for that purpose.

zzo38computer 5 hours ago

JSON (unfortunately) requires strings to be Unicode. (JSON has other problems too, but Unicode is one of them.)

recursive 13 hours ago

JSON is text. If you're not going to use unicode in the representation of your text, you'll need some other way.

  • dcrazy 12 hours ago

    The current JSON spec mandates UTF-8, but practically speaking encoding is a higher-level concept. I suspect there are many server implementations that will respect the Content-Encoding header in a POST request containing JSON.

  • ninkendo 11 hours ago

    So?

    All the letters in this string are “just text”:

        "\u0000\u0089\uDEAD\uD9BF\uDFFF"
    
    JSON itself allows putting sequences of escape characters in the string that don’t unescape to valid Unicode. That’s fine, because the strings aren’t required to represent any particular encoding: it’s up to a layer higher than JSON to be opinionated about that.

    I wouldn’t want my shell’s pipeline buffers to reject data it doesn’t like, why should a JSON serializer?

    • recursive 11 hours ago

      I actually agree, now that I understand what you're talking about.