Comment by Ygg2
Comment by Ygg2 4 days ago
Theoretically yes. Practically there is character escaping.
That kills any non-allocation dreams. Moment you have "Hi \uxxxx isn't the UTF nice?" you will probably have to allocate. If source is read-only you have to allocate. If source is mutable you have to waste CPU to rewrite the string.
I'm confused why this would be a problem. UTF-8 and UTF-16 (the only two common unicode subsets) are a maximum of 4 bytes wide (and, most commonly, 2 in English text). The ASCII representation you gave is 6-bytes wide. I don't know of many ASCII unicode representations that have less bytewidth than their native Unicode representation.
Same goes for other characters such as \n, \0, \t, \r, etc. All half in native byte representation.