Comment by mananaysiempre
Comment by mananaysiempre 8 hours ago
WTF-8 is more or less the obvious thing to use when NT/Java/JavaScript-style WTF-16 needs to fit into a UTF-8-shaped hole. And yes, it’s UTF-8 except you can encode surrogates except those surrogates can’t form a valid pair (use the normal UTF-8 encoding of the codepoint designated by that pair in that case).
(Some people instead encode each WTF-16 surrogate independently regardless of whether it participates in a valid pair or not, yielding an UTF-8-like but UTF-8-incompatible-beyond-U+FFFF thing usually called CESU-8. We don’t talk about those people.)
The parent’s point was that “potentially ill-formed UTF-16" and "WTF-8" are inherently different encodings (16-bit word sequence vs. byte sequence), and thus not “aka”.