Comment by 0xCE0

Comment by 0xCE0 2 days ago

The original intent of Unicode was great: a standard that creates a mapping between a unique number==codepoint and specific character of language (and here character means only abstract non-visual symbol==meaning, not visually rendered glyph with stylistic font of any kind). The updates for Unicode versions added more languages, even dead ones. So basically it was a historical knowledge effort also.

Then came emojis, and now the Unicode Consortium's efforts for Unicode version updates seems to be about adding more different kinds of poop emojis and shades of skin colors. Well, maybe it projects accurately the language and culture of this modern time.

UTF-8 is great because it is a superset of ASCII, but because its byte-width varies, it has more complexity for decoding/encoding it (similar to constant/variable width ISA's in CPUs).

Different languages have different concepts, e.g. text direction==flow (left/right, up/down, characters/logograms, different kind of visual cues etc.). Humans create problems when they want to combine different languages at the same time. E.g. mathematical notation is in my opinion 2D graphics, and it cannot be (usually/always) inlined with text glyphs (to be aesthetically pleasing). Same kind of problems may come when trying to inline e.g. languages with different flow directions. Its like trying to combine native GUI widgets in Win32 and Cocoa/SwiftUI and GTK/Qt/WXwidgets - the (visual) languages doesn't have the same concepts or they are conflicting.

Rendello a day ago

For what it's worth, the Unicode Consortium seems to be trying to reign in the emoji explosion in the last few years. For example they won't process any new proposals for flags [1][2]:

> The Unicode Consortium will no longer accept proposals for flags. Flags that correspond to officially assigned ISO 3166-1 alpha-2 region codes are automatically added, with no proposals necessary.

And they decided against adding Multi-skintoned Families to the RGI, as in, vendors can encode them if they really want to, but it's not recommended. Apple for example replaced their more complex family emoji with the recommended silhouettes afterwards [4].

1. https://blog.unicode.org/2022/03/the-past-and-future-of-flag...

2. https://unicode.org/emoji/proposals.html#Flags

3. https://www.unicode.org/L2/L2020/20114-family-emoji-explor.p...

4. https://blog.emojipedia.org/ios-17-4-emoji-changelog/

Reply View 0 replies

arp242 a day ago

The emoji horse bolted long before Unicode. Everything from MSN Messenger to web forums had their own implementation of it. And that was a continuation of various ASCII emojis ranging from the simple :-) to the more complex ¯\_/(ツ)\_/¯

To say nothing that various emojis have been part of Unicode since pretty much the start, and was part of other encoding schemes as well (notably in Japan, but also e.g. the the "Outlook J").

And if you actually look at the changes in Unicode versions, you'll see there are tons of language-related changes in every one. To say Unicode updates are just about emoji updates is just silly. The reason you don't notice is because this is mostly for small language, obscure features in larger languages, and historical languages, and things like that.

Reply View 1 reply

0xCE0 14 hours ago

Great corrections from both of you.

Reply View | 0 replies