CorrectHorseBat a day ago

In German you have the same, only within one language. ß can be written as ss if it isn't available in a font, and only in 2017 they added a capital version. So depending the font and the unicode version the number of letters can differ.

  • kbelder a day ago

    "Traditionally, ⟨ß⟩ did not have a capital form, and was capitalized as ⟨SS⟩. Some type designers introduced capitalized variants. In 2017, the Council for German Orthography officially adopted a capital form ⟨ẞ⟩ as an acceptable variant, ending a long debate."

    Thanks, that is interesting!

  • guappa a day ago

    should "ß" == "ss" evaluate as true?

    • birn559 a day ago

      I don't see why it should. I also believe parent is wrong as there are unambiguous rules about when to use ß or ss.

      Never thought of it but maybe there are rules that allow to visually present the code point for ß as ss? At least (from experience as a user) there seem to be a singular "ss" codepoint.

      • CorrectHorseBat a day ago

        >also believe parent is wrong as there are unambiguous rules about when to use ß or ss.

        I never said it was ambiguous, I said it depends on the unicode version and the font you are using. How is that wrong? (Seems like the capital of ß is still SS in the latest unicode but since ẞ is the preferred capital version now this should change in the future)

      • guappa a day ago

        well I don't speak german, I was asking

        • birn559 a day ago

          I see, wasn't clear to me on what level you were asking. The letter ß has never been generally equivalent to ss in the German language.

          From a user experience perspective though it might be beneficial to pretend that "ß" == "ss" holds when parsing user input.

int_19h 19 hours ago

That's not really any different than the distinction (or lack thereof) between "ae" and "æ". For that matter, in Russian there is a letter "ы" which is historically a digraph consisting of two separately letters "ъ" and "i" that just happens to be treated as a single letter for so long that few people would even recognize it as a digraph. This kind of stuff is all language-specific, which is why for Worlde etc you always need to be aware of the context, and this context will then unambiguously decide what constitutes a single letter.