Comment by unbalancedevh
Comment by unbalancedevh 7 days ago
> The fact that dz is treated as a single letter in Hungarian means that if you search for “mad”, it should not match “madzag” (which means “string”) because the “dz” in “madzag” is a single letter and not a “d” followed by a “z”, no more than “lav” should match “law” just because the first part of the letter “w” looks like a “v”.
This doesn't seem right. If the individual letters "d" and "z" exist, then it should be possible to have them next to each other in a text file without them necessarily collapsing into a single letter -- especially if they're actually represented as separate characters, which they are in the example. Even if the letter "w" wasn't correctly represented and required actually typing "uu", you wouldn't want the word "vacuum" to be interpreted as having a "w"!
Yes, I'm Hungarian, and I'm not even mad (pun intended) about "mad" matching "madzag". I find that we ourselves sometimes conflate characters and letters, so many people's first thought would be that "madzag" is six letters. I think most other digraphs e.g. "sz" or "gy" are considered more tightly bound, so one would be unlikely to say that "szám" (=number) is four letters rather than three.