The strangest letter of the alphabet: The rise and fall of yogh
(deadlanguagesociety.com)151 points by penetralium 9 hours ago
151 points by penetralium 9 hours ago
> one of the rarer sounds in the worlds languages
Is that true? Seems like it's in every other word when I visit Spain...
I know Japanese does not have a th sound, and I don't think chinese or most other asian languages have it, but am less sure about that. Unfortunately I lack the data needed to substantiate my claim.
with
lang_sounds as (
select
lang,
unnest(string_to_array(ipa, null) ) as sound
from world_dictionary
),
totals as (
select
lang,
count(sound) as sound_count
from lang_sounds
group by lang
)
select
lang,
totals.sound,
count(sound) / totals.sound_count
from
lang_sounds join
totals on
lang_sounds.lang = totals.lang
where sound = 'θ' or sound = 'ð' or sound = 'θ̠' or sound = 'z'
group by lang, sound
order by count(sound) / totals.sound_count
This is just a feature of Castilian Spanish: https://en.wikipedia.org/wiki/Phonological_history_of_Spanis... \th\ only occurs naturally in like 5% of the thousands of human languages that have ever existed. Just because those languages are some of the most widely-spoken ones worldwide does not make the sound a commonly-occurring one in a meaningful phonological sense.
It’s true. English and the main Spain version of Spanish are two of the few languages in the world which have the sound. Even most Latin American versions of Spanish (maybe all?) do not have it.
Made me think of this SMBC comic[1], where there's a debate if being in English or Spanish, each with around a billion speakers, makes it rare or not.
> ‘ȝ’ was used to write two completely different sounds in Middle English
Was it, though? By comparing English and Dutch you can clearly see that one of the ways this harsh "gh" changed in English is it became "y" as in "yesterday". "Weg" (Dutch) - "way" (English), "gister[en]" (Dutch) - "yester[day] (English), etc. I wonder if at the time pronouncing it as "gh" was still common and this would make using the same letter in some words much more logical.
Maybe you didn't realize until now that it is indeed Wynn-DOS as in W-Disk Operating System
I like Apple's thinking - Option key
Super key could arguably apply to the shift keys, because you are using a super set of letters (or am I reaching too far)
"English spelling has a reputation. And it’s not a good one." - never have i ever agreed with anything more
different hill, but one I would die on is: as the letter "c" should make the "ch" sound, the letter "c" serves no purpose not already handled by "s" or "k" otherwise
https://guidetogrammar.org/grammar/twain.htm
For example, in Year 1 that useless letter "c" would be dropped to be replased either by "k" or "s", and likewise "x" would no longer be part of the alphabet.
The only kase in which "c" would be retained would be the "ch" formation, which will be dealt with later.
Year 2 might reform "w" spelling, so that "which" and "one" would take the same konsonant, wile Year 3 might well abolish "y" replasing it with "i" and iear 4 might fiks the "g/j" anomali wonse and for all.
Jenerally, then, the improvement would kontinue iear bai iear with iear 5 doing awai with useless double konsonants, and iears 6-12 or so modifaiing vowlz and the rimeining voist and unvoist konsonants.
Bai iear 15 or sou, it wud fainali bi posibl tu meik ius ov thi ridandant letez "c", "y" and "x" -- bai now jast a memori in the maindz ov ould doderez -- tu riplais "ch", "sh", and "th" rispektivli.
Fainali, xen, aafte sam 20 iers ov orxogrefkl riform, wi wud hev a lojikl, kohirnt speling in ius xrewawt xe Ingliy-spiking werld.
I remember a version which ends with how we'll end up speaking German.
Recommend X for the ‘sh’ sound, as it is pronounced that way in languages like Portuguese. Y is a common typographical substitute for theta/thorn, as in “ye olde shoppe.”
The nice thing about this passage is it reflects the extent of Twain's non-rhotic dialect -- he keeps the R in "year"/"years", "orthographical", and "world" but drops it in "after", "letters", and "dodderers". So only dropped in final unstressed syllables of multi-syllable words.
English's spelling irregularities help with disambiguating homophones:
cent / sent / scent
ceiling / sealing
cite / sight / site
colonel / kernel
carrot / karat
cue / queue
Also read (future tense) and read (past tense) being pronounced differently despite the same spelling.
If you look up these words in the dictionary, the same word with the same spelling very often has several different definitions that are often unrelated because homographs (same spelling, but different meaning) are super-common in English. Dictionaries don't account for newer or more niche meanings of words either.
How is it that you can say these words without confusion?
Language is context sensitive and you understand the word based on the context around it. Likewise, you understand homographs based on the context. Because of this, spelling isn't as important as it might appear.
And cause confusion with needless heterographs?
practice / practise licence / license
> It’s full of silent letters, as in numb, knee, and honour. A given sound can be spelled in multiple ways (farm, laugh, photo), and many letters make multiple sounds (get, gist, mirage).
that last one is hardly fair - gist and mirage are french words. might as well complain about the silent letters in rendezvous or faux pas.
Almost every English word is French, except for the most important ones.
Call me a douche, but the e in "touche" is silent, whereas that in "touché" is voiced.
I've played around with respelling quite a bit; one of the most difficult adaptations is forcing yourself to correctly use "dh" (few-but-common words, "thy", "either", "teethe") vs "th" (most words, "thigh", "ether", "teeth").
j -> dzh is more weird than anything.
Vowels, of course, are a cause of war between dialects; nobody can even agree how many there are.
Well, it would be a step backward in the right direction to go with spelling it 'kube' and pronouncing it 'koob'. That would hew to the original Greek. We'd also bring cybernetic back closer to kubernetes. And circle to kuklos. (Side note: It's another spelling "error" that we use 'y' in English to transliterate the Greek upsilon, which looks like 'Y' when capitalized, but is really a better match to 'u'. Hence, hyper and hypo instead of huper and hupo (like super and sub).)
kyube or kyoob would definitely be the way to go.
It's funny you use "tube" as an example though, as in my British accent I pronounce that as "chube", whereas I believe many Americans would use a "t" sound for that word. Not sure how you settle on a spelling in those cases.
Regional variations are available! I think the BBC would have had it pronounced tyoob. And don't Americans pronounce it "subway"?
I completely agree with you. I've taken an amateurish interest in linguistics over the past couple of years, and I've often thought that it might be a fun exercise to come up with a phonetic alphabet for the English language. Use the letter 'c' to represent /ch/, 'x' to represent /sh/, etc.
Maybe as a fun pet project someday!
> as the letter "c" should make the "ch" sound
What’s the ch sound? My intuition from German class is that ch represents a throaty hhhh. Somehow that got spoiled into k in most English words.
Every c in Pacific Ocean is pronounced differently. C is a silly letter.
"Ch" is a strange hill to die on. "Ch" has a mostly consistent pronunciation (eg chair, touch, chain, choke, recharge, etc) that no other letter combination does.
Exceptions to this are generally loan words, particularly from French (eg chaise, which sounds more like "sh"). Others are harder to explain. "Lichen" springs to mind. Yes it technically comes from Latin but we're beyond the time range to truly consider it a loan word.
There are also some "ch" words of Greek origin (IIRC) that could simply be replaced with "c" or "k" (eg chemistry, school).
"Kh" on the other hand I think is entirely loan words, particularly from Arabic. Even then we have names like "Achmed" that would more consistently be written as "Akhmen". "Khan" is obviously a loan word but I think time has largely reduced the pronunciation to "karn" rather than "kharn" if it ever was that.
But I can't think of a single "kh" word that pronounced like "ch" in "chair".
"Sh" doesn't seem to crossover with any of these pronunciations.
my biggest TIL takeaway from that article was an "oh wow" moment:
The other sound that ‘ȝ’ once spelled is the “harsh” or “guttural” sound made in the back of the mouth, which you hear in Scots loch or German Bach.4 This sound is actually the reason for the most famous bit of English spelling chaos: the sometimes-silent, sometimes-not sequence ‘gh’ that you see in laugh, cough, night, and daughter. Maybe one day I’ll tell you that story too.
Took me a couple of seconds - Georgian .. as in the other country with a red cross on a white background (although you add a few extra crosses than England)
I have no doubt that two very disparate languages and scripts will find a few similarities simply due to proximity. Georgia and England (UK) are close enough for a fair amount of cultural exchange.
If you start looking into it, you will surely be astonished at just how much “cultural exchange” there must have been going back even into the Paleolithic time, and definitely during the period of the OP article is touching on.
People have an extremely distorted perspective on European history for many reasons, but the late industrial age nation state probably had the biggest impact on that mental model people still have today in many ways. By all evidence I’ve seen, the cultural exchange in the distant past was far more organic than most people can easily imagine today for many reasons. Trade and cultural communion, religious exchanges and defensive unions all made that possible in a world that was not at all as controlled and authoritarian as we even experience today. It all waxed and waned over the centuries and regions of course, in a rather organic manner; but due to practical limitations a lot of the authoritarian restrictions we are all subject to today simply did not exist.
In some ways the USA until about 1960, is probably the most similar analogue of how Europe seems to have generally been for the longest time leading up to the Industrial Revolution. It was a land of general regions of self-regulating, cultural clustering with local levels of varying jurisdictions and power structures which to a large extent kept most people in their home region, if not their place of birth. By the latter part of that period the identity with one’s state and region and local culture had already largely succumbed to the oppressive force of the centralized dominating power of the federal and global power, but your region was still largely your cultural identity as a person and community.
That of course has all been totally razed and destroyed now and the USA effectively exists in name only today, which has been the case for an even longer time, but that’s a different topic altogether.
In the UK, yogh existed in Scotland a while longer than in England. You can observe it still in the name Menzies which is pronounced Ming-is there - the z in place of the yogh.
In the last week, our most famous Menzies passed away - politician Sir "Ming" Campbell.
Australia's longest serving prime minister https://en.wikipedia.org/wiki/Robert_Menzies also had the nickname 'ming'.
Some wag made a 'ming vase' of his face: https://www.portrait.gov.au/portraits/2004.176/ming-vase-sir...
Thanks for that. I always thought it was Greek or something like that. I wondered why we had a major political party led by a Greek guy.
But apparently, Macungie, PA has nothing to do with Mackenzie
Here's a curious thing I found when visiting government officials in Fiji a while back.
They all wrote the fancy 'g' rather than the simplified 'g' we use now. I assume they copied text from textbooks rather than (say) from teachers from England.
As a real young'un I used to attempt to do the same as an exercise but it's not easy to make it look good.
> English spelling has a reputation. And it’s not a good one." - never have i ever agreed with anything more
Quick reminder that writing != language. Even the highest fidelity writing systems are lossy encoding systems. In fact, the more phonologically accurate a writing system is to its language, the more it obscures the history of its words, especially words borrowed from other languages.
So from the perspective of someone interested in etymology, English writing's tendency to preserve old and foreign spellings is a good thing.
Plus, a more phonetic writing system is also problematic for dialectal variation. I pronounce marry/Mary/merry identically, as well as bag/beg, but other dialects distinguish them. I don't think the written standard would benefit from spelling them identically. That's relevant for everyday use, not just upsetting etymology enthusiasts.
Of course it also depends on how conservative the language is, like Finnish orthography is practically IPA, and yet Finnish is a freaking time capsule for words like borrowed Proto-Germanic *kuningaz and *wīsaz, which became king and wise in English, but kuningas and viisas in modern Finnish. So you can have both phonemic writing as well as etymological transparency if your phonology doesn't change much.
Many Indian languages are written in scripts that mirror what is spoken. Silent letters don't exist and pronunciations that don't match the spelling are very rare. This does npt preclude the existence of rich dialects and accents.
This increases the complexity of learning to write the language -- 56 letters in alphabet and each combination of consonant+vowel and consonant+consonant takes on a different letter form instead of just being a string of independent letters like English.
But reading / pronunciation is straightforward. (No we don't have spelling bees :) )
How does that play out for languages that use characters that are pictorial.
eg. Egyptian Heiroglyphs, or Asian characters (esp. Korean which has a relatively young alphabet - which IIRC is phoneme based, or Chinese which has a very old set, which is used across multiple languages (eg. Mandarin/Cantonese/etc)
Time to get on the English/American spelling reform and alphabet reform soapbox. 54% of US citizens have a less than 6th grade reading ability and 21% are functionally illiterate. The cause of this is almost entirely non-phonetic/phonemic spelling.
We pretend phonics exists, but it's just a lie we tell little kids to kickstart their learning. In reality, English spelling is more like learning Kanji. The original meanings of the words is warped beyond belief and we tell the specific pronunciation of specific letter sequences based on the surrounding letter sequences (much like telling which Kanji reading to use based on the surrounding Kanji). Words aren't so much sounded out as memorized and because English has such a massive vocabulary, the memorization work needed to be proficient is very extensive.
The classic example of this is "ough" which has NINE different pronunciations for the same letters and no real rules to indicate which one should be used. Spelling reform would make such situations completely unnecessary.
Languages with more phonetic alphabets tend to have much higher literacy rates for the same education quality and literacy can be achieved much faster. This works because once you memorize the sounds the letters make, you can sound out any word or write any word (provided you pronounce it correctly). The memorization process slowly kicks in where common words are still sight-read, but that process can happen much sooner and the individual can start independent reading much earlier with a focus on comprehension rather than memorizing weird rules and exceptions.
English departments have done massive damage in this regard. English started finalizing how words would be spelled around the same time the great vowel shift happened and completely screwed up everything. We then mass-adopted words with foreign spellings that used completely different phonetic systems. Despite the issues, English departments insist that these bugs are actually features despite the great harm they cause students and not only codify them, but denigrate all attempts to fix the problems.
English departments aren't the only ones. Even 150 years ago when Webster was trying small spelling reforms (some stuck around and some did not), people complained that the writing was childish. When Teddy Roosevelt tried a further spelling reform of getting rid of unneeded letters, he was turned into a laughing stock for the same reason (again with a handful sticking around). Modern "text speak" is yet another unofficial attempt to simplify spelling so it is more consistent, but once again, better, shorter alternatives are derided as making someone look unintelligent.
This still doesn't deal with the more fundamental phoneme/alphabet mismatch though. English has 44 common phonemes and a bunch of less common and regional sounds (for example, the χ sound in "cloCK"). Our adopted Latin alphabet has 26 letters of which at least 3 are unneeded (C as K or S, Q as KW, and X as KS). This leads to a horrible situation where a lot of sounds no longer have letters (Futhorc didn't get all the sounds, but still did better with 33 letters of which something like 11 were vowels). Some English sounds like the S in "treaSure" seem to have no real, unique spelling at all. Others like th and th have no indicator if it is supposed to be voiced like "THen" or unvoiced like "THink" (we used to have thorn and eth for this). We have 18 unique consonants and 24 common consonant sounds.
The vowel situation is even more dire. We have just 5 vowels and around 20 common vowels leaving each vowel desperately overloaded with all kinds of weird phonics "rules" and almost all of them having either multiple rules or different pronunciations for the same word (eg, "reed" vs "red" in "I read the book"). There needs to be massive vowel reform (either a ton of stable digraphs, diacritics, or more letters) so that sounds can be differentiated properly.
Spelling reform could all but eliminate our illiteracy problems and open a whole new world of possibilities to more than half of all Americans. In a world dominated by ever-increasing volumes of information, these people would have much better lives if we lowered the bar of learning to read to something more attainable.
I strongly object to the claim "In reality, English spelling is more like learning Kanji." as someone who had to learn both Chinese and English characters.
Both rely on groups of characters. Both are non-phonetic. Both rely on multiple memorized pronunciations for those character groups based on surrounding character group context. Both preserve symbol shape for reason of historical context.
There are certainly differences, but if you place current English spelling next to something like Shavian (or some other language with near-pure phonetic spelling), I'd say that Modern English learning patterns are closer to Kanji than the pure phonetic alphabets.
The trouble is that most people can read English effortlessly and are completely unconscious of it's many, many inconsistencies. It's also not that hard for an average child to pick up. Also, i enjoy the sophistication of english because i've mastered it.
One thing that worries me is the widespread adoption of english words and nouns in many languages. The list is ever increasing, even though the word makes absolutely no sense out of the context of English, cannot be adapted by a mon english speaker to have anything more then a single, rigid meaning. It's annoying enough for me when some books use French words. I don't know how everyone else copes.
As for literacy, i find it hard to believe the true statistics are as dire as you say but i'm prepared to accept that it is. Firstly, what are the statistics for contemporary societies with more sensible spellings? And can better education help? A final point, you clearly know far more about this topic then i do, but would adding half a dozen letters to the alphabet really help with increasing literacy?
> i find it hard to believe the true statistics are as dire as you say but i'm prepared to accept that it is.
https://www.thenationalliteracyinstitute.com/2024-2025-liter...
> It's annoying enough for me when some books use French words.
From around 1060 to 1360, French was the official language of England. It wasn't normal French though as William the Conqueror spoke Norman French. Both French dialects mixed in what can only be considered English style. For example Norman French said Warder while other French speakers said Guarder. English adopted both Warden and Guard, but gave them two different meanings. Overall, some 30% of our words are French though over 800 of the most common 1000 words are English in origin.
> would adding half a dozen letters to the alphabet really help with increasing literacy?
ITA (International Teaching Alphabet) shows the benefits and problems.
ITA students rocketed ahead the first couple of years and could read way more words than their traditional counterparts. The problem was the transition. Learning both systems seems to have evened things up or maybe even caused a net negative for ITA students. I believe this was because they had to learn two sets of spelling for everything. If you would like to see the difficulty in learning a new way to read/write and have a bit of fun, try learning Shavian script.
In an ideal world, they would have phonetic spelling only. I believe under those conditions that their advantage would continue to grow all the way through school. The problem is that this study is unethical to conduct because even if it is correct, the students would graduate and be unable to read traditional English which would permanently harm them.
This leaves the tricky problem of bridging the gap. This can't be done too quickly or the older generations get left behind. There's also an issue of transcribing everything into the new spelling. Technology has made that easier than ever, but it would still be a very hard proposition.
The first and easiest step is cleaning up the spellings using the letters we currently have. Stuff like all those -ough endings get rewritten in sane letters as an accepted alternative spelling. Silent letters start going away. We start moving toward consistent vowel and consonant digraphs. This will take time for older people to adapt to, but more consistent rules will mean they will have an easier time sounding them out.
After this, we start adding back letters. Maybe eth and thorn come back for the two "th" sounds. We certainly need a new letter for the S in "treaSure" and maybe bring back the elongated S to use for SH. At some point, we then start working on slowly adding new characters to stand in for the vowel digraphs.
I don't think you could convince adults to do more than a couple of steps at a time each generation. Such a plan would likely take decades to maybe even a century or two. Until the creation of the printing press, such slow changes were considered normal. Only in recent times have we attempted to gate-keep what "real English" is. If we allow the language to grow more organically, I think it could be guided into something far better than we have today.
ot but i recently discovered that the latin alphabet western languages use has it's roots in the semitic languages of the middle east. It is of course obvious when you think about it, even the name alphabet is basically the same as aleph bet, the first two letters of hebrew. It's even more obvious when you look at the similarities in the names of the Greek alphabet which Latin is based off.
What happened in short was that the greeks copied the ancient and now virtually defunct phoenican script, varieties of which were used across the region and kept the names even though they made no sense in the context of Greek, added vowels and wrote it from left to right.
The russians adapted the script in one way, Latin in another, Hebrew and arabic took entirely separate paths and now the only thing they share in common is alphabets that follows vaguely the same ordering.
If this was a serious article they would have used gif as an example of the g sound /s :)
If we are voting on missing letters I want thorn(þ). My understanding is that thorn is one of the rarer sounds in the worlds languages, and it deserves to get it's own letter back.
On the topic of screwball spelling is this video essay on silent letters. The fun takeaway for me was that a lot of silent letters were never pronounced. it is just that when some of the first dictionaries were being produced, and the spellings decided on, they decided to introduce silent letters to indicate the origin of the word. the b in debt is because it comes from the latin debitum. but it was not spelled that way until the 1500's prior to that it was dette.
https://www.youtube.com/watch?v=NXVqZpHY5R8 (RobWords: Why English is full of silent letters)