The Writing Is on the Wall for Handwriting Recognition
(newsletter.dancohen.org)52 points by speckx 7 days ago
52 points by speckx 7 days ago
I have a personal corpus of letters between my grandparents in WW2. My grandfather fighting in Europe and my grandmother in England. The ability of Claude and ChatGPT to transcribe them is extremely impressive. Though I haven’t worked on them in months and this uses older models. At that time neither system could properly organize pages though and chatGPT would sometimes skip a paragraph.
Shhhhh no one cares about data contamination anymore.
> Here’s Transkribus’s best guess at George’s letter to Maryann, above:
Transkribus got a new model architecture around the corner and the results look impressive. Not only for trivial cases like text, but also for table structures and layouting.
Best of all, you can train it on your own corpus of text to support obscure languages and handwriting systems.
Really looking forward to it.
As always, this depends on the amount of training data available. Japanese is another success story: https://digitalorientalist.com/2020/02/18/cursive-japanese-a...
Seems to do an OK job:
https://g.co/gemini/share/e173d18d1d80
This is a random image from Twitter with no transcript or English translation provided, so it's not going to be in the training data.
No, transcription has nothing to do with written text, it guessed few words here and there but not even general topic. That's doctors note about patient visit, beginning with "Прием: состояние удовл., t*, но кашель / patient visit: condition is OK, t(temperature normal?) but coughing". But unreadable doctors handwriting is a meme...
That's Gemini 2.5 Flash btw
The result from Gemini 3 Pro using the default media resolution (the medium one): "(Заголовок / Header): Арсеньев (Фамилия / Surname - likely "Arsenyev")
Состояние удовл-
t N, кожные
покровы чистые,
[л/у не увел.]
В зеве умерен. [умеренная]
гипер. [гиперемия]
В легких дыха-
ние жесткое, хрипов
нет. Тоны серд-
[ца] [ритм]ичные.
Живот мяг-
кий, б/б [безболезненный].
мочеисп. [мочеиспускание] своб. [свободное]
Ds: ОРЗ [или ОРВИ]" and with the translation: "Arsenyev
Condition satisfactory.
Temp normal, skin coverings [skin] are clean, lymph nodes not enlarged.
In the throat [pharynx], moderate hyperemia [redness].
In the lungs, breathing is rigid [hard], no rales [crackles/wheezing].
Heart tones are rhythmic.
Abdomen is soft, painless.
Urination is free [unhindered].
Diagnosis: ARD (Acute Respiratory Disease)."If I went back in time to the 90s when I was doing my PhD I would absolutely blow my mind with how well handwriting OCR works now.
I became convinced of this after the release of KuroNet: https://arxiv.org/pdf/1910.09433 (High-quality OCR of Japanese manuscripts, which look almost impossible to read.)