Comment by demosthanos

tptacek 7 hours ago

OK, fair enough, but can you find one in this article that's hard for an LLM? The gnarliest one I saw, 4o handled instantly, and I went back and looked carefully at the image and the text and I'm sold.

Like if this is a crowdsourcing project, why not do a first pass with an LLM and present users with both the image and the best-effort LLM pass?

Later

I signed up, went to the current missions, and they all seem to post post-1900 and all typeset. They're blurry, but 4o cuts through them like a hot knife through butter.

Reply View 7 replies

defaultcompany 5 hours ago

My parents have saved letters from their parents which are written in cursive but in two perpendicular layers. Meaning the writing goes horizontally in rows and then when they got to the end of the page it was turned 90 degrees and continued right on top of what was already there for the whole page. This was apparently to save paper and postage. It looks like an unintelligible jumble but my mother can actually decipher it. Maybe that’s what the LLMs are having trouble with?
Edit: apparently it’s called cross writing [1]
1: https://highshrink.com/2018/01/02/criss-cross-letters/

Reply View | 1 reply
- tptacek 5 hours ago
  
  Are they having trouble? You can sign up right now and get tasks from the archive that seem trivial for 4o (by which I mean: feed a screenshot to 4o, get a transcription, and spot check it).
  
  Reply View | 0 replies
varenc 6 hours ago

My guess is because it’s the Smithsonian, they’re just not willing to trust an LLM’s transcription enough to put their name on it. I imagine they’re rather conservative. And maybe some AI-skeptic protectionist sentiments from the professional archivists. Seems like it could change with time though.

Reply View | 1 reply
- ugh123 4 hours ago
  
  > My guess is because it’s the Smithsonian, they’re just not willing to trust an LLM’s transcription enough to put their name on it. I imagine they’re rather conservative
  I expect thats a common theme from companies like that, yet I don't think they understand the issue they think they have there.
  Why not have the LLMs do as much work as possible and have humans review and put their own name on it? Do you think they need to just trust and publish the output of the LLM wholeheartedly?
  I think too many people saw what a few idiot lawyers did last year and closed the book on LLM usage.
  
  Reply View | 0 replies
ellen364 2 hours ago

> Like if this is a crowdsourcing project, why not do a first pass with an LLM and present users with both the image and the best-effort LLM pass?
Possibly for the reason that came up in your other post: you mentioned that you spot checked the result.
Back when I was in historical research, and occasionally involved in transcription projects, the standard was 2-3 independent transcriptions per document.
Maybe the National Archive will pass documents to an LLM and use the output as 1 of their 2-3 transcriptions. It could reduce how many duplicate transcriptions are done by humans. But I'll be surprised if they jump to accepting spot checked LLM output anytime soon.

Reply View | 0 replies
rtkwe 3 hours ago

One that require additional work beyond simply feeding the image into the model would be this example which is a mix of barely legible hand written cursive and easy to read typed form. [0] Initially 4o just transcribes (successfully) the bottom half of the text and has to be prompted to attempt the top half at which point it seems to at best summarize the text instead of giving a direct transcription. [1] In fact it seems to mix up some portions of the latter half of the typed text with the written text in the portion of it's "transcription" about "reduced and indigent circumstances".
[0] https://catalog.archives.gov/id/54921817?objectPage=8&object...
[1] Reproducing here since I cannot share the chat since it has user uploaded images. " The text in the top half of the image is handwritten and partially difficult to read due to its cursive style and some smudging. Here's my best transcription attempt for the top section:
...resident within four? years, swears and says that the name of the John Hopper mentioned in the foregoing declaration is the same person, and he verily believes the facts as stated in the declaration are true.
He further swears that the said John Hopper is in reduced and indigent circumstances and requires the aid of his country.
The declarant further swears he has no evidence now in his power of service, except the statement of Capt. (illegible name), as to his reduced circumstances ...
Sworn to before me, this day...
Some parts remain unclear due to the handwriting, but let me know if you'd like me to attempt further clarification on specific sections!"

Reply View | 0 replies
doodlebugging 4 hours ago

I'm doing some genealogy work right now on my family's old papers covering the time period from recent years back to the late 17th century. Handwriting styles changed a lot over the centuries and individuals can definitely be identified by their personal cursive style of writing and you can see their handwriting change as they aged.
Then you have the problem that some of these ancestors not only had terrible penmanship but also spelled multi-syllabic words phonetically since they likely were barely educated kids who spent more time when they were young working on the farm or ranch instead of attending school where they would've learned how to spell correctly.
I don't know whether your LLM can handle English words spelled phonetically written in cursive by an individual who had no consistency in forming letters in the words. It is clear after reading a lot of correspondence from this person that they ignored things that didn't seem important in the moment like dotting i's or crossing t's or forming tails on g's, p's, j's, or even beginning letters consistently since they switched between cursive and block letters within a sentence, maybe while they paused to clarify their thoughts. I don't know but it is fascinating to take a walk through life with someone you'll never meet and to discover that many of the things that seemed awesome to you as a kid were also awesome to them and that their life had so many challenges that our generations will never need to endure.
Some of my people have the most beautiful flowing cursive handwriting that looks like the cursive that I was taught in grade school. Others have the most beautiful flowing cursive with custom flourishes and adornments that make their handwriting instantly recognizable and easy to read once you understand their style.
I think there are plenty of edge cases where LLMs will take a drunkard's walk through the scribble and spit out gibberish.
I'm reminded of an old joke though.
Ronald Reagan woke up one snowy Washington, DC morning and took a look out of the window to admire the new-fallen snow. He enjoys the beautiful scene laid out before him until he sees tracks in the snow below his window and a message obviously written in piss that said - "Reagan sucks".
He dispatched the Secret Service to the site where samples were taken of the affected snow and photos of the tracks of two people were made.
After an investigation he receives a call from the Secret Service agent in charge who tells him he has some good news and some bad news for him.
The good news is that they know who pissed the message. It was George HW Bush, his Vice President. The bad news is that it was Nancy's handwriting.

Reply View | 0 replies

tedunangst 6 hours ago

Something about extraordinary claims and extraordinary evidence? The evidence presented, a seemingly easily transcribed image, is hardly persuasive.

Reply View 2 replies

rtkwe 3 hours ago

Some are significantly harder to read. I took the page below and tried to get GPT 4o to transcribe it and it basically couldn't do it. I'm not going to sit and prompt hack for ages to see if it can but it seems unable to tackle the handwritten text at the top. When I first just fed it the image and asked for a transcription it only (but successfully) read the bottom portion, prompted for a transcription of the top it dropped into more of a summary of the whole document mainly pulling some phrases from the bottom text. (Sadly can't share it but I copied it's reply out in a comment upthread) [0]
It was more successful at a few others I tried but it's still a task that requires manual processing like a lot of LLM output to check for accuracy and prompt modification to get it to output what you need for some documents.
https://catalog.archives.gov/id/54921817?objectPage=8&object...
[0] https://news.ycombinator.com/item?id=42746490

Reply View | 0 replies
[removed] 6 hours ago

[deleted]

Reply View | 0 replies