Comment by BugsJustFindMe

Comment by BugsJustFindMe a year ago

> I don’t think I believe that OCR can’t do it but random humans can

I do.

> OCR is VERY good

Uh, my experience is extremely different.

jncfhnb a year ago

I would challenge you to find a picture of text that you think a human can read and OCR cannot. I’m happy to demonstrate. The text shown in this article is trivial.

Reply View 14 replies

demosthanos a year ago

The archivists themselves say that they run into such texts often enough that this program was needed:
> The agency uses artificial intelligence and a technology known as optical character recognition to extract text from historical documents. But these methods don’t always work, and they aren’t always accurate.
They are absolutely aware of the advances in these tools, so if they say they're not completely there yet I believe them. One likely reason is that the models probably have less 1800s-era cursive in their training set than they do modern cursive.
It's likely that with more human-tagged data they could improve on the state of the art for OCR, but it's pretty arrogant to doubt the agency in charge of this sort of thing when they say the tech isn't there yet.

Reply View | 7 replies
- tedunangst a year ago
  
  Can someone please post a sample of one of these images that can only be read by a human for us naive OCR believers to see?
  
  Reply View | 5 replies
  
  CamperBob2 a year ago
  
  To be fair there was a similar discussion a few days ago in which an SME remained unconvinced: https://news.ycombinator.com/item?id=42566391
  I don't necessarily agree with her conclusion because she wasn't participating directly in the thread and wasn't completely responsive to some of the points raised, but still, it appears that there are a few instances of difficult-to-read handwriting where OCR is still coming in second to skilled human interpretation.
  
  Reply View | 1 reply
  
  jncfhnb a year ago
  
  That’s comprehension of English not reading characters
  
  Reply View | 0 replies
  
  BugsJustFindMe a year ago
  
  I've posted these above, but I'll give you your own copy because the bits are free. Does your OCR work on these? Mine sadly doesn't. But if yours does, then I'll switch to it.
  https://imgur.com/a/CDU6Lgs
  
  Reply View | 2 replies
- jncfhnb a year ago
  
  Then please provide a single example that we can’t instantly solve. Happy to prove them wrong.
  
  Reply View | 0 replies
AdieuToLogic a year ago

> I would challenge you to find a picture of text that you think a human can read and OCR cannot.
Are you aware of CAPTCHA[0] images?
0 - https://en.wikipedia.org/wiki/CAPTCHA

Reply View | 4 replies
- jncfhnb a year ago
  
  Text that is _intentionally constructed_ to fool computers but not humans is obviously out of scope. But they’re generally easily solved with OCR these days anyway.
  
  Reply View | 0 replies
- jahewson a year ago
  
  Solvable with the right tools.
  https://github.com/noCaptchaAi/NoCaptcha-Ai-Browser-Extensio...
  
  Reply View | 2 replies
  
  AdieuToLogic a year ago
  
  > Solvable with the right tools.
  The original assertion was:
  I would challenge you to find a picture of text that you think a human can read and OCR cannot.
  Not if many CAPTCHA image challenges could be automated. Unless the tool referenced guarantees 100% correct solutions for all manipulated text images.
  
  Reply View | 1 reply
  
  CamperBob2 a year ago
  
  The AI models are now better at CAPTCHAs than I am, for both text- and image-based questions. But when confronted with a CAPTCHA, humans work for free, and the models don't. :(
  As long as that's the case, CAPTCHAs probably won't be considered truly obsolete.
  
  Reply View | 0 replies
BugsJustFindMe a year ago

Yeah ok, but it might take me a few tries because I don't know what you're using. I hope that's agreeable?
What does your OCR say that these say? The first one isn't too hard for a human (assuming appropriate language skill). The second one is a bit more difficult.
https://imgur.com/a/CDU6Lgs

Reply View | 0 replies

CamperBob2 a year ago

Your experience is obsolete.

Reply View 10 replies

BugsJustFindMe a year ago

Oh, ok then.

Reply View | 9 replies
- CamperBob2 a year ago
  
  I mean, all you have to do is feed the image to ChatGPT, and it will read it basically as well as you can.
  Denying/downvoting reality is always an option, of course.
  
  Reply View | 8 replies
  
  BugsJustFindMe a year ago
  
  Can you feed these to ChatGPT and tell me what it says they say?
  https://imgur.com/a/CDU6Lgs
  It gets them wrong for me, but maybe it will get them right for you. Maybe you're better at prompting or have access to a better model or something.
  
  Reply View | 5 replies
  
  bigstrat2003 a year ago
  
  Not being rude was also an option, one you chose not to take for some reason. Seriously, all it would've taken was for you to say something like "there have been a lot of advancements so it's probably different than you remember". This conversation would've gone much smoother for you if you had.
  And BugsJustFindMe can't downvote you, because it was a reply to him. So don't bite his head off over it. You got downvoted because you were a jerk, plain and simple.
  
  Reply View | 1 reply
  
  CamperBob2 a year ago
  
  Not being rude was also an option
  Refraining from reflexively pooh-poohing AI with uninformed and/or out-of-date opinions is also an option, but not one often exercised on HN.
  It gets old not being able to carry on a discussion without squinting at grayed-out text, simply because someone pointed out that humans aren't robots and should no longer have to emulate them.
  
  Reply View | 0 replies