Comment by AdieuToLogic

Comment by AdieuToLogic 6 months ago

22 replies

> I don’t think I believe that OCR can’t do it but random humans can

Considering the people involved are experts in their field, are certainly aware of OCR capabilities, and have publicized a need thusly:

  ... the National Archives is looking for volunteers who can 
  help transcribe and organize its many handwritten records ...
Perhaps "random humans" can perform tasks which could reshape your belief:

> OCR is VERY good

tptacek 6 months ago

No. Sign up and look at the current missions. A lot of what they want transcribed is totally straightforward to OCR --- not even LLM, OCR. Whatever's going on, and I'm not second-guessing them, a pretty big chunk of their problem appears to be well within the state of the art. The appeal to authority isn't going to play here, because you can just click through to the archives and see what they're trying to figure out.

  • AdieuToLogic 6 months ago

    > No. Sign up and look at the current missions. A lot of what they want transcribed is totally straightforward to OCR --- not even LLM, OCR. Whatever's going on, and I'm not second-guessing them, a pretty big chunk of their problem appears to be well within the state of the art.

    If it's that easy, then do it and be the hero they want.

    Or maybe, just maybe, "a pretty big chunk of their problem appears to be well within the state of the art" is a sweeping generalization lacking understanding of the difficulties involved.

    • tptacek 6 months ago

      Go ahead and find something hard, and relate back the steps you took to find it.

      • AdieuToLogic 6 months ago

        > Go ahead and find something hard, and relate back the steps you took to find it.

        This is a strawman[0] argument. You proclaimed:

          A lot of what they want transcribed is totally
          straightforward to OCR
        
        And I replied:

          If it's that easy, then do it and be the hero
          they want.
        
        So do it or do not. Nowhere does my finding "something hard" have any relevance to your proclamation.

        0 - https://en.wikipedia.org/wiki/Straw_man

jncfhnb 6 months ago

Also, you seem to have taken issue with the phrase “random humans” because you’re confused at what’s being done here. It is random humans. Non experts.

Experts are asking for the help of non experts.

> Anyone with an internet connection can volunteer to transcribe historical documents and help make the archives’ digital catalog more accessible

  • AdieuToLogic 6 months ago

    > Also, you seem to have taken issue with the phrase “random humans” because you’re confused at what’s being done here. It is random humans. Non experts.

    I'm largely aligned with your interpretation of "random humans", with a clarification below. The experts I was referencing are the ones you identified:

    > Experts are asking for the help of non experts.

    The call to action by the archivists (experts), IMHO, has the intent to engage people with interest in the topic. So not really random from a mathematical definition, but perhaps better thought of as "unknown interested parties."

    Granted, this is my unsubstantiated opinion.

jncfhnb 6 months ago

There are conceivable reasons why they may be telling a half truth here. Just engaging the public is a worthy goal here.

  • AdieuToLogic 6 months ago

    > There are conceivable reasons why they may be telling a half truth here. Just engaging the public is a worthy goal here.

    Asserting an ulterior motive without supporting proof is to engage in conspiracy theories.

    Sometimes a cigar is just a cigar.[0]

    0 - https://quoteinvestigator.com/2011/08/12/just-a-cigar/

    • jncfhnb 6 months ago

      The alternative is me saying that appealing to their “expertise” is an appeal to authority fallacy that flies in the face of general evidence that modern OCR is far better than humans at character recognition. Especially random non specialized humans.

    • Dylan16807 6 months ago

      It doesn't look like a cigar (very tricky documents) though. Hence the skepticism.