Comment by AdieuToLogic

tptacek 6 months ago

Go ahead and find something hard, and relate back the steps you took to find it.

Reply View 12 replies

AdieuToLogic 6 months ago
> Go ahead and find something hard, and relate back the steps you took to find it.
This is a strawman[0] argument. You proclaimed:
A lot of what they want transcribed is totally straightforward to OCR
And I replied:
If it's that easy, then do it and be the hero they want.
So do it or do not. Nowhere does my finding "something hard" have any relevance to your proclamation.
0 - https://en.wikipedia.org/wiki/Straw_man
Reply View | 11 replies
- Dylan16807 6 months ago
  
  There are two claims. The main one is that all of these documents are easy to individually transcribe by machine. The other is that a whole lot can be OCR'd, which is pretty simple to check.
  That's not a claim that processing the entire archive would be trivial. And even if it was, whether that would make someone the "hero they want" is part of what's being called into question.
  So your silly demand going unmet proves nothing.
  Also, "give me an example please" is not a strawman!
  If you actually want to prove something, you need to show at least one document in the set that a human can do but not a machine, or to really make a good point you need to show that a non-neglibile fraction fit that description.
  
  Reply View | 2 replies
  
  AdieuToLogic 6 months ago
  
  > So your silly demand going unmet proves nothing.
  I made demands of no one.
  > Also, "give me an example please" is not a strawman!
  My identification of the strawman was that it referenced "find something hard" when I had said "be the hero they want" and that what is needed in this specific problem domain may be more difficult than what a generalization addresses.
  > If you actually want to prove something, you need to show at least one document in the set that a human can do but not a machine, or to really make a good point you need to show that a non-neglibile fraction fit that description.
  Maybe this is the proof you demand.
  LLM's are statistical prediction algorithms. As such, they are nondeterministic and, therefore, provide no guarantees as to the correctness of their output.
  The National Archives have specific artifacts requiring precise textual data extraction.
  Use of nondeterministic tools known to produce provably incorrect results eliminate their applicability in this workflow due to all of their output requiring human review. This is an unnecessary step and can be eliminated by the human reading the original text themself.
  Does that satisfy your demand?
  
  Reply View | 1 reply
  
  Dylan16807 6 months ago
  
  > I made demands of no one.
  Whatever you want to call "If it's that easy, then do it"
  > LLM's [...] Does that satisfy your demand?
  That's a different argument from the one above where you were trying to contradict tptacek. And that argument is flawed itself. In particular, humans don't have guarantees either.
  > provably incorrect results
  This gets back to the actual request from earlier, which is showing an example where the machine performs below some human standard. Just pointing out that LLMs make mistakes is not enough proof of incorrectness in this specific use case.
  
  Reply View | 0 replies
- tptacek 6 months ago
  
  I did in fact do it, and what I got was much, much easier than the samples in the article, which 4o did fine with. I'm sorry, but I declare the burden of proof here to be switched. Can you find a hard one?
  (I don't think you need to Wikipedia-cite "straw man" on HN).
  
  Reply View | 4 replies
  
  AdieuToLogic 6 months ago
  
  > I did in fact do it, and what I got was much, much easier than the samples in the article, which 4o did fine with.
  Awesome.
  Can you guarantee its results are completely accurate every time, with every document, and need no human review?
  > I'm sorry, but I declare the burden of proof here to be switched.
  If you are referencing my stating:
  If it's that easy, then do it and be the hero they want.
  Then I don't really know how to respond. Otherwise, if you are referencing my statement:
  > Perhaps "random humans" can perform tasks which could reshape your belief:
  >> OCR is VERY good
  To which I again ask, can you guarantee the correctness of OCR results will exceed what "random humans" can generally provide? What about "non-random motivated humans"?
  My point is that automated approaches to tasks such as what the National Archives have outlined here almost always require human review/approval, as accuracy is paramount.
  > (I don't think you need to Wikipedia-cite "straw man" on HN).
  I do so for two purposes. First, if I misuse a cited term someone here will quickly correct me. Second, there is always a probability of someone new here which is unaware of the cited term(s).
  
  Reply View | 3 replies
- idlewords 6 months ago
  
  Please don't link to wikipedia definitions of elementary terms. It's condescending and obnoxious.
  
  Reply View | 2 replies
  
  AdieuToLogic 6 months ago
  
  > Please don't link to wikipedia definitions of elementary terms. It's condescending and obnoxious.
  I respect what you have chosen to contribute to this conversation, but neither need nor seek your approval of mine.
  
  Reply View | 1 reply
  
  tptacek 6 months ago
  
  For the record, it mostly comes across as you only recently learning what a straw man argument is, which, in the context of a message board argument, is a rough thing to have to admit, so you have my sympathies.
  
  Reply View | 0 replies