Comment by og_kalu
It's not irrelevant, because this is an argument about whether the machine can be said to be reasoning or not.
If Alice had concluded that this occasional mistake NN calculator was 'not really performing algebra', then Bob would be well within his rights to ask Alice what on earth she was going on about.
> If Alice had concluded that this occasional mistake NN calculator was 'not really performing algebra', then Bob would be well within his rights to ask Alice what on earth she was going on about.
No, your burden of proof here is totally bass-ackwards.
Bob's the one who asked for blind trust that his magical auto-learning black-box would be made to adhere to certain rules... but the rules and trust are broken. Bob's the one who has to start explaining the discrepancy, and whether the failure is (A) a fixable bug or (B) an unfixable limitation that can be reliably managed or (C) an unfixable problem with no good mitigation.
> It's not irrelevant, because this is an argument about whether the machine can be said to be reasoning or not.
Bringing up "b-b-but homo sapiens" is only "relevant" if you're equivocating the meaning of "reasoning", using it in a broad, philosophical, and kinda-unprovable sense.
In contrast, the "reasoning" we actually wish LLMs would do involves capabilities like algebra, syllogisms, deduction, and the CS-classic boolean satisfiability.
However the track-record of LLMs on such things is long and clear: They fake it, albeit impressively.
The LLM will finish the popular 2+2=_, and we're amazed, but when we twiddle the operands too far, it gives nonsense. It answers "All men are mortal. Socrates is a man. Therefore, Socrates is ______", but reword the situation enough and it breaks again.