Comment by andy12_

Comment by andy12_ 7 hours ago

4 replies

The problem is that so far, SOTA generalist models are not excellent at just one particular task. They have a very wide range of tasks they are good at, and good scores in one particular benchmarks correlates very strongly with good scores in almost all other benchmarks, even esoteric benchmarks that AI labs certainly didn't train against.

I'm sure, without any uncertainty, that any generalist model able to do what Einstein did would be AGI, as in, that model would be able to perform any cognitive task that an intelligent human being could complete in a reasonable amount of time (here "reasonable" depends on the task at hand; it could be minutes, hours, days, years, etc).

somenameforme 6 hours ago

I see things rather differently. Here's a few points in no particular order:

(1) - A major part of the challenge is in not being directed towards something. There was no external guidance for Einstein - he wasn't even a formal researcher at the time of his breakthroughs. An LLM might be able to be handheld towards relativity, though I doubt it, but given the prompt of 'hey find something revolutionary' it's obviously never going to respond with anything relevant, even with substantially greater precision specifying field/subtopic/etc.

(2) - Logical processing of natural language remains one small aspect of intelligence. For example - humanity invented natural language from nothing. The concept of an LLM doing this is a nonstarter since they're dependent upon token prediction, yet we're speaking of starting with 0 tokens.

(3) - LLMs are, in many ways, very much like calculators. They can indeed achieve some quite impressive feats in specific domains, yet then they will completely hallucinate nonsense on relatively trivial queries, particularly on topics where there isn't extensive data to drive their token prediction. I don't entirely understand your extreme optimism towards LLMs given this proclivity for hallucination. Their ability to produce compelling nonsense makes them particularly tedious for using to do anything you don't already effectively know the answer to.

  • andy12_ 3 hours ago

    > I don't entirely understand your extreme optimism towards LLMs given this proclivity for hallucination

    Simply because I don't see hallucinations as a permanent problem. I see that models keep improving more and more in this regard, and I don't see why the hallucination rate can't be abirtrarily reduced with further improvements to the architecture. When I ask Claude about obscure topics, it correctly replies "I don't know", where past models would have hallucinated an answer. When I use GPT 5.2-thinking for my ML research job, I pretty much never encounter hallucinations.

    • somenameforme 3 hours ago

      Hahah, well you working in the field probably explains your optimism more than your words! If you pretty much never encounter hallucinations with GPT then you're probably dealing with it on topics where there's less of a right or wrong answer. I encounter them literally every single time I start trying to work out a technical problem with it.

  • andai 2 hours ago

    Well the "prompt" in this case would be Einstein's neurotype and all his life experiences. Might a bit long for the current context windows though ;)