Comment by perrygeo

Comment by perrygeo 3 days ago

4 replies

> Large language models (LLMs) are powerful but static; they lack mechanisms to adapt their weights in response to new tasks

The learning and inference process are entirely separate, which is very confusing to people familiar with traditional notions of human intelligence. For humans, learning things and applying that knowledge in the real world is one integrated feedback process. Not so with LLMs, we train them, deploy them, and discard them for a new model that has "learned" slightly more. For an LLM, inference is the end of learning.

Probably the biggest misconception out there about AI. If you think LLMs are learning, it's easy to fantasize that AGI is right around the corner.

fspeech 3 days ago

Reinforcement learning can be used to refine LLM as shown by Deepseek.

  • perrygeo 3 days ago

    Everything I've read in the last 5 months says otherwise. Probably best described by the Apple ML group's paper call The Illusion of Thinking. It empirically works, but the explanation could just be that making the stochastic parrot squawk longer yields a better response.

    In any case, this is a far cry from what I was discussing. At best, this shows an ability for LLMs to "learn" within the context window, which should already be somewhat obvious (that's what the attention mechanism does). There is no global knowledge base or weight updates. Not until the content gets published, rescraped, and trained into the next version. This does demonstrate a learning feedback loop, albeit one that takes months or years, driven by external forces - the company that trains it. But it's way too slow to be considered intelligent, and it can't learn on its own without help.

    A system that truly learned, ie incorporated empirical data from its environment into its model of the world, would need to do this in millisecond time frames. Single celled organisms can do this. Where you at AGI?

    • throwaway314155 2 days ago

      > explanation could just be that making the stochastic parrot squawk longer yields a better response

      No one in the research and science communities ever said anything contrary to this and if they did they wouldn't last long (although i imagine many of them would find issue with your stochastic parrot reference).

      The apple paper has a stronger title than its actual premise. Basically they found that "thinking" definitely works but falls apart for problems of a certain difficulty and simply scaling "thinking" up doesn't help (for these harder problems)

      It never said "thinking" doesnt work. People are just combining the title with their existing prejudices to draw the conclusion the _want_ to see.

kovek 3 days ago

What if you can check if the user responds positively/negatively to the output, and then you train the LLM on the input it got and the output it produced?