Comment by daveguy

Comment by daveguy 2 days ago

11 replies

No. His argument is definitely closer to LLMs can't generalize. I think you would benefit from re-reading the paper. The point is that a puzzle consisting of simple reasoning about simple priors should be a fairly low bar for "intelligence" (necessary but not sufficient). LLMs performs abysmally because they have a very specific purpose trained goal that is different from solving the ARC puzzles. Humans solve these easily. And committees of humans do so perfectly. If LLMs were intelligent they would be able to construct algorithms consisting of simple applications of the priors.

Training to a specific task and getting better is completely orthogonal to generalized search and application of priors. Humans do a mix of both search of the operations and pattern matching of recognizing the difference between start and stop state. That is because their "algorithm" is so general purpose. And we have very little idea how the two are combined efficiently.

At least this is how I interpreted the paper.

voidspark 2 days ago

He is setting a bar, saying that that is the "true" generalization.

Deep neural networks are definitely performing generalization at a certain level that beats humans at translation or Go, just not at his ARC bar. He may not think it's good enough, but it's still generalization whether he likes it or not.

  • fc417fc802 a day ago

    I'm not convinced either of your examples is generalization. Consider Go. I don't consider a procedural chess engine to be "generalized" in any sense yet a decent one can easily beat any human. Why then should Go be different?

    • voidspark a day ago

      A procedural chess engine does not perform generalization, in ML terms. That is an explicitly programmed algorithm.

      Generalization has a specific meaning in the context of machine learning.

      The AlphaGo Zero model learned advanced strategies of the game, starting with only the basic rules of the game, without being programmed explicitly. That is generalization.

      • fc417fc802 a day ago

        Perhaps I misunderstand your point but it seems to me that by the same logic a simple gradient descent algorithm wired up to a variety of different models and simulations would qualify as generalization during the training phase.

        The trouble with this is that it only ever "generalizes" approximately as far as the person configuring the training run (and implementing the simulation and etc) ensures that it happens. In which case it seems analogous to an explicitly programmed algorithm to me.

        Even if we were to accept the training phase as a very limited form of generalization it still wouldn't apply to the output of that process. The trained LLM as used for inference is no longer "learning".

        The point I was trying to make with the chess engine was that it doesn't seem that generalization is required in order to perform that class of tasks (at least in isolation, ie post-training). Therefore, it should follow that we can't use "ability to perform the task" (ie beat a human at that type of board game) as a measure for whether or not generalization is occurring.

        Hypothetically, if you could explain a novel rule set to a model in natural language, play a series of several games against it, and following that it could reliably beat humans at that game, that would indeed be a type of generalization. However my next objection would then be, sure, it can learn a new turn based board game, but if I explain these other five tasks to it that aren't board games and vary widely can it also learn all of those in the same way? Because that's really what we seem to mean when we say that humans or dogs or dolphins or whatever possess intelligence in a general sense.