Comment by falcor84

Comment by falcor84 3 days ago

6 replies

It's not clear me what you're saying; isn't the whole deal here that by performing RL on the CoT (given sufficient size and compute) it would converge to the right program?

HarHarVeryFunny 3 days ago

I was really saying two things:

1) The theoretical notion that a fixed depth transformer + COT can solve arbitrary problems involving sequential computation is rather like similar theoretical notions of a Turing machine as universal computer, or of an ANN with a hidden layer able to represent arbitrary functions .. it may be true, but at the same time not useful

2) The Turing machine, just as the LLM+COT, is only as useful as the program it is running. If the LLM-COT is incapable of runtime learning and just trying to mimic some reasoning heuristics, then that is going to limit it's function, even if theoretically such an "architecture" could do more if only it were running a universal AGI program

Using RL to encourage the LLM to predict continuations according to some set of reasoning heuristics is what it is. It's not going to make the model follow any specific reasoning logic, but is presumably hoped to generate a variety of continuations that the COT "search" will be able to utilize to arrive at a better response than it otherwise would have done. More of an incremental improvement (as reflected in the benchmark scores it achieves) than "converging to the right program".

__loam 3 days ago

Sometimes reading hackernews makes me want to slam my head on the table repeatedly. Given sufficient size and compute is one of the most load bearing phrases I've ever seen.

  • falcor84 3 days ago

    But it is load bearing. I mean, I personally can't stop being amazed at how with each year that passes, things that were unimaginable with all the world's technology a decade ago are becoming straightforward to run on a reasonably priced laptop. And at this stage, I wouldn't bet even $100 against any particular computational problem being solved in some FAANG datacenter by the end of the decade.

    • HarHarVeryFunny 3 days ago

      That's an apples and oranges comparison.

      Technology advances, but it doesn't invent itself.

      CPUs didn't magically get faster by people scaling them up - they got faster by evolving the design to support things like multi-level caches, out-of-order execution and branch prediction.

      Perhaps time fixes everything, but scale alone does not. It'll take time for people to design new ANN architectures capable of supporting AGI.

    • __loam 3 days ago

      There's unimaginable and there's physically and mathematically impossible.

      • falcor84 3 days ago

        Agreed - but would you wager a bet on what in TFA (or the related discussion) is physically/mathematically impossible?