Comment by johnsmith1840

Comment by johnsmith1840 3 days ago

4 replies

AGI likely a combination of these two papers + something new likely along the lines of distillation.

1. Preventing collapse -> model gets "full" https://arxiv.org/pdf/1612.00796

2. Forgetting causes better generalization https://arxiv.org/abs/2307.01163

3. Unknow paper that connects this - allow a "forgetting" model that improves generalization over time. - I tried for a long time to make this but it's a bit difficult

Fun implication is that if true this implies AGI will need "breaks" and likely need to consume non task content of high variety much like a person does.

khalic 3 days ago

There is no sign that LLMs are capable of general reasoning, on the contrary, so hold your horses about that. We have proven they can do basic composition (as a developer, I see proof of this every time I generate some code with an assistant) which is amazing already, but we’re still far from anything like “general intelligence”.

  • johnsmith1840 3 days ago

    My argument is that we already have psuedo/static reasoners. CL will turn our non reasoners into reasoners.

    CL has been an open problem from the very beginnings of AI research with basically no solution. Its pervasiveness indicates a very deep misunderstanding on our knowledge of reasoning.

zelphirkalt 2 days ago

That's really reaching way to far. We have no idea, whether that will lead to anything even close to AGI and it even seems more likely, that it will just run into the next hurdle.

  • johnsmith1840 2 days ago

    Totally possible!

    I just like talking about it. I will say that learning outside distribution content while keeping previous knowledge in a "useful" state is a capability that would absolutely supercharge ever AI method we currently have.

    It's atleast an honest atempt at a research direction other than "scale infinitely for everything" that we currently do.

    Just think about how natural brains do something incredible.

    1. They have fixed computation budgets per time step. 2. They continously learn entirely new tasks while still maintaining previous in a useful state.

    That's a capability I would very much like in my AI.

    Scaling laws are correct but they are also the reason we are nowhere near replacing humans.

    Take a simple job maybe admin work. Every timestep depends on the previous timestep. While not a complex job and an AI could do it for awhile but over time the compuation required to "look back" its memory and connect it for the next step grows near exponentially.

    RAG is another perfect example of this problem.

    I do deeply belive AGI will be solved by a kid with a whiteboard not a supercluster. CL is my best guess at what that means.

    Maybe it's a super RL or energy type method but I've never seen it.