Comment by johnsmith1840

Comment by johnsmith1840 3 days ago

3 replies

If it was pure compute we'd have simple examples. We can't do this even on the smallest of AI models.

There are tons of benchmarks around this you can easily run with 1 gpu.

It's compute only in the sense that the only way to do it is retrain a model from scratch at every step.

If you solve CL with a CNN you just created AGI.

Davidzheng 3 days ago

yeah but training from scratch is a valid solution. And if we can't find easier solutions we should just try to make it work. Compute is the main advantage we have in silica vs biological computers so we might as well push it--like ideally soon we will have one large AI running on datacenter size computer solving really hard problems and it could easily be most of the compute (>95%) is on training step--which is where really AI excels tbh not inference techniques. Like even Alphaproof for example spends most of compute training on solving simpler problems--which btw is one instance of continual training/training at test time which is implemented.

  • johnsmith1840 2 days ago

    Retrain from stratch does technically solve it.

    But it doesn't solve the time aspect.

    You need to randomize data in order to train to best quality. In doing that the model has no idea t0 was before t1000. If you don't you get model collapse or heavy bias.

    Some attempts at it but nothing crazy effective.

zelphirkalt 2 days ago

How do you make the mental jump from being able to train a model continuously to an "artificial general intelligence"?