Comment by johnsmith1840

Comment by johnsmith1840 3 days ago

If it was pure compute we'd have simple examples. We can't do this even on the smallest of AI models.

There are tons of benchmarks around this you can easily run with 1 gpu.

It's compute only in the sense that the only way to do it is retrain a model from scratch at every step.

If you solve CL with a CNN you just created AGI.

yeah but training from scratch is a valid solution. And if we can't find easier solutions we should just try to make it work. Compute is the main advantage we have in silica vs biological computers so we might as well push it--like ideally soon we will have one large AI running on datacenter size computer solving really hard problems and it could easily be most of the compute (>95%) is on training step--which is where really AI excels tbh not inference techniques. Like even Alphaproof for example spends most of compute training on solving simpler problems--which btw is one instance of continual training/training at test time which is implemented.

Reply View 1 reply

johnsmith1840 2 days ago

Retrain from stratch does technically solve it.
But it doesn't solve the time aspect.
You need to randomize data in order to train to best quality. In doing that the model has no idea t0 was before t1000. If you don't you get model collapse or heavy bias.
Some attempts at it but nothing crazy effective.

Reply View | 0 replies

zelphirkalt 2 days ago

How do you make the mental jump from being able to train a model continuously to an "artificial general intelligence"?

Reply View 0 replies