Comment by tim333

Comment by tim333 15 hours ago

2 replies

There was stuff on a possible way around that from Google Research out the other day called Nested Learning https://research.google/blog/introducing-nested-learning-a-n...

My understanding is at the moment you train something like ChatGPT on the web, setting weights with backpropagation till it works well, but if you give some more info and do more backprop it can forget other stuff it's learned, called 'catastrophic forgetting'. The nested learning approach is to split things into a number of smaller models so you can retrain one without mucking up the other ones.

imtringued an hour ago

That's pretty cool that you point it out.

>We introduce Nested Learning, a new approach to machine learning that views models as a set of smaller, nested optimization problems, each with its own internal workflow, in order to mitigate or even completely avoid the issue of “catastrophic forgetting”, where learning new tasks sacrifices proficiency on old tasks. [0].

It feels funny to be vindicated by rambling something random a week before someone makes an announcement that they did something incredibly similar with great success:

>Here is my stupid and simple unproven idea: Nest the reinforcement learning algorithm. Each critic will add one more level of delay, thereby acting as a low pass filter on the supervised reward function. Since you have two critics now, you can essentially implement a hybrid pre-training + continual learning architecture. The most interesting aspect here is that you can continue training the inner-most critic without changing the outer critic, which now acts as a learned loss function. [1]

[0] https://research.google/blog/introducing-nested-learning-a-n... [1] https://news.ycombinator.com/item?id=45745402