Comment by eob
An aspect of these self-improvement thought experiments that I’m willing to tentatively believe.. but want more resolution on, is the exact work involved in “improvement”.
Eg today there’s billions of dollars being spent just to create and label more data, which is a global act of recruiting, training, organization, etc.
When we imagine these models self improving, are we imagining them “just” inventing better math, or conducting global-scale multi-company coordination operations? I can believe AI is capable of the latter, but that’s an awful lot of extra friction.
This is exactly what makes this scenario so absurd to me. The authors don't even attempt to describe how any of this could realistically play out. They describe sequence models and RLAIF, then claim this approach "pays off" in 2026. The paper they link to is from 2022. RLAIF also does not expand the information encoded in the model, it is used to align the output with a set of guidelines. How could this lead to meaningful improvement in a model's ability to do bleeding-edge AI research? Why wouldn't that have happened already?
I don't understand how anyone takes this seriously. Speculation like this is not only useless, but disingenuous. Especially when it's sold as "informed by trend extrapolations, wargames, expert feedback, experience at OpenAI, and previous forecasting successes". This is complete fiction which, at best, is "inspired by" the real world. I question the motives of the authors.