Comment by cs702
TL;DR: The OP believes that if we train large AI models via RL to duplicate the behavior of existing software (for example, train them to duplicate the behavior of an existing spreadsheet, an existing command-line tool, or an existing application), large AI models will get good at:
* reading and understanding long, complicated, detailed instructions,
* executing those instructions meticulously and precisely, without errors,
* noticing its mistakes, if there are any along the way, and recovering from them,
* not settling prematurely for solutions that look "good enough" but aren't, and
* undertaking large, complicated projects which previously could be completed only by teams of human experts.
There's a good chance the OP is right, in my view.
We sure live in interesting times!