Comment by finnborge
At this extreme, I think we'd end up relying on backup snapshots. Faulty outcomes are not debugged. They, and the ecosystem that produced them, are just erased. The ecosystem is then returned to its previous state.
Kind of like saving a game before taking on a boss. If things go haywire, just reload. Or maybe like cooking? If something went catastrophically wrong, just throw it out and start from the beginning (with the same tools!)
And I think the only way to even halfway mitigate the vulnerability concern is to identify that this hypothetical system can only serve a single user. Exactly 1 intent. Totally partitioned/sharded/isolated.
Backup snapshots of what though? The defects aren’t being introduced through code changes, they are inherent in the model and its tooling. If you’re using general models, there’s very little you can do beyond prompt engineering (which won’t be able to fix all the bugs).
If you were using your own model you could maybe try to retrain/finetune the issues away given a new dataset and different techniques? But at that point you’re just transmuting a difficult problem into a damn near impossible one?
LLMs can be miraculous and inappropriate at the same time. They are not the terminal technology for all computation.