Comment by jakozaur

Is it just me, or do LLM code assistants do catastrophically silly things (drop a DB, delete files, wipe a disk, etc.) far more often than humans?

It looks like the training data has plenty of those examples, but the models don’t have enough grounding or warnings before doing them. I wish there were a PleaseDontDoAnythingStupidEval for software engineering.