Comment by markb139
I tried code gen for the first time recently. The generated code look great, was commented and ran perfectly. The results were completely wrong. The code was to calculate the cpu temperature from the Raspberry Pi RP2350 in python. The initial value look about right, then I put my finger on the chip and the temp went down! I assume the model had been trained on broken code. This lead me to think how do they validate code does what it says
Nobody is saying that you don't have to read and check the code. Especially for things like numerical constants. Those are very frequently hallucinated (unless it's something super common like pi).