Comment by dingnuts
The difference is that humans learn. I got bit by this behavior of CombinedOutput once ten years ago, and no longer make this mistake.
The difference is that humans learn. I got bit by this behavior of CombinedOutput once ten years ago, and no longer make this mistake.
There are definitely dumb errors that are hard for human reviewers to find because nobody expects them.
One concrete example is confusing value and pointer types in C. I've seen people try to cast a `uuid` variable into a `char` buffer to, for example, memset it, by doing `(const char *)&uuid)`. It turned out, however, that `uuid` was not a value type but rather a pointer, and so this ended up just blasting the stack because instead of taking the address of the uuid storage, it's taking the address of the pointer to the storage. If you're hundreds of lines deep and are looking for more complex functional issues, it's very easy to overlook.
But my gripe with your first point is that by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand. Like there is a reason we are not using fuzzy human language in math/coding, it is ambiguous. I always feel like doing those funny videos where you have to write exact instructions on how to make a peanut butter sandwich, getting deliberately misinterpreted. Except it is not fun at all when you are the one writing the instructions.
2. It's very questionable that they will get any smarter, we have hit the plateau of diminishing returns. They will get more optimized, we can run them more times with more context (e.g. chain of thought), but they fundamentally won't get better at reasoning.
> Like there is a reason we are not using fuzzy human language in math/coding, it is ambiguous
On the foolishness of "natural language programming"
https://www.cs.utexas.edu/~EWD/transcriptions/EWD06xx/EWD667...
> by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand
The improved prompt or project documentation guides every future line of code written, whether by a human or an AI. It pays dividends for any long term project.
> Like there is a reason we are not using fuzzy human language in math/coding
Math proofs are mostly in English.
And you can build automatic checks that reinforce correct behavior for when the lessons haven’t been learned, by bot or human.
This applies to AI, too, albeit in different ways:
1. You can iteratively improve the rules and prompts you give to the AI when coding. I do this a lot. My process is constantly improving, and the AI makes fewer mistakes as a result.
2. AI models get smarter. Just in the past few months, the LLMs I use to code are making significantly fewer mistakes than they were.