Comment by xmprt

Comment by xmprt 6 days ago

> Using apis I am familiar with but don't have memorized

I think you have to be careful here even with a typed language. For example, I generated some Go code recently which execed a shell command and got the output. The generated code used CombinedOutput which is easier to used but doesn't do proper error handling. Everything ran fine until I tested a few error cases and then realized the problem. In other times I asked the agent to write tests cases too and while it scaffolded code to handle error cases, it didn't actually write any tests cases to exercise that - so if you were only doing a cursory review, you would think it was properly tested when in reality it wasn't.

tptacek 6 days ago

You always have to be careful. But worth calling out that using CombinedOutput() like that is also a common flaw in human code.

Reply View 9 replies

dingnuts 6 days ago

The difference is that humans learn. I got bit by this behavior of CombinedOutput once ten years ago, and no longer make this mistake.

Reply View | 8 replies
- csallen 6 days ago
  
  This applies to AI, too, albeit in different ways:
  1. You can iteratively improve the rules and prompts you give to the AI when coding. I do this a lot. My process is constantly improving, and the AI makes fewer mistakes as a result.
  2. AI models get smarter. Just in the past few months, the LLMs I use to code are making significantly fewer mistakes than they were.
  
  Reply View | 7 replies
  
  th0ma5 6 days ago
  
  That you don't know when it will make a mistake and that it is getting harder to find them are not exactly encouraging signs to me.
  
  Reply View | 2 replies
  
  gf000 5 days ago
  
  But my gripe with your first point is that by the time I write an exact detailed step-by-step prompt for them, I could have written the code by hand. Like there is a reason we are not using fuzzy human language in math/coding, it is ambiguous. I always feel like doing those funny videos where you have to write exact instructions on how to make a peanut butter sandwich, getting deliberately misinterpreted. Except it is not fun at all when you are the one writing the instructions.
  2. It's very questionable that they will get any smarter, we have hit the plateau of diminishing returns. They will get more optimized, we can run them more times with more context (e.g. chain of thought), but they fundamentally won't get better at reasoning.
  
  Reply View | 2 replies
  
  kasey_junk 6 days ago
  
  And you can build automatic checks that reinforce correct behavior for when the lessons haven’t been learned, by bot or human.
  
  Reply View | 0 replies