Comment by hiq
> had to iterate with them for 4/5 times each. Gemini got it right but then used deprecated methods
How hard would it be to automate these iterations?
How hard would it be to automatically check and improve the code to avoid deprecated methods?
I agree that most products are still underwhelming, but that doesn't mean that the underlying tech is not already enough to deliver better LLM-based products. Lately I've been using LLMs more and more to get started with writing tests on components I'm not familiar with, it really helps.
How hard can it be to create a universal "correctness" checker? Pretty damn hard!
Our notion of "correct" for most things is basically derived from a very long training run on reality with the loss function being for how long a gene propagated.