Comment by EnPissant
Comment by EnPissant 12 hours ago
My experience with Codex / Gpt-5:
- The smartest model I have used. Solves problems better than Opus-4.1.
- It can be lazy. With Claude Code / Opus, once given a problem, it will generally work until completion. Codex will often perform only the first few steps and then ask if I want to continue to do the rest. It does this even if I tell it to not stop until completion.
- I have seen severe degradation near max context. For example, I have seen it just repeat the next steps every time I tell it to continue and I have to manually compact.
I'm not sure if the problems are Gpt-5 or Codex. I suspect a better Codex could resolve them.
Claude seems to have gotten worse for me, with both that kind of laziness and a new pattern where it will write the test, write the code, run the test, and then declare that the test is working perfectly but there are problems in the (new) code that need to be fixed.
Very frustrating, and happening more often.