Comment by 9rx
> You still gotta understand what you're doing.
Of course, but how do you begin to understand the "stochastic parrot"?
Yesterday I used LLMs all day long and everything worked perfectly. Productivity was great and I was happy. I was ready to embrace the future.
Now, today, no matter what I try, everything LLMs have produced has been a complete dumpster fire and waste of my time. Not even Opus will follow basic instructions. My day is practically over now and I haven't accomplished anything other than pointlessly fighting LLMs. Yesterday's productivity gains are now gone, I'm frustrated, exhausted, and wonder why I didn't just do it myself.
This is a recurring theme for me. Every time I think I've finally cracked the code, next time it is like I'm back using an LLM for the first time in my life. What is the formal approach that finds consistency?
You're experiencing throttling. Use the API instead and pay per token.
You also have to treat this as outsourcing labor to a savant with a very, very short memory, so:
1. Write every prompt like a government work contract in which you're required to select the lowest bidder, so put guardrails everywhere. Keep a text editor open with your work contract, edit the goal at the bottom, and then fire off your reply.
2. Instruct the model to keep a detailed log in a file and, after a context compaction, instruct it to read this again.
3. Use models from different companies to review one another's work. If you're using Opus-4.5 for code generation, then consider using GPT-5.2-Codex for review.
4. Build a mental model for which models are good at which tasks. Mine is: