Comment by 9rx

Comment by 9rx 3 days ago

7 replies

> You still gotta understand what you're doing.

Of course, but how do you begin to understand the "stochastic parrot"?

Yesterday I used LLMs all day long and everything worked perfectly. Productivity was great and I was happy. I was ready to embrace the future.

Now, today, no matter what I try, everything LLMs have produced has been a complete dumpster fire and waste of my time. Not even Opus will follow basic instructions. My day is practically over now and I haven't accomplished anything other than pointlessly fighting LLMs. Yesterday's productivity gains are now gone, I'm frustrated, exhausted, and wonder why I didn't just do it myself.

This is a recurring theme for me. Every time I think I've finally cracked the code, next time it is like I'm back using an LLM for the first time in my life. What is the formal approach that finds consistency?

acuozzo 3 days ago

You're experiencing throttling. Use the API instead and pay per token.

You also have to treat this as outsourcing labor to a savant with a very, very short memory, so:

1. Write every prompt like a government work contract in which you're required to select the lowest bidder, so put guardrails everywhere. Keep a text editor open with your work contract, edit the goal at the bottom, and then fire off your reply.

2. Instruct the model to keep a detailed log in a file and, after a context compaction, instruct it to read this again.

3. Use models from different companies to review one another's work. If you're using Opus-4.5 for code generation, then consider using GPT-5.2-Codex for review.

4. Build a mental model for which models are good at which tasks. Mine is:

  3a. Mathematical Thinking (proofs, et al.): Gemini DeepThink

  3b. Software Architectural Planning: GPT5-Pro (not 5.1 or 5.2)

  3c. Web Search & Deep Research: Gemini 3-Pro

  3d. Technical Writing: GPT-4.5

  3e. Code Generation & Refactoring: Opus-4.5

  3f. Image Generation: Nano Banana Pro
  • 9rx 3 days ago

    > You're experiencing throttling. Use the API instead and pay per token.

    That was using pay per token.

    > Write every prompt like a government work contract in which you're required to select the lowest bidder, so put guardrails everywhere.

    That is what I was doing yesterday. Worked fantastically. Today, I do the very same thing and... Nope. Can't even stick to the simplest instructions that have been perfectly fine in the past.

    > If you're using Opus-4.5 for code generation, then consider using GPT-5.2-Codex for review.

    As mentioned, I tried using Opus, but it didn't even get the point of producing anything worth reviewing. I've had great luck with it before, but not today.

    > Instruct the model to keep a detailed log in a file and, after a context compaction

    No chance of getting anywhere close to needing compaction today. I had to abort long before that.

    > Build a mental model for which models are good at which tasks.

    See, like I mentioned before, I thought I had this figured out, but now today it has all gone out the window.

    • toraway 3 days ago

      Drives me absolutely crazy how lately any time I comment about my experience using LLMs for coding that isn’t gushing praise, I get the same predictable, condescending lecture about how I'm using it ever so slightly wrong (unlike them) which explains why I don't get perfect output literally 100% of the time.

      It’s like I need a sticky disclaimer:

        1. No, I didn’t form an outdated impression based on GPT-4 that I never updated, in fact I use these tools *constantly every single day* 
        2. Yes, I am using Opus 4.5
        3. Yes, I am using a CLAUDE.md file that documents my expectations in detail
        3a. No, it isn’t 20000 characters or anything
        3b. Yes, thank you, I have in fact already heard about the “pink elephant problem”
        4. Yes, I am routinely starting with fresh context
        4a. No, I don’t expect every solution to be one-shotable 
        5. Yes, I am still using Opus fucking 4.5 
        6. At no point did I actually ask for Unsolicited LLM Tips 101.
      
      Like, are people really suggesting they never, ever get a suboptimal or (god forbid) completely broken "solution" from Claude Code/Codex/etc?

      That doesn't mean these tools are useless! Or that I’m “afraid” or in denial or trying to hurt your feelings or something! I’m just trying to be objective about my own personal experience.

      It’s just impossible to have an honest, productive discussion if the other person can always just lob responses like “actually you need to use the API not the 200/mo plan you pay for” or “Opus 4.5 unless you’re using it already in which case GPT 5.2 XHigh / or vice versa” to invalidate your experience on the basis of “you’re holding it wrong” with an endlessly slippery standard of “right”.

      • acuozzo 3 days ago

        When I wrote my reply I was not familiar with the existing climate of LLM-advice-as-a-cudgel that you describe.

        > to invalidate your experience on the basis of “you’re holding it wrong”

        This was not my intent in replying to 9rx. I was just trying to help.

        • vldx 2 days ago

          GP didn’t, but I’ve found the tips that you’ve shared helpful, so thank you for taking the time.

  • cxvwbvb 3 days ago

    Nonsense. I have ran an experiment today - trying to generate a particular kind of image.

    Its been 12 hours and all the image gen tools failed miserably. They are only good at producing surface level stuff, anything beyond that? Nah.

    So sure, if what you do is surface level (and crap in my opinion) ofc you will see some kind of benefit. But if you have any taste (which I presume you dont) you would handily admit it is not all that great and the amount invested makes zero sense.

    • acuozzo 3 days ago

      > if what you do is surface level (and crap in my opinion)

      I write embedded software in C for a telecommunications research laboratory. Is this sufficiently deep for you?

      FWIW, I don't use LLMs for this.

      > But if you have any taste (which I presume you dont)

      What value is there to you in an ad hominem attack here? Did you see any LLM evangelism in my post? I offered information based on my experience to help someone use a tool.