Comment by QuadrupleA

Comment by QuadrupleA 2 days ago

14 replies

Been unhappy with the GPT5 series, after daily driving 4.x for ages (I chat with them through the API) - very pedantic, goes off on too many side topics, stops following system instructions after a few turns (e.g. "you respond in 1-3 sentences" becomes long bulleted lists and multiple paragraphs very quickly.

Much better feel with the Claude 4.5 series, for both chat and coding.

Hard_Space 2 days ago

> you respond in 1-3 sentences" becomes long bulleted lists and multiple paragraphs very quickly

This is why my heart sank this morning. I have spent over a year training 4.0 to just about be helpful enough to get me an extra 1-2 hours a day of productivity. From experimentation, I can see no hope of reproducing that with 5x, and even 5x admits as much to me, when I discussed it with them today:

> Prolixity is a side effect of optimization goals, not billing strategy. Newer models are trained to maximize helpfulness, coverage, and safety, which biases toward explanation, hedging, and context expansion. GPT-4 was less aggressively optimized in those directions, so it felt terser by default.

Share and enjoy!

  • kouteiheika 2 days ago

    > This is why my heart sank this morning. I have spent over a year training 4.0 to just about be helpful enough to get me an extra 1-2 hours a day of productivity.

    Maybe you should consider basing your workflows on open-weight models instead? Unlike proprietary API-only models no one can take these away from you.

    • Hard_Space 2 days ago

      I have considered it, and it is still on the docket. I have a local 3090 dedicated to ML. Would be a fascinating and potentially really useful project, but as a freelancer, it would cost a lot to give it the time it needs.

  • Angostura 2 days ago

    And how would GPT 5.0 know that, I wonder. I bet it’s just making stuff up.

  • ComputerGuru 2 days ago

    You can’t ask GPT to assess the situation. That’s not the kind of question you can count on a an LLM to accurately answer.

    Playing with the system prompts, temperature, and max token output dials absolutely lets you make enough headway (with the 5 series) in this regard to demonstrably render its self-analysis incorrect.

dahcryn 2 days ago

4.1 is great for our stuff at work. It's quite stable (doesn't change personality every month, and one word difference doesn't change the behaviour). IT doesn't think, so it's still reasonably fast.

Is there anything as good in the 5 series? likely, but doing the full QA testing again for no added business value, just because the model disappears, is just a hard sell. But the ones we tested were just slower, or tried to have more personality, which is useless for automation projects.

  • QuadrupleA 2 days ago

    Yeah - agreed, the initial latency is annoying too, even with thinking allegedly turned off. Feels like AI companies are stapling more and more weird routing, summarization, safety layers, etc. that degrade the overall feel of things.

anarticle 2 days ago

I also found this disturbing, as I used to use GPT for small worked out theoretical problems. In 5.2, the long list of repeated bulleted lists and fortune cookies was a negative for my use case. I replaced some of that use with Claude and am experimenting with LM studio and gpt-oss. It seemed like an obvious regression to me, but maybe people weren't using it that way.

For instance something simple like: "If I put 10kw in solar on my roof when is the payback given xyz price / incentive / usage pattern."

Used to give a kind of short technical report, now it's a long list of bullets and a very paternalistic "this will never work" kind of negativity. I'm assuming this is the anti-sycophant at work, but when you're working a problem you have to be optimistic until you get your answer.

For me this usage was a few times a day for ideas, or working through small problems. For code I've been Claude for at least a year, it just works.

spprashant 2 days ago

I can never understand why it is so eager to generate walls of text. I have instructions to always keep the response precise and to the point. It almost seem like it wants to overwhelm you, so you give up and do your own research.

mhitza 2 days ago

I often use ChatGPT without an account and ChatGPT 5 mini (which you get while logged out) might as well be Mistral 7b + web search. Its that mediocre. Even the original 3.5 was way ahead.

  • accrual 2 days ago

    I kinda miss the original 3.5 model sometimes. Definitely not as smart as 4o but wow was it impressive when new. Apparently I have a very early ChatGPT account per the recent "wrapped" feature.

  • teaearlgraycold 2 days ago

    Really? I’ve found it useful for random little things.

    • mhitza 2 days ago

      It is useful for quick information lookup when you're lacking the precise search terms (which is what I've often do). But the way I was chatting with the original chatgpt were better.