Comment by jostmey

Comment by jostmey 3 days ago

18 replies

I noticed how ChatGPT got progressively worse at helping me with my research. I gave up on ChatGPT 5 and just switched Grok and Gemini. I couldn’t be happier that I switched.

azan_ 3 days ago

It's amazing how different are the experiences different people have. To me every new version of chatgpt was an improvement and gemini is borderline unusable.

  • farcitizen 3 days ago

    I got the same experience. Dont get how people are saying gemini is so good.

    • 0xbadcafebee 3 days ago

      A lot of people still have a shallow understanding of how LLMs work. Each version of a model has different qualities than the last, each model is better or worse at some things than others, and each responds differently to different prompts, styles. Some smaller models perform better than larger ones. Sometimes you should use a system prompt, sometimes you shouldn't. Tuning settings for the model inference (temperature, top_p, penalties, etc) significantly influence the outcome. (https://www.promptingguide.ai/introduction/settings, https://platform.openai.com/docs/guides/optimizing-llm-accur...)

      Most "big name" models' interfaces don't let you change settings, or not easily. Power users learn to use different interfaces and look up guides to tweak models to get better results. You don't have to just shrug your shoulders and switch models. OpenAI's power interface: https://platform.openai.com/playground Anthropic's power interface: https://platform.claude.com/ For self-hosted/platform-agnostic, OpenWebUI is great: https://openwebui.com/

    • europeanNyan 3 days ago

      Gemini has a great model, but it's a bad product. I feel much happier using ChatGPT because Gemini just seems so barebones and unpolished. It has this feeling of a tech demo.

  • tgtweak 3 days ago

    Very curious for what use cases you're finding gemini unusable.

    • azan_ 3 days ago

      Scientific research and proof-reading. Gemini is the laziest LLM I've used. Frequently he will lie that he searched for something and just make stuff up, basically never happens to me when I'm using gpt5.2.

      • buu700 2 days ago

        The way I summed it up to a friend recently is that Gemini 3 is smarter but Grok 4 works harder. Very loose approximation, but roughly maps to my experience. Both are extremely useful (as is GPT-5.2), but I use them on different tasks and sometimes need to manage them a bit differently.

      • flexagoon 3 days ago

        Do you use it directly? I've only used it though Kagi Assistant but it works better than any other model for me

    • wltr 2 days ago

      Any coding task produces some trash, while I can prototype with ChatGPT quite a lot, sometimes delivering the entire app almost entirely vibe-coded. Gemini, it takes a few prompts for it to get me mad and just close the tab. I use only the free web versions, never agentic ‘mess with my files’ thing. Claude, is even better than that, but I keep it for serious tasks only, so good it is.

    • double0jimb0 3 days ago

      In my experience with Gemini, I find it incapable of not hallucinating.

    • subscribed 2 days ago

      Gemini loves to ignore Gemini.md instructions from the first minutes, to replace half of the python script with "# other code...", or to try to delete files OUTSIDE of the project directory, then apologise profusely, and try it again.

      Utterly unreliable. I get better results, faster, editing parts of the code with Claude in a web ui, lol.

mmcwilliams 2 days ago

Odd, I've found that Gemini will completely fabricate the content of specific DOIs despite being corrected and even it providing a link to a paper which shows it is off about the title and subject of a paper it will cite. This obviously concerns me about its effectiveness as a research aide.

amelius 3 days ago

Why not Claude?

  • esperent 3 days ago

    The limits on the $20 plan are too low compared to Gemini and ChatGPT. They're too low to do any serious work at all.

  • jostmey 3 days ago

    I personally find Claude the best at coding, but it’s usefulness doesn’t seem to extend to scientific research and writing

  • 650REDHAIR 3 days ago

    Because I’m sick of paying $20 for an hour of claude before it throttles me.