Comment by TechRemarker

Comment by TechRemarker 18 hours ago

42 replies

Heard all the news how Gemini 3 is passing everyone on benchmarks, so quickly tested and still find it a far cry from ChatGPT in real world use when testing questions on both platforms. But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini. But glad to see competition since certainly don't want only one winner in this race.

hodgehog11 18 hours ago

That's really fascinating. Every real world use case I've tried on Gemini (especially math-related) absolutely slaughtered the performance of ChatGPT in speed and quality, not even close. As an Android user, the Gemini app is also far superior, since the ChatGPT app still doesn't properly display math equations, among plenty of other bugs.

  • dudeinhawaii 17 hours ago

    I have to agree with you but I'll remain a skeptic until the preview tag is dropped. I found Gemini 2.5 Pro to be AMAZING during preview and then it's performance and quality unceremoniously dropped month after month once it went live. Optimizations in favor of speed/costs no doubt but it soured me on jumping ship during preview.

    Anthropic pulled something similar with 3.6 initially, with a preview that had massive token output and then a real release with barely half -- which significantly curtails certain use cases.

    That said, to-date, Gemini has outperformed GPT-5 and GPT5.1 on any task I've thrown at them together. Too bad Gemini CLI is still barely useful and prone to the same infinite loop issues that have plagued it for over a year.

    I think Google has genuinely released a preview of a model that leapfrogs all other models. I want to see if that is what actually makes it to production before I change anything major in my workflows.

  • verdverm 17 hours ago

    It's generally anecdotal and vibes when people make claims that some AI is better than another for things they do. There are too many variables and not enough eval for any of it to hold water imo. Personal preferences, experience, brand loyalty, and bias at play too

    it's contemporary vim vs emacs at this point

    • hodgehog11 16 hours ago

      I get what you're saying because this is typically true (this is a strong motivator for my current research) but I don't think it applies here and OpenAI seems to agree with me. Some cases are clear: GPT-5 is clearly better than Llama 3 for example. If there is a sizeable enough difference across virtually all evals, it is typically clear that one LLM is a stronger performer than another.

      Experiences aside, Gemini 3 beats GPT-5 on enough evals that it seems fair to say that it is a better model. This appears in line with public consensus, with a few exceptions. Those exceptions seem to be centered around search.

  • bdhtu 18 hours ago

    What do you mean? It renders LaTex fine on Android.

    • hodgehog11 17 hours ago

      Some LaTeX, but not all, especially for larger equations. I will admit it has gotten a lot better in recent updates, since it seemed thoroughly broken for quite a while in its early days.

    • null_deref 17 hours ago

      I had a problem where ChatGPT rendered math to me from right to left. Sure thing YMMV

  • kristofferR 17 hours ago

    Try doing some more casual requests.

    When I asked both ChatGPT 5.1 Extended Thinking and Gemini 3 Pro Preview High for best daily casual socks both responses were okay and had a lot of the same options, but while the ChatGPT response included pictures, specs scraped from the product pages and working links, the Gemini response had no links. After asking for links, Gemini gave me ONLY dead links.

    That is a recurring experience, Gemini seems to be supremely lazy to its own detriment quite often.

    A minute ago I asked for best CR2032 deal for Aqara sensors in Norway, and Gemini recommended the long discontinued IKEA option, because it didn't bother to check for updated information. ChatGPT on the other hand actually checked prices and stock status for all the options it gave me.

  • croes 17 hours ago

    One might think that benchmarks do not say much about individual usage and that an objective assessment of the performance of AIs is difficult.

    At least, thanks to the hype, RAM and SSDs are becoming more expensive, which eats up all the savings from using AI and the profits from increased productivity /s?

BeetleB 17 hours ago

> But importantly the ChatGPT app experience at least for iPhone/Mac users is drastically superior vs Google which feels very Google still. So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.

Yes, the ChatGPT experience is much better. No, Gemini doesn't need to make a better product to take market share.

I've never had the ChatGPT app. But my Android phone has the Gemini app. For free, I can do a lot with it. Granted, on my PC I do a lot more with all the models via paid API access - but on the phone the Gemini app is fine enough. I have nothing to gain by installing the ChatGPT app, even if it is objectively superior. Who wants to create another account?

And that'll be the case for most Android users. As a general hint: If someone uses ChatGPT but has no idea about gpt-4o vs gpt-5 vs gpt-5.1 etc, they'll do just fine with the Gemini app.

Now the Gemini app actually sucks in so many ways (it doesn't seem to save my chats). Google will fix all these issues, but can overtake ChatGPT even if they remain an inferior product.

It's Slack vs Teams all over again. Teams one by a large margin. And Teams still sucks!

karmasimida 17 hours ago

Well I have been using Gemini and ChatGPT side by side for over 6 months now.

My experience is Gemini has significantly improved its UX and performs better that requires niche knowledge, think of some ancient gadgets that have been out of production for 4-5 decades. Gemini can produce reliable manuals, but ChatGPT hallucinates.

UX wise ChatGPT is still superior and for common queries it is still my go to. But for hard queries, I am team Gemini and it hasn’t failed me once

binarymax 18 hours ago

Benchmaxxing galore by lots of teams in this space.

  • emp17344 18 hours ago

    I think it’s entirely possible that AI actually has plateaued, or has reached a point where a jump in intelligence comes at the cost of reliability.

    • hugh-avherald 18 hours ago

      I suspect it's reached the point where the distinguishing quality of one model over the others is only observable by true experts -- and only in their respective fields. We are exhausting the well of frontier questions that can be programmatically asked and the answers checked.

      • hodgehog11 17 hours ago

        Absolutely this. Strong disagree that progress is plateauing, merely that gains are harder for the general public to perceive and typically come from more advanced means than simply scaling. Math performance in particular is improving at an uncomfortably rapid pace.

    • lukan 17 hours ago

      AI in general? Not at all. LLM's maybe a little bit, when even Sam Altman said, the progress is logarithmic to the investment. Still, there is progress. And the potential of LLM based agents, where many different models and other technics are mixed in together, we just started to explore.

pohl 17 hours ago

I had a similar experience, signing up for the first time to give Gemini a test drive on my side project after a long time using ChatGPT. The latter has a native macOS client which "just works" integrating with Xcode buffers. I couldn't figure out how to integrate Gemini with Xcode quickly enough so I'm resorting to pasting back & forth from the browser. A few of the exchanges I've had "felt smarter" — but, on the whole, it feels like maybe it wasn't as well trained on Swift/SwiftUI as the OpenAI model. I haven't decided one way or another yet, but those are my initial impressions.

doug_durham 16 hours ago

I've been a paying high volume user of ChatGPT for a while. I found the transition to Gemini to be seamless. I've been pleasantly surprised. I bounce between the two. I'm at about 60% Gemini, 40% ChatGPT.

kranke155 17 hours ago

Gemini comes with the 1.99 Google One plan. So I use that

  • BeetleB 17 hours ago

    Actually, it comes with the free plan. The $1.99 plan doesn't give you any more AI capabilities. Only at the $19.99/mo plan do you get more.

    https://one.google.com/about/#compare-plans

    • kranke155 4 hours ago

      Well then the usage is already so useful in Free mode that I didn’t even notice it. “Thinking ” has a meaningful cap. But I have not felt the need to pay for more. I pay for Claude.

xnx 18 hours ago

> So Gemini would have to be drastically better answer wise than ChatGPT to lure users from a better UI/UX experience to Gemini.

or cheaper/free

lanthissa 18 hours ago

they're deep into a redesign of the gemini app, idk when it will be released or if its going to be good, but at least they agree with you and are putting significant resources into fixing it.

  • tmaly 15 hours ago

    I did notice a bug on the iPhone, even with app background refresh, if the phone shuts off the screen, a prompt that was processing stalls out.

tapoxi 18 hours ago

Its really hard to measure these things. Personally I switched to Gemini a few months ago since it was half the cost of ChatGPT (Verizon has a $10/month Google AI package). I feel like I've subconsciously learned to prompt it slightly differently and now using OpenAI products feels disappointing. Gemini tends to give me the answer I expect, Claude follows close behind, I get "meh" results from OpenAI.

I am using Gemini 3 Pro, I rarely use Flash.

golfer 18 hours ago

I couldn't even get ChatGPT to let me download code it claimed to program for me. It kept saying the files were ready but refused to let me access or download anything. It was the most basic use case and it totally bombed. I gave up on ChatGPT right then and there.

It's amazing how different people have wildly varying experiences with the same product.

  • embedding-shape 18 hours ago

    It's because comparing their "ChatGPT" experience with your "ChatGPT" experience doesn't tell anyone anything. Unless people start saying what models they're using and prompts, the discussions back and forth about what platform is the best provides zero information to anyone.

    • jiggawatts 8 hours ago

      It’s the equivalent of the user that points at their workstation tower and exclaims that the “hard drive is broken!”

      Use the right words, get the right response.

      Ah… ahhh… I get now why they get such bad results from AI models.

  • dudeinhawaii 17 hours ago

    Did you wait a while before downloading? The links it provides for temporary projects have a surprisingly brief window where you can download them. I've had similar experience when even waiting 1 minute to download the file.

  • bdbdbdb 18 hours ago

    Since LLMs are non deterministic it's not that amazing. You could ask it the same question as me and we could both get very different conversations and experiences

  • _whiteCaps_ 17 hours ago

    The same thing happens to me in Claude occasionally. I have to tell it "Generate a tar.gz archive for me to download".

par 17 hours ago

Yeah, hate to say but for me a big thing is i still couldn't separate my Gemini chats into folders. I had ChatGPT export some profiles and history and moved it into Gemini, and 1) when Gemini gave me answers i was more pleased but 2) Gemini was a bit more rigorous on guard rails, which seems a bit overly cautious. I was asking some pretty basic non-controversial stuff.

r_lee 18 hours ago

I'm confused as well, it hallucinated like crazy

like it seems great, but then it's just bullshitting about what it can do or whatever

potsandpans 17 hours ago

What are your primary usecases? Are you mostly using it as a chatbot?

I find gemini excels in multimodal areas over chatgpt and anthropic. For example, "identify and classify this image with meta data" or "ocr this document and output a similar structure in markdown"

j45 17 hours ago

Training and gaming for the benchmarks is different than actual use.

jiggawatts 18 hours ago

Curiously, I had the opposite experience, except for Deep Research mode where after the latest update the OpenAI offering has become genuinely amazing. This is doubly ironic because Gemini has direct API access to Google search!

  • threecheese 17 hours ago

    It is good, but Pro subscribers get only five per month. After that, it’s a limited version, and it’s not good (normal 5.1 gives more comprehensive answers than DR Limited).

  • observationist 18 hours ago

    Google search is awful. I don't think they can put lipstick on that particular pig and expect anyone to think it's beautiful.

    • coppsilgold 17 hours ago

      I'm sure they give their AI models a superior search than they give to us.

      Also if you prompt Google search the right way it's unfortunately still superior to most if not all other solutions in most cases.

mrcwinn 17 hours ago

This is exactly my experience. And it's funny -- this crowd is so skeptical of OpenAI... so they prefer _Google_ to not be evil? It's funny how heroes and villains are being re-cast.