Comment by firefoxd

Comment by firefoxd 4 days ago

18 replies

I must be using web browsers completely wrong. Like browsing a page isn't a problem for me. I can do it at the speed of my needs.

I'm having a hard time understanding why I will tell gemini to create an account on some website for me or send an email. Those are usually just a tab away. That's why I feel like I'm missing something here.

jsnell 4 days ago

Basically none of their examples are just "browse a page"? They're multi-step tasks combining data from multiple pages.

Like the first example in the demo carousel (the Y2K party) starts from a photo and a prompt of roughly "buy the props needed for replicating this photo from Etsy". It first analyzes the image in the current tab, identifies a bunch of things to buy, searches for them on Etsy, customizes the orders, adds them to the shopping basket, and then asks for a confirmation to actually send an order.

The second one auto-fills a form with a couple of dozen fields from the data that's in a pdf in another tab. (And in the fiction of a demo, presumably a pdf that's you already had around, not one that you made just for the purposes of using it to auto-fill the form.)

I'm not the target market for this: automating a browser with my credentials is just too scary, but I can certainly see the utility. There's a huge amount of tasks taking a minute or two are not worth creating bespoke automation for but that are also pretty mechanical processes.

  • coffeefirst 3 days ago

    Maybe I’m a curmudgeon who can’t imagine throwing an elaborate Y2K party because all my friends were alive and threw parties at the real Y2K, but… these all feel extremely contrived.

    It’s as if they used AI to generate use cases for their AI tool because they weren’t really sure what it’s for…

    • xnx 3 days ago

      Do you ever have a project that requires research and comparison? This can automate that.

      • coffeefirst 3 days ago

        Yeah but that's what I'm already using regular AI powered search for.

        I suppose by being in the browser it can private and paywalled data, so maybe that's something.

        • xnx 3 days ago

          Exactly. I think I'd use it for hotel price search where you usually don't get the real price until deep in the checkout process.

bandrami 4 days ago

I feel that way about IDEs too, though. My text editor has snippets, my file manager shows me what files are where, and my terminal lets me run programs. Why it's important to people that these functions to be grafted into a single window escapes me.

lmm 4 days ago

Maybe you're only using well-designed sites? Try making a booking with a Chinese airline and you'll quickly wish for an assistant to delegate it all to.

  • dzjkb 4 days ago

    funny you say that, I was literally just booking a flight with air china yesterday and the UX was 10x better than the average wizzair/ryanair experience - a clear, readable UI (with a great table comparison of prices +-3 days from the selected dates), no ads, no random services getting pushed in your face, no booking tabs automatically opening in the background

    • lmm 3 days ago

      Huh. Last time I tried with them (about a year ago), and more recently trying with China Eastern, I couldn't even get it to show me a flight that I knew was flying on a given day (just at a slightly higher price than the one it would show me).

  • shakna 4 days ago

    If you struggle, then an agent will probably fail.

    • lmm 3 days ago

      I know exactly what to do, it's just very tedious to actually do it. Which seems like the perfect use case for an agent.

      • shakna 3 days ago

        Tedium often means a large context window. Lots of personal information to be entered, in different formats, that must be exactly right.

        Thats exactly what an agent regularly fails at.

    • ares623 4 days ago

      Will it matter if you can’t tell?

      • samrus 4 days ago

        Yeah. Because you'll think you have a flight to beijing when you dont

wolvoleo 4 days ago

Yes. I like it for deep research, that kind of thing where I'd be wading though clickbait search results for hours.

But for regular browsing? I don't see the point.