Comment by firefoxd

Comment by firefoxd 4 days ago

I must be using web browsers completely wrong. Like browsing a page isn't a problem for me. I can do it at the speed of my needs.

I'm having a hard time understanding why I will tell gemini to create an account on some website for me or send an email. Those are usually just a tab away. That's why I feel like I'm missing something here.

jsnell 4 days ago

Basically none of their examples are just "browse a page"? They're multi-step tasks combining data from multiple pages.

Like the first example in the demo carousel (the Y2K party) starts from a photo and a prompt of roughly "buy the props needed for replicating this photo from Etsy". It first analyzes the image in the current tab, identifies a bunch of things to buy, searches for them on Etsy, customizes the orders, adds them to the shopping basket, and then asks for a confirmation to actually send an order.

The second one auto-fills a form with a couple of dozen fields from the data that's in a pdf in another tab. (And in the fiction of a demo, presumably a pdf that's you already had around, not one that you made just for the purposes of using it to auto-fill the form.)

I'm not the target market for this: automating a browser with my credentials is just too scary, but I can certainly see the utility. There's a huge amount of tasks taking a minute or two are not worth creating bespoke automation for but that are also pretty mechanical processes.

Reply View 4 replies

coffeefirst 3 days ago

Maybe I’m a curmudgeon who can’t imagine throwing an elaborate Y2K party because all my friends were alive and threw parties at the real Y2K, but… these all feel extremely contrived.
It’s as if they used AI to generate use cases for their AI tool because they weren’t really sure what it’s for…

Reply View | 3 replies
- xnx 3 days ago
  
  Do you ever have a project that requires research and comparison? This can automate that.
  
  Reply View | 2 replies
  
  coffeefirst 3 days ago
  
  Yeah but that's what I'm already using regular AI powered search for.
  I suppose by being in the browser it can private and paywalled data, so maybe that's something.
  
  Reply View | 1 reply
  
  xnx 3 days ago
  
  Exactly. I think I'd use it for hotel price search where you usually don't get the real price until deep in the checkout process.
  
  Reply View | 0 replies

bandrami 4 days ago

I feel that way about IDEs too, though. My text editor has snippets, my file manager shows me what files are where, and my terminal lets me run programs. Why it's important to people that these functions to be grafted into a single window escapes me.

Reply View 2 replies

newdee 4 days ago

This is satire, right?

Reply View | 1 reply
- bandrami 4 days ago
  
  No. Why would you think that's satire?
  
  Reply View | 0 replies

lmm 4 days ago

Maybe you're only using well-designed sites? Try making a booking with a Chinese airline and you'll quickly wish for an assistant to delegate it all to.

Reply View 8 replies

dzjkb 4 days ago

funny you say that, I was literally just booking a flight with air china yesterday and the UX was 10x better than the average wizzair/ryanair experience - a clear, readable UI (with a great table comparison of prices +-3 days from the selected dates), no ads, no random services getting pushed in your face, no booking tabs automatically opening in the background

Reply View | 1 reply
- lmm 3 days ago
  
  Huh. Last time I tried with them (about a year ago), and more recently trying with China Eastern, I couldn't even get it to show me a flight that I knew was flying on a given day (just at a slightly higher price than the one it would show me).
  
  Reply View | 0 replies
shakna 4 days ago

If you struggle, then an agent will probably fail.

Reply View | 5 replies
- lmm 3 days ago
  
  I know exactly what to do, it's just very tedious to actually do it. Which seems like the perfect use case for an agent.
  
  Reply View | 1 reply
  
  shakna 3 days ago
  
  Tedium often means a large context window. Lots of personal information to be entered, in different formats, that must be exactly right.
  Thats exactly what an agent regularly fails at.
  
  Reply View | 0 replies
- ares623 4 days ago
  
  Will it matter if you can’t tell?
  
  Reply View | 2 replies
  
  samrus 4 days ago
  
  Yeah. Because you'll think you have a flight to beijing when you dont
  
  Reply View | 1 reply
  
  ares623 4 days ago
  
  Oh yeah that bit lol
  
  Reply View | 0 replies

wolvoleo 4 days ago

Yes. I like it for deep research, that kind of thing where I'd be wading though clickbait search results for hours.

But for regular browsing? I don't see the point.

Reply View 0 replies