Comment by deadbabe

Comment by deadbabe 5 days ago

This is not that impressive, there are numerous examples of browsers for training data to reference.

simonw 5 days ago

I don't buy this.

It implies that the agents could only do this because they could regurgitate previous browsers from their training data.

Anyone who's watched a coding agent work will see why that's unlikely to be what's happening. If that's all they were doing, why did it take three days and thousands of changes and tool calls to get to a working result?

I also know that AI labs treat regurgitation of training data as a bug and invest a lot of effort into making it unlikely to happen.

I recommend avoiding the temptation to look at things like this and say "yeah, that's not impressive, it saw that in the training data already". It's not a useful mental model to hold.

Reply View 8 replies

deadbabe 5 days ago

It took three days because... agents suck.
But yes, with enough prodding they will eventually build you something that's been built before. Don't see why that's particularly impressive. It's in the training data.

Reply View | 7 replies
- simonw 5 days ago
  
  Not a useful mental model.
  
  Reply View | 6 replies
  
  deadbabe 5 days ago
  
  It is useful. If you can whip up something complex fairly quickly with an AI agent, it’s likely because it’s already been done before.
  But if even the AI agent seems to struggle, you may be doing something unprecedented.
  
  Reply View | 5 replies

embedding-shape 5 days ago

Damn, ok, what should I attempt instead, that could impress even you?

Reply View 2 replies

anonymous908213 5 days ago

Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people. This is just, yet another, proof-of-concept. Something which LLMs obviously can do, and which never seems to translate to real-world software people use. Parsing and rendering text is really not the hard part of building a browser, and there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject.
That said, I think some credit is due. This is still a nice weekend project as far as LLMs go, and I respect that you had a specific goal in mind (showing a better approach than Cursor's nonsense, that gets better results in less time with less cost) and achieved it quickly and decisively. It has not really changed my priors on LLMs in any way, though. If anything it just confirms them, particularly that the "agent swarm" stuff is a complete non-starter and demonstrates how ridiculous that avenue of hype is.

Reply View | 1 reply
- embedding-shape 5 days ago
  
  > Actually good software that is suitable for mass adoption would go a long way to convincing a lot of people.
  Yeah, that's obviously a lot harder, but doable. I've built it for clients, since they pay me, but haven't launch/made public something of my own, where I could share the code, I guess might be useful next project now.
  > This is just, yet another, proof-of-concept.
  It's not even a PoC, it's a demonstration of how far off the mark Cursor are with their "experiment" where they were amazed by what "hundreds of agents" build for week(s).
  > there's no telling how closely the code mirrors existing open-source implementations if you aren't versed on the subject
  This is absolutely true, I tried to get some better answers on how one could even figure that out here: https://news.ycombinator.com/item?id=46784990
  
  Reply View | 0 replies

usef- 5 days ago

What would be impressive to you?

Reply View 1 reply

deadbabe 5 days ago

A browser so unique and strange it is literally unlike anything we've ever seen to date, using entirely new UI patterns and paradigms.

Reply View | 0 replies