Comment by deadbabe

Comment by deadbabe 5 days ago

It took three days because... agents suck.

But yes, with enough prodding they will eventually build you something that's been built before. Don't see why that's particularly impressive. It's in the training data.

simonw 5 days ago

Not a useful mental model.

Reply View 6 replies

deadbabe 5 days ago

It is useful. If you can whip up something complex fairly quickly with an AI agent, it’s likely because it’s already been done before.
But if even the AI agent seems to struggle, you may be doing something unprecedented.

Reply View | 5 replies
- simonw 5 days ago
  
  Except if you spend quality time with coding agents you realize that's not actually true.
  They're equally useful for novel tasks because they don't work by copying large scale patterns from their training data - the recent models can break down virtually any programming task to a bunch of functions and components and cobble together working code.
  If you can clearly define the task, they can work towards a solution with you.
  The main benefit of concepts already in the training data is that it lets you slack off on clearly defining the task. At that point it's not the model "cheating", it's you.
  
  Reply View | 4 replies
  
  deadbabe 4 days ago
  
  Good long lived software is not a bunch of functions and components cobbled together.
  You need to see the big picture and visions of the future state in order to ensure what is being built will be able to grow and breathe into that. This requires an engineer. An agent doesn’t think much about the future, they think about right now.
  This browser toy built by the agent, it has NO future. Once it has written the code, the story is over.
  
  Reply View | 0 replies
  
  aix1 5 days ago
  
  Simon, do you happen to have some concrete examples of a model doing a great job at a clearly novel, clearly non-trivial coding task?
  I'd find it very interesting to see some compelling examples along those line.
  
  Reply View | 1 reply
  
  simonw 5 days ago
  
  I think datasette-transactions https://github.com/datasette/datasette-transactions is pretty novel. Here's the transcript where Claude Code built it: https://gisthost.github.io/?a41ce6304367e2ced59cd237c576b817...
  That transcript viewer itself is a pretty fun novel piece of software, see https://github.com/simonw/claude-code-transcripts
  Denobox https://github.com/simonw/denobox is another recent agent project which I consider novel: https://orphanhost.github.io/?simonw/denobox/transcripts/ses...
  
  Reply View | 0 replies
  
  keybored 5 days ago
  
  > Except if you spend quality time with coding agents you realize that's not actually true.
  Agent engineering seems to be (from the outside!) converging on quality lived experience. Compared to Stone Age manual coding it’s less about technical arguments and more about intuition.
  Vibes in short.
  You can’t explain sex to someone who has not had sex.
  Any interaction with tools is partly about intuition. It’s a difference of degree.
  
  Reply View | 0 replies