Automatic Programming

211 points by dvrp 21 hours ago

dugmartin 20 hours ago

I have 30+ years of industry experience and I've been leaning heavily into spec driven development at work and it is a game changer. I love programming and now I get to program at one level higher: the spec.

I spend hours on a spec, working with Claude Code to first generate and iterate on all the requirements, going over the requirements using self-reviews in Claude first using Opus 4.5 and then CoPilot using GPT-5.2. The self-reviews are prompts to review the spec using all the roles and perspectives it thinks are appropriate. This self review process is critical and really polishes the requirements (I normally run 7-8 rounds of self-review).

Once the requirements are polished and any questions answered by stakeholders I use Claude Code again to create a extremely detailed and phased implementation plan with full code, again all in the spec (using a new file is the requirements doc is so large is fills the context window). The implementation plan then goes though the same multi-round self review using two models to polish (again, 7 or 8 rounds), finalized with a review by me.

The result? I can then tell Claude Code to implement the plan and it is usually done in 20 minutes. I've delivered major features using this process with zero changes in acceptance testing.

What is funny is that everything old is new again. When I started in industry I worked in defense contracting, working on the project to build the "black box" for the F-22. When I joined the team they were already a year into the spec writing process with zero code produced and they had (iirc) another year on the schedule for the spec. At my third job I found a literal shelf containing multiple binders that laid out the spec for a mainframe hosted publishing application written in the 1970s.

Looking back I've come to realize the agile movement, which was a backlash against this kind of heavy waterfall process I experienced at the start of my career, was basically an attempt to "vibe code" the overall system design. At least for me AI assisted mini-waterfall ("augmented cascade"?) seems a path back to producing better quality software that doesn't suffer from the agile "oh, I didn't think of that".

Reply View 32 replies

AdamN 19 hours ago

Waterfall can work great when: 1/ the focus is long-term both in terms of knowing that she company can take a few years to get the thing live but also that it will be around for many more years, 2/ the people writing the spec and the code are largely the same people.
Agile was really pushing to make sure companies could get software live before they died (number 1) and to remedy the anti-pattern that appeared with number 2 where non-technical business people would write the (half-assed) spec and then technical people would be expected do the monkey work of implementing it.

Reply View | 10 replies
- aglavine 19 hours ago
  
  No.
  Agile core is the feedback loop. I can't believe people still don't get it. Feedback from reality is always faster than guessing on the air.
  Waterfall is never great. The only time when you need something else than Agile is when lives are at stake, you need there formal specifications and rigorous testing.
  SDD allows better output than traditional programming. It is similar to waterfall in the sense that the model helps you to write design docs in hours instead of days and take more into account as a result. But the feedback loop is there and it is still the key part in the process.
  
  Reply View | 5 replies
  
  0xbadcafebee 4 hours ago
  
  > Feedback from reality is always faster than guessing on the air
  Only if you have no idea what the results will be.
  Professional engineering takes parts with specific tolerances, tested for a specific application, using a tried-and-true design, combines them into the solution that other people have already made, and watches it work, exactly as predicted. That's how we can build a skyscraper "the first time" and have it not fall down. We don't need to build 20 tiny versions of a building until we get a working skyscraper.
  
  Reply View | 0 replies
  
  keyle 18 hours ago
  
  I've live through both eras...
  Agile, hardly any planning, write 3 times.
  Waterfall, weeks of planning, write 3 times anyway.
  The point is, people don't know what they want or are asking for, until it's in front of them. No system is perfect, but waterfall leads to bigger disasters.
  
  Reply View | 1 reply
  
  AdamN 16 hours ago
  
  Any real software (that delivers value over time) is constantly rewritten and that's a good thing. The question is whether the same people are rewriting it that wrote it and what percentage of that rewriting is based off of a spec or based off of feedback from elsewhere in the system.
  
  Reply View | 0 replies
  
  jll29 18 hours ago
  
  > The only time when you need something else than Agile is when lives are at stake, you need there formal specifications and rigorous testing.
  Lives are always at stake, given that we use software everywhere, and often in unintended ways, even outside its spec (isn't that a definition of a "hack"?).
  People think of medical appliance software, space/air traffic software, defense systems or real-time embedded systems as the only environments where "lives are stake", but actually, in subtle ways, a violation of user expectancy (in some software companies, UX issues count as serious bugs) in a Word processor, Web browser or the sort command can kill a human.
  Two real-life examples:
  (1) a few years ago, a Chinese factory worker was killed by a robot. It was not in the spec that a human could ever walk in the robot's path (the first attested example of "AI" killing a human that I found at the time). This was way before deep larning entered the stage, and the factory was a closed and fully automated environment.
  (2) Also a few years back, the Dutch software for social benefits management screwed up, and thousands of families just did not get pay out any money at all for an extended period. Allegedly, this led to starvations (I don't have details - but if any Dutch read this, please share), and eventually a whole Dutch government was forced to resign over the scandal.
  
  Reply View | 1 reply
  
  agentultra 16 hours ago
  
  That's a very narrow definition of engineering. What about property? Sensitive information?
  It's a fine "whoopsie-doodle," when your software erases the life savings of a few thousand people. "We'll fix that in the next release," is already too little, too late.
  
  Reply View | 0 replies
- user3939382 19 hours ago
  
  I spent my career building software for executives that wanted to know exactly what they were going to get and when because they have budgets and deadlines i.e. the real world.
  Mostly I’ve seen agile as, let’s do the same thing 3x we could have done once if we spent time on specs. The key phrase here is “requirements analysis” and if you’re not good at it either your software sucks or you’re going to iterate needlessly and waste massive time including on bad architecture. You don’t iterate the foundation of a house.
  I see scenarios where Agile makes sense (scoped, in house software, skunk works) but just like cloud, jwts, and several other things making it default is often a huge waste of $ for problems you/most don’t have.
  Talk to the stakeholders. Write the specs. Analyze. Then build. “Waterfall” became like a dirty word. Just because megacorps flubbed it doesn’t mean you switch to flying blind.
  
  Reply View | 3 replies
  
  catdog 19 hours ago
  
  > The key phrase here is “requirements analysis” and if you’re not good at it either your software sucks or you’re going to iterate needlessly and waste massive time including on bad architecture. You don’t iterate the foundation of a house.
  This depends heavily on the kind of problem you are trying to solve. In a lot of cases requirements are not fixed but evolve over time, either reacting to changes in the real word environment or by just realizing things which are nice in theory are not working out in practice.
  You don’t iterate the foundation of a house because we have done it enough times and also the environment the house exists in (geography, climate, ...) is usually not expected to change much. If that were the case we would certainly build houses differently than we usually do.
  
  Reply View | 0 replies
  
  Mawr 18 hours ago
  
  > making it default is often a huge waste of $ for problems you/most don’t have.
  It's the opposite — knowing the exact spec of your program up front is vanishingly rare, probably <1% of all projects. Usually you have no clue what you're doing, just a vague goal. The only way to find out what to build is to build something, toss it over to the users and see what happens.
  No developer or, dear god, "stakeholder" can possibly know what the users need. Asking the users up front is better, but still doesn't help much — they don't know what they want either.
  No plan survives first contact with the enemy and there's no substitute for testing — reality is far too complex for you to be able to model it up front.
  > You don’t iterate the foundation of a house.
  You do, actually. Or rather, we have — over thousands of years we've iterated and written up what we've learned so that nobody has to iterate from scratch for every new house anymore. It's just that our physics, environment, and requirements for "a house" doesn't change constantly, like it does for software and we've had thousands of years to perfect the craft, not some 50 years.
  Also, civil engineers mess up in exactly the same ways. Who needs testing? [1]. Who needs to iterate as they're building? [2].
  [1]: https://youtu.be/jxNM4DGBRMU?t=397
  [2]: https://youtu.be/jxNM4DGBRMU?t=837
  
  Reply View | 1 reply
  
  embedding-shape 18 hours ago
  
  > knowing the exact spec of your program up front is vanishingly rare, probably <1% of all projects
  I don't have anything useful to add, but both of you speak and write with conviction from your own experience and perspective yet to refuse that the situation might be different from others.
  "Software engineering" is a really broad field, some people can spend their whole life working on projects where everything is known up front, others the straight opposite.
  Kind of feel like you both need to be clearer up front about your context and where you're coming from, otherwise you're probably both right, but just in your own contexts.
  
  Reply View | 0 replies
mentos 19 hours ago

I believe the future of programming will be specs so I’m curious to ask you as someone who operates this way already, are there any public specs you could point to worth learning from that you revere? I’m thinking the same way past generations were referred to John Carmack’s Quake code next generations will celebrate great specs.

Reply View | 0 replies
manmal 19 hours ago

My experience is that such one-shotted projects never survive the collision with reality. Even with extremely detailed specs, the end result will not be what people had in mind, because human minds cannot fully anticipate the complexity of software, and all the edge cases it needs to handle. "Oh, I didn't think that this scheduled alarm is super annoying, I'd actually expect this other alarm to supersede it. It's great we've built this prototype, because this was hard to anticipate on paper."
I'm not saying I don't believe your report - maybe you are working in a domain where everything is super deterministic. Anyway, I don't.

Reply View | 5 replies
- wenc 17 hours ago
  
  I've been doing spec-driven development for the past 2 months, and it's been a game changer (especially with Opus 4.5).
  Writing a spec is akin to "working backwards" (or future backwards thinking, if you like) -- this is the outcome I want, how do I get there?
  The process of writing the spec actually exposes the edge cases I didn't think of. It's very much in the same vein as "writing as a tool of thought". Just getting your thoughts and ideas onto a text file can be a powerful thing. Opus 4.5 is amazing at pointing out the blind spots and inconsistencies in a spec. The spec generator that I use also does some reasoning checks and adds property-based test generation (Python Hypothesis -- similar to Haskell's Quickcheck), which anchors the generated code to reality.
  Also, I took to heart Grant Slatton's "Write everything twice" [1] heuristic -- write your code once, solve the problem, then stash it in a branch and write the code all over again.
  > Slatton: A piece of advice I've given junior engineers is to write everything twice. Solve the problem. Stash your code onto a branch. Then write all the code again. I discovered this method by accident after the laptop containing a few days of work died. Rewriting the solution only took 25% the time as the initial implementation, and the result was much better. So you get maybe 2x higher quality code for 1.25x the time — this trade is usually a good one to make on projects you'll have to maintain for a long time.
  This is effective because initial mental models of a new problem are usually wrong.
  With a spec, I can get a version 1 out quickly and (mostly) correctly, poke around, and then see what I'm missing. Need a new feature? I tell the Opus to first update the spec then code it.
  And here's the thing -- if you don't like version 1 of your code, throw it away but keep the spec (those are your learnings and insights). Then generate a version 2 free of any sunk-cost bias, which, as humans, we're terrible at resisting.
  Spec-driven development lets you "write everything twice" (throwaway prototypes) faster, which improves the quality of your insights into the actual problem. I find this technique lets me 2x the quality of my code, through sheer mental model updating.
  And this applies not just to coding, but most knowledge work, including certain kinds of scientific research (s/code/LaTeX/).
  [1] https://grantslatton.com/software-pathfinding
  
  Reply View | 2 replies
  
  manmal 9 hours ago
  
  My experience with both Opus and GPT-codex is that they both just forget to implement big chunks of specs, unless you give them the means to self-validate their spec conformance. I’m finding myself sometimes spending more time coming up with tooling to enable this, than the actual work.
  
  Reply View | 1 reply
  
  wenc 8 hours ago
  
  The key is generating a task list from the spec. Kiro IDE (not cli) generates tasks.md automatically. This is a checklist that Opus has to check off.
  Try Kiro. It's just an all-round excellent spec-driven IDE.
  You can still use Claude Code to implement code from the spec, but Kiro is far better at generating the specs.
  p.s. if you don't use Kiro (though I recommend it), there’s a new way too — Yegge’s beads. After you install, prompt Claude Code to `write the plan in epics, stories and tasks in beads`. Opus will -- through tool use -- ensure every bead is implemented. But this is a more high variance approach -- whereas Kiro is much more systematic.
  
  Reply View | 0 replies
- nl 19 hours ago
  
  I think there's a difference between people getting a system a d realising it isn't actually what they wanted and "never survive collision with reality".
  They survive by being modified and I don't think that invalidates the process that got them in front of people faster than would otherwise have been possible.
  This isn't a defence of waterfall though. It's really about increasing the pace of agile and the size of the loop that is possible.
  
  Reply View | 1 reply
  
  manmal 10 hours ago
  
  I think I agree with what you’re saying? But that’s not the waterfall approach GP pitched.
  
  Reply View | 0 replies
yobbo 19 hours ago

Agile solves the problem of discovering a workable set of requirements while the environment is changing.
If you already know the requirements, it doesn't need to come into play.

Reply View | 1 reply
- AnimalMuppet 16 hours ago
  
  While the environment is changing. That's the key.
  If you already know the requirements, and they aren't going to change for the duration of the project, then you don't need agile.
  And if you have the time. I recently was on a project with a compressed timeline. The general requirements were known, but not in perfect detail. We began implementation anyway, because the schedule did not permit a fully phased waterfall. We had to adjust somewhat to things not being as we expected, but only a little - say, 10%. We got our last change of requirements 3 or 4 weeks before the completion of implementation. The key to making this work was regular, detailed, technical conversations between the customer's engineers, the requirements writers, and our implementers.
  
  Reply View | 0 replies
orochimaaru 11 hours ago

Agile isn’t against spec writing. Specs can be a task in your story and so can automated tests. Both can be deliverables in your acceptance criteria. But that’s not how it went - because the human nature is to look for least effort.
Which AI, least effort is the specs so that’s the “greatest thing to do” again.

Reply View | 0 replies
WillAdams 18 hours ago

Isn't this just a new name for "Design by Contract"?
https://www.goodreads.com/book/show/15182720-design-by-contr...
but using a Large-Language-Model rather than a subordinate team?
c.f., https://se.inf.ethz.ch/~meyer/publications/old/dbc_chapter.p...

Reply View | 0 replies
jll29 19 hours ago

Perhaps a better way than to view them as alternative choices is to view them as alternative modes of working, between which it is sometimes helpful to switch?
We know old-style classic waterfall lacks flexibility and agile lacks planning, but I don't see a reason why not to switch back and forth multiple times in the same project.

Reply View | 0 replies
lII1lIlI11ll 15 hours ago

How does the resulting code look like though? I found that while <insert your favorite LLM> can spit out barely working C++ code fast, I then have to spend 10x time prodding it to refactor the code to look at least somewhat acceptable.
No matter how much I tell it that it is a "professional experienced 10x developer versed in modern C++, a second coming of Stroustrup" in per-project or global config files it still keeps spewing the same crap big (like manual memory management instead of RAII here and there, initializing fields in ctor body instead of initializer list, having manual init/cleanup methods in classes instead of a proper ctor/dtor design to ensure that objects are always in a consistent state, bunch of other anti-patterns, etc.) and small (checking for nullptr before passing the pointer to delete/free, manually instantiating objects as argument to shared_ptr ctor instead of make_shared, endlessly casting stuff around back and forth instead of designing data types properly, etc.).
Which makes sense I guess because it is how average C++ code on GitHub looks like unfortunately and that is what all those models were trained on, but I keep feeling like my job turning into performing endless code review for a not-very- bright junior developer that just refuses to learn...

Reply View | 1 reply
- wenc 12 hours ago
  
  This could be a language specific failure mode. C++ is hard for humans too, and the training code out there is very uneven (most of it pre-C++11, much of it written by non-craftspeople to do very specific things).
  On the other hand, LLMs are great at Go because Go was designed for average engineers at scale, and LLMs behave like fast average engineers. Go as a language was designed to support minimal cleverness (there's only so many ways to do things, and abstractions are constrained). This kind of uniformity is catnip for LLM training.
  
  Reply View | 0 replies
rcarmo 19 hours ago

Yep. I've been into spec-driven development for a long time (when we had humans as agents) and it's never really failed me. We just have literally more attention (hah!) from LLMs than from humans.

Reply View | 0 replies
chrisweekly 16 hours ago

> "using a new file is the requirements doc is so large is fills the context window"
using a new file IF the requirements doc is so large IT fills the context window

Reply View | 1 reply
- dugmartin 16 hours ago
  
  I need Claude to review my HN comments.
  
  Reply View | 0 replies
catdog 19 hours ago

As it is so often in life, extreme approaches are often bad. If you do pure waterfall you risk finding out very late that your plan might not work out, either because of unforeseen technical difficulties implementing it, the given requirements actually being wrong/incomplete or just simply missing the point in time where you planned enough. If you do extreme agile you often end up with a shit architecture which actually, among other things, hurt your future agility but you get a result which you can validate against reality. The "oh, I didn't think of that" is definitely present in both extremes.

Reply View | 0 replies
bitwize 16 hours ago

What's amusing to me is that PRIDE, the oldest generally available software methodology and perhaps the least appreciated, is basically just "spec driven development with human programmers". Most of the time, and personnel, involved in development is on elucidating the requirements and developing the spec; programmers only get involved at the end and their contribution is about 15%. For a few decades this was considered the "correct" way to develop software. But then PCs happened, mom-and-pop software vendors stuffing floppy disks into Ziploc happened, and the myth of the lone "genius programmer" took hold of the industry, and programmers experienced such prestige inflation that they thought they were able to call the shots, and by and large management acquiesced. And that's how we got Agile.
With the rise of AI, maybe programmers will be put back in their rightful place, as contributors of the final small piece of the development process: a translation from business terms to the language of the computer. Programming as a profession should, by all rights, be obsolete. We should be able to express the solution directly in business terms and have the translation take place automatically. Maybe that day will be here soon.

Reply View | 0 replies
9rx 16 hours ago

Agile is really about removing managers. The twelve principles does encourage short development cycles, but that's to prevent someone from going off into the weeds — having no manager to tell them to stop.

Reply View | 0 replies
jdjdjssh 11 hours ago

[dead]

Reply View | 0 replies

cadamsdotcom 18 minutes ago

Or call it agentic coding..

Whatever you call it, for an experienced engineer to gain so much leverage in so little time while maintaining quality, it’s vibey and a ton of fun.