Comment by wenc

Comment by wenc 19 hours ago

I've been doing spec-driven development for the past 2 months, and it's been a game changer (especially with Opus 4.5).

Writing a spec is akin to "working backwards" (or future backwards thinking, if you like) -- this is the outcome I want, how do I get there?

The process of writing the spec actually exposes the edge cases I didn't think of. It's very much in the same vein as "writing as a tool of thought". Just getting your thoughts and ideas onto a text file can be a powerful thing. Opus 4.5 is amazing at pointing out the blind spots and inconsistencies in a spec. The spec generator that I use also does some reasoning checks and adds property-based test generation (Python Hypothesis -- similar to Haskell's Quickcheck), which anchors the generated code to reality.

Also, I took to heart Grant Slatton's "Write everything twice" [1] heuristic -- write your code once, solve the problem, then stash it in a branch and write the code all over again.

> Slatton: A piece of advice I've given junior engineers is to write everything twice. Solve the problem. Stash your code onto a branch. Then write all the code again. I discovered this method by accident after the laptop containing a few days of work died. Rewriting the solution only took 25% the time as the initial implementation, and the result was much better. So you get maybe 2x higher quality code for 1.25x the time — this trade is usually a good one to make on projects you'll have to maintain for a long time.

This is effective because initial mental models of a new problem are usually wrong.

With a spec, I can get a version 1 out quickly and (mostly) correctly, poke around, and then see what I'm missing. Need a new feature? I tell the Opus to first update the spec then code it.

And here's the thing -- if you don't like version 1 of your code, throw it away but keep the spec (those are your learnings and insights). Then generate a version 2 free of any sunk-cost bias, which, as humans, we're terrible at resisting.

Spec-driven development lets you "write everything twice" (throwaway prototypes) faster, which improves the quality of your insights into the actual problem. I find this technique lets me 2x the quality of my code, through sheer mental model updating.

And this applies not just to coding, but most knowledge work, including certain kinds of scientific research (s/code/LaTeX/).

[1] https://grantslatton.com/software-pathfinding

manmal 12 hours ago

My experience with both Opus and GPT-codex is that they both just forget to implement big chunks of specs, unless you give them the means to self-validate their spec conformance. I’m finding myself sometimes spending more time coming up with tooling to enable this, than the actual work.

Reply View 2 replies

wenc 10 hours ago

The key is generating a task list from the spec. Kiro IDE (not cli) generates tasks.md automatically. This is a checklist that Opus has to check off.
Try Kiro. It's just an all-round excellent spec-driven IDE.
You can still use Claude Code to implement code from the spec, but Kiro is far better at generating the specs.
p.s. if you don't use Kiro (though I recommend it), there’s a new way too — Yegge’s beads. After you install, prompt Claude Code to `write the plan in epics, stories and tasks in beads`. Opus will -- through tool use -- ensure every bead is implemented. But this is a more high variance approach -- whereas Kiro is much more systematic.

Reply View | 1 reply
- manmal an hour ago
  
  I’ve even built my own todo tool in zig, which is backed by SQLite and allows arbitrary levels of todo hierarchy. Those clankers just start ignoring tasks or checking them off with a wontfix comment the first time they hit adversity. Codex is better at this because it keeps going at hard problems. But then it compacts so many times over that it forgets the todo instructions.
  
  Reply View | 0 replies