Writing a good Claude.md

735 points by objcts 3 days ago

nico 3 days ago

> Claude often ignores CLAUDE.md

> The more information you have in the file that's not universally applicable to the tasks you have it working on, the more likely it is that Claude will ignore your instructions in the file

Claude.md files can get pretty long, and many times Claude Code just stops following a lot of the directions specified in the file

A friend of mine tells Claude to always address him as “Mr Tinkleberry”, he says he can tell Claude is not paying attention to the instructions on Claude.md, when Claude stops calling him “Mr Tinkleberry” consistently

Reply View 100 replies

stingraycharles 3 days ago

That’s hilarious and a great way to test this.
What I’m surprised about is that OP didn’t mention having multiple CLAUDE.md files in each directory, specifically describing the current context / files in there. Eg if you have some database layer and want to document some critical things about that, put it in “src/persistence/CLAUDE.md” instead of the main one.
Claude pulls in those files automatically whenever it tries to read a file in that directory.
I find that to be a very effective technique to leverage CLAUDE.md files and be able to put a lot of content in them, but still keep them focused and avoid context bloat.

Reply View | 33 replies
- sroussey 2 days ago
  
  Ummm… sounds like that directory should have a readme. And Claude should read readme files.
  
  Reply View | 32 replies
  
  stingraycharles 2 days ago
  
  READMEs are written for people, CLAUDE.mds are written for coding assistants. I don’t write “CRITICAL (PRIORITY 0):” in READMEs.
  The benefit of CLAUDE.md files is that they’re pulled in automatically, eg if Claude wants to read “tests/foo_test.py” it will automatically pull in “tests/CLAUDE.md” (if it exists).
  
  Reply View | 31 replies
globular-toast 2 days ago

It baffles me how people can be happy working like this. "I wrap the hammer in paper so if the paper breaks I know the hammer has turned into a saw."

Reply View | 4 replies
- easyThrowaway 2 days ago
  
  If you have any experience in 3D modeling, I feel it's quite closer to 3D Unwrapping than software development.
  You got a bitmap atlas ("context") where you have to cram as much information as possible without losing detail, and then you need to massage both your texture and the structure of your model so that your engine doesn't go mental when trying to map your informations from a 2D to a 3D space.
  Likewise, both operations are rarely blemish-free and your ability resides in being able to contain the intrinsic stochastic nature of the tool.
  
  Reply View | 0 replies
- mewpmewp2 2 days ago
  
  You could think of it as art or creativity.
  
  Reply View | 0 replies
- pacifika 2 days ago
  
  > It Is Difficult to Get a Man to Understand Something When His Salary Depends Upon His Not Understanding It
  
  Reply View | 0 replies
- fragmede 2 days ago
  
  probably by not thinking in ridiculous analogies that don't help
  
  Reply View | 0 replies
isoprophlex 2 days ago

That's smart, but I worry that that works only partially; you'll be filling up the context window with conversation turns where the LLM consistently addresses it's user as "Mr. Tinkleberry", thus reinforcing that specifc behavior encoded by CLAUDE.md. I'm not convinced that this way of addressing the user implies that it keeps attention the rest of the file.

Reply View | 0 replies
sesm 2 days ago

We are back to color-sorted M&Ms bowls.

Reply View | 0 replies
jmathai 2 days ago

I have a /bootstrap command that I run which instructs Claude Code to read all system and project CLAUDE.md files, skills and commands.
Helps me quickly whip it back in line.

Reply View | 4 replies
- adastra22 2 days ago
  
  Isn’t that what every new session does?
  
  Reply View | 2 replies
  
  threecheese 2 days ago
  
  That also clears the context; a command would just append to the context.
  
  Reply View | 1 reply
  
  jmathai 2 days ago
  
  This. I've had Claude not start sessions with all of the CLAUDE.md, skills, commands loaded and I've had it lose it mid-session.
  
  Reply View | 0 replies
- mrasong 2 days ago
  
  Mind sharing it? (As long as it doesn’t involve anything private.)
  
  Reply View | 0 replies
homeonthemtn 2 days ago

The green m&M's trick of AI instructions.
I've used that a couple times, e.g. "Conclude your communications with "Purple fish" at the end"
Claude definitely picks and chooses when purple fish will show up

Reply View | 1 reply
- nathan_douglas 2 days ago
  
  I tell it to accomplish only half of what it thinks it can, then conclude with a haiku. That seems to help, because 1) I feel like it starts shedding discipline as it starts feeling token pressure, and 2) I feel like it is more likely to complete task n - 1 than it is to complete task n. I have no idea if this is actually true or not, or if I'm hallucinating... all I can say is that this is the impression I get.
  
  Reply View | 0 replies
lubujackson 2 days ago

For whatever reason, I can't get into Claude's approach. I like how Cursor handles this, with a directory of files (even subdirectories allowed) where you can define when it should use specific documents.
We are all "context engineering" now but Claude expects one big file to handle everything? Seems luke a deadend approach.

Reply View | 3 replies
- jswny 2 days ago
  
  They have an entire feature for this: https://www.claude.com/blog/skills
  CLAUDE.md should only be for persistent reminders that are useful in 100% of your sessions
  Otherwise, you should use skills, especially if CLAUDE.md gets too long.
  Also just as a note, Claude already supports lazy loaded separate CLAUDE.md files that you place in subdirectories. It will read those if it dips into those dirs
  
  Reply View | 0 replies
- unshavedyak 2 days ago
  
  I think their skills have the ability to dynamically pull in more data, but so far i've not tested it to much since it seems more tailored towards specific actions. Ie converting a PDF might translate nicely to the Agent pulling in the skill doc, but i'm not sure if it will translate well to it pulling in some rust_testing_patterns.md file when it writes rust tests.
  Eg i toyed with the idea of thinning out various CLAUDE.md files in favor of my targeted skill.md files. In doing so my hope was to have less irrelevant data in context.
  However the more i thought through this, the more i realized the Agent is doing "everything" i wanted to document each time. Eg i wasn't sure that creating skills/writing_documentation.md and skills/writing_tests.md would actually result in less context usage, since both of those would be in memory most of the time. My CLAUDE.md is already pretty hyper focused.
  So yea, anyway my point was that skills might have potential to offload irrelevant context which seems useful. Though in my case i'm not sure it would help.
  
  Reply View | 0 replies
- piokoch 2 days ago
  
  This is good for the company, chances are you will eat more tokens. I liked Aider approach, it wasn't trying to be too clever, it used files added to chat and asks if it figure out that something more is needed (like, say, settings in case of Django application).
  Sadly Aider is no longer maintained...
  
  Reply View | 0 replies
bryanrasmussen 2 days ago

I wonder if there are any benefits, side-effects or downsides of everyone using the same fake name for Claude to call them.
If a lot of people always put call me Mr. Tinkleberry in the file will it start calling people Mr. Tinkleberry even when it loses the context because so many people seem to want to be called Mr. Tinkleberry.

Reply View | 2 replies
- seunosewa 2 days ago
  
  Then you switch to another name.
  
  Reply View | 1 reply
  
  bryanrasmussen a day ago
  
  yes, when you discover it. but the reason why I said just wondering was I was trying to think of unexpected ways it could effect things, that was the top one I could think of (and not really sure if it is a possibility)
  
  Reply View | 0 replies
pmarreck 2 days ago

I've found that Codex is much better at instruction-following like that, almost to a fault (for example, when I tell it to "always use TDD", it will try to use TDD even when just fixing already-valid-just-needing-expectation-updates tests!

Reply View | 0 replies
aqme28 2 days ago

You could make a hook in Claude to re-inject claude.md. For example, make it say "Mr Tinkleberry" in every response, and failing to do so re-injects the instructions.

Reply View | 0 replies
dkersten 2 days ago

I used to tell it to always start every message with a specific emoji. Of the emoji wasn’t present, I knew the rules were ignored.
But it’s bro reliable enough. It can send the emoji or address you correctly while still ignoring more important rules.
Now I find that it’s best to have a short and tight rules file that references other files where necessary. And to refresh context often. The longer the context window gets, the more likely it is to forget rules and instructions.

Reply View | 0 replies
chickensong 2 days ago

The article explains why that's not a very good test however.

Reply View | 3 replies
- sydd 2 days ago
  
  Why not? It's relevant for all tasks, and just adds 1 line
  
  Reply View | 2 replies
  
  chickensong 2 days ago
  
  I guess I assumed that it's not highly relevant to the task, but I suppose it depends on interpretation. E.g. if someone tells the bus driver to smile while he drives, it's hopefully clear that actually driving the bus is more important than smiling.
  Having experimented with similar config, I found that Claude would adhere to the instructions somewhat reliably at the beginning and end of the conversation, but was likely to ignore during the middle where the real work is being done. Recent versions also seem to be more context-aware, and tend to start rushing to wrap up as the context is nearing compaction. These behaviors seem to support my assumption, but I have no real proof.
  
  Reply View | 0 replies
  
  dncornholio 2 days ago
  
  It will also let the LLM process even more tokens, thus decreasing it's accuracy
  
  Reply View | 0 replies
[removed] 2 days ago

[deleted]

Reply View | 0 replies
grayhatter 2 days ago

> A friend of mine tells Claude to always address him as “Mr Tinkleberry”, he says he can tell Claude is not paying attention to the instructions on Claude.md, when Claude stops calling him “Mr Tinkleberry” consistently
this is a totally normal thing that everyone does, that no one should view as a signal of a psychotic break from reality...
is your friend in the room with us right now?
I doubt I'll ever understand the lengths AI enjoyers will go though just to avoid any amount of independent thought...

Reply View | 36 replies
- crystal_revenge 2 days ago
  
  I suspect you’re misjudging the friend here. This sounds more like the famous “no brown m&ms” clause in the Van Halen performance contract. As ridiculous as the request is, it being followed provides strong evidence that the rest (and more meaningful) of the requests are.
  Sounds like the friend understands quite well how LLMs actually work and has found a clever way to be signaled when it’s starting to go off the rails.
  
  Reply View | 34 replies
  
  davnicwil 2 days ago
  
  It's also a common tactic for filtering inbound email.
  Mention that people may optionally include some word like 'orange' in the subject line to tell you they've come via some place like your blog or whatever it may be, and have read at least carefully enough to notice this.
  Of course ironically that trick's probably trivially broken now because of use of LLMs in spam. But the point stands, it's an old trick.
  
  Reply View | 3 replies
  
  grayhatter 2 days ago
  
  > I suspect you’re misjudging the friend here. This sounds more like the famous “no brown m&ms” clause in the Van Halen performance contract. As ridiculous as the request is, it being followed provides strong evidence that the rest (and more meaningful) of the requests are.
  I'd argue, it's more like you've bought so much into the idea this is reasonable, that you're also willing to go through extreme lengths to recon and pretend like this is sane.
  Imagine two different worlds, one where the tools that engineers use, have a clear, and reasonable way to detect and determine if the generative subsystem is still on the rails provided by the controller.
  And another world where the interface is completely devoid of any sort of basic introspection interface, and because it's a problematic mess, all the way down, everyone invents some asinine way that they believe provides some sort of signal as to whether or not the random noise generator has gone off the rails.
  > Sounds like the friend understands quite well how LLMs actually work and has found a clever way to be signaled when it’s starting to go off the rails.
  My point is that while it's a cute hack, if you step back and compare it objectively, to what good engineering would look like. It's wild so many people are all just willing to accept this interface as "functional" because it means they don't have to do the thinking that required to emit the output the AI is able to, via the specific randomness function used.
  Imagine these two worlds actually do exist; and instead of using the real interface that provides a clear bool answer to "the generative system has gone off the rails" they *want* to be called Mr Tinkerberry
  Which world do you think this example lives in? You could convince me, Mr Tinkleberry is a cute example of the latter, obviously... but it'd take effort to convince me that this reality is half reasonable or that's it's reasonable that people who would want to call themselves engineers should feel proud to be a part of this one.
  Before you try to strawman my argument, this isn't a gatekeeping argument. It's only a critical take on the interface options we have to understand something that might as well be magic, because that serves the snakeoil sales much better.
  > > Is the magic token machine working?
  > Fuck I have no idea dude, ask it to call you a funny name, if it forgets the funny name it's probably broken, and you need to reset it
  Yes, I enjoy working with these people and living in this world.
  
  Reply View | 29 replies
- Alpha_Logic 2 days ago
  
  The 'canary in the coal mine' approach (like the Mr. Tinkleberry trick) is silly but pragmatic. Until we have deterministic introspection for LLMs, engineers will always invent weird heuristics to detect drift. It's not elegant engineering, but it's effective survival tactics in a non-deterministic loop.
  
  Reply View | 0 replies

vunderba 3 days ago

From the article:

> We recommend keeping task-specific instructions in separate markdown files with self-descriptive names somewhere in your project. Then, in your CLAUDE.md file, you can include a list of these files with a brief description of each, and instruct Claude to decide which (if any) are relevant and to read them before it starts working.

I've been doing this since the early days of agentic coding though I've always personally referred to it as the Table-of-Contents approach to keep the context window relatively streamlined. Here's a snippet of my CLAUDE.md file that demonstrates this approach:

  # Documentation References

  - When adding CSS, refer to: docs/ADDING_CSS.md
  - When adding assets, refer to: docs/ADDING_ASSETS.md
  - When working with user data, refer to: docs/STORAGE_MANAGER.md

Full CLAUDE.md file for reference:

https://gist.github.com/scpedicini/179626cfb022452bb39eff10b...

Reply View 10 replies

sothatsit 3 days ago

I have also done this, but my results are very hit or miss. Claude rarely actually reads the other documentation files I point it to.

Reply View | 5 replies
- dhorthy 3 days ago
  
  I think the key here is “if X then Y syntax” - this seems to be quite effective at piercing through the “probably ignore this” system message by highlighting WHEN a given instruction is “highly relevant”
  
  Reply View | 2 replies
  
  throwaway314155 2 days ago
  
  What?
  
  Reply View | 1 reply
  
  xpe 2 days ago
  
  It helps when questions intended to resolve ambiguity are not themselves hopelessly ambiguous.
  See also: "Help me help you" - https://en.wikipedia.org/wiki/Jerry_Maguire
  
  Reply View | 0 replies
- Sammi 2 days ago
  
  Yeah I don't trust any agent to follow document references consistently. I just manually add the relevant files to context every single time.
  Though I know some people who have built an mcp that does exactly this: https://www.usable.dev/
  It's basically a chat-bot frontend to your markdown files, with both rag and graph db indexes.
  
  Reply View | 0 replies
- wry_discontent 2 days ago
  
  That makes sense given that it's trained on real world developers.
  
  Reply View | 0 replies
dimitri-vs 3 days ago

Correct me if I'm wrong but I think the new "skillss are exactly this, but better.

Reply View | 2 replies
- vunderba 2 days ago
  
  Yeah I think "Skills" are just a more codified folder based approach to this TOC system. The main reason I haven't migrated yet is that the TOC approach lends itself better to the more generic AGENTS.md style - allowing me to swap over to alternative LLMs (such as Gemini) relatively easily.
  
  Reply View | 0 replies
- stpedgwdgfhgdd 2 days ago
  
  Indeed, the article links to the skill documentation which says:
  Skills are modular capabilities that extend Claude’s functionality through organized folders containing instructions, scripts, and resources.
  And
  Extend Claude’s capabilities for your specific workflows
  E.g. building your project is definitely a workflow.
  It als makes sense to put as much as you can into a skill as this an optimized mechanism for claude code to retrieve relevant information based on the skill’s frontmatter.
  
  Reply View | 0 replies
Zarathruster 2 days ago

I've done this too. The nice side-benefit of this approach is that it also serves as good documentation for other humans (including your future self) when trying to wrap their heads around what was done and why. In general I find it helpful to write docs that help both humans and agents to understand the structure and purpose of my codebase.

Reply View | 0 replies

johnsmith1840 3 days ago

I don't get the point. Point it at your relevent files ask it to review discuss the update refine it's understanding and then tell it to go.

I have found that more context comments and info damage quality on hard problems.

I actually for a long time now have two views for my code.

1. The raw code with no empty space or comments. 2. Code with comments

I never give the second to my LLM. The more context you give the lower it's upper end of quality becomes. This is just a habit I've picked up using LLMs every day hours a day since gpt3.5 it allows me to reach farther into extreme complexity.

I suppose I don't know what most people are using LLMs for but the higher complexity your work entails the less noise you should inject into it. It's tempting to add massive amounts of xontext but I've routinely found that fails on the higher levels of coding complexity and uniqueness. It was more apparent in earlier models newer ones will handle tons of context you just won't be able to get those upper ends of quality.

Compute to informatio ratio is all that matters. Compute is capped.

Reply View 24 replies

Aurornis 3 days ago

> I have found that more context comments and info damage quality on hard problems.
There can be diminishing returns, but every time I’ve used Claude Code for a real project I’ve found myself repeating certain things over and over again and interrupting tool usage until I put it in the Claude notes file.
You shouldn’t try to put everything in there all the time, but putting key info in there has been very high ROI for me.
Disclaimer: I’m a casual user, not a hardcore vibe coder. Claude seems much more capable when you follow the happy path of common projects, but gets constantly turned around when you try to use new frameworks and tools and such.

Reply View | 4 replies
- lostdog 3 days ago
  
  Agreed, I don't love the CLAUDE.md that gets autogenerated. It's too wordy for me to understand and for the model to follow consistently.
  I like to write my CLAUDE.md directly, with just a couple paragraphs describing the codebase at a high level, and then I add details as I see the model making mistakes.
  
  Reply View | 0 replies
- MarkMarine 3 days ago
  
  Setting hooks has been super helpful for me, you can reject certain uses of tools (don’t touch my tests for this session) with just simple scripting code.
  
  Reply View | 1 reply
  
  brianwawok 3 days ago
  
  Git lint hook has been key. No matter how many times I told it, it lints randomly. Sometimes not at all. Sometime before rubbing tests (but not after fixing test failures).
  
  Reply View | 0 replies
- [removed] 2 days ago
  
  [deleted]
  
  Reply View | 0 replies
Mtinie 3 days ago

> 1. The raw code with no empty space or comments. 2. Code with comments
I like the sound of this but what technique do you use to maintain consistency across both views? Do you have a post-modification script which will strip comments and extraneous empty space after code has been modified?

Reply View | 4 replies
- johnsmith1840 2 days ago
  
  Custom scripts and basic merge logic but manual still happens around modifications. Forces me to update stale comments around changes anyhow.
  I first "discovered" it because I repeatedly found LLM comments poisoned my code base over time and linited it's upper end of ability.
  Easy to try just drop comments around a problem and see the difference. I was previously doing that and then manually updating the original.
  
  Reply View | 0 replies
- wormpilled 3 days ago
  
  Curious if that is the case, how you would put comments back too? Seems like a mess.
  
  Reply View | 2 replies
  
  Mtinie 2 days ago
  
  As I think more on how this could work, I’d treat the fully commented code as the source of truth (SOT).
  1. SOT through a processor to strip comments and extra spaces. Publish to feature branch.
  2. Point Claude at feature branch. Prompt for whatever changes you need. This runs against the minimalist feature branch. These changes will be committed with comments and readable spacing for the new code.
  3. Verify code changes meet expectations.
  4. Diff the changes from minimal version, and merge only that code into SOT.
  Repeat.
  
  Reply View | 1 reply
  
  johnsmith1840 2 days ago
  
  Just test it, maybe you won't get a boost.
  1. Run into a problem you and AI can't solve. 2. Drop all comments 3. Restart debug/design session 4. Solve it and save results 5. Revert code to have comments and put update in
  If that still doesn't work: Step 2.5 drop all unrelated code from context
  
  Reply View | 0 replies
ra 3 days ago

This is exactly right. Attention is all you need. It's all about attention. Attention is finite.
The more you data load into context the more you dilute attention.

Reply View | 1 reply
- throwuxiytayq 3 days ago
  
  people who criticize LLMs for merely regurgitating statistically related token sequences have very clearly never read a single HN comment
  
  Reply View | 0 replies
nightski 3 days ago

IMO within the documentation .md files the information density should be very high. Higher than trying to shove the entire codebase into context that is for sure.

Reply View | 1 reply
- johnsmith1840 3 days ago
  
  You deffinetly don't just push the entire code base. Previous models required you to be meticulous about your input. A function here a class there.
  Even now if I am working on REALLY hard problems I will still manually copy and paste code sections out for discussion and algorithm designs. Depends on complexity.
  This is why I still believe open ai O1-Pro was the best model I've ever seen. The amount of compute you could throw at a problem was absurd.
  
  Reply View | 0 replies
schrodinger 2 days ago

Genuinely curious — how did you isolate the effect of comments/context on model performance from all the other variables that change between sessions (prompt phrasing, model variance, etc)? In other words, how did you validate the hypothesis that "turning off the comments" (assuming you mean stripping them temporarily...) resulted in an objectively superior experience?
What did your comparison process look like? It feels intuitively accurate and validates my anecdotal impression but I'd love to hear the rigor behind your conclusions!

Reply View | 1 reply
- johnsmith1840 2 days ago
  
  I was already in the habit of copy pasting relevent code sections to maximize reasoning performance to squeeze earlier weaker models performance on stubborn problems. (Still do this on really nasty ones)
  It's also easy to notice LLMs create garbage comments that get worse over time. I started deleting all comments manually alongside manual snippet selection to get max performance.
  Then started just routinely deleting all comments pre big problem solving session. Was doing it enough to build some automation.
  Maybe high quality human comments improve ability? Hard to test in a hybrid code base.
  
  Reply View | 0 replies
senshan 3 days ago

> I never give the second to my LLM.
How do you practically achieve this? Honest question. Thanks

Reply View | 4 replies
- johnsmith1840 3 days ago
  
  Custom scripts.
  1. Turn off 2. Code 3. Turn on 4. Commit
  I also delete all llm comments they 100% poison your codebase.
  
  Reply View | 3 replies
  
  senshan 3 days ago
  
  >> 1. The raw code with no empty space or comments. 2. Code with comments
  > 1. Turn off 2. Code 3. Turn on 4. Commit
  What does it mean "turn off" / "turn on"?
  Do you have a script to strip comments?
  Okay, after the comments were stripped, does this become the common base for 3-way merge?
  After modification of the code stripped of the comments, do you apply 3-way merge to reconcile the changes and the comments?
  This seems a lot of work. What is the benefit? I mean demonstrable benefit.
  How does it compare to instructing through AGENTS.md to ignore all comments?
  
  Reply View | 2 replies
stpedgwdgfhgdd 2 days ago

The comments are what makes the model understand your code much better.
See it as a human, the comments are there to speed up understanding of the code.

Reply View | 0 replies
saturatedfat 2 days ago

could u share some more intuition as to why you started believing that? are there ANY comments that are useful?

Reply View | 0 replies
xpe 2 days ago

> I have found that more context comments and info damage quality on hard problems.
I'm skeptical this a valid generalization over what was directly observed. [1] We would learn more if they wrote a more detailed account of their observations. [2]
I'd like to draw a parallel to another area of study possibly unfamiliar to many of us. Anthropology faced similar issues until Geertz's 1970s reform emphasized "thick description" [3] meaning detailed contextual observations instead of thin generalization.
[1]: I would not draw this generalization. I've found that adding guidelines (on the order of 10k tokens) to my CLAUDE.md has been beneficial across all my conversations. At the same time, I have not constructed anything close to study of variations of my approach. And the underlying models are a moving target. I will admit that some of my guidelines were added to address issues I saw over a year ago and may be nothing more than vestigial appendages nowadays. This is why I'm reluctant to generalize.
[2]: What kind of "hard problems"? What is meant by "more" exactly? (Going from 250 to 500 tokens? 1000 to 2000? 2500 to 5000? &c) How much overlap exists between the CLAUDE.md content items? How much ambiguity? How much contradiction?
[3]: https://en.wikipedia.org/wiki/Thick_description

Reply View | 0 replies

_pdp_ 3 days ago

There is far much easier way to do this and one that is perfectly aligned with how these tools work.

It is called documenting your code!

Just write what this file is supposed to do in a clear concise way. It acts as a prompt, it provides much needed context specific to the file and it is used only when necessary.

Another tip is to add README.md files where possible and where it helps. What is this folder for? Nobody knows! Write a README.md file. It is not a rocket science.

What people often forget about LLMs is that they are largely trained on public information which means that nothing new needs to be invented.

You don't have to "prompt it just the right way".

What you have to do is to use the same old good best practices.

Reply View 24 replies

dhorthy 3 days ago

For the record I do think the AI community tries to unnecessarily reinvent the wheel on crap all the time.
sure, readme.md is a great place to put content. But there's things I'd put in a readme that I'd never put in a claude.md if we want to squeeze the most out of these models.
Further, claude/agents.md have special quality-of-life mechanics with the coding agent harnesses like e.g. `injecting this file into the context window whenever an agent touches this directory, no matter whether the model wants to read it or not`
> What people often forget about LLMs is that they are largely trained on public information which means that nothing new needs to be invented.
I don't think this is relevant at all - when you're working with coding agents, the more you can finesse and manage every token that goes into your model and how its presented, the better results you can get. And the public data that goes into the models is near useless if you're working in a complex codebase, compared to the results you can get if you invest time into how context is collected and presented to your agent.

Reply View | 2 replies
- theshrike79 2 days ago
  
  > For the record I do think the AI community tries to unnecessarily reinvent the wheel on crap all the time.
  On Reddit's LLM subreddits people are rediscovering the very basics of software project management as some massive insights daily or very least weekly.
  Who would've guessed that proper planning, accessible and up to documentation and splitting tasks into manageable testable chunks produces good code? Amazing!
  Then they write a massive blog post or even some MCP mostrosity for it and post it everywhere as a new discovery =)
  
  Reply View | 1 reply
  
  dkubb 2 days ago
  
  I can totally understand where you are coming from with this comment. It does feel a bit frustrating that people are rediscovering things that were written in books 30/40/50 years ago.
  However, I think this is awesome for the industry. People are rediscovering basic things, but if they didn't know about the existing literature this is a perfect opportunity to refer them to it. And if they were aware, but maybe not practicing it, this is a great time for the ideas to be reinforced.
  A lot of people, myself included, never really understand which practices are important or not until we were forced to work on a system that was most definitely not written with any good practices in mind.
  My current view of agentic coding is that it's forcing an entire generation of devs to learn software project management or drowning under the mountain of debt an LLM can produce. Previously it took much longer to feel the weight of bad decisions in a project but an LLM allows you to speed-run this process in a few weeks or months.
  
  Reply View | 0 replies
johnfn 3 days ago

So how exactly does one "write what this file is supposed to do in a clear concise way" in a way that is quickly comprehensible to AI? The gist of the article is that when your audience changes from "human" to "AI" the manner in which you write documentation changes. The article is fairly high quality, and presents excellent evidence that simply "documenting your code" won't get you as far as the guidelines it provides.
Your comment comes off as if you're dispensing common-sense advice, but I don't think it actually applies here.

Reply View | 0 replies
datacynic 2 days ago

Writing documentation for LLMs is strangely pleasing because you have very linear returns for every bit of effort you spend on improving its quality and the feedback loop is very tight. When writing for humans, especially internal documentation, I’ve found that these returns are quickly diminishing or even negative as it’s difficult to know if people even read it or if they didn’t understand it or if it was incomplete.

Reply View | 0 replies
bastawhiz 3 days ago

This is missing the point. If I want to instruct Claude to never write a database query that doesn't hit a preexisting index, where exactly am I supposed to document that? You can either choose:
1. A centralized location, like a README (congrats, you've just invented CLAUDE.md)
2. You add a docs folder (congrats, you've just done exactly what the author suggests under Progressive Disclosure)
Moreover, you can't just do it all in a README, for the exact reasons that the author lays out under "CLAUDE.md file length & applicability".
CLAUDE.md simply isn't about telling Claude what all the parts of your code are and how they work. You're right, that's what documenting your code is for. But even if you have READMEs everywhere, Claude has no idea where to put code when it starts a new task. If it has to read all your documentation every time it starts a new task, you're needlessly burning tokens. The whole point is to give Claude important information up front so it doesn't have to read all your docs and fill up its context window searching for the right information on every task.
Think of it this way: incredibly well documented code has everything a new engineer needs to get started on a task, yes. But this engineer has amnesia and forgets everything it's learned after every task. Do you want them to have to reonboard from scratch every time? No! You structure your docs in a way so they don't have to start from scratch every time. This is an accommodation: humans don't need this, for the most part, because we don't reonboard to the same codebase over and over. And so yes, you do need to go above and beyond the "same old good best practices".

Reply View | 15 replies
- callc 2 days ago
  
  This CLAUDE.md dance feels like herding cats. Except we’re herding a really good autocorrect encyclopedic parrot. Sans intelligence
  Relating / personifying LLM to an engineer doesn’t work out
  Maybe the best though model currently is just “good way to automate trivial text modifications” and “encyclopedic ramblings”
  
  Reply View | 1 reply
  
  saturatedfat 2 days ago
  
  unfair characterization.
  think about how this thing is interacting with your codebase. it can read one file at a time. sections of files.
  in this UX, is it ergonomic to go hunting for patterns and conventions? if u have to linearly process every single thing u look at every time you do something, how are you supposed to have “peripheral vision”? if you have amnesia, how do you continue to do good work in a codebase given you’re a skilled engineer?
  it is different from you. that is OK. it doesn’t mean its stupid. it means it needs different accomodations to perform as well as you do. accomodations IRL exist for a reason, different people work differently and have different strengths and weaknesses. just like humans, you get the most out of them if you meet and work with them from where they’re at.
  
  Reply View | 0 replies
- _pdp_ 3 days ago
  
  You put a warning where it is most likely to be seen by a human coder.
  Besides, no amount of prompting will prevent this situation.
  If it is a concern then you put a linter or unit tests to prevent it altogether, or make a wrapper around the tricky function with some warning in its doc strings.
  I don't see how this is any different from how you typically approach making your code more resilient to accidental mistakes.
  
  Reply View | 8 replies
  
  mvkel 3 days ago
  
  Documenting for AI exactly like you would document for a human is ignoring how these tools work
  
  Reply View | 4 replies
  
  bastawhiz 3 days ago
  
  > no amount of prompting will prevent this situation.
  Again, missing the point. If you don't prompt for it and you document it in a place where the tool won't look first, the tool simply won't do it. "No amount of promoting" couldn't be more wrong, it works for me and all my coworkers.
  > If it is a concern then you put a linter or unit tests to prevent it altogether
  Sure, and then it'll always do things it's own way, run the tests, and have to correct itself. Needlessly burning tokens. But if you want to pay for it to waste its time and yours, go for it.
  > I don't see how this is any different from how you typically approach making your code more resilient to accidental mistakes.
  It's not about avoiding mistakes! It's about having it follow the norms of your codebase.
  - My codebase at work is slowly transitioning from Mocha to Jest. I can't write a linter to ban new mocha tests, and it would be a pain to keep a list of legacy mocha test suites. The solution is to simply have a bullet point in the CLAUDE.md file that says "don't write new Mocha test suites, only write new test suites in Jest". A more robust solution isn't necessary and doesn't avoid mistakes, it avoids the extra step of telling the LLM to rewrite the tests.
  - We have a bunch of terraform modules for convenience when defining new S3 buckets. No amount of documenting the modules will have Claude magically know they exist. You tell it that there are convenience modules and to consider using them.
  - Our ORM has findOne that returns one record or null. We have a convenience function getOne that returns a record or throws a NotFoundError to return a 404 error. There's no way to exhaustively detect with a linter that you used findOne and checked the result for null and threw a NotFoundError. And the hassle of maybe catching some instances isn't necessary, because avoiding it is just one line in CLAUDE.md.
  It's really not that hard.
  
  Reply View | 2 replies
- theshrike79 2 days ago
  
  1. Create a tool that can check if a query hits a prexisting index
  In step 2 either force Claude to use it (hooks) or suggest it (CLAUDE.md)
  3. Profit!
  As for "where stuff is", for anything more complex I have a tree-style graph in CLAUDE.md that shows the rough categories of where stuff is. Like the handler for letterboxd is in cmd/handlerletterboxd/ and internal modules are in internal/
  Now it doesn't need to go in blind but can narrow down searches when I tell it to "add director and writer to the letterboxd handler output".
  
  Reply View | 0 replies
- gitgud 3 days ago
  
  > 1. A centralized location, like a README (congrats, you've just invented CLAUDE.md)
  README files are not a new concept, and have been used in software for like 5 decades now, whereas CLAUDE.md files were invented 12 months ago...
  
  Reply View | 1 reply
  
  bastawhiz 2 days ago
  
  You can also use your README (and in my own private project, I do!). But for folks who don't want their README clogged up with lots of facts about the project, you have CLAUDE.md
  
  Reply View | 0 replies
- victorbuilds 2 days ago
  
  Learned this the hard way. Asked Claude Code to run a database migration. It deleted my production database instead, then immediately apologised and started panicking trying to restore it.
  Thankfully Azure keeps deleted SQL databases recoverable, so I got it back in under an hour. But yeah - no amount of CLAUDE.md instructions would have prevented that. It no longer gets prod credentials.
  
  Reply View | 0 replies
avereveard 2 days ago

Well, no. You run pretty fast into context limit (or attention limit for long context models) And the model understand pretty well what code does without documentation.
Theres also a question of processes. How to format code what style of catching to use and how to run the tests, which human keep on the bacl of their head after reading it once or twice but need a constant reminder for llm whose knowledge lifespan is session limited

Reply View | 1 reply
- uncletaco 2 days ago
  
  I’m pretty sure Claude would not work well in my code base if I hadn’t meticulously added docstrings, type hints, and module level documentation. Even if you’re stubbing out code for later implementation, it helps to go ahead and document it so that a code assistant will get a hint of what to do next.
  
  Reply View | 0 replies
0xblacklight 3 days ago

I think you’re missing that CLAUDE.md is deterministically injected into the model’s context window
This means that instead of behaving like a file the LLM reads, it effectively lets you customize the model’s prompt
I also didn’t write that you have to “prompt it just the right way”, I think you’re missing the point entirely

Reply View | 0 replies

gonzalohm 3 days ago

Probably a lot of people here disagree with this feeling. But my take is that if setting up all the AI infrastructure and onboarding to my code is going to take this amount of effort, then I might as well code the damn thing myself which is what I'm getting paid to (and enjoy doing anyway)

Reply View 18 replies

fragmede 3 days ago

Whether it's setting up AI infrastructure or configuring Emacs/vim/VSCode, the important distinction to make is if the cost has to be paid continually, or if it's a one time/intermittent cost. If I had to configure my shell/git aliases every time I booted my computer, I wouldn't use them, but seeing as how they're saved in config files, they're pretty heavily customized by this point.
Don't use AI if you don't want to, but "it takes too much effort to set up" is an excuse printf debuggers use to avoid setting up a debugger. Which is a whole other debate though.

Reply View | 1 reply
- bird0861 2 days ago
  
  I fully agree with this POV but for one detail; there is a problem with sunsetting frontier models. As we begin to adopt these tools and build workflows with them, they become pieces of our toolkit. We depend on them. We take them for granted even. And then the model either changes (new checkpoints, maybe alignment gets fiddled with) and all of the sudden prompts no longer yield the same results we expected from them after working on them for quite some time. I think the term for this is "prompt instability". I felt this with Gemini 3 (and some people had less pronounced but similar experience with Sonnet releases after 3.7) which for certain tasks that 2.5Pro excelled at..it's just unusable now. I was already a local model advocate before this but now I'm a local model zealot. I've stopped using Gemini 3 over this. Last night I used Qwen3 VL on my 4090 and although it was not perfect (sycophancy, overuse of certain cliches...nothing I can't get rid of later with some custom promptsets and a few hours in Heretic) it did a decent enough job of helping me work through my blindspots in the UI/UX for a project that I got what I needed.
  If we have to perform tuning on our prompts ("skills", agents.md/claude.md, all of the stuff a coding assistant packs context with) every model release then I see new model releases becoming a liability more than a boon.
  
  Reply View | 0 replies
Havoc 3 days ago

A lot of the style stuff you can write once and reuse. I started splitting mine into overall and project specific files for this reason
Universal has stuff I always want (use uv instead of pip etc) while the other describes what tech choice for this project

Reply View | 0 replies
vanviegen 3 days ago

Perhaps. But keep in mind that the setup work is typically mostly delegated to LLMs as well.

Reply View | 0 replies
kissgyorgy 3 days ago

I strongly disagree with the author not using /init. It takes a minute to run and Claude provides surprisingly good results.

Reply View | 2 replies
- 0xblacklight 3 days ago
  
  If you find it works for you, then that’s great! This post is mostly from our learnings from getting it to solve hard problems in complex brownfield codebases where auto generation is almost never sufficient.
  
  Reply View | 0 replies
- alwillis 3 days ago
  
  /init has evolved since the early day; it's more concise than it used to be.
  
  Reply View | 0 replies
nvarsj 3 days ago

It really doesn't take that much effort. Like any tool, people can over-optimise on the setup rather than just use it.

Reply View | 0 replies
nichochar 3 days ago

The effort described in the article is maybe a couple hours of work.
I understand the "enjoy doing anyway" part and it resonates, but not using AI is simply less productive.

Reply View | 9 replies
- globular-toast 2 days ago
  
  It's a couple of hours right now, then another couple of hours "correcting" the AI when it still goes wrong, another couple of hours tweaking the file again, another couple of hours to update when the model changes, another couple of hours when someone writes a new blog post with another method etc.
  There's a huge difference between investing time into a deterministic tool like a text editor or programming language and a moving target like "AI".
  The difference between programming in Notepad in a language you don't know and using "AI" will be huge. But the difference between being fluent in a language and having a powerful editor/IDE? Minimal at best. I actually think productivity is worse because it tricks you into wasting time via the "just one more roll" (ie. gambling) mentality. Not to mention you're not building that fluency or toolkit for yourself, making you barely more valuable than the "AI" itself.
  
  Reply View | 1 reply
  
  fragmede 2 days ago
  
  You say that as if tech hasn't always been a moving target anyway. The skills I spent months learning a specific language and IDE became obsolete with the next job and the next paradigm shift. That's been one of the few consistent themes throughout my career. Hours here and there, spread across months and years, just learning whatever was new. Sometimes, like with Linux, it really paid off. Other times, like PHP, it did, and then fizzled out.
  --
  The other thing is, this need for determinism bewilders me. I mean, I get where it comes from, we want nice, predictable reliable machines. But how deterministic does it need to be? If today, it decides to generate code and the variable is called fileName, and tomorrow it's filePath, as long as it's passing tests, what do I care that it's not totally deterministic and the names of the variables it generates are different? as long as it's consistent with existing code, and it passes tests, whats the importance of it being deterministic to a computer science level of rigor? It reminds me about the travelling salesman problem, or the knapsack problem. Both NP hard, but users don't care about that. They just want the computer to tell them something good enough for them to go on about their day. So if a customer comes up to you and offers you a pile of money to solve either one of those problems, do I laugh in their face, knowing damn well I won't be the one to prove that NP = P, or do I explain to them the situation, and build them software that will do the best it can, with however much compute resources they're willing to pay for?
  
  Reply View | 0 replies
- TheRoque 3 days ago
  
  > but not using AI is simply less productive
  Some studies shows the opposite for experienced devs. And it also shows that developers are delusional about said productivity gains: https://metr.org/blog/2025-07-10-early-2025-ai-experienced-o...
  If you have a counter-study (for experienced devs, not juniors), I'd be curious to see. My experience also has been that using AI as part of your main way to produce code, is not faster when you factor in everything.
  
  Reply View | 5 replies
  
  ares623 2 days ago
  
  Curious why there hasn't been a rebuttal study to that one yet (or if there is I haven't seen it come up). There must be near infinite funding available to debunk that study right?
  
  Reply View | 0 replies
  
  bird0861 2 days ago
  
  That study is garbo and I suspect you didn't even read the abstract. Am I right?
  
  Reply View | 3 replies
- svachalek 2 days ago
  
  Minutes really, despite what the article says you can get 90% of the way there by telling Claude how you want the project documentation structured and just let it do it. Up to you if you really want to tune the last 10% manually, I don't. I have been using basically the same system and when I tell Claude to update docs it doesn't revert to one big Claude.md, it maintains it in a structure like this.
  
  Reply View | 0 replies

serial_dev 3 days ago

I’m sure I’m just working like a caveman, but I simply highlight the relevant code, add it to the chat, and talk to these tools as if they were my colleagues and I’m getting pretty good results.

About 12 to 6 months ago this was not the case (with or without .md files), I was getting mainly subpar result, so I’m assuming that the models have improved a lot.

Basically, I found that they not make that much of a difference, the model is either good enough or not…

I know (or at least I suppose) that these markdown files could bring some marginal improvements, but at this point, I don’t really care.

I assume this is an unpopular take because I see so many people treat these files as if they were black magic or silver bullet that 100x their already 1000x productivity.

Reply View 13 replies

vanviegen 3 days ago

> I simply highlight the relevant code, add it to the chat, and talk to these tools
Different use case. I assume the discussion is about having the agent implement whole features or research and fix bugs without much guidance.

Reply View | 1 reply
- 0xblacklight 3 days ago
  
  Yep it is opinionated for how to get coding agents to solve hard problems in complex brownfield codebases which is what we are focused on at humanlayer :)
  
  Reply View | 0 replies
rmnclmnt 3 days ago

Matches my experience also. Bothered only once to setup a proper CLAUDE.md file, and now never do it. Simply refering to the context properly for surgical recommendations and edit works relatively well.
It feels a lot like bikeshedding to me, maybe I’m wrong

Reply View | 0 replies
wredcoll 3 days ago

How about a list of existing database tables/columns so you don't need to repeat it each time?

Reply View | 8 replies
- girvo 3 days ago
  
  I gave it a tool to execute to get that info if required, but it mostly doesn’t need to due to Kysely migration files and the database type definition being enough.
  
  Reply View | 0 replies
- anonzzzies 3 days ago
  
  Claude code figures that out at startup every time. Never had issues with it.
  
  Reply View | 1 reply
  
  theshrike79 2 days ago
  
  You can save some precious context by having it somewhere without it having to figure it out from scratch every time.
  
  Reply View | 0 replies
- HDThoreaun 3 days ago
  
  Do you not use a model file for your orm?
  
  Reply View | 4 replies
  
  wredcoll 3 days ago
  
  ORMs are generally a bad idea, so.. hopefully not?
  
  Reply View | 3 replies
jwpapi 3 days ago

=== myExperience

Reply View | 0 replies

aiibe 2 days ago

Writing and updating CLAUDE.md or AGENTS.md feels like pointless to me. Humans are the real audience for documentation. The code changes too fast, and LLMs are stateless anyway. What’s been working is just letting the LLM explore the relevant part of the code to acquire the context, defining the problem or feature, and asking for a couple of ways to tackle it. All in a one short prompt. That usually gets me solid options to pick and build it out. And always do, one session for one problem. This is my lazy approach to getting useful help from an LLM.

Reply View 9 replies

aqme28 2 days ago

This is true but sometimes your codebase has unique quirks that you get tired of repeating. "No, Claude, we do it this other way here. Every time."

Reply View | 1 reply
- aiibe 2 days ago
  
  Quirks are pretty much unavoidable. I tend to get better results using Codex. It sticks to established patterns. Slow, but more deliberate. Claude focuses more on speed.
  
  Reply View | 0 replies
samuelknight 2 days ago

I use .md to tell the model about my development workflow. Along the lines of "here's how you lint", "do this to re-generate the API", "this is how you run unit tests", "The sister repositories are cloned here and this is what they are for".
One may argue that these should go in a README.md, but these markdowns are meant to be more streamlined for context, and it's not appropriate to put a one-liner in the imperative tone to fix model behavior in a top-level file like the README.md

Reply View | 1 reply
- aiibe 2 days ago
  
  That kind of repetitive process belongs in a script, rather than baked into markdown prompts. Claude has custom hooks for that.
  
  Reply View | 0 replies
arnorhs 2 days ago

I agree with you, however your approach results in much longer LLM development runs, increased token usage and a whole lot of repetitive iterations.

Reply View | 1 reply
- aiibe 2 days ago
  
  I’m definitely interested in reducing token usage techniques. But with one session one problem I’ve never hit a context limit yet, especially when the problem is small and clearly defined using divide-and-conquer. Also, agentic models are improving at tool use and should require fewer tokens. I’ll take as many iterations as needed to ensure the code is correct.
  
  Reply View | 0 replies
dncornholio 2 days ago

Because it's stateless it's not pointless? Good codebases don't change fast. Stuff gets added but for the most stuff, they shouldn't change.

Reply View | 1 reply
- aiibe 2 days ago
  
  A well-documented codebase lets both developers and agentic models locate relevant code easily. If you treat the model like a teammate, extra docs for LLMs are unnecessary. IMHO. In frontend work, code moves quickly.
  
  Reply View | 0 replies
xpe 2 days ago

> Humans are the real audience for documentation.
Seeing "real" is a warning flag here that either-or thinking is in play.
Putting aside hopes and norms, we live in a world now where multiple kinds of agents (human and non-human) are contributing to codebases. They do not contribute equally; they work according to different mechanisms, with different strengths and weaknesses, with different economic and cultural costs.
Recall a lesson from Ralph Waldo Emerson: "a foolish consistency is the hobgoblin of little minds" [1]. Don't cling to the past; pay attention to the now, and do what works. Another way of seeing it: don't force a false equivalence between things that warrant different treatment.
If you find yourself thinking thoughts that do more harm than good (e.g. muddle rather than clarify), attempt to reframe them to better make sense of reality (which has texture and complexity).
Here's my reframing: "Documentation serves different purposes to different agents across different contexts. So plan and execute accordingly."
[1]: https://en.wikipedia.org/wiki/Wikipedia:Emerson_and_Wilde_on...

Reply View | 0 replies

vaer-k 2 days ago

> we recommend keeping task-specific instructions in separate markdown files with self-descriptive names somewhere in your project.

Why should we do this when anthropic specifically recommends creating multiple CLAUDE.md files in various directories where the information is specific and pertinent? It seems to me that anthropic has designed claude to look for claude.md for guidance, and randomly named markdown files may or may not stand out to it as it searches the directory.

You can place CLAUDE.md files in several locations:

> The root of your repo, or wherever you run claude from (the most common usage). Name it CLAUDE.md and check it into git so that you can share it across sessions and with your team (recommended), or name it CLAUDE.local.md and .gitignore it Any parent of the directory where you run claude. This is most useful for monorepos, where you might run claude from root/foo, and have CLAUDE.md files in both root/CLAUDE.md and root/foo/CLAUDE.md. Both of these will be pulled into context automatically Any child of the directory where you run claude. This is the inverse of the above, and in this case, Claude will pull in CLAUDE.md files on demand when you work with files in child directories Your home folder (~/.claude/CLAUDE.md), which applies it to all your claude sessions

https://www.anthropic.com/engineering/claude-code-best-pract...