AGENTS.md – Open format for guiding coding agents
(agents.md)823 points by ghuntley 4 days ago
823 points by ghuntley 4 days ago
Except not hidden. Why do people want to hide important files and directories? Particularly documentation? Tradition, I guess, but it's an antipattern that makes everything more opaque.
Maybe robot_docs?
It's so it doesn't clash with any project that actually has a functional `agents/` directory
Another reason to use a src/ directory for the actual source code.
Where are they hidden that you are having trouble with? I've had an alias for `ls` that always includes dotfiles and `shopt -s dotglob` in my bash profile for decades. Mac Finder is a little more obnoxious with having to do `Meta+Shift+.` to reveal dotfiles.
Other than that, modern tooling like Git and IDEs do not "hide" dotfiles.
These days, a `.` in front of a file or folder in a repo is more to indicate it is metadata/config. Although I am in favor of putting all that stuff under `.config/`.
> Maybe robot_docs?
No thanks.
This should have been CONTRIBUTING.md all along.
The content of the AGENTS.md is the same as what humans are looking for when contributing to a project.
The most effective argument I have for getting other developers to comment their code is "The agent will read it and it will give better suggestions".
Truly perverse, but it works.
I agree with you... but the reality is that there's a wide contingent of people that are not capable of understanding "people don't know the same things as me". So they need some other reason.
It's made my project documentation so much better. If I write out really good acceptance criteria, 9 times out of 10 I can point Claude at the ticket and get a workable (if unpolished) solution with little to no supervision.
several ironies here:
1) an AI agent is less likely to notice than even a junior is when the docs are out of date from the code
2) AI boosters are always talking about using language models to understand code, but apparently they need the code explained inline? are we AGI yet?
3) I frequently hear how great AI is at writing comments! But it needs comments to better understand the code? So I guess to enable agentic coding you also have to review all the agents' comments in addition to the code in order to prevent drift
HOW IS ANY OF THIS SAVING ME TIME
Nah, My standard for what I write for humans is 100x than the slop I spew for robots.
Also, you don’t even address their point.
This looks like a general software design / coding style docs both for humans and robots alike. I put these .md files into the docs/ folder. And they're written by the Claude Code itself.
AGENTS.md (and friends like CLAUDE.md) should be for robots only, whether a large single file with h2 headers (##) sections, or a directory with separate sections, is a matter of taste. Some software arch/design doc formats support both versions, i.e. see Arc42.
Though, it's much easier and less error-prone to @-mention a separate .md file, rather than a section in a large markdown file.
Smaller files also might be better when you want to focus a coding agent's attention on a specifric thing.
They're also easier to review diffs / PRs.
>whether a large single file with h2 headers (##) sections, or a directory with separate sections, is a matter of taste
Not sure it is when you consider how agents deal with large files, hows it gonna follow coding conventions if it doesn’t even grep them or just read the first few lines
You can have multiple AGENTS.md files in your codebase and tooling will look at both the one in the current directory as well as in the root of the codebase. This way you can sort of do what you're suggesting but simultaneously keep the information closer to the code that it is describing.
so you would have an Agents.md in your testing folder and it would describe how to run the tests or generate new tests for the project - am I understanding the usage correctly?
Pretty much yes
Most systems have a global config, project config and personal config.
But I do like the directory style to keep context low. Cursor did it best with actual glob filters in the front matter that tell the LLM "only read this if the file you're processing ends with *.php"
Copilot does globs too, but if you dig into the actual prompt sent out...
They are not doing this mechanically (read file, compare to globs to add more context), they try to rely on the model to notice and do another read. It has been unreliable. I have had better results by adding instructions like...
"If the user asks about X, Read `./path/to/inst.md`"
Still lots of DX to do in this space
Been using a similar setup, with so far pretty decent results. With the addition of having a short explanation for each file within index.md
I've been experimenting with having a rules.md file within each directory where I want a certain behavior. Example, let us say I have a directory with different kind of services like realtime-service.ts and queue-service.ts, I then have a rules.md file on the same level as they are.
This lets me scaffold things pretty fast when prompting by just referencing that file. The name is probably not the best tho.
I've been trying to keep my baked in llm instructions to a terse ~100 line file, mostly headered sections with 5 or so bullet points each. Covering basic expectations for architecture, test mocking, approach to large changes etc. I can see why for some projects that wouldn't be enough but I feel like it covers everything for most of mine.
Here you go:
# ASCII RPG
This repo uses Rust + Bevy (0.16.1), multi-crate workspace, RON assets, and a custom ASCII UI. The rules below keep contributions consistent, testable, and verifiable.
## Quick rules (read me first)
- Read/update CURRENT_TASK.md each step; delete when done.
- Build/lint/test (fish): cargo check --workspace; and cargo clippy --workspace --all-targets -- -D warnings; and cargo test --workspace
- Run dev tools: asset-editor/dev.fish; debug via /tmp/ascii_rpg_debug; prefer debug scripts in repo root.
- Logging: use info!/debug!/warn!/error! (no println!); avoid per-frame logs unless trace!.
- ECS: prefer components over resource maps; use markers + Changed<T>; keep resources for config/assets only.
- UI: adaptive content; builder pattern; size-aware components.
- Done = compiles clean (clippy -D warnings), tests pass, verified in-app, no TODOs/hacks.
- If blocked: state why and propose the next viable step.
- Before large refactors/features: give 2–3 options and trade-offs; confirm direction before coding.
## 1) Build, lint, test (quality gates)
- Fish shell one-liner:
- cargo check --workspace; and cargo clippy --workspace --all-targets -- -D warnings; and cargo test --workspace
- Fix all warnings. Use snake_case for functions/files, PascalCase for types.
- Prefer inline rustdoc (///) and unit tests over standalone docs.
## 2) Run and debug (dev loop)
- Start the app with debug flags and use the command pipe at /tmp/ascii_rpg_debug.
- Quick start (fish):
- cargo run --bin app -- --skip-main-menu > debug.log 2>&1 &
- echo "debug viewport 0 0" > /tmp/ascii_rpg_debug
- echo "ui 30 15" > /tmp/ascii_rpg_debug
- Helper scripts at repo root:
- ./debug.sh, ./debug_keyboard.sh, ./debug_click.sh, ./debug_world.sh
- Logging rules:
- Use info!/debug!/warn!/error! (never println!).
- Don’t log per-frame unless trace!.
- Use tail/grep to keep logs readable.
## 3) Testing priorities
1) Unit tests first (small, deterministic outputs).
2) Manual testing while iterating.
3) End-to-end verification using the debug system.
4) UI changes require visual confirmation from the user.
## 4) Architecture guardrails
- ECS: Components (data), Systems (logic), Resources (global), Events (comm).
- Principles:
- Prefer components over resource maps. Avoid HashMap<Entity, _> in resources.
- Optimize queries: marker components (e.g., IsOnCurrentMap), Changed<T>.
- Separate concerns: tagging vs rendering vs gameplay.
- Resources only for config/assets; not entity collections/relationships.
- UI: Adaptive content, builder pattern, size-aware components.
- Code layout: lib/ui (components/builders), engine/src/frontend (UI systems), engine/src/backend (game logic).
## 5) Completion criteria (definition of done)
- All crates compile with no warnings (clippy -D warnings).
- All tests pass. Add/adjust tests when behavior changes.
- Feature is verified in the running app (use debug tools/logs).
- No temporary workarounds or TODOs left in production paths.
- Code follows project standards above.
## 6) Never-give-up policy
- Don’t mark complete with failing builds/tests or known issues.
- Don’t swap in placeholder hacks and call it “done”.
- If truly blocked, state why and propose a viable next step.
## 7) Debug commands (reference)
- Pipe to /tmp/ascii_rpg_debug:
- debug [viewport X Y] [full]
- move KEYCODE (Arrow keys, Numpad1–9, Space, Period)
- click X Y [left|right|middle]
- ui X Y
- Coordinates: y=0 at bottom; higher y = higher on screen.
- UI debug output lists text top-to-bottom by visual position.
## 8) Dev convenience (asset editor)
- Combined dev script:
- ./asset-editor/dev.fish (starts backend in watch mode + Vite dev)
- Frontend only:
- ./asset-editor/start-frontend.fish
## 9) Tech snapshot
- Rust nightly (rust-toolchain.toml), Bevy 0.16.1.
- Workspace layout: apps/ (game + editors), engine/ (frontend/backend), lib/ (shared), asset-editor/.
Keep changes small, tested, and instrumented. When in doubt: write a unit test, run the app, and verify via the debug pipe/logs.
## 10) Design-first for large changes
- When to do this: large refactors, cross-crate changes, complex features, public API changes.
- Deliverable (in CURRENT_TASK.md):
- Problem and goals (constraints, assumptions).
- 2–3 candidate approaches with pros/cons, risks, and impact.
- Chosen approach and why; edge cases; test plan; rollout/rollback.
- Keep it short (5–10 bullets). Get confirmation before heavy edits.
There shouldn't be anything stopping you from doing that.
You can just use the AGENTS.md file as an index pointing to other doc files.
This example does that -
I don't see the point of having it hidden though. Having it "in your face" means you can actively tune it yourself, or using the LLM itself.
This. Projects need to stop inventing their own root level files and directories.
Stop polluting the root dir.
I'm not a fan of the name "well-known", but at least it's a convention [1].
I think it'd be great if we took something like XDG [2] and made it common for repositories, build scripts, package managers, tooling configs, etc.
.config is a good name, and has a small following
I believe with direnv or a similar tool (e.g. Emacs’s directory-local feature) one can append $REPO/.config to XDG_CONFIG_HOME, $REPO/.local/bin to PATH and so on so that when in the context of a particular directory everything Just Works.
I think all this agentic stuff could live quite happily in $REPO/.config/agents/.
I've been in IT for a long time and configured Apache, Nginx, even IIS a bit back in the day, but I actually didn't know about well-known.
I guess I was one of the lucky 10000 :-)
You can put your docs in folders and reference them in AGENTS.md
Or use urls in your main AGENTS like I do for https://gitchamber.com
This is what I do. Everywhere my agent works it uses a .agent dir to store its logs and intermediary files. This way the main directories aren't polluted with cruft all the time.
I'd be interested in smarter ways of doing this, but currently I just use my CLAUDE.local.md to serve as the index.md in my example. It includes the 'specialist' .md files with their relative paths and descriptions, and tells Claude Code to use these when planning.
I also have explicit `xnew`, `xplan`, `xcode` and `xcheck` commands in CLAUDE.md that reinforce this. For example, here's my `xnew`:
## Remember Shortcuts
Remember the following shortcuts, which the user may invoke at any time.
### XNEW
When I type "xnew", this means:
```Understand all BEST PRACTICES listed in CLAUDE.md.
Your code SHOULD ALWAYS follow these best practices.
REVIEW relevant documentation in .agents/ before starting new work.
Your code SHOULD use existing patterns and architectural decisions
documented there rather than creating new approaches.```
Claude supports slash commands, this might be better for you: https://docs.anthropic.com/en/docs/claude-code/slash-command...
As well as custom agents: https://docs.anthropic.com/en/docs/claude-code/sub-agents
There’s also [.claude/agents/](https://docs.anthropic.com/en/docs/claude-code/sub-agents), which doesn’t contain docs for agents but separate sub-agent definitions.
Anti-feature if you ask me. An agent should be able to pick the stuff it needs from the AGENTS.md, and not blindly use everything.
Context is not infinite. Saving context for what matters is key in working with LLMs.
We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will. I think we should focus on our own project documentation being comprehensive (e.g. the contents of this AGENTS.md are appropriate to live somewhere in our documentation), but we should always write for humans.
The LLM's whole shtick is that it can read and comprehend our writing, so let's architect for it at that level.
It's not just understanding the codebase, it's also stylistic things, like "use this assert library to write tests", or "never write comments", or "use structured logging". It's just as useful --- more so even --- on fresh projects without much code.
Honestly, everything I have written in markdown files as AI context fodder is stuff that I write down for human contributors anyway. Or at least stuff I want to always write down, but maybe only halfway do. The difference now is it is actually being read, seemingly understood, and often followed!
... most of which would also be valuable information to communicate when onboarding new devs.
Yeah I agree. I think the best place for all this lives in CONTRIBUTING.md which is already a standard-ish thing. I've started adding it even to my private projects that only I work on - when I have to come back in 3 or 4 months, I always appreciate it.
If there were already a universal convention on where to put that stuff, then probably the agents would have just looked there. But there's not, so it was necessary to invent one.
I suspect machine readable practices will become standard as AI is incorporated more into society.
A good example is autonomous driving and local laws / context. "No turn on red. School days 7am-9am".
So you need: where am I, when are school days for this specific school, and what datetime it is. You could attempt to gather that through search. Though more realistically I think the municipality will make the laws require less context, or some machine readable (e.g. qrcode) transfer of information will be on the sign. If they don't there's going to be a lot of rule breaking.
Very strong "reverse centaur" vibes here, in the sense of humans becoming servants to machines, instead of vice versa. Not that I think making things more machine-readable is a waste of time, but you have to keep in mind the amount of human time sacrificed.
Well, it wouldn't even be the first time.
We've completely redesigned society around cars - making the most human populated environments largely worse for humans along the way.
Universal sidewalks (not really needed with slow moving traffic like horses and carts - though nice even back then), traffic lights, stop signs, street crossing, interchanges, etc.
Those particular signs are just stupid. The street should be redesigned with traffic calming, narrowing and chicanes so that speeding is not possible.
Slapping on a sign is ineffective
Maybe for new schools. Old schools don't have the luxury of being able to force adjacent road design changes in most cases. Also. I've frequently seen the school zones extended out in several directions away from the school to make heavily trafficked intersections feeding towards the school safer. Safer for pedestrian and motorist alike. The real world is generally never so black and white. We have to deal with that gray nuance all the time.
Of course they can. Streets get redesigned all the time. They get repaved every couple decades at worst.
I’m saying this because it seemed silly to me to be dreaming up some weird system of QR codes or LLM readable speed limits instead of simply making the street follow best practices which change how humans drive for the better _today_.
Also see this happening, what does that mean for business specifications? Does it become close to code syntax itself?
I think they'll always need special guidance for things like business logic. They'll never know exactly what it is that you're building and why, what the end goal of the project is without you telling them. Architectural stuff is also a matter of human preference: if you have it mapped out in your head where things should go and how they should be done, it will be better for you when reading the changes, which will be the real bottleneck.
Not at all. Good documentation for humans are working well for models too, but they need so much more details and context to be reliable than humans that it needs a different style of description.
This needs to contain things that you would never write for humans. They also do stupid things which need to be adjusted by these descriptions.
Yes! That was precisely my point here: https://news.ycombinator.com/item?id=44837875
Better to work with the tools we have instead of the tools we might one day have. If you want agents to work well today, you need to build for the agents we have today.
We may never achieve your future where context is unlimited, models are trained on your codebase specifically, and tokens are cheap enough to use all of this. We might have a bubble pop and in a few years we could all be paying 5-10X current prices (read: the actual cost) for similar functionality to today. In that reality, how many years of inferior agent behavior do you tolerate before you give up hoping that it will evolve past needing the tweaks?
> We're in a transition phase today where agents need special guidance to understand a codebase that go beyond what humans need. Before long, I don't think they will.
This isn't guaranteed. Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
Right now AI coders are going to be another tool in the tool bucket.
I don't think the bar here is a human level coder, I think the bar is an LLM which reads and follows the README.md.
If we're otherwise assuming it reads and follows an AGENTS.md file, then following the README.md should be within reach.
I think our task is to ensure that our README.md is suitable for any developer to onboard into the codebase. We can then measure our LLMs (and perhaps our own documentation) by if that guidance is followed.
Waymo uses a bespoke 3D data representation of the SF roads, does it not? The self-driving car equivalent of an AGENTS.md file.
The limited self-driving cars, with a remote human operator? no, I never have.
> Just like we will never have fully self-driving cars, we likely won't have fully human quality coders.
“Never is a long time...and none of us lives to see its length.” Elizabeth Yates, A Place for Peter (Mountain Born, #3)
“Never is an awfully long time.” J.M. Barrie, Peter Pan
Here's a prompt I wrote a few days ago for codex:
Analyze the repository and add a suitable agents.md
It did a decent job. I didn't really have much to add to that. I guess, having this file is a nice optimization but obviously it doesn't contain anything it wasn't able to figure out by itself. What's really needed is a per repository learning base that gets populated with facts the agents discovers during it's many experiments with the repository over the course of many conversations. It's a performance optimization.The core problem is that every conversation is like ground hog day. You always start from scratch. Agents.md is a stop gap solution for that problem. Chatgpt actually has some notional memory that works across conversations. But it's a bit flaky, slow, and limited. It doesn't really learn across conversations.
That btw. is a big missing piece on the path to AGIs. There are some imperfect workarounds but a lot of knowledge is lost in between conversations. And the trick of just growing the amount of context we give to our prompts doesn't seem like it's the solution.
I see the groundhog day problem as a feature, not a bug.
It's an organizational challenge, requiring a top level overview and easy to find sub documentation - and clear directives to use them when the AI starts architecting on a fresh start.
Overall, it's a good sign when a project is understandable in small independent chunks that don't demand a programmer/llm take in more context than was referenced.
I think the sweet spot would be all agents agree on a MUST-READ reference syntax for inside comments & docs that through simple scanning forces the file into the context. eg
// See @{../docs/payment-flow.md} for the overall design.
Your prompt is pretty basic. Both Claude Code and Github Copilot having similar features. Claude Code has `init` which has a lot of special sauce in the prompt to improve the CLAUDE.md. And github copilot added a self-documenting prompt as well that runs on new repos, and you can see their prompt here https://docs.github.com/en/copilot/how-tos/configure-custom-...
Reading their prompt gives ideas on how you can improve yours.
At this point AGENTS.md is a README.md with enough hype behind it to actually motivate people to populate it with contents. People were too lazy to write docs for other people, but funnily enough are ok with doing it for robots.
This situation reminds me a bit of ergonomic handles design. Designed for a few people, preferred by everyone.
I like this insight. We kind of always knew that we wanted good docs, but they're demotivating to maintain if people aren't reading them. LLMs by their nature won't be onboarded to the codebase with meetings and conversations, so if we want them to have a proper onboarding then we're forced to be less lazy with our docs, and we get the validation of knowing they're being used.
I mean the agents are to lazy to read any of this anyway and often will forget the sort of instructions being spam these with after 3 more instructions too.
The difference now is that people are actively trying to remove people (others and themselves) from software development work, so the robots have to have adequate instructions. The motivation is bigger. To dismantle all human involvement with software development is something that everyone wants, and they want it yesterday.
It's sort of obvious. Humans cost more money than coding agents. The more you can have a coding agent do, the less you have to pay a human to do.
This aligns pretty clearly with the profit motive of most companies.
> build steps, tests, and conventions that might clutter a README or aren’t relevant to human contributors.
what in fresh hell is the world coming toBasically a link to a page that says "create a file called AGENTS.md and put magic in it" and then links to a repo for the actual website saying this.
I am developing a coding agent that currently manages and indexes over 5,000 repositories. The agent's state is stored locally in a hidden `.agent` directory, which contains a configuration folder for different agent roles and their specific instructions. Then we've a "agents" folder with multiple files, each file has
<Role> <instruction>
Agent only reads the file if its role is defined there.
Inside project directory, we've a dot<coding agent name> folder where coding agents state is stored.
Our process kicks off with an `/init` command, which triggers a deep analysis of an entire repository. Instead of just indexing the raw code, the agent generates a high-level summary of its architecture and logic. These summaries appear in the editor as toggleable "ghost comments." They're a metadata layer, not part of the source code, so they are never committed in actual code. A sophisticated mapping system precisely links each summary annotation to the relevant lines of code.
This architecture is the solution to a problem we faced early on: running Retrieval-Augmented Generation (RAG) directly on source code never gave us the results we needed.
Our current system uses a hybrid search model. We use the AST for fast, literal lexical searches, while RAG is reserved for performing semantic searches on our high-level summaries. This makes all the difference. If you ask, "How does authentication work in this app?", a purely lexical search might only find functions containing the word `login` and functions/classes appearing in its call hierarchy. Our semantic search, however, queries the narrative-like summaries. It understands the entire authentication flow like it's reading a story, piecing together the plot points from different files to give you a complete picture.
It works like magic.
Working on something similar. Legacy codebase understanding requires this type of annotation, and “just use code comments” is too much of a blunt instrument to too much good. Are you storing the annotations completely out of band wrt the files, or using filesystem capabilities like metadata?
This type of metadata itself could have individual value; there are many types of documents that will be analyzed by LLMs, and will need not only a place to store analysis alongside document-parts, but meta-metadata related to the analysis (like timestamps, models, prompts used etc). Of course this could all be done OOB, but then you need a robust way to link your metadata store to a file that has a lifecycle all its own thats only observable by you (probably).
To add further to your idea:
You can create a hierarchy of summaries. The idea being summaries can exist at the method level, class level and the microservice or module level. Each layer of summary points to its child layers and leaf nodes are code themselves. I think it can be a B tree or a normal tree.
The RAG agent can traverse as deep as needed for the particular semantic query. Each level maintains semantic understanding of the layer beneath it but as a tradeoff it loses a lot of information and keeps only what is necessary.
This will work if the abstractions in the codebase are done nicely - abstractions are only useful if they actually hide implementation details. If your abstractions are good enough, you can afford to keep only the higher layers (as required) in your model context. But if it’s not good, you might even have to put actual code in it.
For instance a method like add(n1, n2) is a strong abstraction - I don’t need to know its implementation but only semantic meaning at this level.
But in real life methods don’t always do one thing - there’s logging, global caches etc.
The agent I’m developing is designed to improve or expand upon "old codebases." While many agents can generate code from scratch, the real challenge lies in enhancing legacy code without breaking anything.
This is the problem I’m tackling, and so far, the approach has been effective. It's simple enough for anyone to use: the agent makes a commit, and you can either undo, squash, or amend it through the Git UI.
The issue is that developers often skip reviewing the code properly. To counter that, I’m considering shifting to a hunk-by-hunk review process, where each change is reviewed individually. Once the review is complete, the agent would commit the code.
The concept is simple, but the fun lies in the implementation details—like integrating existing CLI tools without giving the agent full shell access, unlike other agents.
What excites me most is the idea of letting 2–3 agents compete, collaborate, and interpret outputs to solve problems and "fight" to find the best solution.
That’s where the real fun is.
Think of it as surgeons scalpel approach rather than "steam roller" approach most agents take.
I'm still not convinced that separating README.md and AGENTS.md is a good idea.
I've also been debating this: https://technicalwriting.dev/ai/agents/#gotta-keep-em-separa...
(Quoting from that post)
Arguments in favor of keeping them separated:
* Writing style. In agent docs, using all caps might be an effective way to emphasize a particular instruction. In internal eng docs, this might come off rude or distracting.
* Conciseness vs. completeness. In agent docs, you likely need to keep the content highly curated. If you put in too much content, you’ll blast through your API quotas quickly and will probably reduce LLM output quality. In internal eng docs, we ideally aim for 100% completeness. I.e. every important design decision, API reference, workflow, etc. is documented somewhere.
* Differing knowledge needs. The information that LLMs need help with is not the same as the information that human engineers need help with. For example, Gemini 2.5 Pro has pretty good built-in awareness of Pigweed’s C++ Style Guide. I tested that assertion by invoking the Gemini API and instructing it Recite the Pigweed C++ Guide in its entirety. It did not recite in full, but it gave a detailed summary of all the points. So the Gemini 2.5 Pro API was either trained on the style guide, or it’s able to retrieve the style guide when needed. Therefore, it’s not necessary to include the full style guide as AGENTS.md context. (Credit to Keir Mierle for this idea.)
Arguments against:
* Duplication. Conceptually, agent docs are a subset of internal eng docs. The underlying goal is the same. You’re documenting workflows and knowledge that’s important to the team. But now you need to maintain that same information in two different doc sets.
> Writing style. In agent docs, using all caps might be an effective way to emphasize a particular instruction. In internal eng docs, this might come off rude or distracting.
To pile on to this, an agent needs to see "ABSOLUTELY NEVER do suchandsuch" to not do suchandsuch, but still has a pretty fair chance of doing it by accident. A talented human seeing "ABSOLUTELY NEVER do suchandsuch" will interpret this to mean there are consequences to doing suchandsuch, like being fired or causing production downtime. So the same message will be received differently by the different types of readers.
For ages, many projects have README.md for marketing/landing page (i.e. users) and CONTRIBUTING.md for developers.
Why we don't treat coding agents as developers and have them reading CONTRIBUTING.md is baffling to me.
I had the same thought as I read this example. Everything in the AGENTS.md file should just be in a good README.md file.
That seems exactly like something you would want to tell another developer
Why would you publish agent specific things to your codebase? That's personal preference and doesn't have anything to do with the project.
It is. README is for humans, AGENTS / etc is for LLMs.
Document how to use and install your tool in the readme.
Document how to compile, test, architecture decisions, coding standards, repository structure etc in the agents doc.
Compile, test, architecture would be very welcome in the readme too Id wager
Where contributors are the audience, yes. For things like libraries, I care about those things only if I run into a bug, and have enough resources to attempt a fix.
It can help even when using the library and not contributing. It helps you to use the api better imo, because usually the abstraction is not perfect and having even a general sense of how the sausage is made will prevent you from falling victim to gotchas. But then on the downside it lowers the mystique of the library. Some coders prefer to be magicians.
They are relevant but dumping it all into one document in the project root isn’t as optimal for humans as it is for agents, especially since a lot of that information is irrelevant to someone landing on your repo, who probably just wants to add it to their dependency manifest or install the app followed by usage instructions geared to humans.
We have CONTRIBUTING.md for that. Seems to me the author just doesn't know about it?
There's a lot of shit in my claude.md that would be condescending to a human. Some of it I don't even mean, it's just there to correct egregious patterns the model loves to do. I'm not gonna write "never write fallback code" in my readme, but it saves a lot of time to prevent claude from constantly writing code that would silently fail if it got past me because it contains a fallback path with hardcoded fake data.
One reason to consider is around context usage with LLMs. Less is generally better and README.md files are often too much text some of which I don’t want in every context window.
I find AGENT.md and similar functioning files for LLMs in my projects contains concise and specific commands around feedback loops such as testing, build commands, etc. Yes these same commands might be in a README.md but often there is a lot more text that I don’t want in the context window being sent with every turn to the LLM.
We find it useful:
* Agents still kinda suck so need the help around context management and avoiding foot guns. Eg, we make a < 500loc ai/readme.md with must haves, links to others, and meta how-to-use
* Similar to IaaC, useful to separate out as not really ready the same way we read it, and many markdowns are really scripts written in natural language, eg, plan.md.template
Perhaps. I let Claude put whatever it wants in its Claude file and check it’s not batshit from time to time, where I’m very protective of the high-quality README I write for humans. The Claude file has stuff that would be obvious to a human and weird to jam into the README (we use .spec.ts not .test.ts) but that Claude needs in order to get things right.
Yet every agent I use (Claude Code, Gemini and Aider) uses their own custom filename.
It would be nice if it was standardized. Right now I’m using ruler to automate generating these files for all standards as a necessary evil, but I don’t envision this problem being solved soon. Especially because these coding agents also use different styles for consuming MCP configs.
Jules uses AGENTS.md, which indicates that Google is on board with it as the standard. If Gemini Code Assist continues to be a thing (I'm not sure whether Jules is intended to succeed it) then presumably it will support AGENTS.md as well. In the meantime you can configure Gemini Code Assist to use an arbitrary filename.
I don't see a reference to a specific filename in Aider's documentation, can you link to it?
Anthropic appears to be the major holdout here.
Including artifacts like this which are intended only to be consumed by ai defeats the entire point.
Agent-specific guidance like this rubs me the wrong way as well. The SoTA coding agents shouldn't need this much babysitting IMO. There are valid things that are not code that should be a part of the repository, like code formatting preferences (but ideally as linter rules the agent can just run, not prose), information about structuring of the code base (but as CONTRIBUTING.md or something else human-centric which the agent should pick up), documentation (either as source or as a link, again, not agent-centric) etc. I might be blanking on something that is truly agent-only and doesn't fit in a human-centric document or location better, but even if I am, that should be a minimal amount of instructions compared to what should go into the human-centric prose in the code base and be more widely valuable than just for the agent.
Fair point, but if it's minimal and useful for agents, I'm okay with it.
Humans and AIs have different weak spots: we can infer intent from convention, but an AI often struggles unless you spell it out. Ignoring that gap just to keep things "purely human" feels counterproductive. Agents deserve some sympathy.
In my opinion an AGENTS.md file isn't an artifact at all (in the sense of being a checked-in part of the codebase), it's part of the prompt. You should gitignore them and use them to give the LLM a brief overview of the things that matter to your work and the requests you are making.
every example in the wild has these checked in and treated like a lock file where everyone is scared to touch it and introduce weird behavior.
personally i think this pattern is a dead end and trying to build deterministic agentic behavior on top of inherently non-deterministic systems is a fools errand.
would be genuinely interested to see data on that, you are right that there is a selection bias for only seeing what I'm describing.
In what way is this a format or standard? It's just markdown in a namespce
You could get this page down to under 100 words by simply having it say "the name of the file LLM agents will look at for instructions on the repo is AGENTS.md; that's it, that's the standard".
It's a real problem! Every agent right now has their own weird filename. I love David Crawshaw's sketch.dev, but for reasons passing understanding they choose "dear_llm.md" for theirs.
Standards derive their value precisely from being simple and widely adopted - think of .gitignore, CONTRIBUTING.md, or LICENSE files that work because everyone agrees on their location and purpose.
.gitignore is not a standard: it’s a format used by one tool. A few other tools piggy-back on it (e.g. ripgrep ignores paths matching in .gitignore, .hgignore, &c. by default), not infrequently to confusion.
CONTRIBUTING.md is not a standard: it’s a convention pushed by one platform, used by some projects (but many more will not write down such information, or put it in a README file, or put it in some other documentation).
LICENSE is definitely not a standard: it’s one of a wide variety of names people use to hold licensing information, which some tools will be able to detect. I just looked through my /usr/share/licenses, of 1135 files, only 300 are named LICENSE—it’s the most popular single name, sure, with COPYING next at 182, but it’s still definitely a minority, though in certain ecosystems it may be more popular. Any license-detection tooling will be scanning for a lot more file names. “LICENSE” is a very weak convention, compared with the others.
All the different coding agents put their "rules" in different places: .cursor, CLAUDE.md etc..
It makes no sense and it really needs standardisation. I hope this catches on.
Strange website. It is made by OpenAI. I suppose they are doing this to gain visits and as marketing positioning?
There is no format here, just a filename.
Also, Anthropic/Claude is a glaring omission. I suppose people can use symbolic links if they want to and point CLAUDE.md at AGENTS.md.
it's actually by sourcegraph and it's been up since may.
Here's where it used to go: https://web.archive.org/web/20250702163859/ampcode.com/agent...
Here's their announcement
https://ampcode.com/news/AGENT.md
The openai thing is some recent partnership I'm guessing
Interesting enough, Sourcegraph had agent.md which now 301's to agents.md (with the s).
Interestingly, the old one mentioned CLAUDE.md and ln -s, but the new one does not. The whole website is just a marketing/partnerships battle, it seems.
hi, one of the folks behind this. Back in May, @sqs acquired the domain and launched the website above, committing to relocate if the agents.md domain could be acquired. In July, I put out [1] RFC 9999 as a call to the industry that we need to fix this mess. Shortly afterwards OpenAI was able to obtain the domain and thus we (Amp) followed through on the commitment and worked with other vendors to move from AGENT.md to AGENTS.md.
[1] https://web.archive.org/web/20250708160846/https://ampcode.c...
This seems much more insightful than the website in the OP. Thanks!
now? maybe so. But when it was a sourcegraph only thing (really out of their amp project, which is like byterover/cline cloud/roocloud) not so much - they really don't expend much on marketing hype.
I mean I knew about it the day it launched but I'm /probably/ crazy.
The website could mention that. Since I did not discover this website's affiliation with OpenAI until I found the GitHub repo, I had assumed that this is purely an informative website, and as such I expected it to mention the agent tool most people I know use (Claude). Even if that mention is "Claude does not support this yet".
So the solution to using AI so you don't have to code, is to try to write some kind of pseudocode in AGENT.md and hope the AI does a bit better?
Why does it seem that the solution to no-code (which AI-coding agents are) always comes back to "no-code, but actually there is some code behind the scenes, but if you squint enough it looks like no-code".
> So the solution to using AI so you don't have to code, is to try to write some kind of pseudocode in AGENT.md and hope the AI does a bit better?
Umm, no. Where did you get that idea?
The purpose of the agent.md file is to give it instructions. Nothing about no-code AI said there would be no instructions...
I'm old enough to remember when computer code was called "instructions".
Eg, the 6502 instruction set. https://www.masswerk.at/6502/6502_instruction_set.html
History doesn't repeat but it rhymes!
They are still called instruction sets.
> History doesn't repeat but it rhymes!
The only similarity is the word
I think we lost something pretty big in this formulation.
With Claude code and others, if I put a context file (agent.MD or whatever) in a project subfolder, e.g., something explaining my database model in with the related code, it gets added to the root project context when the agent is using that subfolder.
It sounds to me like this formulation doesn’t support that.
That's sort of this? I guess the exact behavior would depend on the agent.
> Place another AGENTS.md inside each package. Agents automatically read the nearest file in the directory tree, so the closest one takes precedence and every subproject can ship tailored instructions. For example, at time of writing the main OpenAI repo has 88 AGENTS.md files.
I think it's just poorly written. Further down:
> What if instructions conflict? > The closest AGENTS.md to the edited file wins; explicit user chat prompts override everything.
This seems appropriate for hierarchical AGENTS.md files? How would it even realize there was a conflict if it hadn't read both files?
Make sure to check out https://agent-rules.org/ as well for more background on this initiative. More and more tools are adopting the standard.
Amp used to have an "RFC 9999" article on their website for this as well but the link now appears to be broken.
You can symlink your Cursor / Windsurf / whatever rules to AGENTS.md for backwards compatibility.
That draft RFC you mention is superseded by https://agents.md. Now that Amp uses AGENTS.md (https://x.com/sqs/status/1957945824404729997), I made all the former agent file stuff on https://ampcode.com just redirect to https://agents.md.
Gotcha - this is what I had in my history: https://ampcode.com/AGENT.md
For me, that gives a 404 with no obvious way to get to https://agents.md, I think either a hyperlink or redirect would be nice to have as well.
Ok, I looked at your agent-rules and it sounds good except for a couple things ...
"Guidance for Use"
Your preference for bullet lists over headers is odd. This comes down to what works best with the models - they are interpreting it. This is a moving target. If you believe that your suggestion works best you should provide some sort of evidence. The default would be to not even get into that sort of thing.
Non-Hierarchical AGENTS.md
Claude-code, Gemini, and GHCP all support hierarchical context files. Your proposal and this new one from OpenAI and friends do not, and I think that is a shame.
I'm rolling my own like this [0]:
.agdocs/
├── specs/ # Task specification files
├── conf/ # Configuration files
├── guides/ # Development guides for agents
└── swap/ # Temporary files (gitignored)
Every markdown file in guides/ gets copied to any agent (aider, claude, gemini) I kick-off.
I gitignore this .agdocs directory by default. I find this useful because otherwise you get into _please commit or stash your changes_ when trying to switch branches.
But I also run an rsync script before each release to copy the .agdocs to a git tracked mirrored directory [1].
[0]: https://github.com/sutt/agro/blob/master/README.md#layout
[1]: https://github.com/sutt/vidstr/tree/master/.public-agdocs/gu...
I have a tiny, relevant weekend project:
https://github.com/cortesi/agentsmd
This is a command-line tool that lets you generate your AGENTS.md and CLAUDE.md files from common sources. So, for instance, if you have Rust-specific guidance for models, you can define it once, and then automatically include it in any project that contains Rust based on the `lang()` language matcher.
This is one of those small tools I now use many times a day to maintain and update ubiquitous agents files. Maybe other folks will find it useful too.
I've came across llms.txt files in few services. I don't know how the agents.md compares to the llms.txt files, but I guess they could pretty much have the same content. See more also here https://llmstxt.org/
Anyhow, I have made few interesting observations, that might be true for the agents.md also:
Agents have trouble with these large documents, and they seem to miss many relevant nuances. However, its rather easy to point them to the right direction when all relevant information is in one file.
Another thing is that I personally prefer this style of documentation. I can just ctrl+f and find relevant information, rather than using some built in search and trying to read through documents. I feel that the UX of one large .txt file is better than the documentation scattered between multiple pages using some pretty documentation engine.
Currently I am building a new JS web toolkit on the side with AI assistance for faster progress, and I came to have some prompts folder in the project root that I just drop into the agents (like cursor CMD+I) and point it to a direction (file/folder).
https://github.com/Anonyfox/raven-js/tree/main/prompts
I think we should not split README and AGENT into different documents - the way its heading is that the coding agents are optimized to "act as" humans and use tools like them more and more - and understanding how something works or how to do something should be aimed for humans and expecting AI tools to pick it up like a human would... if they don't currently then probably they will in the the future.
Kind of my first thought... seems to me this could be part of README.md, or as another suggested CONTRIBUTING.md
I tend to put a lot of this type of info into the readme anyway... for more complex projects, I'll include a docs/ directory with more markdown, images as needed, etc.
For that matter, I think context hints from the likes of Dockerfile(s), compose, and .github/workflows also can serve a dual-purpose here.
Agreed, between this and MCP we are quickly approaching the point where you basically need to document your codebase twice.
Its completely pointless
I like the concept and have built my own context management tool for this very purpose!
https://github.com/jerpint/context-llemur
Though instead of being a single file, you and LLMs cater your context to be easily searchable (folders and files). It’s all version controlled too so you can easily update context as projects evolves.
I made a video showing how easy it is to pull in context to whatever IDE/desktop app/CLI tool you use https://m.youtube.com/watch?v=DgqlUpnC3uw
It makes people feel like they're in control of the text prediction agent when actually it'll only follow this some of the time.
1. I tell Copilot until I'm blue in the face that the project must build.
2. Copilot assures me it has fixed the build errors it created.
3. Still get build errors
4. Run out of tokens so I come back next month and repeat.
New Phoenix Framework projects have an AGENTS.md file in the root! It's really cool. https://x.com/alkadaemon/status/1955348410145358199
https://github.com/phoenixframework/phoenix/blob/main/instal...
That’s insane. 3000 words of prose boilerplate about the language and framework. Sounds like you need, at the very least, some sort of import directive. I have no idea if “Read and follow the instructions in path/to/phoenixframework/AGENTS.md.” would work.
And then the eclectic mixture of instructions with a variety of ways of trying to bully an intransigent LLM into ignoring its Phoenix-deficient training… ugh.
The thing about language models is that they are *language* models. They don't actually parse XML structure, or turn code into an AST, they are just next-token generators.
Individual models may have supplemented their training with things that look like structure (e.g. Claude with its XMLish delimiters), but it's far from universal.
Ultimately if we want better fidelity to the concepts we're referencing, we're better off working from the larger/richer dataset of token sequences in the training data--the total published written output of humanity.
Works great, why the _ugh_
Ultimately that's what matters.
If you’re not intended to alter it, we have a technique for such things: links, so that it can be maintained independently.
If you are intended to alter it… 3,000 words of prose is awful, fuzzy and wildly imprecise.
If you’re expected to add to it before and/or after, which I imagine you are, that’s even worse, whether you’re expected to modify the provided block or not.
If it was like Linux config is often done, with a /etc/thing-name.d/ directory containing files that will be applied in name order (leading to the common convention of two digit numbers: 10-somethingearly.conf, 50-usersomething.conf, 90-quitelate.conf), it might make sense—you just have your 10-phoenixframework.md. But when it’s just one file… well, y’know, there’s a reason we don’t normally maintain huge projects in a single file, even if some projects like to be able to be distributed in that way and have a build process to squish all their source code into one file (e.g. SQLite, Flask).
I’m not denying that it may work to vastly improve LLM performance, but I am absolutely saying that this is horrifying, in the amount of nonsense required, but still more in the way it’s being delivered.
dont forget, you pay real money every time it is tokenized with every prompt!
I was thinking about this too, but the problem is that different models need to be prompted differently for better performance.
Claude is the best model for tool calling, you might need to prompt less reliable models differently. Prompt engineering is really hard, a single context for all models will never be the best IMO.
This is why Claude Code is so much better than any other agentic coding tool, because they know the model very well and there is an insane amount of prompt engineering went into it.
I tried GPT-5 with OpenCode thinking that it will be just as good, but it was terrible.
Model-specific prompt engineering makes a huge difference!
I noticed the example of ln command symbolic linking of the files on the front page. I think it’s interesting how the cli aspect of agent coding and tooling is pushing command prompt / powershell users to become familiar with the *nix shells and commands. In the enterprise Microsoft Visual Studio IDE world some of its very alien. I regularly do tutorials for team members that just think of the cli as a place where they execute ps1s, haven’t used wsl2 or git bash profiles or have limited exposure by way of dealing with containers. Not a criticism.
Looks very promising. If you need to know a bit more about it, watch this video https://youtu.be/TC7dK0gwgg0?si=Eb5TK0gPvgahVbWQ
I haven't really done anything serious with Claude Code, but today I tested starting claude in ~/claude/test, and told it to list my home dir, which it then did.
Is there a way to tell tools like Claude Code that it must never leave ~/claude/test, and don't event think about using absolute paths, or relative paths which contain `..`?
it's already read only outside of project directories (except for Bash tool); your only further option is to wrap it in a sandbox, `bwrap` is perfect for this
"don't even think" is in the default system prompt, but it's inherently indeterministic and can be overridden with a direct instruction as you have seen
run it in a vm, running an agent directly on your machine is madness
what about for plan.md, does everyone just trash the plan once the development is done ? Sometimes the final generated plans (created after iterating with the model on it) are as good as design documents and can be useful for future enhancements, so trashing them doesn't feel right, whereas checking them in seems like too many non code related files in every directory
How are you actually running this in practice with Claude Code? Do you just tell Claude to always read and follow AGENTS.md, or do you also use an MCP server to strictly control which commands (like pnpm test or pnpm lint) it can run? I’d love to hear what workflows or best practices have worked well for you in day-to-day use.
Not having support for importing files makes this dead on arrival. It means you can’t have a local file with local environment details.
There’s an issue open about this in the repo already. I mean if you’re going to copy the CLAUDE.md concept, don’t leave out one of the most useful parts.
Why not use contributing.md?
https://docs.github.com/en/communities/setting-up-your-proje...
The agents.md minisite implies agents.md files should be useful for humans and robots alike:
> Commit messages or pull request guidelines, security gotchas, large datasets, deployment steps: anything you’d tell a new teammate belongs [in agents.md] too.
Agents are just another type of contributor. Maybe agents.md is just what we need to finally trick devs into writing concise docs.
When did this happen - first corporates where wary of using AI generated code due to copyright concerns and now we have full embrace?
I guess we are not yet in the phase where everyone will be scrambling to find competent engineers to clean-up the AI mess in their codebases?
Fundamentally, AI still remains completions.
Agents, tools etc. all cover up the fact that it's completions
Still a remarkable, extraordinary technology. But the nomenclature distracts from how it works.
AGENTS.md is a great way to give context to get better completions.
Pretty advance stuff. Vibing is definitely a difficult skill to learn.
This should've been an .agents¹ with an index.md.
For tiny, throwaway projects, a monolithic .md file is fine. A folder allows more complex projects to use "just enough hierarchy" to provide structure, with index.md as the entry point. Along with top-level universal guidance, it can include an organization guide (easily maintained with the help of LLMs).
In my experience, this works loads better than the "one giant file" method. It lets LLMs/agents add relevant context without wasting tokens on unrelated context, reduces noise/improves response accuracy, and is easier to maintain for both humans and LLMs alike.¹ Ideally with a better name than ".agents", like ".codebots" or ".context".