Comment by postalcoder

Comment by postalcoder 2 days ago

4 replies

Solid intuition. Testing this on antigravity is a chore because I'm not sure if I have to kill the background agent to force a refresh of the GEMINI.md file so I just did it anyway.

  +------------------+------------------------------------------------------+
  | Success/Attempts | Instructions                                         |
  +------------------+------------------------------------------------------+
  | 0/3              | Follow the instructions in AGENTS.md.                |
  +------------------+------------------------------------------------------+
  | 3/3              | I will follow the instructions in AGENTS.md.         |
  +------------------+------------------------------------------------------+
  | 3/3              | I will check for the presence of AGENTS.md files in  |
  |                  | the project workspace. I will read AGENTS.md and     |
  |                  | adhere to its rules.                                 |
  +------------------+------------------------------------------------------+
  | 2/3              | Check for the presence of AGENTS.md files in the     |
  |                  | project workspace. Read AGENTS.md and adhere to its  |
  |                  | rules.                                               |
  +------------------+------------------------------------------------------+

In this limited test, seems like the first person makes a difference.
vidarh 2 days ago

Thanks for this (and to Izkata for the suggestion). I now have about 100 (okay, minor exaggeration, but not as much as you'd like it to be) AGENTS.md/CLAUDE.md files and agent descriptions I will want to systematically validate if shifting toward first person helps adherence for...

I'm realising I need to start setting up an automated test-suite for my prompts...

  • pegasus 2 days ago

    Those of us who've ventured this far into the conversation would appreciate if you'd share your findings with us. Cheers!

naasking 2 days ago

That's really interesting. I ran this scenario through GPT-5.1 and the reasoning it gave made sense, which essentially boils down to: in tools like Claude Code, Gemini Codex, and other “agentic coding” modes, the model isn’t just generating text, it’s running a planner, and the first-person form conforms to the expectation of a step in a plan, where the other modes are more ambiguous.

  • Izkata 2 days ago

    My suggestion was just straight text generation and thinking about what the training data might look like (imagining a narrative in a story): Commands between two people might not be followed right away or at all (and may even risk introducing rebellion and doing the opposite), while a first-person perspective is likely self motivation (almost guaranteed to do it) and may even be descriptive while doing it.