Comment by 3kkdd
Im sick and tired of these empty posts.
SHOW AN EXAMPLE OF YOU ACTUALLY DOING WHAT YOU SAY!
Im sick and tired of these empty posts.
SHOW AN EXAMPLE OF YOU ACTUALLY DOING WHAT YOU SAY!
What? People do this all the time. Sometimes manually by invoking another agent with a different model and asking it to review the changes against the original spec. I just setup some reviewer / verifier sub agents in Cursor that I can invoke with a slash command. I use Opus 4.5 as my daily driver, but I have reviewer subagents running Gemini 3 Pro and GPT-5.2-codex and they each review the plan as well, and then the final implementation against the plan. Both sometimes identify issues, and Opus then integrates that feedback.
It’s not perfect so I still review the code myself, but it helps decrease the number of defects I have to then have the AI correct.
these two posts (the parent and then the OP) seem equally empty?
by level of compute spend, it might look like:
- ask an LLM in the same query/thread to write code AND tests (not good)
- ask the LLM in different threads (meh)
- ask the LLM in a separate thread to critique said tests (too brittle, testing guidelines, testing implementation and not out behavior, etc). fix those. (decent)
- ask the LLM to spawn multiple agents to review the code and tests. Fix those. Spawn agents to critique again. Fix again.
- Do the same as above, but spawn agents from different families (so Claude calls Gemini and Codex).
—-
these are usually set up as /slash commands like /tests or /review so you aren’t doing this manually. since this can take some time, people might work on multiple features at once.
There's no example because OP has never done this, and never will. People lie on the internet.