Comment by simonw
Comment by simonw 20 hours ago
"It’s becoming clear that real-world agentic systems work best when multiple agents collaborate, rather than having one agent attempt to do everything."
I'll be honest: I don't buy that premise (yet). It's clearly a popular idea and I see a lot of excitement about it (see Google's A2A thing) but it feels to me like a pattern that, in many cases, will make LLM code even harder to get reliable results from.
I worry it's the AI-equivalent of microservices: useful in a small set of hyper complex systems, the vast majority of applications that adopt it would have been better off without.
If there are strong arguments counter to what I've said here I'd love to hear them!
A few concrete examples of multi-agent collaboration being useful in my project Plandex[1]:
- While it uses Sonnet 3.7 by default for creating the edit snippet when writing code, calls related to applying the snippet and validating the result (and falling back to a whole file write if needed) use o3-mini (soon to be o4-mini) which is 1/3 the cost, much faster, and actually more accurate and reliable than Sonnet for this particular narrow task.
- If Sonnet 3.7's context limit is exceeded in the planning stages, it can switch to a Gemini model for planning, then go back to Sonnet again for the implementation steps (since these only need the files relevant to each step).
- It eagerly summarizes the conversation after each response so that the summary can be used later if the conversation gets too long. This is only practical because much smaller models than the main planning/coding models are sufficient for a good summary. Otherwise it would be way too expensive.
It's definitely more complex, but I think in these cases at least, there's a real payoff for the trouble.
1 - https://github.com/plandex-ai/plandex