Comment by blainm

One of the key limitations of even state-of-the-art LLMs is that their coherence and usefulness tend to degrade as the context window grows. When tackling complex workflows, such as customer support automation or code review pipelines - breaking the process into smaller, well-defined tasks allows the model to operate with more relevant and focused context at each step, improving reliability.

Additionally, in self-hosted environments, using an agent-based approach can be more cost-effective. Simpler or less computationally intensive tasks can be offloaded to smaller models, which not only reduces costs but also improves response times.

That being said, this approach is most effective when dealing with structured workflows that can be logically decomposed. In more open-ended tasks, such as "build me an app," the results can be inconsistent unless the task is well-scoped or has extensive precedent (e.g., generating a simple Pong clone). In such cases, additional oversight and iterative refinement are often necessary.