Comment by eclipsetheworld
Comment by eclipsetheworld 10 hours ago
Interestingly, sticking to the "Agent = REPL" mental model is actually what helped me solve those specific scaling problems (sub-agents and shared data) without the SDK bloat.
1. Sub-agents are just stack frames. When the main loop encounters a complex task, it "pushes" a new scope (a sub-agent with a fresh, empty context). That sub-agent runs its own REPL loop, returns only the clean result with out any context pollution and is then "popped".
2. Shared Data is the heap. Instead of stuffing "shared data" into the context window (which is expensive and confusing), I pass a shared state object by reference. Agents read/write to the heap via tools, but they only pass "pointers" in the conversation history. In the beginning this was just a Python dictionary and the "pointers" were keys.
My issue with the heavy SDKs isn't that they try to solve these problems, but that they often abstract away the state management. I’ve found that explicitly managing the "stack" (context) and "heap" (artifacts) makes the system much easier to debug.
Indeed. So in addition to your chat loop, you have built a way to spawn sub-agents, and to share memory objects between them (or tools) and the main agent; also (I suppose) a standard way to define tools and their actions; to define sub-agents with their separate tools and actions and (if needed) separate memory objects; to inject ephemeral context in the chat (the current state of the UI, or the last user action); to introduce reinforcement messages when needed; etc. Maybe context packing if/ when the context gets too big. Then you've probably have built something for evals, so that you can run batches of tasks and score the results. Etc.
So that's my point (and that of the article): it's not "just a loop", it quickly gets much more complicated than that. I haven't used any framework, so I can't tell if they're good or not; but for sure I ended up building my own. Calling tools in a loop is enough for a cool demo but doesn't work well enough for production.