Comment by eclipsetheworld

Comment by eclipsetheworld 10 hours ago

We're repeating the same overengineering cycle we saw with early LangChain/RAG stacks. Just a couple of months ago the term agent was hard to define, but I've realized the best mental model is just a standard REPL:

Read: Gather context (user input + tool outputs). Eval: LLM inference (decides: do I need a tool, or am I done?). Print: Execute the tool (the side effect) or return the answer. Loop: Feed the result back into the context window.

Rolling a lightweight implementation around this concept has been significantly more robust for me than fighting with the abstractions in the heavy-weight SDKs.

throw310822 9 hours ago

I don't think this has much to do with SDKs. I've developed my own agent code from scratch (starting from the simple loop) and eventually- unless your use case is really simple- you always have to deal with the need for subagents specialised for certain tasks, that share part of their data (but not all) with the main agent, with internal reasoning and reinforcement messages, etc.

Reply View 2 replies

eclipsetheworld 9 hours ago

Interestingly, sticking to the "Agent = REPL" mental model is actually what helped me solve those specific scaling problems (sub-agents and shared data) without the SDK bloat.
1. Sub-agents are just stack frames. When the main loop encounters a complex task, it "pushes" a new scope (a sub-agent with a fresh, empty context). That sub-agent runs its own REPL loop, returns only the clean result with out any context pollution and is then "popped".
2. Shared Data is the heap. Instead of stuffing "shared data" into the context window (which is expensive and confusing), I pass a shared state object by reference. Agents read/write to the heap via tools, but they only pass "pointers" in the conversation history. In the beginning this was just a Python dictionary and the "pointers" were keys.
My issue with the heavy SDKs isn't that they try to solve these problems, but that they often abstract away the state management. I’ve found that explicitly managing the "stack" (context) and "heap" (artifacts) makes the system much easier to debug.

Reply View | 1 reply
- throw310822 5 hours ago
  
  Indeed. So in addition to your chat loop, you have built a way to spawn sub-agents, and to share memory objects between them (or tools) and the main agent; also (I suppose) a standard way to define tools and their actions; to define sub-agents with their separate tools and actions and (if needed) separate memory objects; to inject ephemeral context in the chat (the current state of the UI, or the last user action); to introduce reinforcement messages when needed; etc. Maybe context packing if/ when the context gets too big. Then you've probably have built something for evals, so that you can run batches of tasks and score the results. Etc.
  So that's my point (and that of the article): it's not "just a loop", it quickly gets much more complicated than that. I haven't used any framework, so I can't tell if they're good or not; but for sure I ended up building my own. Calling tools in a loop is enough for a cool demo but doesn't work well enough for production.
  
  Reply View | 0 replies