Comment by moinism

Comment by moinism 12 hours ago

33 replies

Amen. Been seeing these agent SDKs coming out left and right for a couple of years and thought it'd be a breeze to build an agent. Now I'm trying to build one for ~3 weeks, and I've tried three different SDKs and a couple of architectures.

Here's what I found:

- Claude Code SDK (now called Agent SDK) is amazing, but I think they are still in the process of decoupling it from the Claude Code, and that's why a few things are weird. e.g, You can define a subagent programmatically, but not skills. Skills have to be placed in the filesystem and then referenced in the plugin. Also, only Anthoripic models are supported :(

- OpenAI's SDK's tight coupling with their platform is a plus point. i.e, you get agents and tool-use traces by default in your dashboard. Which you can later use for evaluation, distillation, or fine-tuning. But: 2. They have agent handoffs (which works in some cases), but not subagents. You can use tools as subagents, though. 1. Not easy to use a third-party model provider. Their docs provide sample codes, but it's not as easy as that.

- Google Agent Kit doesn't provide any Typescript SDK yet. So didn't try.

- Mastra, even though it looks pretty sweet, spins up a server for your agent, which you can then use via REST API. umm.. why?

- SmythOS SDK is the one I'm currently testing because it provides flexibility in terms of choosing the model provider and defining your own architecture (handoffs or subagents, etc.). It has its quirks, but I think it'll work for now.

Question: If you don't mind sharing, what is your current architecture? Agent -> SubAgents -> SubSubAgents? Linear? or a Planner-Executor?

I'll write a detailed post about my learnings from architectures (fingers crossed) soon.

copypaper 4 hours ago

Every single SDK I've used was a nightmare once you get past the basics. I ended up just using an OpenRouter client library [1] and writing agents by hand without an abstraction layer. Is it a little more boilerplatey? Yea. Does it take more LoC to write? Yea. Is it worth it? 100%. Despite writing more code, the mental model is much easier (personally) to follow and understand.

As for the actual agent I just do the following:

- Get metadata from initial query

- Pass relevant metadata to agent

- Agent is a reasoning model with tools and output

- Agent runs in a loop (max of n times). It will reason which tool calls to use

- If there is a tool call, execute it and continue the loop

- Once the agent outputs content, the loop is effectively finished and you have your output

This is effectively a ReAct agent. Thanks to the reasoning being built in, you don't need an additional evaluator step.

Tools can be anything. It can be a subagent with subagents, a database query, etc. Need to do an agent handoff? Just output the result of the agent into a different agent. You don't need an sdk to do a workflow.

I've tried some other SDKs/frameworks (Eino and langchaingo), and personally found it quicker to do it manually (as described above) than fight against the framework.

[1]: https://github.com/reVrost/go-openrouter

peab 10 hours ago

I think the term sub-agent is almost entirely useless. An agent is an LLM loop that has reasoning and access to tools.

A "sub agent" is just a tool. It's implantation should be abstracted away from the main agent loop. Whether the tool call is deterministic, has human input, etc, is meaningless outside of the main tool contract (i.e Params in Params out, SLA, etc)

  • nostrebored 2 hours ago

    Nah, when working on anything sufficiently complicated you will have many parallel subagents that need their own context window, ability to mutate shared state, sandboxing differences, durability considerations, etc.

    If you want to rewrite the behavior per instance you totally can, but there is a definite concept here that is different than “get_weather”.

    I think that existing tools don’t work very well here or leave much of this as an exercise for the user. We have tasks that can take a few days to finish (just a huge volume of data and many non deterministic paths). Most people are doing way too much or way too little. Having subagents with traits that can be vended at runtime feels really nice.

  • moinism 9 hours ago

    I agree, technically, "sub agent" is also another tool. But I think it's important to differentiate tools with deterministic input/output from those with reasoning ability. A simple 'Tool' will take the input and try to execute, but the 'subagent' might reason that the action is unnecessary and that the required output already exists in the shared context. Or it can ask a clarifying question from the main agent before using its tools.

  • the_mitsuhiko 6 hours ago

    > It's implantation should be abstracted away from the main agent loop. Whether the tool call is deterministic, has human input, etc, is meaningless outside of the main tool contract (i.e Params in Params out, SLA, etc)

    Up to a point. You're obviously right in principle, but if that task itself has the ability to call into "adjacent" tools then the behavior changes quite a bit. You can see this a bit with how the Oracle in Amp surfaces itself to the user. The oracle as sub-agent has access to the same tools as the main agent, and the state changes (rare!) that it performs are visible to itself as well as the main agent. The tools that it invokes are displayed similarly to the main agent loop, but they are visualized as calls within the tool.

  • verdverm 8 hours ago

    ADK differentiates between tools and subagents based on the ability to escalate or transfer control (subagents), where as tools are more basic

    I think this is a meaningful distinction, because it impacts control flow, regardless what they are called. The lexicon are quite varied vendor-to-vendor

    • peab 7 hours ago

      Are there any examples of implementations of this that actually work, and/or are useful? I've seen people write about this, but I haven't seen it anywhere

      • verdverm 7 hours ago

        I think in ADK, the most likely place to find them actually used is the Workflow agent interfaces (sequential, parallel, loop). Perhaps looping, where it looks like they suggest you have an agent that determines if the loop is done and escalates with that message to the Looper.

        https://google.github.io/adk-docs/agents/workflow-agents/

        I haven't gotten there yet, still building out the basics like showing diffs instead of blind writing and supporting rewind in a session

  • Vinnl 8 hours ago

    What does "has reasoning" mean? Isn't that just a system prompt that says something like "make a plan" and includes that in the loop?

    • peab 7 hours ago

      You actually probably don't need reasoning, as the old non reasoning models like 4o can do this too.

      In the past, the agent type flows would work better if you prompted the LLM to write down a plan, or reasoning steps on how to accomplish the task with the available tools. These days, the new models are trained to do this without promoting

  • ColinEberhardt 10 hours ago

    Oh, so _that_ is what a sub-agent is. I have been wondering about that for a while now!

verdverm 8 hours ago

Google's ADK is pretty nice, I'm using the Go version, which is less mature than the python on. Been at it a bit over a week and progress is great. This weekend I'm aiming for tracking file changes in the session history to allow rewinding / forking

It has a ton of day 2 features, really nice abstractions, and positioned itself well in terms of the building blocks and constructing workflows.

ADK supports working with all the vendors and local LLMs

  • dragonwriter 8 hours ago

    I really wish ADK had a local persistent memory implementation, though.

    • verdverm 7 hours ago

      w.r.t. Go, it's probably not that big a lift. I was looking at that yesterday, made a small change to lift the Gorm stuff a bit so the DB conn can be shared between the services

      I thought the same thing about the artifact service, which could have a nice local FS option.

      I'm pretty new to ADK, so we'll see how long the honeymoon phase lasts. Generally very optimistic that I found a solid foundation and framework

      edit: opened an issue to track it

      https://github.com/google/adk-go/issues/339

blancm 12 hours ago

Hello, about Claude Code where only Anthoripic models are supported, in reality you can use Claude Code router (https://github.com/musistudio/claude-code-router) to use other models in Claude Code. I use it since some weeks with opensource models and it works pretty well. You can even use "free" models from openrouter

mountainriver 12 hours ago

The frameworks are all pointless, just use AI assist to create agents in python or ideally a language with concurrency.

You will be happy you did

  • moduspol an hour ago

    You will undoubtedly be recreating what already exists in LangGraph. And you'll probably be doing it worse.

  • moinism 12 hours ago

    How do you deal with the different APIs/Tooluse schema in a custom build? As other people have mentioned, it's a bigger undertaking than it sounds.

    • koakuma-chan 10 hours ago

      You can just tell the AI which format you want the input in, in natural language.

      • verdverm 8 hours ago

        you're wasting valuable context with approaches like that

        save it for more interesting tasks

otterley 11 hours ago

Have you tried AWS’s Strands Agents SDK? I’ve found it to be a very fluent and ergonomic API. And it doesn’t require you to use Bedrock; most major vendor native APIs are supported.

(Disclaimer: I work for AWS, but not for any team involved. Opinions are my own.)

  • moinism 10 hours ago

    This looks good. Even though it's only in Python, I think its worth a try. Thanks.

ph4rsikal 10 hours ago

My favourite is Smolagents from Huggingface. You can easily mix and match their models in your agents.

  • moinism 9 hours ago

    Dude, it looks great, but I just spent half an hour learning about its 'CodeAgents' feature. Which essentially is 'actions written as code'.

    This idea has been floating around in my head, but it wasn't refined enough to implement. It's so wild that what you're thinking of may have already been done by someone else on the internet.

    https://huggingface.co/docs/smolagents/conceptual_guides/int...

    For those who are wondering, it's kind of similar to the 'Code Mode' idea implemented by Cloudflare and now being explored by Anthropic; Write code to discover and call MCPs instead of stuffing context window with their definations.

thewhitetulip 9 hours ago

Did you try langchain/langgraph? Am I confusing what the OP means aa agents?