HN Top New Show Ask Jobs

settings

Theme

Hand Mode

Feed

Comment by verdverm

Comment by verdverm 7 hours ago

0 replies

View on Hacker News

ADK has a few pages and some API for evaluating agentic systems

https://google.github.io/adk-docs/evaluate/

tl;dr - challenging because different runs produce different output, also how do you pass/fail (another LLM/agent is what people do)