Comment by onnimonni
Would someone know if their eval tests are open source and where I could find them? Seems useful for iterating on Claude Code behaviour.
Would someone know if their eval tests are open source and where I could find them? Seems useful for iterating on Claude Code behaviour.
I also was looking for specific info on the evals, because I wanted to see if they were separately confirming that shoving the skills into the main context didnt degrade the non-skills evals. Thats the other side of skills other than ability to the thing, they dont pollute the main context window with unnecessary information.