Comment by libraryofbabel
Comment by libraryofbabel 12 hours ago
This has been a problem for us too. Sometimes they reach for skills, sometimes they don’t and just try to do the thing on their own. It’s annoying.
I think this is (mostly) a solvable problem. The current generation of SotA models wasn’t RLVR-trained on skills (they didn’t exist at that time) and probably gets slightly confused by the way the little descriptions are all packed into the same tool call schema. (At least that’s how it works with Claude Code.) The next generation will have likely been RLVRed on a lot of tasks where skills are available, and will use them much more reliably. Basically, wait until the next Opus release and you should hopefully see major improvements. (Of course, all this stuff is non-deterministic blah blah, but I think it’s reasonable to expect going from “misses the skill 30% of the time” to “misses it 2% of the time”.)
Same, I have a bunch of skills defined ith proper YAML headers and semantic triggers installed, I make a point of listing not too many but making it quite specific.
Even with that, I have to be very specific in triggering a skill and it's hit or miss if it picks up on the skill -- usually I have to say there is a skill with this go and use it.