Comment by rfoo

I don't think it's weight being different or special inference techniques, more like they are not able to train the model to follow tool schema perfectly yet, and both Moonshot and Groq decided to use something like https://github.com/noamgat/lm-format-enforcer to make sure at least the output format is correct.