Comment by codazoda

Comment by codazoda 5 hours ago

I can't get Codex CLI or Claude Code to use small local models and to use tools. This is because those tools use XML and the small local models have JSON tool use baked into them. No amount of prompting can fix it.

In a day or two I'll release my answer to this problem. But, I'm curious, have you had a different experience where tool use works in one of these CLIs with a small local model?

zackify 5 hours ago

I'm using this model right now in claude code with LM Studio perfectly, on a macbook pro

Reply View 1 reply

codazoda 5 hours ago

You mean Qwen3-Coder-Next? I haven't tried that model itself, yet, because I assume it's too big for me. I have a modest 16GB MacBook Air so I'm restricted to really small stuff. I'm thinking about buying a machine with a GPU to run some of these.
Anywayz, maybe I should try some other models. The ones that haven't worked for tool calling, for me are:
Llama3.1
Llama3.2
Qwen2.5-coder
Qwen3-coder
All these in 7b, 8b, or sometimes 30b (painfully) models.
I should also note that I'm typically using Ollama. Maybe LM Studio or llama.cpp somehow improve on this?

Reply View | 0 replies

regularfry 5 hours ago

Surely the answer is a very small proxy server between the two?

Reply View 1 reply

codazoda 5 hours ago

That might work, but I keep seeing people talk about this, so there must be a simple solution that I'm over-looking. My solution is to write my own minimal and experimental CLI that talks JSON tools.

Reply View | 0 replies