Comment by danielhanchen
Comment by danielhanchen 11 hours ago
For those interested, made some Dynamic Unsloth GGUFs for local deployment at https://huggingface.co/unsloth/Qwen3-Coder-Next-GGUF and made a guide on using Claude Code / Codex locally: https://unsloth.ai/docs/models/qwen3-coder-next
Nice! Getting ~39 tok/s @ ~60% GPU util. (~170W out of 303W per nvtop).
System info:
llama.cpp command-line: