Comment by Flere-Imsaho

Comment by Flere-Imsaho 13 hours ago

(I've only just starting running local LLMs so excuse the dumb question).

Would Granite run with llama.cpp and use Mamba?

Last I checked Ollama inference is based on llama.cpp so either Ollama has not caught up yet, or the answer is no.

EDIT: Looks like Granite 4 hybrid architecture support was added to llama.cpp back in May: https://github.com/ggml-org/llama.cpp/pull/13550