Comment by altcognito Comment by altcognito 3 days ago 1 reply Copy Link View on Hacker News What sorts of token/s are you getting with each model?
Copy Link jetsnoc 3 days ago Collapse Comment - Model performance summary: **openai/gpt-oss-120b** — MLX (MXFP4), ~66 tokens/sec @ Hugging Face: `lmstudio-community/gpt-oss-120b-MLX-8bit` **google/gemma-3-27b** — MLX (4-bit), ~27 tokens/sec @ Hugging Face: `mlx-community/gemma-3-27b-it-qat-4bit` **qwen/qwen3-coder-30b** — MLX (8-bit), ~78 tokens/sec @ Hugging Face: `Qwen/Qwen3-Coder-30B-A3B-Instruct` Will reply back and add Meta Llama performance shortly. Reply View | 0 replies
Model performance summary:
Will reply back and add Meta Llama performance shortly.