Comment by jasonsb
Comment by jasonsb 2 days ago
It's all about the hardware and infrastructure. If you check OpenRouter, no provider offers a SOTA chinese model matching the speed of Claude, GPT or Gemini. The chinese models may benchmark close on paper, but real-world deployment is different. So you either buy your own hardware in order to run a chinese model at 150-200tps or give up an use one of the Big 3.
The US labs aren't just selling models, they're selling globally distributed, low-latency infrastructure at massive scale. That's what justifies the valuation gap.
Edit: It looks like Cerebras is offering a very fast GLM 4.6
Gemini 3 = ~70tps https://openrouter.ai/google/gemini-3-pro-preview
Opus 4.5 = ~60-80tps https://openrouter.ai/anthropic/claude-opus-4.5
Kimi-k2-think = ~60-180tps https://openrouter.ai/moonshotai/kimi-k2-thinking
Deepseek-v3.2 = ~30-110tps (only 2 providers rn) https://openrouter.ai/deepseek/deepseek-v3.2