Comment by CamperBob2
Comment by CamperBob2 2 days ago
As far as I'm aware, they all are. There are only five important foundation models in play -- Gemini, GPT, X.ai, Claude, and Deepseek. (edit: forgot Claude)
Everything from China is downstream of Deepseek, which some have argued is basically a protege of ChatGPT.
Not true, Qwen from Alibaba does lots of random architectures.
Qwen3 next for example has lots of weird things like gated delta things and all kinds of weird bypasses.
https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d...