Comment by kingstnap
Not true, Qwen from Alibaba does lots of random architectures.
Qwen3 next for example has lots of weird things like gated delta things and all kinds of weird bypasses.
https://qwen.ai/blog?id=4074cca80393150c248e508aa62983f9cb7d...
Agree with you over OP - as well as Qwen there's others like Mistral, Meta's Llama, and from China there's the likes of Baidu ERNIE, ByteDance Doubao, and Zhipu GLM. Probably others too.
Even if all of these were considered worse than the "only 5" on OP's list (which I don't believe to be the case), the scene is still far too young and volatile to look at a ranking at any one point in time and say that if X is better than Y today then it definitely will be in 3 months time, yet alone in a year or two.