andreyf 18 hours ago

Rumor has it that they weren't trained "from scratch" the was US would, i.e. Chinese labs benefitted from government "procured" IP (the US $B models) in order to train their $M models. Also understand there to be real innovation in the many-MoE architecture on top of that. Would love to hear a more technical understanding from someone who does more than repeat rumors, though.

usef- 13 hours ago

We don't really know how much it cost them. Plenty of reasons to doubt the numbers passed around and what it wasn't counting.

(And even if you do believe it, they also aren't licensing the IP they're training on, unlike american firms who are now paying quite a lot for it)

4fterd4rk 17 hours ago

A lot of HN commentators are high on their own supply with regard to the AI bubble... when you realize that this stuff isn't actually that expensive the whole thing begins to quickly unravel.