Comment by mgraczyk
As others have pointed out, this is false. Google has made their models and hardware more efficient, you can read the linked report. Most of the efficiency comes from quantization, MoE, new attention techniques, and distillation (making smaller models useable in place of bigger models)
sure, but the issue is if you make the model 30x more efficient, but you use it 300x more often (mostly for stuff nobody wants), it's still a net loss