Comment by maeil
The section on training feels weak, and that's what the discussion is mainly about.
Many companies are now trying to train models as big as GPT-4. OpenAI is training models that may well be even much larger than GPT-4 (o1 and o3). Framing it as a one-time cost doesn't seem accurate - it doesn't look like the big companies will stop training new ones any time soon, they'll keep doing it. So one model might only be used half a year. And many models may not end up used at all. This might stop at some point, but that's hypothetical.
It briefly touches on training, but uses a seemingly misleading statistic that comes from (in reference to GPT-4) extremely smaller models.
This article [1] says that 300 [round-trip] flights are similar to training one AI model. Its reference of an AI model is a study done on 5-year-old models like BERT (110M parameters), Transformer (213M parameters), and GPT-2. Considering that models today may be more than a thousand times larger, this is an incredulous comparison.
Similar to the logic of "1 mile versus 60 miles in a massive cruise ship"... the article seems to be ironically making a very similar mistake.
[1] https://icecat.com/blog/is-ai-truly-a-sustainable-choice/#:~....