Comment by maeil

Comment by maeil 18 hours ago

4 replies

The section on training feels weak, and that's what the discussion is mainly about.

Many companies are now trying to train models as big as GPT-4. OpenAI is training models that may well be even much larger than GPT-4 (o1 and o3). Framing it as a one-time cost doesn't seem accurate - it doesn't look like the big companies will stop training new ones any time soon, they'll keep doing it. So one model might only be used half a year. And many models may not end up used at all. This might stop at some point, but that's hypothetical.

blharr 17 hours ago

It briefly touches on training, but uses a seemingly misleading statistic that comes from (in reference to GPT-4) extremely smaller models.

This article [1] says that 300 [round-trip] flights are similar to training one AI model. Its reference of an AI model is a study done on 5-year-old models like BERT (110M parameters), Transformer (213M parameters), and GPT-2. Considering that models today may be more than a thousand times larger, this is an incredulous comparison.

Similar to the logic of "1 mile versus 60 miles in a massive cruise ship"... the article seems to be ironically making a very similar mistake.

[1] https://icecat.com/blog/is-ai-truly-a-sustainable-choice/#:~....

  • mmoskal 16 hours ago

    737-800 burns about 3t of fuel per hour. NYC-SFO is about 6h, so 18t of fuel. Jet fuel energy density is 43MJ/kg, so 774000 MJ per flight, which is 215 MWh. Assuming the 60 GWh figure is true (seems widely cited on the internets), it comes down to 279 one-way flights.

    • blharr 16 hours ago

      Thanks, I missed that 60 GWh figure. I got confused because the quotes around the statement, so I looked it up and couldn't find a quote. I realize now that he's quoting himself making that statement (and it's quite accurate)

      I am surprised that, somehow, the statistic didn't change from GPT-2-era to GPT-4. Did GPUs really get that much more efficient? Or that study must have some problems

devmor 17 hours ago

I am sure that’s intentional, because this article is the same thing we see from e/acc personalities any time the environmental impact is brought up.

Deflection away from what actually uses power and pretending the entire system is just an API like anything else.