Comment by blharr
Thanks, I missed that 60 GWh figure. I got confused because the quotes around the statement, so I looked it up and couldn't find a quote. I realize now that he's quoting himself making that statement (and it's quite accurate)
I am surprised that, somehow, the statistic didn't change from GPT-2-era to GPT-4. Did GPUs really get that much more efficient? Or that study must have some problems