Comment by keeda

Wait hold on, let's put some numbers on this. Please correct my calculations if I'm wrong.

1. The human brain draws 12 - 20 watts [1, 2]. So, taking the lower end, a task taking one hour of our time costs 12 Wh.

2. An average ChatGPT query is between 0.34 Wh - 3 Wh. A long input query (10K tokens) can go up to 10 Wh. [3] I get the best results by carefully curating the context to be very tight, so optimal usage would be in the average range.

3. I have had cases where a single prompt has saved me at least an hour of work (e.g. https://news.ycombinator.com/item?id=44892576). Let's be pessimistic and say it takes 3 prompts at 3 Wh (9 Wh) and 10 minutes (2 Wh) of my time prompting and reviewing to complete a task. That is 11 Wh for the same task, which still beats out the human brain unassisted!

And that's leaving aside the recent case where I vibecoded and deployed a fully-tested endpoint on a cloud platform I had no prior experience in, over the course of 2 - 3 hours. I estimate it would have taken me a whole day just to catch up on the documentation and another 2 days tinkering with the tools, commands and code. That's at least an 8x power savings assuming an 8-hour workday!!

4. But let's talk data instead of anecdotes. If you do a wide search, there is a ton of empirical evidence that improves programmer productivity by 5 - 30% (with a lot of nuance). I've cited some here: https://news.ycombinator.com/item?id=45379452 -- there is no measure of the amount of prompt usage to estimate energy usage, but those are significant productivity boosts.

Even the METR study that appeared to show AI coding lowering productivity also showed that AI usage broadly increased in idle-time in users. That is, calendar time for task completion may have gone up, but that included a lot of idle time where people were doing no cognitive work at all. Someone should run the numbers, but maybe it resulted in lower power consumption!

---

But what about the training costs? Sure we've burned gazillions of GWh on training already, and the usual counterpoint is "what about the cost involved in evolution?" but let's assume we stopped training all models today. They will still serve all future prompts at the same power consumption rates discussed above.

However every new human will take 15 - 20 years of education to get to be a novice in a single domain, followed by many more years of experience to become proficient. We're comparing apples and blueberries here, but that's a LOT of energy to even start becoming productive, but a trained LLM is instantly productive in multiple domains forever.

My hunch is that if we do a critical analysis of amortized energy consumption, LLMs will probably beat out humans. If not already, soon with the rate of token costs plummeting all the time.

[1] https://psychology.stackexchange.com/questions/12385/how-muc...

[2] https://press.princeton.edu/ideas/is-the-human-brain-a-biolo...

[3] https://epoch.ai/gradient-updates/how-much-energy-does-chatg...

In my go example we have a human and an AI model competing at the same task. A good AI model will perform much much much better and probably win the game, but if we measure the energy input into either player the AI model will consume a lot more energy. However a game of go is not automation, it won’t save us any time. The benefits of the AI model is it helps human go players improve their own game, finding new moves, new patterns, new proverbs, etc. Because of go playing AI models human go players now play their games better, but nor more efficiently, nor faster.

In your LLM coding example you have a human and an AI model collaborating on a single task, both spend some amount of energy (taking your assumptions at face value, compatible amount of energy) and produce a single outcome. In the go example it is easy to compare energy usage and the quality of the outcome is also easy to measure (simply who won the game). In your coding example the quality of the outcome is impossible to measure, and because the effort is collaborative, splitting the energy usage is complected.

When talking about automation my game of go example falls apart. A much better examples would be something like a loom, or a digital calculator. These tools help the human arrive at a particular outcome much faster and with much less effort then a human performing the task without the help of the machines. The time saved by using these tools are measured in several orders of magnitudes, and the energy spent is at par with a human. It is easy to see how a loom or a digital calculator are more efficient then a human.

I guess if we take into account the training cost of an LLM model we should also take into account the production costs of looms and digital calculators. I don‘t know how to do that, but I can’t imagine it would be anywhere close to that of an LLM model.

And we have an LLM model we have increased the productivity of, not 5000x[1], but by 5%-30%. To me this does not sound like a revolutionary technology. But I have my doubts of even the 5%-30% figure. We have preliminary research ranging anywhere from negative productivity increase to your cited 5%-30%. We will have to wait for more research, and possibly some meta-analysis before we can accurately assess the productivity boost of LLMs. But we will have to do a whole lot better then 5%-30% to sufficiently justify the huge energy consumption of AI[2].

Personally, I am not convinced by your back of the envelope calculations. It fails my sniff test that 9 Wh of matrix multiplication will consistently save you an hour of using your brain to perform the same task adequately. I know our brains are not super good at the logic required for coding (but neither are LLMs), but I know for a fact they are very efficient at it.

That said I refuse to accept your framing that we can simply ignore the energy used in training, on the bases that it is equally invalid as considering the energy used for evolving into our species, or that we can simply stop training new models and use the models we do have. That is simply not how things work. New models will get trained (unless the AI bubble bursts and the market looses interest) and the energy consumed by training is the bulk of the energy cost. And omitting it makes the case for AI comically easy to justify. I reject this framing.

Instead of calculating, instead I’m gonna do a thought experiment. Imagine a late 19th century where iron and steel production took an entire 2% of world’s energy consumption[3] (maybe an alternative reality where Iron working is simply that challenging and requires much higher temperatures to work). But the steam train could only carry the same load as a 20 mule team, and would only do it 5%-30% faster on average then the state of the art cargo carriages at the time without steam power. Would you accept the argument that we should simply ignore the fact that rail production takes a whopping 2% of global energy consumption, when factoring the energy consumption of the steam train, even when it only provides you with 5%-30% productivity boost. I don‘t think so.

---

1: I don‘t know how much the loom has increased productivity, but this is what I would guess without any way of knowing how to even find out.

2: That is, if you are only interested in the increased productivity. If you are interested in the LLM models for some other reason, those reason will have to be measured differently.

3: https://www.allaboutai.com/resources/ai-statistics/ai-enviro...