Comment by jonas21

Comment by jonas21 13 hours ago

22 replies

What exactly are you basing this assertion on (other than your feelings)? Are you accusing Google of lying when they say in the technical report [1]:

> This impact results from: A 33x reduction in per-prompt energy consumption driven by software efficiencies—including a 23x reduction from model improvements, and a 1.4x reduction from improved machine utilization.

followed by a list of specific improvements they've made?

[1] https://services.google.com/fh/files/misc/measuring_the_envi...

esperent 13 hours ago

Unless marketing blogs from any company specifically say what model they are talking about, we should always assume they're hiding/conflating/mislabeling/misleading in every way possible. This is corporate media literacy 101.

The burden of proof is on Google here. If they've reduced gemini 2.5 energy use by 33x, they need to state that clearly. Otherwise a we should assume they're fudging the numbers, for example:

A) they've chosen one particular tiny model for this number

or

B) it's a median across all models including the tiny one they use for all search queries

EDIT: I've read over the report and it's B) as far as I can see

Without more info, any other reading of this is a failing on the reader's part, or wishful thinking if they want to feel good about their AI usage.

We should also be ready to change these assumptions if Google or another reputable party does confirm this applies to large models like Gemini 2.5, but should assume the least impressive possible reading until that missing info arrives.

Even more useful info would be how much electricity Google uses per month, and whether that has gone down or continued to grow in the period following this announcement. Because total energy use across their whole AI product range, including training, is the only number that really matters.

  • mquander 12 hours ago

    You should not assume that "they've chosen one particular tiny model", or "it's a median across all models including the tiny one they use for all search queries" because those are totally made up assumptions that have nothing to do with what they say they measured. They measured the Gemini Apps product that completes text prompts. They also provided a chart showing that the thing they are measuring scores comparably to GPT-4o on LM Arena.

    • penteract 11 hours ago

      From the report:

      > To calculate the energy consumption for the median Gemini Apps text prompt on a given day, we first determine the average energy/prompt for each model, and then rank these models by their energy/prompt values. We then construct a cumulative distribution of text prompts along this energy-ranked list to identify the model that serves the 50-th percentile prompt.

      They are measuring more than one model. I assume this statement describes how they chose which model to report the LM arena score for, and it's a ridiculous way to do so - the LM arena score calculated this way could change dramatically day-to-day.

  • mgraczyk 12 hours ago

    > total energy use across their whole AI product range, including training, is the only number that really matters.

    What if they are serving more requests?

  • mgraczyk 12 hours ago

    They did specifically say in the linked report

    • esperent 12 hours ago

      Here's the report. Could you tell me where in it you found a link to 33x reduction (or any large reduction) for any specific non-tiny model? Because all I can find is lots of references to "median Gemini". In fact, I would say they're being extremely careful in this paper not to mention any particular Google models with regards to energy reduction.

      https://services.google.com/fh/files/misc/measuring_the_envi...

      • mgraczyk 12 hours ago

        Figure 4

        I think you are assuming we are talking about swapping API usage from one model to another. That is not what happened. A specific product doing a specific thing uses less energy now.

        To clarify: the way models become more efficient is usually by training a new one with a new architecture, quantization, etc.

        This is analogous to making a computer more efficient by putting a new CPU in it. It would be completely normal to say that you made the computer more efficient, even though you've actually swapped out the hardware.