Expect more overestimates of AI energy consumption
We’ve started to see AI doomerism spread to predictions of the vast quantity of energy AI is undoubtably going to consume.
The launch of ChatGPT on November 30, 20221 marks the beginning of the new AI era. We’ve had many AI cycles in the past going all the way back to 1966. However, the last year has been particularly exciting. Today, everyone can have a compelling experience for free with ChatGPT 3.5 or Google Bard, with added amazement if you pay OpenAI to get ChatGPT 4 and DALL-E.
I expect this to continue in 2024. The AI train is just getting going. ChatGPT 5 is on the way, Facebook has been pushing Llama, Google is trying to compete with Gemini, Apple has quietly released Ferret, and startups like Mistral are also on board.
AI is the number one thing everyone2 is talking about. The doomerism is almost as prevalent as the hype, and we’ve also started to see that doomerism spread to predictions of the vast quantity of energy (and water) AI is undoubtedly going to consume.
Remember how a peer reviewed article from 2015 claimed that by 2020 data centers were going to consume 1,200 TWh of energy? The actual figure was 200-220 TWh (prediction, outcome). And there were many more like it. Just like data center energy before it, AI energy consumption doomerism is the perfect horror story for media e.g. BBC, Scientific American, New York Times.
AI definitely consumes energy. Google has reported how machine learning represents around 15% of its annual energy consumption and we know that GPUs are very energy intensive. The question is: how much?
What to watch out for
The big red flag is extrapolation from current public data. Something like:
A Google search consumes x energy.
Google has said that an AI query will cost x10 more than a normal search. query.
Therefore AI energy will be (Current Search Volume) x 10.
Or you might see:
OpenAI consumes x energy today.
OpenAI has 100 million users.
Allocate x energy across 100 million users.
If OpenAI grows to a billion users, that will be (Per user energy allocation) x 1 billion.
These are arguments from extrapolation and they are always wrong. You can’t trust any prediction about a complex system more than a few months out. Technology changes too rapidly.
That Google quote is also deliberately misleading – the full quote is:
In an interview, Alphabet’s Chairman John Hennessy told Reuters that having an exchange with AI known as a large language model likely cost 10 times more than a standard keyword search, though fine-tuning will help reduce the expense quickly.
The incentives to reduce the cost are very strong, particularly when services are offered for free and/or have business models indirectly linked to search volume e.g. ads or fixed subscriptions.
What might change?
I’m excited by the coming technological changes. Lots of new products and APIs to play with! What might we see that will complicate accurate calculations of the energy consumption of AI systems?
New models with fewer parameters, but higher quality. For example, the Mixtral of Experts model “outperforms Llama 2 70B on most benchmarks with 6x faster inference”.
More energy efficient models. Google reported the choice of model can impact the amount of computing power required by a factor of 5-10. Different tasks (even different search query types) will be given to different models.
Different data center hardware. NVIDIA has a monopoly on GPUs, which is the largest incentive that can operate on a market to encourage more competition. Google Gemini was trained entirely on TPUs which “compared to the unoptimized P100s from 2017, the ML-optimized TPU v2 in 2019 and TPU v4 in 2021 reduced energy consumption by 5.7x and 13.7x, respectively.”
Different client hardware. Apple has neural cores built into all their current computers and mobile devices. Transformers are already running on macOS 14 to give you predictions as you type. This happens locally. The M-series chips are probably the most power efficient chips in the world and “the M3 GPU is able to deliver the same performance as M1 using nearly half the power, and up to 65 percent more performance at its peak” (Apple).
What to measure? Measuring “AI” is not the same as measuring the energy consumption of a network switch or a server because it’s all software. GPUs (and TPUs, etc) are a more easily measurable component, but AI also uses parts of other systems in the data center. How to account for training and/or inference on client devices will also be difficult.
This makes me optimistic. What I expect to see is AI initially consuming more energy as new technology emerges3. Just as with data centers, efficiency will improve quickly and then we’ll see AI energy consumption decoupling from demand. Eventually, it will plateau even as AI usage grows massively.
However, that will take some time. So in the meantime, keep an eye out for claims about the planetary damage AI Is causing. These will almost certainly be bogus.
Or, arguably, a few months prior with the launch of DALL-E 2 in April 2022.
Definitely in tech, but also in general.
Insofar as we can actually calculate it accurately, which we have only started to see for data center energy more recently.