Influencing the carbon emissions of AI
There is a correlation between the training time and energy consumption, but that doesn’t mean there is a correlation between training time and carbon emissions.
Machine learning is not new, but all the hype around generative AI and ChatGPT is indicative of its rapidly growing importance. As usage grows, minimizing the environmental impact is important to consider.
Use of AI is broken into two main stages - 1) training i.e. producing the model; 2) inference (or prediction) i.e. getting something out of the model. Training is what takes the most time and resources. This is particularly true if the training dataset is very large, but also because the process involves iterating on the model, training, and retraining to improve its accuracy.
A publication last year (also freely available) which included previously unavailable numbers on the energy & carbon impact of Google’s AI efforts provided some interesting insights:
For the last 3 years, machine learning has represented around 15% of Google’s overall energy consumption, but 70-80% of Google’s total floating point operations per second (FLOPs). Google’s total energy consumption has increased, but the proportion attributed to machine learning has remained the same.
The choice of model can impact the amount of computing power required by a factor of 5-10. Improvements to algorithms and reducing model density result in significant improvements in energy efficiency and a reduction in training time even as the number of parameters increases.
Using specialist hardware (like Google’s TPUs) and modern GPUs specifically optimized for machine learning rather than graphics (like the NVIDIA V100 and A100) can improve performance per watt by 2-5 times compared to general purpose processors. Specifically, “compared to the unoptimized P100s from 2017, the ML-optimized TPU v2 in 2019 and TPU v4 in 2021 reduced energy consumption by 5.7x and 13.7x, respectively.”
This can be simplified to: energy consumption is related to the compute operations needed for training (i.e. training time), which is becoming more efficient through improved model implementations. At the same time, hardware is also becoming more efficient. This is offsetting the increase in usage.
What about the carbon emissions of AI?
As with all software neither AI nor machine learning emit carbon by themselves, but generating the energy used to power the infrastructure does. There is a correlation between the training time and energy consumption, but that doesn’t necessarily mean there is a close relationship between training time and carbon emissions. When and where the training happens is crucial to understanding the carbon impact.
High performance computing clusters typically associated with scientific computing tend to be run at very high utilization. They’re often co located near universities and jobs are booked to try and ensure there is always something scheduled to run. This is good when considering the high fixed cost of purchasing the equipment, but doesn’t help optimize use-stage carbon emissions.
The grid electricity mix continually changes so that one minute there might be an abundance of clean energy, but the next the grid might switch to being powered by fossil fuels. The ability to delay processing by a few hours can have a significant impact on the carbon footprint. Especially so if the processing can also be moved location (although this is very difficult and rarely done).
This means you can’t use the average carbon intensity to accurately calculate the carbon footprint of a workload that is running for just a few hours or days. The grid mix fluctuations are smoothed out and mean that any estimate would be either under- or over-estimating the total carbon. This is a limitation in a study that was just published in preprint considering the factors influencing the emissions of machine learning.
Better transparency is a running theme of my posts because without hourly (at least) tracking of emissions it’s difficult to produce an accurate estimate. This is why projects like EnergyTag exist and why companies like Google are really pushing for 100% clean energy 24/7.
The energy consumption of AI can be calculated, but translating that to carbon emissions is much more challenging - there currently isn’t enough data.
Thanks for your article. I am going to annoy you one more time, I hope that's ok :)
1. You seem to be focusing your articles on electricity consumption. That is only a tiny part of the carbone footprint of the digital industry. For instance, you mention that new HW is more efficient. That implies acquiring new HW, which implies their manufacture (and transport). Manufacture of IT HW is very greedy in term of minerals and energy, which is actually most of the carbon footprint. What kind of energy mix you use to power that has little influence on the total footprint.
2. As you point out, the fact that HW and algorithms are becoming more efficient has absolutely no impact on the amount of electricity Google uses to power his AI (and other services). You can reduce by 2 the power needed to run a service, if you multiply by 3 the amount of users or services, you end-up having larger carbon emissions in total (which is what we are concerned about).
3. Now, regarding the energy mix, you need to take into account that Google is not the only user of the power grid in periods of high wind and sunny days... When google schedule its heavy computations during periods of peak of decarbonized energy, it prevents other users from using this mix. This is because the amount of 'green' energy is finite. It is actually so finite that 100% produced is consumed at all times. So, assuming Google could use exclusively decarbonized energy, it would simply push other users towards carbonized sources, but the global picture would not change: the same amount of carbon is emitted at the scale of the country, while only Google's carbon footprint would appear "greener" as an artefact.