Why I Use Generative AI to Illustrate My Essays, Part I: Environmental Costs Are Low
I have used images created using generative AI — specifically, OpenAI’s DALL-E — to illustrate my essays here on Medium. Some of my readers have criticized this choice, and I thought it would be useful to provide an explanation of why I use generative AI for these purposes and why I think that’s consistent with morality. My arguments primarily address this context, where I’m using AI-generated images to illustrate essays that I write for free and make available for free — the context of using AI-generated images to illustrate commercial work is different, and some of my arguments don’t apply to that context.
First, I want to lay out the affirmative reasons for me to use AI-generated images before turning to why I reject arguments advanced against using AI to generate images. Including images in general makes my essays more accessible, gives them a better look when linking to them, and may help increase their readership. While I write these essays for me, because I want to write them, I find it gratifying when people read them and want the essays to find a wide readership. Generating images with DALL-E to use with each essay is quick and fun — I can think about what sort of image I would like to see, type in a prompt, and quickly get a result. If the first image doesn’t work for me, I can try a few modifications to the prompt until I’m happy. I enjoy the process, both the creative aspect of defining the prompt and the amusement of seeing the results — sometimes good, sometimes hilariously bad. I also find it a useful way to keep monitoring the capabilities of generative AI with real-world tasks, which makes it a good fit for my other interests. These aren’t tremendously strong reasons to use generative AI, but they are more than sufficient to justify it if there are not compelling counter-arguments as to why I shouldn’t.
People advance three main arguments against using AI-generated images. The first is an environmental argument, asserting that because AI uses large amounts of electricity (and large amounts of water for cooling), it is irresponsible and destructive to use AI to generate images. The second is an argument that AI-generated images are destructive of human art and creativity, bad for artist’s commercial opportunities, and infringe on copyrights. The third is a meta-argument, not so much expressing that AI-generated art is actually bad, but arguing that the use of AI-generated art will cause push-back from others or will turn off readers. In today’s essay, I will address the environmental argument, leaving the others for future essays.
Environmental Concerns: Does AI Consume Too Much Energy?
People frequently argue that using AI is enormously destructive of the environment. The argument is based on an assertion that using AI — particularly for image generation — uses large amounts of energy and requires vast quantities of water for cooling purposes. If that assertion were fully true, it would be a compelling reason to avoid using generative AI at all. Global warming and climate change are very serious problems, generated by human actions, and we need to act quickly and aggressively to address them. I was an early adopter of fully electric vehicles because transitioning away from the internal combustion engine is a necessary part of reducing and ultimately stopping global warming, and I take environmental costs and the externalities of energy generation and consumption seriously. However, AI does not currently require an unreasonable amount of energy.
When I started researching and writing this essay, I expected to conclude that individual images took a meaningful but small amount of energy, that training models took a large amount of energy and inflicted substantial environmental costs, that the AI industry as a whole was a serious problem environmentally, and that major reforms were vital to prevent almost inevitable global warming problems in the future. As I researched the issue, however, the evidence surprised me: at every level and every step, AI uses less energy, relatively speaking, than I expected. The current level is in fact quite sustainable, and while that may change in the future, it may not.
When people claim that generative AI is a threat to the environment, their claims can be broken down into four basic types. The first claim is that each individual use of AI — each image generated, each query responded to, each “inference” to use the AI jargon — consumes an undue amount of energy. The second is that even if the energy consumed for generating images isn’t unacceptable, the initial training of the AI requires much more energy, and that we should view the overall cost of both the training and the use of generative AI as unacceptable. The third is that even if any particular AI system consumes a reasonable amount of energy, for both training and use combined, the AI industry as a whole uses too much energy. All three of those arguments are basically just false. The fourth assertion is that while today’s AI uses reasonable amounts of energy, the ever-increasing demand for more computational power for more advanced AI will result in problems in the future. That last assertion may be true, although like any projection it remains uncertain. We need to keep an eye on changes in AI’s energy demands, to work hard on efficiency, and to make sure we roll out large amounts of renewable power generation rapidly — something we should be (and to some extent are) doing anyway. It does not, however, provide any strong basis for not using generative AI today. I’m going to examine each of these assertions in detail.
The Energy Costs of Generating an Image Are Negligible
It’s commonplace to find articles that assert that generating an image using a tool like DALL-E consumes vastly more energy than generating text using a generative Large Language Model like ChatGPT, which in turn use much more energy than a simple search query. That’s all true. The question, however, is how much energy does generating an image take in absolute terms, not how that compares to other computing tasks. And the answer is, surprisingly little.
Before I get into the details, a little reminder about what we mean by power and energy for people whose physics is a little rusty. “Energy” consumption is about the amount of work we have done over time. “Power” is the rate at which we’re using energy — how hard we’re working at a particular moment. For electricity, power is typically measured in watts (and the related terms kilowatts, megawatts, gigawatts, and terawatts). Energy is usually measured in units like kilowatt-hours (kWh); a one kilowatt power device uses one kilowatt-hour of energy each hour. We can use different time units in the units of energy; the watt-second (one watt of power used for one second of time) is the same as the joule, another standard unit of energy. Both power and energy matter, but for different purposes. For example, power is relevant to whether a circuit breaker will flip (or a fuse will blow), whether the wiring is sufficiently robust to run a device, and whether the grid has enough power supply to meet the additional demand created by building a new data center. Energy is relevant to questions like “what’s our electric bill at the end of the month?” and “how much coal and natural gas burned to give us power?” While both matter for environmental concerns, total energy usage is generally more relevant to global warming than peak power demands, although unusually high power demand may result in less efficient “peaker” plants being used in addition to more efficient base load power plants.
Most of the articles that I found online asserting that AI image generation consumes a large amount of energy cite back to “Power Hungry Processing: Watts Driving the Cost of AI Deployment,” (November 28, 2023) a pre-print article by Alexandra Sasha Luccioni, Yacine Jernite, and Emma Strubel. Luccioni, Jernite, and Strubel estimate that generating 1000 images uses an average of 2.907 kWh of energy. What does that actually mean, though? Is that a lot? First, we can convert that directly to an energy cost for one image by simply eliminating the kilo-: each image uses an average of 2.9 Wh. To put that in perspective, let’s convert that into watt-seconds (or joules), instead of watt-hours, and then compare that to some household appliances. 2.9 Wh is equal to 10,440 Ws, and a typical toaster or microwave draws about 1200 W of power. That means that generating an image using generative AI is about equivalent to running a toaster or a microwave for 8.7 seconds. For a different comparison, running a laptop computer is fairly low power, requiring only about 50–300 W, depending on the model. Generating an image uses the same amount of energy as running a laptop for somewhere between 30 seconds and 3 or 4 minutes. These amounts of energy aren’t nothing and are higher than the energy used in a Google query or the like, but they are very, very small — if we don’t worry about the energy costs of running a microwave or toasting some bread, we should be similarly unconcerned about the marginal energy costs of generating the occasional image. If we want that in dollars and cents, using the average cost of electricity in the United States in March 2024 ($0.174 per kWh), generating an image costs about five-hundredths of a penny worth of electricity. That’s a negligible cost.
This also makes sense in light of the costs charged to use generative AI. OpenAI charges $20/month for access to ChatGPT Plus, which includes DALL-E access. Google allows free image generation on Gemini, subject to limitations on creating images with people in them, as does Meta for people willing to log in with their Facebook accounts. If the energy costs for individual image generation were high, Google and Meta would not be willing to allow users to generate images for free, and OpenAI would need to charge much more. Yes, they are competing for brand value and market-share, and may be willing to run at a loss temporarily in the hopes of profit down the road. But if it were actually true that each individual image generated involved an objectively large energy cost, they would not be willing to allow people to use their services as inexpensively as they do.
The energy costs for each image generated are trivial. Even if the estimated energy to generate an image is low by an order of magnitude, ten times the Luccioni, Jernite, and Strubel estimate would only be enough energy to make a bowl of instant oatmeal in the microwave.
Energy Costs for Training Are Substantial, But Not Highly Destructive
While the marginal costs of generating an individual image are low, training a model requires an enormous amount of computational resources which do consume a substantial amount of energy. We can get estimates of how much in a couple of different ways. First, some published statements of training energy costs are available. Meta released its new current generation base model on April 18, 2024, and published statements about the resources used to train it. For the mid-sized 70 billion parameter version — what Meta currently uses to generate images for the public — Meta used 6.4 million hours of processor time, at 700 W per hour, for a total of 4.48 gigawatt-hours of energy. While Llama 3 is a state-of-the-art model, the mid-sized version is less capable than the currently-under-training 400 billion parameter version that is likely similar in size and training complexity to other top of the line, current AI models (e.g. Anthropic’s Claude 3 Opus, Google’s Gemini 1.0 Ultra, ChatGPT 4 (or maybe an as-yet-unreleased ChatGPT 4.5)). Using the energy cost to train Llama 3’s 8B parameter version and its 70B parameter version, we can estimate that the 400B parameter version will take roughly 35 million hours of processor time, requiring roughly 25 gigawatt-hours of energy.
We can also approach this from another direction by estimating the energy required to power enough GPUs to make a new, state-of-the-art “frontier” model. President Biden’s executive order about AI defined the level of sophistication needed for additional oversight as models trained on clusters capable of more than 10²⁰ floating point operations per second or trained with a total of 10²⁶ floating point operations per second. (Lower requirements apply to models specifically targeting biological or medical uses.) The current state-of-the-art publicly available AI-optimized GPU is Nvidia’s H100. Many of the biggest players in AI use their own custom designed chips, not publicly available chips like the H100; however, using the H100, whose technical specs and power demands are publicly released, as a benchmark lets us make a rough estimate that we can apply to chips that are not available for sale, like Google’s Tensor Processing Units. The executive order’s cluster definition would roughly cover clusters with more than about 50,000 or 100,000 H100 GPUs. Each H100 can pull 700 W of power, meaning that the entire cluster for training a new frontier model might pull 70 MW of power. 70 MW is a lot of power — a cluster running at that power level constantly for a year would use about 614 GWh of energy, about the same electricity consumption as the US Virgin Islands. In practice, clusters won’t run at full power all the time — a frontier model trained on 100,000 H100 GPUs might require less than a month of computational time — but let’s continue to compare to the annual energy usage. The average person in the US uses about 4.5 MWh of energy per year, so a single AI training cluster might require as much power as 150,000 people. And of course, all that power usage generates a great deal of heat, requiring a lot of cooling.
Even so, AI training isn’t a principal driver of electricity use at a company like Google. Google used 21.78 TWh of electrical energy in 2022. Our hypothetical top-end H100 computation cluster, run at 100% utilization year-round, would represent only about 2.8% of Google’s electricity usage. The consumption of energy for AI training is still relatively small compared to the use of energy for computing in general.
Another way to contextualize that is by comparing the financial costs of the energy to run a cluster like that compared to the capital costs to build that cluster. Energy is underpriced relative to its social cost, because the pollution from generating energy is an externality that’s not fully incorporated into the costs of energy. Still, it gives us a back of the envelope estimate of relative costs. A single H100 costs about $30,000, with a cluster of 8 and supporting infrastructure costing around $300,000. (Information on the costs of larger clusters of H100s are not publicly available, but I’m going to assume they scale roughly linearly.) Our hypothetical 100,000 H100 cluster, enough to run at or above the current best models in the world, would cost a nice $3 billion. That’s real money, even by the standards of Alphabet/Google and Meta — Google’s entire capital expenditures for 2023 were $32.3 billion. In comparison, the 614 GWh to run 100,000 H100s for a full year would cost $107 million at market rates. Again, that’s a substantial cost, but it still shows that currently, energy costs are not the binding constraint on AI training. Google’s overall electricity usage would cost something like $3.5 billion to $4 billion at market rates — a vastly higher number, because the energy to train a frontier AI model is small compared to the total energy use for computation for a major cloud company. It’s significant — unlike the energy costs for a single image generation — and we should care about whether the companies involved are making a point of sourcing renewable power and working on ameliorating the costs of water cooling (if they use water for cooling, as Google does). At the same time, it doesn’t represent a compelling reason to avoid using AI-generated art — although it might be a good reason to prefer using systems by companies like Google, which make a major point of their sustainability efforts and provide substantial information about their progress to their goals, as opposed to companies like OpenAI, which do not.
Even Taken as a Whole, AI Computation Is Not Likely a Major Source of Emissions
The next question is whether when the aggregate costs of training and all the queries across the whole industry represent an environmental catastrophe. Each individual query doesn’t take a lot of energy, but in sufficient numbers, they add up. The Luccioni, Jernite, and Strubel paper estimated that ChatGPT might handle queries from 5 million to 10 million people per day. Those won’t all be image generation, but some people will make many queries or generate multiple images, so if we back-of-the-envelope assume the equivalent of one image per person, that represents something like 15-30 million Wh — 15-30 MWh — of energy per day. That’s a substantial amount of energy, although again, not an inordinate amount for a substantial industry. If a frontier model is trained in a month on a high-power cluster using 50 GWh of energy, and then handles millions of queries per day using an additional 30 MWh per day that adds up to about 61 GWh over the course of a year.
Those are real energy costs, especially when many different companies are competing and training different frontier models. Google, Meta, OpenAI, and Anthropic have all made state-of-the-art models available to the public, and some other companies have done major AI work that is either not publicly available or limited for special applications — Apple’s MM1 and Tesla’s work on self-driving cars leap to mind. Moreover, each company generates multiple different foundation models — Google developed 18 different foundation models in 2023 alone, although not all of them required state-of-the-art amounts of computation. Epoch AI estimated that industry, academia, and governments produced 86 different “notable” foundational models in 2023. In addition, there are numerous smaller companies and organizations that are not producing significant foundational models, but are still using meaningful computational power on fine-tuning models and applying models.
Even so, the total amount of energy used for AI can’t be exorbitant currently. Take the 61 GWh estimate I derived above for the total energy use to derive a state-of-the-art foundational model and use it to handle a number of queries comparable to the number handled by ChatGPT, and multiply that times 100, rounding Epoch AI’s estimate of notable 2023 models up, and we get 6 TWh. That’s almost certainly an overestimate — very few models see as active use as ChatGPT, and my estimates use the high ends of every possible range. Six TWh is a mind-bogglingly huge amount of electricity, about as much as running a toaster for 5 trillion hours or enough to provide full charges to a long-range electric vehicle about 50 billion times. On the other hand, the United States used about 4,000 TWh in 2023, so that high-end estimate of 6 TWh of energy spent on AI represents only 0.15% of the electrical energy used in the United States. Again, these are big numbers. It matters that we care about increasing the amount of renewable electricity generation to reduce the environmental footprint of AI. But while these numbers are large, using 0.15% of the U.S.’s electricity for an industry that represents something like 0.1% of the U.S.’s GDP is not unreasonable.
Future Generations of Generative AI May Require Actually Problematic Amounts of Energy
While up to now, generative AI has not required an unreasonable amount of electric energy, that may change quickly. The computational resources required to train a state-of-the-art generative AI model have increased by roughly a factor of 10 each year for the last half decade or so. If that trend continues, then to the extent that it’s not counter-balanced by much more energy-efficient chips, the amount of energy required to train AI models will rapidly go from substantial but entirely manageable, at present, to unsustainable within a few years. If we continue with the (likely slightly high) estimate of 6 TWh of energy for AI in 2023, and assume a factor of 10 growth each year, by 2026, AI would be using more electrical energy than the entire electrical consumption of the United States at present. By mid-2027, AI would use more electrical energy than all other uses on the planet combined. That would, of course, not be possible — it’s completely implausible that the world could double its power generation infrastructure in three or four years.
Beyond just estimates based on trends, we can also use what some of the companies involved say about their future plans. Mark Zuckerberg has said that by the end of 2024, Meta will have the equivalent of 600,000 H100 GPUs. That’s roughly equivalent to ten times what people estimate is being used to train Llama 3. The major players all say that the computational requirements for future generations of AI models will continue to increase rapidly, and they also say that that means they will need to work actively to develop and secure additional generating power.
In practice, some of the increased need for more computational power is offset by more efficient chips. Nvidia’s H100 chips are currently the market standard, but Nvidia has announced that its next generation B200 chips will offer four times the training power (and sixteen times the inference power) while using 1/25th of the energy. That means that if a current top-of-the-line AI training cluster might use 100,000 H100s and next year’s top-of-the-line would need 10 times the processing power, it could be built with 250,000 B200s and end up using a tenth of the energy. If that turns out to be true — and representative of technological advance in general — then even with increasing demands for computational power, the energy requirements might even go down.
As the Danish aphorism says, “Predictions are very difficult, especially about the future.” We can be confident that computational requirements for cutting-edge AI will continue to grow. It is likely, though not certain, that energy usage for AI will also grow. If energy usage grows in the same dramatic way that computational requirements have grown, it would rapidly become a major environmental concern. That’s a strong reason to carefully watch what continues to happen, to invest in new renewable energy generation, and to be prepared to respond with regulation as necessary.
For the time being, we should recognize that the current energy usage of generative AI is also small potatoes compared to energy used for transportation, or for heating, or for electricity for the more than 130 million U.S. households and many other industrial and commercial uses. It may increase to be a major factor in the future, and controlling the growth of energy usage for AI will be important, especially if AI usage increases dramatically and the power demands of each use increase. At present, the claim that generating an image now and again requires energy usage (or water usage) that we should view as unacceptable is simply fear-mongering, akin to arguing that reheating your leftovers in the microwave is unacceptable because of the energy costs.
I’ll address concerns about artists’ livelihoods and copyright interests in a subsequent post.