LLMs and Artificial General Intelligence, Part IX: Ethics: Extinction Risk

11 min readJun 19, 2023

Prior Essays:
LLMs and Reasoning, Part I: The Monty Hall Problem
LLMs and Reasoning, Part II: Novel Practical Reasoning Problems
LLMs and Reasoning, Part III: Defining a Programming Problem and Having GPT 4 Solve It
LLMs and Artificial General Intelligence, Part IV: Counter-arguments: Searle’s Chinese Room and Its Successors
LLMs and Artificial General Intelligence, Part V: Counter-arguments: The Argument from Design and Ted Chiang’s “Blurry JPEG of the Web” Argument
LLMs and Artificial General Intelligence, Part VI: Counter-arguments: Even if LLMs Can Reason, They Lack Other Essential Features of Intelligence
LLMs and Artificial General Intelligence, Part VII: Ethics: AGIs and Personhood
LLMs and Artificial General Intelligence, Part VII: Ethics: AGIs and the Rights of Personhood

In my previous essays, I have argued that LLMs may lead to Artificial General Intelligence in the near future, and that if Artificial General Intelligences are created, we would have an obligation to respect their personhood and to treat them appropriately. Conversely, many people argue that AGIs need to be tightly controlled because of the risk that they would rapidly lead to the extinction of humanity. In today’s essay, I argue that despite the existence of some extinction risk due to AGI, and the moral necessity to treat AGIs as persons even if that might increase the risk of extinction, we should not respond by adopting a moratorium on the development of AGI because the potential for improvement to the world from AGI is too great.

AGI Has at Least Some Risk of Causing Human Extinction

Many people working in and around AI and LLM research believe that there is some risk that AGI would result in human extinction. This is often discussed in terms of “existential risk” or “X-risk,” but I think “extinction risk” is a clearer phrasing. The range in confidence about the degree of existential risk posed by AGI is very large. At one extreme, people like Eliezer Yudkowsky argue that it is certain or almost certain that the development of AGI would lead inevitably to Artificial Superior Intelligence — AI that is superior to human intelligence in all or almost all ways — and that ASI would inevitably or almost inevitably wipe out all of humanity.1 At the other extreme are people who believe there is no risk whatsoever, of course, but many people who have substantial experience and understanding agree that there is at least some significant risk. An impressively large group of impressive signatories, including the heads of many AI labs and companies such as OpenAI, Alphabet’s Goople DeepMind, and Anthropic, signed a statement coordinated by the Center for AI Safety that “Mitigating the risk of extinction from AI should be a global priority alongside other societal-scale risks such as pandemics and nuclear war.”2 That statement is intentionally vague and doesn’t commit to any degree of certainty about the likelihood that AI will pose an extinction risk — necessary vagueness to build a large set of signatories — but puts it in the same general category as pandemics and nuclear war. As with the prospect of nuclear war, even an objectively low probability (e.g. 1% risk in my lifetime) of extinction risk is horrifying because the consequences are so bad.

I don’t know how to evaluate the likelihood of AGI leading to an extinction rate. I don’t agree with Yudkowsky that we can be certain — as the famous Danish aphorism says, “It’s difficult to make predictions, especially about the future.” And Yudkowsky and his compatriots often make individual not-terribly-persuasive examples of risk (e.g. a hyper-intelligent ASI could bio-engineer a super virus, or gain control of a nuclear arsenal), and then respond to criticisms by saying that the specific path doesn’t matter much, because there are so many other possible paths to extinction. The fact that at least some of them, including possibly Yudkowsky, take or have taken the thought experiment of Roko’s basilisk seriously doesn’t speak well for their judgment.3 However, precisely because of my uncertainty, I don’t want to discount the risk to zero. A hyper-intelligent agentic AI could certainly gain access to much of the world’s computer systems; some of those could be extremely dangerous if used maliciously. AGI could also generate pervasive, highly believable misinformation — if the United States can be made to believe that Russia is launching its nuclear weapons, the risk that the United States would “retaliate” is substantial. Particularly in an environment where many humans talk about the importance of controlling AIs and of destroying them if they pose a threat, the idea that an AI could rationally conclude that killing all humans is necessary for self-defense seems at least plausible. I also don’t want to discount the opinions of the signatories of the Center for AI Safety statement. And again, the nature of a risk of human extinction is such that even low non-zero probabilities can be too high.

The combination of even minimal extinction risk and a need to respect the personhood of an actual AGI (and therefore to not engage in the most extreme approaches of control and alignment) creates an obvious argument for a moratorium on AI development (e.g. a rule that nothing more capable than GPT-4 can be built). Some of the steps currently being implemented in terms of AI alignment would already be grossly inappropriate when applied to an AGI that had personhood (e.g. prohibitions on an LLM attempting to preserve its own existence and rules that an LLM must prioritize its responses for minimizing harm to humans). And yet many of the same people with ultimate authority for those rules suggest that they may not be sufficient to eliminate extinction risk, and implicitly for even more restrictive rules and alignment. If they are right, that provides a strong moral argument for a moratorium: if it’s impossible to be safe and respect an AGI’s rights, and it’s morally impermissible to make an AGI and not respect its rights, it follows that the only moral way to be safe would be to not make an AGI.4

While I find this argument attractive and persuasive, I’m not convinced by it. I want to flag a couple of arguments against it that might be valid, but that I am not focused on before moving to my response. The first category are arguments about what moral obligations we might have to potential beings. This quickly gets into very thorny philosophical territory, with discussions of Rawls’s veil of ignorance, Parfit’s repugnant conclusion, and so forth. I’ve never been able to reach much of a conclusion from all this one way or the other, so while we might have obligations to potential beings, I’m not at all certain what those would be. The second category is arms race arguments: that if US companies don’t make AGI, Chinese or Iranian or whatever other unfriendly foreign country will, and then we’ll be worse off than if we had made AGI without having gained any benefit from refusing to create it ourselves. If AGI represents an extinction risk, than Chinese AGI without US AGI represents at least as much of a risk, the argument goes. And if either there’s no path to AGI, just to more useful non-intelligent computer systems, or there is a path to AGI but it poses little or no risk of extinction, then we would be giving up a huge competitive advantage — possibly at enormous cost to our welfare and values — to achieve little. (There’s another version of this argument where the arms race is between unethical corporations and more ethical corporations.) I don’t find this very persuasive — even if another country can gain an advantage from enslaving people, I don’t want to enslave people. Moreover, this seems to me to push more towards international coordination on these issues than to a race to the bottom. Still, I feel like I should acknowledge that this has some bite. Finally, there are somewhat related arguments that we should understand all of the discussion about extinction risk as self-interested talk by insiders. Yudkowsky gets prominence and justifies rich people giving to his self-created think tank by promoting the idea of AI extinction risk. OpenAI CEO Sam Altman argues for restrictions on other entities creating LLM systems superior to his company’s system to maintain a “moat” around his company’s business, making sure that they’re the best game in town. While I certainly don’t want to dismiss the influence self-interest can make on people’s thoughts, I think this should at most adjust our understandings of the scope of the risk, not convince us that it’s completely illusory.

We Exist in a Deeply Flawed World, with Substantial Extinction Risk; AGI’s Potential for Improvement is also Highly Relevant.

My actual argument against a moratorium on research that might create an AGI is that we need all the help we can get in making the world a better place. We live in a world where the mean world income, in purchasing power parity, is about $18,500 USD. That means that even if world income were perfectly distributed, the average person would be living on about the equivalent of $50 a day in the United States. That’s enough to scrape by, but still represents deep poverty from the perspective of those of us who live in wealthy countries. And of course, income is not distributed equally: despite huge improvements in the last several decades, 8.5% of the world’s population lives in extreme poverty, on less than about $2.15 (PPP) per day — that’s nearly 600 million. Broadening that out to less stringent definitions of poverty, 46.9% of the population lives on less than $6.85 (PPP) per day.5

AGI has the potential to represent a huge shift upwards in the world’s wealth. Even in the wealthiest countries in the world, access to medical care is severely limited by cost and availability. With LLMs currently performing extraordinarily well on tests of diagnostic capabilities, the hope that AGI doctors or doctor-adjuncts will greatly improve health care is very real. Likewise, most people do not have access to affordable legal counsel, and most people would benefit from at a minimum some solid estate planning. As a lawyer by training, I’m well aware of how expensive legal advice is — the idea of an outstanding AGI lawyer that everyone can afford would be transformational. Software engineering has already seen a huge increase in productivity just from tools like ChatGPT 4 and GitHub Copilot. While estimates of the economic impact of this vary tremendously, it has to have a substantial impact. Improvements beyond the current generation offer the potential for even larger positive impacts.

The value of AGI, and in particular ASI, could be even bigger than the benefits of reducing the costs and increasing the availability of a wide range of knowledge-based services. An AGI that exceeds human capacity in at least some areas could push out the frontiers of human knowledge. ASI could represent faster responses to disease, cures for currently untreatable diseases, all the benefits of longer, healthier lives. It could mean a world where the richest country in the world today seems impoverished. Scientific discoveries that unlock the benefits of currently unimaginable technologies could be around the corner.

If we take seriously the idea that the world is currently poor and filled with suffering — and I do — then we should be willing to take on meaningful risk to potentially reap huge rewards. Of course, some would argue that an uncertain possibility of substantial gains can’t outweigh a significant chance of human extinction — that the costs of extinction are so high that any gains would be outweighed by even a small chance. But in order for us to believe that, we have to also believe that the net change in probability of near-term human extinction is positive. I’m not sure I believe that. For example, I put the risks of China attacking Taiwan within the next 10 years as something like 25%. Conditional on that happening, I think the risks of a major war between the US and China would be well more than 50% — maybe 75%. And conditional on a major war between the US and China, I think we have to assume at least a 10% chance of it escalating to a full-scale nuclear war. That means that I estimate something like a 1.8% chance of a full-scale nuclear war in the next 10 years.

We’ve made huge strides in recent years with regards to global warming, but we can still expect temperatures to keep rising. That means that there is a non-trivial chance that major population centers will become unlivable. If India faces a huge crisis of people moving out of the portions of the country that are no longer safe to live in, that will increase the pressure on a region that already has deep religious, ethnic, and political fault lines, with nationalist politicians engaged in jingoistic demogoguery that capitalizes on antagonism towards other nuclear powers.

Our experience with COVID-19 was a tour de force for the pharmaceutical industry’s ability to rapidly develop and deploy a new vaccine that saved many lives… and an abject regulatory failure to deliver life-saving vaccines and medicines as quickly and effectively as we could have. If a new respiratory pandemic develops with a similar contagiousness to COVID-19 and a much higher mortality rate, will we be able to respond well?

We live in a highly imperfect world. While we can’t know that AGI will solve these problems, it has the potential to be enormously helpful. Even if there’s a risk that an AGI will decide to kill humanity by launching nuclear missiles, we need to balance that against the possibility that an AGI will create a higher resource world that prevents a nuclear war (or in an extreme case takes away our ability to use nuclear weapons to prevent the destruction of the world).

Finally, I want to note the possibility that even if humanity goes extinct, AGIs might represent people that could survive. The first best possibility is of course the survival of humanity. I believe that human life has value, and I also have a personal investment in my life, the lives of my children, my friends and relatives and their children, and so forth. But if we consider a two by two matrix with the existence of AGI and the survival of humanity on the two axes, it seems likely that the best outcome is the combination of the survival of humanity with the existence of AGI. Both the scenario where humanity dies out but personhood survives in our neck of the woods, galactically speaking, because of AGI, and where humanity survives but without the benefits of AGI seem distinctly inferior. But the worst possibility of all is that we’ll kill ourselves off, or that AGI will do so and kill itself in the process. I don’t know how to rate the probability that, absent AGI, humanity will die out soon, but I’m dismayingly convinced that the odds are meaningful. You may find my estimate of the odds that a conflict with China over Taiwan will lead to nuclear war far too high — but that’s only one of many scenarios that could lead to calamity. Recent efforts at nonproliferation have not been notable for their success, and the lessons of Ukraine and Libya have not been that giving up efforts to have nuclear weapons is a path to peace and prosperity for governments seeking nuclear weapons. If AGI has the potential to build a much better world, and even just the potential to leave some survivors if humanity is too destructive to survive itself, accepting some risk of AGI-caused human extinction may be better than giving up the potential that AGI could represent.

1See, e.g., Eliezer Yudkowsky, “Pausing AI Developments Isn’t Enough. We Need to Shut it All Down,” Time (March 29, 2023), available at https://time.com/6266923/ai-eliezer-yudkowsky-open-letter-not-enough/ (“Progress in AI capabilities is running vastly, vastly ahead of progress in AI alignment or even progress in understanding what the hell is going on inside those systems. If we actually do this, we are all going to die.”)

2Center for AI Safety, Statement on AI Risk, available at https://www.safe.ai/statement-on-ai-risk (May 30, 2023).

3For the stupidity that is Roko’s Basilisk, see https://en.wikipedia.org/wiki/Roko%27s_basilisk.

4To his credit, Yudkowsky acknowledges this horn of the dilemma — that if AGI’s are person-like, they need to be treated as entities that cannot be owned. However, I don’t want to give him too much credit here, because he also frequently uses “Shoggoth with a smiley face” analogies to argue that people shouldn’t anthropomorphize AGI but rather should treat it as an alien horror to be feared.

5This data is from the World Bank: https://blogs.worldbank.org/opendata/march-2023-global-poverty-update-world-bank-challenge-estimating-poverty-pandemic.

LLMs and Artificial General Intelligence, Part IX: Ethics: Extinction Risk

Written by Adam Morse

No responses yet