What is a path to AGI being an existential risk?

Here is a conjunctive path1 to AI takeover2 inspired by Joe Carlsmith’s report on power-seeking AI. Each step is uncertain and depends on the realization of the previous one. We do not assign probabilities for each of these steps as people have widely varying estimates for these probabilities, but argue that the end result is probable enough to warrant attention. You can find a more abstract version of this argument here.

The path goes like this:

  1. Building human-level AGI is possible in principle. In this context, AGI refers to AI which can do things like: think strategically, do independent scientific research, design new computer systems, engage in high levels of persuasion and make and carry out plans.3

  2. Within the foreseeable future, humanity will have the technological capability to construct AGI.

  3. Once feasible, humanity is expected to proceed with building agentic AGI to perform tasks autonomously because it will be profitable to do so.

  4. A singleton4 AGI is deployed by a well intentioned actor but is misaligned5 and gains a decisive strategic advantage. Such an AI is capable of outmaneuvering humanity and can reach its misaligned aims either by directly acting in the physical world or influencing intermediaries like humans to act in ways that harm humans or humanity.

  5. Humanity is disempowered. We don’t know exactly what will happen but human extinction seems likely.

Carlsmith arrives at a 5% chance of human extinction following a similar path. You can put your own probabilities into a detailed model here6 to derive your probability of existential catastrophes according to this model.

Further reading:


  1. This makes the path liable to the multiple stage fallacy. ↩︎

  2. This path only covers AI takeover. For a case that covers misuse, see here. ↩︎

  3. Examples include Karnofsky’s PASTA or Bensinger’s STEM-level AGI. ↩︎

  4. This might also be possible in a multipolar scenario. ↩︎

  5. This could be due to instrumental convergence, the orthogonality thesis, inner/outer misalignment, a treacherous turn, etc. ↩︎

  6. Use this guide to learn how to use the Analytica model. ↩︎