What is a path to AGI being an existential risk?

Here is a conjunctive path to AI catastrophe inspired by Joe Carlsmith’s report on power-seeking AI. Each step is uncertain and depends on the realization of the previous one. The tree branches between accidental and misuse cases. Unlike Carlsmith, we do not assign probabilities for each of these steps as credences vary greatly, but argue that the end result is probable enough to warrant attention. You can find a more abstract version of this argument here.

The path goes something like this:

  1. Building human-level AGI is possible

  2. Humanity will have the means to build AGI soon-ish

  3. Humanity will build agentic AGI soon after it is possible

    1. It will be profitable to do so

    2. Some actors will rush for it, which might trigger an AI arms race

  4. Misalignment: A singleton[1] AGI is deployed by a well intentioned actor but is misaligned (due to instrumental convergence, orthogonality thesis, inner/outer misalignment, treacherous turn, etc.) and gains a decisive strategic advantage

    1. AGI can intellectually outmaneuver humanity

    2. AGI can act in the physical world or influence humans to act in ways that harm humans or humanity


Misuse: An AGI is deployed by a selfish or malignant actor, leading to bad outcomes for most or all humans

1. AGI leads to [authoritarianism](/?state=6409&question=Isn't%20the%20real%20concern%20AI-enabled%20surveillance%3F)

1. AGI is used for [terrorism](/?state=6410&question=Isn't%20the%20real%20concern%20AI%20being%20misused%20by%20terrorists%20or%20other%20bad%20actors%3F)

1. [Autonomous weapons](/?state=6411&question=Isn't%20the%20real%20concern%20autonomous%20weapons%3F) are used to kill entire populations

1. A [Chaos-GPT](https://decrypt.co/126122/meet-chaos-gpt-ai-tool-destroy-humanity)-like AI is deployed
  1. This leads to bad outcomes and possibly even human extinction

If you would like to try putting your own probabilities into a model you can do that here (for further explanation of how to use the Analytica model you can read this guide)

  1. This might also be possible in a multipolar scenario. ↩︎