Why would a misaligned superintelligence kill us?

3 min read

Suggest changes in Google Docs

While AI is unlikely to be malevolent towards humanity, we might still die as a result of the AI doing instrumental reasoning, with our deaths being either 1) an intentional goal or 2) a side-effect of some other goal:

As an intentional goal: The AI might see a risk of humans interfering with its goals, e.g., by trying to turn it off or by building a rival superintelligent AI. The AI might attempt to remove this risk by killing us.
As a side-effect: Whatever a superintelligent AI’s goals are, it’s unlikely that they would be best served by keeping the Earth broadly in its current state, unless the AI specifically values preserving the status quo (e.g., for the sake of humans and other life). Just like we flood large areas to build reservoirs for dams, fundamentally changing the ecosystem, an AI might undertake large-scale projects like enveloping the sun in solar panels that make the world unlivable for humans, or that use the materials lying around the solar system as resources. This material could include humans themselves: “The AI does not hate you, nor does it love you, but you are made out of atoms which it can use for something else.”¹

The default outcome of these is probably not just death on a large scale, but human extinction.

On the other hand, keeping humans around would take only a small fraction of a superintelligence’s resources. Some have argued that an AI might be willing to pay that small cost to keep us around, either if it’s only mostly misaligned and cares about us a little bit, or for various decision-theoretic reasons. That could look like anything from giving us free rein over a small part of the universe to putting us into a kind of zoo. Others think it will be unwilling to pay even that cost. And even if humanity survives like this, that’s not an ideal outcome: many of us might still die, survivors might not like their situation, and most of the universe would be outside of humanity’s reach forever.

But any superintelligence that was both powerful and misaligned enough to consider taking humans apart for their atoms would be modifying the rest of the world radically enough to make human life impossible, anyway. ↩︎

How powerful could a superintelligence become?

How likely is extinction from superintelligent AI?

Wouldn’t AI takeover leave survivors?