Why, in outline form, should we be concerned about advanced AI?

4 min read

Suggest changes in Google Docs

We sketch here a high-level argument for why future AI might threaten human civilization.

First, it seems possible to build AIs that are capable of reproducing a majority of the cognitive abilities of a human. AI researchers also generally believe that it is possible to build superintelligent AI which far exceeds human-level intelligence. It is possible that increasing the amount of compute used to train AIs is the main ingredient needed to build such an AI, in which case it might become technologically feasible to build one soon.

Second, actors such as corporations or governments have incentives to build these powerful AIs as soon as they can.¹ Some of these AIs will likely be made agentic² — i.e., built to autonomously pursue goals — since such AIs are expected to be more profitable.³ (While non-agentic AIs are less likely to lead to AI takeover, there is a risk that they would be misused – for instance, to conduct invasive surveillance, perform acts of terror, or simply cause chaos.)

Third, some of the first agentic AIs that are more generally capable than humans⁴ may be misaligned.⁵

Fourth, there’s a risk that such an AI will be driven to, and capable of, gaining a decisive strategic advantage over humanity in order to further its misaligned goals.

Finally, this situation leads to our disempowerment and, possibly, extinction. There are many ways disempowerment could happen, and it is hard to be confident in any specific scenario, but there are broad reasons to think things might go poorly.

It’s possible that when AI that surpasses humans is perceived to be imminent, these actors might rush to be the first to build such an AI, triggering an arms race, which could lead to safety being de-emphasized. ↩︎
It has been argued (e.g. here and here) that a non-agentic tool AI is likely to become agentic, either through its creators finding an agentic version more useful or through the AI self-modifying itself. ↩︎
We note elsewhere that things can go wrong even with non-agentic AI, although this seems less likely. ↩︎
The line for what counts as human-level is quite fuzzy, but for this scenario it is the combination of power and generality that makes it dangerous. ↩︎
It’s also possible that the first superhuman AI that is deployed is safe, either because it is reasonably well aligned or is not agentic, but if it does not perform a pivotal act there might be a second such AI that is deployed shortly after by another actor that could be unsafe. ↩︎

Why, in outline form, should we be concerned about advanced AI?

Unlisted