Categories

Academia (6)Actors (5)Adversarial Training (7)Agency (6)Agent Foundations (20)AGI (19)AGI Fire Alarm (3)AI Boxing (2)AI Takeoff (8)AI Takeover (6)Alignment (5)Alignment Proposals (10)Alignment Targets (4)Anthropic (1)ARC (3)Autonomous Weapons (1)Awareness (6)Benefits (2)Brain-based AI (3)Brain-computer Interfaces (1)CAIS (2)Capabilities (20)Careers (14)Catastrophe (29)CHAI (1)CLR (1)Cognition (5)Cognitive Superpowers (9)Coherent Extrapolated Volition (2)Collaboration (6)Community (10)Comprehensive AI Services (1)Compute (9)Consciousness (5)Content (2)Contributing (29)Control Problem (7)Corrigibility (8)Deception (5)Deceptive Alignment (8)Decision Theory (5)DeepMind (4)Definitions (86)Difficulty of Alignment (8)Do What I Mean (2)ELK (3)Emotions (1)Ethics (7)Eutopia (5)Existential Risk (29)Failure Modes (13)FAR AI (1)Forecasting (7)Funding (10)Game Theory (1)Goal Misgeneralization (13)Goodhart's Law (3)Governance (25)Government (3)Hedonium (1)Human Level AI (5)Human Values (11)Inner Alignment (10)Instrumental Convergence (5)Intelligence (15)Intelligence Explosion (7)International (3)Interpretability (17)Inverse Reinforcement Learning (1)Language Models (13)Literature (4)Living document (2)LLM (9)Machine Learning (20)Maximizers (1)Mentorship (8)Mesa-optimization (6)MIRI (2)Misuse (4)Multipolar (4)Narrow AI (4)Objections (60)Open AI (2)Open Problem (4)Optimization (4)Organizations (15)Orthogonality Thesis (3)Other Concerns (8)Outcomes (5)Outer Alignment (14)Outreach (5)People (4)Philosophy (5)Pivotal Act (1)Plausibility (7)Power Seeking (5)Productivity (6)Prosaic Alignment (7)Quantilizers (2)Race Dynamics (6)Ray Kurzweil (1)Recursive Self-improvement (6)Regulation (3)Reinforcement Learning (13)Research Agendas (26)Research Assistants (1)Resources (19)Robots (7)S-risk (6)Sam Bowman (1)Scaling Laws (6)Selection Theorems (1)Singleton (3)Specification Gaming (10)Study (13)Superintelligence (34)Technological Unemployment (1)Technology (3)Timelines (14)Tool AI (2)Transformative AI (4)Transhumanism (2)Types of AI (2)Utility Functions (3)Value Learning (5)What About (9)Whole Brain Emulation (6)Why Not Just (15)

Catastrophe

29 pages tagged "Catastrophe"
What about AI-enabled surveillance?
Is the worry that AI will become malevolent or conscious?
What about automated AI persuasion and propaganda?
Can we list the ways a task could go disastrously wrong and tell an AI to avoid them?
If I only care about helping people alive today, does AI safety still matter?
How quickly could an AI go from harmless to existentially dangerous?
How likely is it that an AI would pretend to be a human to further its goals?
How can I update my emotional state regarding the urgency of AI safety?
Are Google, OpenAI, etc. aware of the risk?
Wouldn't it be a good thing for humanity to die out?
Why might a maximizing AI cause bad outcomes?
Why is AI alignment a hard problem?
Why does AI takeoff speed matter?
What is a "warning shot"?
How likely is extinction from superintelligent AI?
What are the differences between AI safety, AI alignment, AI control, Friendly AI, AI ethics, AI existential safety, and AGI safety?
What are accident and misuse risks?
Can't we limit damage from AI systems in the same ways we limit damage from companies?
Will AI be able to think faster than humans?
What is perverse instantiation?
What about AI that is biased?
What is reward hacking?
Why would a misaligned superintelligence kill everyone?
What is the "sharp left turn"?
Wouldn't AIs need to have a power-seeking drive to pose a serious risk?
Might someone use AI to destroy human civilization?
What is the EU AI Act?
Why would misaligned AI pose a threat that we can’t deal with?
But won't we just design AI to be helpful?

AISafety.info

We’re a global team of specialists and volunteers from various backgrounds who want to ensure that the effects of future AI are beneficial rather than catastrophic.