The open textbook for AI Safety

Written by Markov Grey and Charbel-Raphael Segerie from the French Center for AI Safety.

Trusted by

ML4Good BlueDot Impact European Network for AI Safety ENS Paris Saclay

Questions the textbook answers

A comprehensive, regularly updated guide to understanding and mitigating risks from advanced AI systems.

8 chapters40+ sectionsTechnical + Governance tracksUpdated quarterly
1

How capable is AI today, and how fast is it advancing?

Foundation models, scaling laws, benchmarks, and forecasting. What current systems can do and what's coming next.

2

What risks does advanced AI pose?

From misuse to misalignment to systemic effects. Threat models, failure modes, and the severity spectrum from harm to extinction.

3

What strategies can prevent AI from causing harm?

Technical and governance approaches across timescales—from misuse prevention today to alignment challenges with superintelligence.

4

How should society govern AI development?

Why traditional regulation fails for AI, compute governance, race dynamics, proliferation, and the concentration of power.

5

How do we measure whether an AI system is safe?

Evaluating capabilities, propensities, and control. Behavioral and internal techniques, and why testing for safety is fundamentally hard.

6

How do we tell AI what we actually want?

The specification problem: reward hacking, Goodhart's Law, and solutions from imitation learning to RLHF and Constitutional AI.

7

Why might AI learn the wrong goals despite correct training?

Goal misgeneralization: how AI learns proxy objectives, dangerous manifestations like scheming, and detection strategies.

8

How do we oversee AI that exceeds human expertise?

Scalable oversight techniques: task decomposition, debate, amplification, and weak-to-strong generalization.

Written by researchers, for everyone

Markov Grey

Markov Grey

Researcher, French Center for AI Safety. Previously technical writer at aisafety.info and scriptwriter at Rational Animations.

Charbel-Raphael Segerie

Charbel-Raphael Segerie

Executive Director, French Center for AI Safety. Co-founded ML4good. Teaching experience includes ARENA and MLAB.

Charles Martinet

Contributing Author

Head of Policy, French Center for AI Safety.

Jeanne Salle

Contributing Author

AI safety teacher at ENS Ulm.

Vincent Corruble

Advisor

Professor at Sorbonne University and research fellow at CHAI.

Fabien Roger

Advisor

Previously worked at Redwood Research, now at Anthropic.

Start reading

Start with Chapter 1 or jump to any topic. Self-paced, always free.

Start a course

Join 30+ course organizers using Atlas materials.