AI Safety

There is often no better way to learn a subject than to read the right book and work through the right exercises. A great textbook can make all the difference.

We believe that AI safety is in need of a great textbook — both to prepare students for research and to welcome researchers from other disciplines.

Writing this textbook is daunting not least because of how much material we need to cover. It’s daunting because AI safety is such a young field. We ~~may~~ do not yet know what the right questions are (let alone the right answers).

As such, writing this textbook poses a risk. The textbook defines the field, so the wrong textbook risks establishing the wrong definitions, questions, lines of inquiry, and priorities. That’s why usually the textbook comes well after a field has enter the regime of normal science.¹

Still, we think it’s worth the risk. And we think we can mitigate much of the risk by abandoning the nastier tropes of traditional textbooks.

A Different Kind of Textbook

First, we are working with a continuous and iterative release schedule. We’ll be publishing new chapters as they’re ready, and we’ll be updating existing chapters as we learn more. Version control is a git problem, not a publisher problem.²

Second, we’re sticking to an online format. This makes it easier to update and iterate, and it makes it easier to share and collaborate. It also lets us develop more interactive and engaging content. Our primary inspiration is Distill, an online journal for interactive machine learning research. (Eventually, we will release accompanying pdfs).

Third, we’re outsourcing much of the writing, so we can move faster and cover more ground. Our primary role is to curate and edit. We’re also working with a team of expert reviewers to ensure that the content is accurate and up-to-date.

In Context

Currently, much of the AI safety literature is published on the Alignment Forum (AF). This has the advantage of making it easier and faster for researchers to publish and gather feedback. It has the disadvantage that the amount of material increases monotonically. Without a concerted effort at distillation, the barrier to entry grows and grows.

Our aim is to occupy one level of distillation above the “sequences” of the AF. In doing so, we hope to provide a self-contained introduction to the field—enough that an advanced undergrad or graduate student can get up to speed and start contributing to the field.

We’re also aiming for this textbook to become the basis of courses such as AGI Safety Fundamentals, Alignment 201, Intro to ML Safety, AI Safety Camp, ARENA, and MLAB. Our audience includes a wide range of different people: researchers, students, engineers, and those working in public policy. We don’t expect all content to be relevant to all readers, and we suggest that readers skip over content that doesn’t concern them.

About the Editors

Jesse Hoogland is a SERI MATS scholar under the Deceptive Alignment stream with Evan Hubinger. He has a master’s degree in theoretical physics from the University of Amsterdam.

Aaliya Manji has built her career in Education, with the mission of making learning more accessible. She has a degree in Electrical and Electronic Engineering from UCL, with a specific focus on Intelligent Systems.

There are precedents for textbooks written during the revolution. E.g.: Nielsen & Chuang’s Quantum Computation and Quantum Information was first published in 2000, only six years after Shor’s algorithm and eight years after the Deutsch-Jozsa algorithm. ↩
This makes citations more of a hassle, and we are working on a solution. ↩

1 Preface

Authors

Last Updated

A Different Kind of Textbook

In Context

About the Editors

Next:

Updates & Corrections

Reuse

Citation

1 Preface

Authors

Last Updated

A Different Kind of Textbook

In Context

About the Editors

Footnotes

Next:

Updates & Corrections

Reuse

Citation