Aligning AI

Introduction

The Alignment Problem

Timelines & Takeoffs

Part 1. Foundations

Analytic Geometry

Vector Calculus

Information Theory & Probability Theory

Dynamical Systems

Computer Science

Biology & Neuroscience

Part 2. Machine Learning

Neural Networks

Modeling Objectives

Training at Scale

Reinforcement Learning

Adversarial Training

Part 3. Central Problems in AI Safety

Learning Values

Agent Foundations

Learning from Humans

Decomposing Tasks

Interpretability