Aligning AI
Introduction
0
Preface
1
The Alignment Problem
2
Timelines & Takeoffs
3
Outline
Part 1. Foundations
4
Logic
5
Linear Algebra
5.1
Algebraic Structures
5.2
Vector Spaces
5.3
Matrices
5.4
Systems of Linear Equations
5.5
Linear Independence, Basis, and Rank
5.6
Linear mappings
5.7
Further Reading
6
Analytic Geometry
7
Vector Calculus
8
Information Theory & Probability Theory
9
Dynamical Systems
10
Computer Science
11
Physics
12
Biology & Neuroscience
13
Psychology
14
Economics
Part 2. Machine Learning
15
Neural Networks
16
Optimization
17
Transformers
18
Modeling Objectives
19
Training at Scale
20
Reinforcement Learning
21
Adversarial Training
Part 3. Central Problems in AI Safety
22
Learning Values
23
Power-Seeking
24
Agent Foundations
25
Learning from Humans
26
Decomposing Tasks
27
Interpretability
28
Governance