Chapter Dependencies & Reading Order

Understanding the prerequisite relationships between chapters is essential for both the writing process and for readers who wish to take non-linear paths through the book. This page maps every dependency, highlights the critical path, and suggests several alternative reading orders.

Interdependency Graph

Solid arrows represent hard prerequisites: a reader must have completed the source chapter before the target is accessible. Dashed arrows indicate soft dependencies where familiarity helps but is not mandatory. Chapter nodes colored in red receive deep treatment (≥25 pages with derivations).

graph LR C1((Ch 1<br/>Intro)) --> C2((Ch 2<br/>Math)) C2 --> C3((Ch 3<br/>Classical)) C2 --> C4((Ch 4<br/>Embeddings)) C3 --> C5((Ch 5<br/>Seq Models)) C4 --> C5 C5 --> C6((Ch 6<br/>Attention)):::deep C6 --> C7((Ch 7<br/>Seq2Seq)) C6 --> C8((Ch 8<br/>Transformer)):::deep C7 --> C8 C8 --> C9((Ch 9<br/>Pre-train)):::deep C8 --> C10((Ch 10<br/>Tokenize)) C9 --> C11((Ch 11<br/>Scaling)) C9 --> C12((Ch 12<br/>Alignment)):::deep C9 --> C13((Ch 13<br/>ICL)) C11 -.-> C12 C12 -.-> C13 C13 -.-> C14((Ch 14<br/>RAG)) C10 -.-> C14 classDef deep fill:#e74c3c,color:white,stroke:#c0392b classDef default fill:#3498db,color:white,stroke:#2980b9 linkStyle 0,1,2,3,4,5,6,7,8,9,10,11,12,13 stroke:#2c3e50,stroke-width:2px linkStyle 14,15,16,17 stroke:#95a5a6,stroke-width:1px,stroke-dasharray:5

Critical Path

The longest chain of hard prerequisites determines the minimum sequential reading required to reach the most advanced material. That path is:

Ch 1 (Introduction) → Ch 2 (Math Foundations) → Ch 4 (Word Representations) → Ch 5 (Sequence Models) → Ch 6 (Attention) → Ch 8 (Transformer) → Ch 9 (Pre-training) → Ch 12 (Alignment)

This eight-chapter spine covers approximately 200 pages and takes a reader from probability basics through RLHF. Every other chapter branches off this spine and can be read in parallel once its prerequisites on the spine are met.

Full Dependency Table

The bidirectional table below lists both what each chapter requires and what later chapters it unlocks. Chapters marked (DEEP) receive extended treatment with derivations and implementation examples.

Ch Title Hard Prerequisites Required By
1 Introduction All
2 Math Foundations Ch 1 Ch 3, Ch 4
3 Classical LMs Ch 1, Ch 2 Ch 5
4 Word Representations Ch 1, Ch 2 Ch 5
5 Sequence Models Ch 1, Ch 3, Ch 4 Ch 6
6 Attention (DEEP) Ch 1, Ch 5 Ch 7, Ch 8
7 Seq2Seq & Decoding Ch 1, Ch 6 Ch 8
8 Transformer (DEEP) Ch 1, Ch 6, Ch 7 Ch 9, Ch 10
9 Pre-training (DEEP) Ch 1, Ch 8 Ch 11, Ch 12, Ch 13
10 Tokenization Ch 1, Ch 8
11 Scaling Laws Ch 1, Ch 9
12 Alignment (DEEP) Ch 1, Ch 9
13 ICL & Prompting Ch 1, Ch 9
14 RAG & Agents Ch 1
15 Ethics & Future Ch 1

Suggested Reading Paths

Not every reader needs to follow the linear chapter order. The dependency structure supports several focused paths depending on background and goals.

Sequential (Full Book)

Chapters 1 through 15 in order. Recommended for graduate students taking a full-semester course. Approximately 335 pages, 15–16 weeks at one chapter per week.

Fast Track (Core Theory)

Ch 1Ch 2Ch 6Ch 8Ch 9Ch 12

Six chapters (~155 pages). Skips the historical build-up and jumps directly to attention, transformers, pre-training, and alignment. Assumes mathematical maturity. Best for readers with prior exposure to deep learning who want to understand LLMs specifically.

Practitioner Path

Ch 1Ch 8Ch 9Ch 12Ch 13Ch 14

Six chapters (~150 pages). Focuses on the transformer architecture, how models are trained, aligned, prompted, and deployed. Minimal theory. Ideal for software engineers building on top of LLMs who need working intuition without full mathematical derivations.

Theory Path

Ch 1Ch 2Ch 3Ch 4Ch 5Ch 6Ch 8Ch 11

Eight chapters (~185 pages). Emphasizes the mathematical progression from count-based models through neural sequence models to scaling laws. Suitable for researchers interested in the theoretical underpinnings of language modeling.

Ethics & Policy Path

Ch 1Ch 9Ch 12Ch 15

Four chapters (~90 pages). Provides enough technical context to understand alignment challenges and their societal implications. Designed for policy analysts, journalists, and non-technical stakeholders.