the Next
Words From N-grams
to LLMs
Predicting the Next Words
Language Modeling from Shannon to GPT
A Springer textbook exploring how machines learn to predict the next word — from classical n-grams through neural networks to modern large language models.
Explore the Book →What You'll Learn
Chapters 1–3
Master the mathematical tools for evaluating language models — probability, information theory, perplexity — and understand how classical n-gram models set the stage.
Chapters 4–7
From word embeddings to attention mechanisms, trace how neural networks transformed language modeling with distributed representations and sequence processing.
Chapters 8–11
Dive deep into the Transformer architecture, pre-training paradigms like BERT and GPT, tokenization strategies, and the scaling laws that drive modern AI.
Chapters 12–15
Explore alignment (RLHF, DPO), in-context learning, retrieval-augmented generation, agents, and the ethical challenges of language AI.
The Intellectual Arc
Start Exploring
Table of Contents
Browse all 15 chapters with section outlines and metadata.
Reading Paths
Suggested routes through the book for different backgrounds.
Dependency Graph
Visualize how chapters build on each other.
Dashboard
Track progress, search content, and explore statistics.
Chapter Drafts (15 of 15 — COMPLETE)
Full chapter content with math rendering and code examples. View all drafts →