Ethics, Society, and the Future
Prerequisites
This is the capstone chapter. No specific technical prerequisites are required beyond general familiarity with the concepts covered throughout the book. The chapter draws on themes from all preceding chapters and is accessible to any reader who has followed the textbook's arc.
Summary
Chapter 15 is the final chapter, shifting from the technical "how" to the societal "so what." Language models are trained on human-generated text, deployed in human societies, and make predictions that affect human lives. The chapter addresses the biases embedded in training data and amplified by models, the privacy risks of memorization (Carlini et al., 2021), the environmental cost of training at scale (Strubell et al., 2019; Patterson et al., 2021), the legal and regulatory landscape (EU AI Act, copyright litigation), and open questions about the future. The chapter is qualitative -- no new equations -- and deliberately balanced: it presents empirical findings while offering multiple perspectives on contested questions without taking ideological positions. It closes by returning to the prediction paradigm from Chapter 1, bringing the book's narrative full circle.
Learning Objectives
- Identify the primary sources of bias in language models -- training data, annotation, algorithmic amplification, and deployment context -- and describe concrete measurement and mitigation strategies for each source.
- Explain how language models memorize and can regurgitate training data, articulate the privacy risks this creates (including PII leakage and data extraction attacks), and describe defenses such as differential privacy and deduplication.
- Quantify the environmental cost of training and deploying large language models in terms of energy consumption and carbon emissions, and evaluate strategies for reducing this footprint.
- Analyze the regulatory landscape for LLMs -- including the EU AI Act, copyright debates over training data, and voluntary governance frameworks -- and reason about the trade-offs between innovation and regulation.
Section Outline
15.1 Bias and Fairness (~4pp)
Sources of bias: biased training corpora, annotation bias, algorithmic amplification, and deployment bias. Measurement: WEAT, StereoSet, BBQ, counterfactual evaluation. Mitigation strategies and their limitations.
- 15.1.1 Sources of Bias: Data, Annotation, Algorithm, Deployment
- 15.1.2 Measuring Bias: Benchmarks and Metrics
- 15.1.3 Mitigation Strategies and Their Limitations
15.2 Privacy and Memorization (~3pp)
Extractable memorization (Carlini et al., 2021), PII risks in web-scraped corpora, and defenses: deduplication, differential privacy (DP-SGD), and output filtering. The capability-privacy tension.
- 15.2.1 Extractable Memorization and Data Leakage
- 15.2.2 Defenses: Deduplication, Differential Privacy, Filtering
- 15.2.3 The Capability-Privacy Tension
15.3 Environmental Impact (~3pp)
Energy consumption and carbon emissions for frontier model training. Efficiency research as environmental policy. Reporting standards and transparency.
- 15.3.1 Energy Consumption and Carbon Emissions
- 15.3.2 Efficiency as Environmental Policy
- 15.3.3 Reporting Standards and Transparency
15.4 Intellectual Property and Regulation (~3pp)
The copyright question and ongoing litigation. The EU AI Act and global regulatory approaches. Voluntary governance: model cards and datasheets.
- 15.4.1 Copyright, Fair Use, and Training Data
- 15.4.2 The EU AI Act and Global Regulation
- 15.4.3 Voluntary Governance: Model Cards and Datasheets
15.5 The Future of Language Models (~2pp)
Open research questions: reasoning, world models, generalization, test-time compute. The social contract for AI: who builds, who benefits, who bears the costs. Closing reflection on prediction as the unifying thread from Shannon to frontier models.
- 15.5.1 Open Problems: Reasoning, World Models, Generalization
- 15.5.2 The Social Contract for AI
- 15.5.3 Closing Reflection: Prediction as the Unifying Thread
Key Equations
Key Figures
Exercises
Discussion / Essay
- Bias Sources Analysis (Basic). Identify three specific sources of bias in a language model trained on Common Crawl data. For each, describe the type of bias, explain the mechanism, and propose one mitigation strategy.
- Regulatory Comparison (Intermediate). Compare the EU AI Act's approach to LLM regulation with the US approach. Which better balances innovation with public safety? What are the risks of over-regulation and under-regulation?
- Model Card Audit (Basic). Analyze a published model card (e.g., LLaMA-2 or OLMo) for completeness against the Mitchell et al. (2019) template. What information is present? What is missing?
- Responsible-Use Policy (Intermediate). Write a responsible-use policy for a hypothetical LLM deployed as a healthcare customer service chatbot. Address permitted/prohibited uses, data handling, bias mitigation, and safety guardrails.
- Stochastic Parrots Debate (Intermediate). Evaluate the "Stochastic Parrots" claim (Bender et al., 2021) in light of the technical capabilities covered in Chapters 8--14. What evidence supports and challenges the claim?
- Open vs. Closed Weights (Advanced). Argue for or against the open release of large language model weights. Consider: scientific reproducibility, democratization, dual-use risks, competitive dynamics, and safety. Present both sides.
Programming
- Bias Benchmark Evaluation (Basic). Use StereoSet or BBQ to evaluate a pre-trained model for bias. Report stereotype scores across at least three demographic categories (gender, race, religion).
- Carbon Footprint Estimation (Intermediate). Implement a Python function to estimate training carbon footprint given GPU type, count, duration, and datacenter location. Compare coal-heavy vs. hydropower regions. Validate against published GPT-3 estimates.
Cross-References
This chapter references:
- Ch 1 (Sections 1.1--1.2): The prediction paradigm and the history of language modeling. The closing section brings the narrative full circle: from Shannon's 1948 theory to today's frontier models.
- Ch 11: Scaling laws and compute costs directly inform the environmental impact discussion.
- Ch 12: Alignment techniques (RLHF, DPO) inform the bias and safety discussions. The chapter avoids duplicating Ch 12 alignment content.
- Ch 14: Efficiency techniques (quantization, LoRA, distillation) are referenced as environmental mitigation strategies.
This chapter is referenced by:
- No later chapters. This is the final chapter of the book.
Key Papers
- Carlini, N., Tramer, F., Wallace, E., et al. (2021). Extracting Training Data from Large Language Models. Proceedings of USENIX Security Symposium. [Section 15.2.1]
- Strubell, E., Ganesh, A., & McCallum, A. (2019). Energy and Policy Considerations for Deep Learning in NLP. Proceedings of ACL, 3645--3650. [Section 15.3.1]
- Patterson, D., Gonzalez, J., Le, Q., et al. (2021). Carbon Emissions and Large Neural Network Training. arXiv:2104.10350. [Section 15.3.1]
- Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the Dangers of Stochastic Parrots: Can Language Models Be Too Big? Proceedings of FAccT, 610--623. [Sections 15.1, 15.5]
- Mitchell, M., Wu, S., Zaldivar, A., et al. (2019). Model Cards for Model Reporting. Proceedings of FAccT, 220--229. [Section 15.4.3]
- Gebru, T., Morgenstern, J., Vecchione, B., et al. (2021). Datasheets for Datasets. Communications of the ACM, 64(12), 86--92. [Section 15.4.3]
- Lee, K., Ippolito, D., Nystrom, A., et al. (2022). Deduplicating Training Data Makes Language Models Better. Proceedings of ACL, 8424--8445. [Section 15.2.2]
- Bolukbasi, T., et al. (2016). Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings. Advances in NeurIPS. [Section 15.1.2]
- Nadeem, M., Bethke, A., & Reddy, S. (2021). StereoSet: Measuring Stereotypical Bias in Pretrained Language Models. Proceedings of ACL, 5356--5371. [Section 15.1.2]