Text Analytics (dv-) HS25

Natural Language Processing and Text Analytics

28 PDFs
5 Notebooks
13 Sessions

Course Overview

This hands-on course takes you from statistical language models to modern transformer architectures through discovery-based learning. You will build practical NLP systems using PyTorch while understanding the mathematical foundations that power today's large language models.

What You'll Learn

  • Build neural language models from scratch using PyTorch
  • Master transformer architectures and attention mechanisms
  • Fine-tune pre-trained models for sentiment analysis and text classification
  • Implement decoding strategies for controllable text generation
  • Understand efficiency techniques: quantization, pruning, and knowledge distillation
  • Apply ethical AI principles to NLP systems
Prerequisites: Programming experience in Python. No prior deep learning knowledge required. Mathematics: basic calculus and linear algebra helpful but not mandatory.

Learning Path

The course follows a progressive structure with three main phases:

Phase 1 Foundations

Weeks 1-5

  • N-Grams & Statistical Models
  • Word Embeddings
  • Neural Networks Primer
  • RNNs & LSTMs
Phase 2 Core Architectures

Weeks 6-9

  • Sequence-to-Sequence
  • Transformers
  • Multi-Agent LLMs
  • Decoding Strategies
Phase 3 Applications

Weeks 10-13

  • Fine-tuning & Transfer
  • Efficiency & Deployment
  • Ethics in NLP
  • Final Projects

Frequently Asked Questions

What programming experience do I need? +
Intermediate Python experience is required. You should be comfortable with functions, classes, lists, dictionaries, and file I/O. Experience with NumPy is helpful but not required - we'll cover the basics needed for deep learning.
What software do I need to install? +
Python 3.8+, PyTorch, NumPy, Matplotlib, Jupyter Lab. Alternatively, you can use Google Colab for zero-setup cloud execution. Detailed installation instructions are provided in the first session.
How is the course graded? +
Three assessments: Kurzprasentation (20%, individual 5-min talk), Zwischenprasentation (10%, team mid-term), and Abschlussprasentation (70%, team final project). No written exams. See the Assignments page for details.
Can I access materials after the course ends? +
Yes! All course materials are available on GitHub and will remain accessible indefinitely. Lecture slides, notebooks, and exercises can be downloaded for offline use.