Deep Reinforcement Learning

MSc Course | University of Twente —

Course Overview

Advanced course on deep reinforcement learning with applications to trading, portfolio management, and financial decision-making. This course explores how intelligent agents learn to make sequential decisions in uncertain financial environments, combining theoretical foundations with hands-on implementation.

Information Systems for the Financial Services Industry, University of Twente, Netherlands, Spring 2025. Coordinator. Developed curriculum.

Topics Covered

Foundations

Markov Decision Processes (MDPs) – States, actions, transitions, rewards
Value Functions – State-value and action-value functions
Bellman Equations – Optimality conditions and dynamic programming
Policy Evaluation and Improvement – Iterative methods for policy optimization

Deep Q-Learning

DQN – Deep Q-Networks with experience replay
Double DQN – Addressing overestimation bias
Dueling DQN – Separate value and advantage streams
Prioritized Experience Replay – Efficient sample utilization

Policy Gradient Methods

REINFORCE – Monte Carlo policy gradient
A2C / A3C – Advantage actor-critic with parallel environments
PPO – Proximal Policy Optimization for stable training
SAC – Soft Actor-Critic with entropy regularization
TD3 – Twin Delayed Deep Deterministic Policy Gradient

Advanced Topics

Multi-Agent RL – Market simulation with competing agents
Model-Based RL – Learning environment dynamics
Explainable RL – Interpretable policies for compliance with the European AI Act
Offline RL – Learning from historical trading data

Financial Applications

Students implement RL agents for real-world financial problems:

Stock Trading Strategies – Learning buy/sell/hold policies from market data
Portfolio Rebalancing – Dynamic asset allocation with transaction costs
Market Making – Optimal bid-ask spread management
Risk Management – Hedging strategies and drawdown control
Optimal Execution – Minimizing market impact in large orders

Doctoral Training

RL methods are also covered at the doctoral level:

Reinforcement Learning for Finance, University of Twente, Netherlands, June 2024. Co-Organizer and Trainer.
European Summer School in Financial Mathematics, TU Delft, Netherlands, September 2023. Lecturer on Deep Reinforcement Learning.

Prerequisites

Machine Learning fundamentals
Python programming (NumPy, PyTorch or TensorFlow)
Probability and statistics
Linear algebra

Assessment

Implementation project (50%)
Written report (30%)
Presentation (20%)