Skip to content

methods-algorithms

MSc Data Science: Methods and Algorithms - Course materials with Python infrastructure

View on GitHub


Information

Property Value
Language TeX
Stars 0
Forks 0
Watchers 0
Open Issues 0
License No License
Created 2026-01-07
Last Updated 2026-03-31
Last Push 2026-03-31
Contributors 1
Default Branch master
Visibility private

Notebooks

This repository contains 21 notebook(s):

Notebook Language Type

| L01_linear_regression | PYTHON | jupyter |

| L02_logistic_regression | PYTHON | jupyter |

| L03_kmeans | PYTHON | jupyter |

| L03_knn | PYTHON | jupyter |

| L03_knn_kmeans | PYTHON | jupyter |

| L04_dt | PYTHON | jupyter |

| L04_random_forests | PYTHON | jupyter |

| L04_rf | PYTHON | jupyter |

| L05_pca | PYTHON | jupyter |

| L05_pca_tsne | PYTHON | jupyter |

| L05_tsne | PYTHON | jupyter |

| L06_embeddings | PYTHON | jupyter |

| L06_embeddings_rl | PYTHON | jupyter |

| L06_rl | PYTHON | jupyter |

| L01_linear_regression | PYTHON | jupyter |

| L02_logistic_regression | PYTHON | jupyter |

| L03_knn_kmeans | PYTHON | jupyter |

| L04_random_forests | PYTHON | jupyter |

| L05_pca_tsne | PYTHON | jupyter |

| L06_embeddings_rl | PYTHON | jupyter |

| notebook_template | PYTHON | jupyter |

Datasets

This repository includes 46 dataset(s):

Dataset Format Size

| continuation-count.json | .json | 0.06 KB |

| L01_deepdive.json | .json | 8.74 KB |

| L01_overview.json | .json | 7.23 KB |

| L02_deepdive.json | .json | 18.55 KB |

| L02_overview.json | .json | 8.97 KB |

| L03_deepdive.json | .json | 9.01 KB |

| L03_overview.json | .json | 6.65 KB |

| L04_deepdive.json | .json | 11.99 KB |

| L04_overview.json | .json | 7.34 KB |

| L05_deepdive.json | .json | 17.1 KB |

| L05_overview.json | .json | 8.83 KB |

| L06_deepdive.json | .json | 7.12 KB |

| L06_overview.json | .json | 5.19 KB |

| ralph-state.json | .json | 1.2 KB |

| ralplan-state.json | .json | 0.56 KB |

| ultrawork-state.json | .json | 12.92 KB |

| audit_report.json | .json | 6.02 KB |

| datasets | | 0.0 KB |

| AGENTS.md | .md | 11.35 KB |

| credit_synthetic.csv | .csv | 21.33 KB |

| customers_synthetic.csv | .csv | 6.05 KB |

| housing_synthetic.csv | .csv | 1.37 KB |

| portfolio_synthetic.csv | .csv | 20.16 KB |

| text_corpus_synthetic.json | .json | 4.74 KB |

| transactions_synthetic.csv | .csv | 5.15 KB |

| L01_Introduction_Linear_Regression.json | .json | 0.16 KB |

| L01_deepdive.json | .json | 0.39 KB |

| L01_overview.json | .json | 0.21 KB |

| L02_Logistic_Regression.json | .json | 0.16 KB |

| L02_deepdive.json | .json | 0.38 KB |

| L02_overview.json | .json | 0.21 KB |

| L03_KNN_KMeans.json | .json | 0.16 KB |

| L03_deepdive.json | .json | 0.37 KB |

| L03_overview.json | .json | 0.21 KB |

| L04_Random_Forests.json | .json | 0.16 KB |

| L04_deepdive.json | .json | 0.37 KB |

| L04_overview.json | .json | 0.21 KB |

| L05_PCA_tSNE.json | .json | 0.16 KB |

| L05_deepdive.json | .json | 0.3 KB |

| L05_overview.json | .json | 0.21 KB |

| L06_Embeddings_RL.json | .json | 0.16 KB |

| L06_deepdive.json | .json | 0.31 KB |

| L06_overview.json | .json | 0.21 KB |

| manifest.json | .json | 52.54 KB |

| quiz_inventory.json | .json | 37.69 KB |

| validation_summary.json | .json | 2.38 KB |

Reproducibility

This repository includes reproducibility tools:

  • Python requirements.txt

Research Keywords

PCA & t, visualization, Random Forests, SNE, Introduction & Linear Regression, Factor models, Portfolio risk decomposition, trading strategies, default prediction, anomaly detection, Sentiment analysis, Credit scoring, Fraud detection, house price prediction, Means, Nearest Neighbours & K, Embeddings & Reinforcement Learning, Logistic Regression, Customer segmentation, feature importance

Status

  • Issues: Enabled
  • Wiki: Enabled
  • Pages: Enabled

README

Methods and Algorithms - MSc Data Science

Master core ML algorithms and develop a systematic approach to choosing the right method for data-driven decision making in finance and business contexts.

Course Overview

Attribute Value
Program MSc Data Science
Sessions 6 x 3 hours
Prerequisites Python, Statistics, Linear Algebra
Domain Focus Finance and Banking

Topics

  1. Introduction & Linear Regression - Factor models, house price prediction
  2. Logistic Regression - Credit scoring, default prediction
  3. K-Nearest Neighbours & K-Means - Customer segmentation, anomaly detection
  4. Random Forests - Fraud detection, feature importance
  5. PCA & t-SNE - Portfolio risk decomposition, visualization
  6. Embeddings & Reinforcement Learning - Sentiment analysis, trading strategies

Course Structure

Each lecture follows the PMSP framework: - Problem (15 min) - Real finance use case - Method (45 min) - Algorithm theory and mathematics - Solution (45 min) - Implementation and results - Practice (75 min) - Hands-on notebook exercises

Materials

  • Slides: LaTeX Beamer (overview + deep dive per topic)
  • Notebooks: Google Colab with synthetic data
  • Quizzes: Moodle format (30 min, timed)
  • Capstone: Open-ended project (report only)

Quick Start

# Install dependencies
pip install -r requirements.txt

# Run course CLI
python infrastructure/course_cli.py status

# Build all slides
python infrastructure/course_cli.py build slides --all

# Validate all content
python infrastructure/course_cli.py validate --all

Directory Structure

Methods_and_Algorithms/
├── infrastructure/     # Python course management CLI
├── slides/            # LaTeX slides per topic
├── notebooks/         # Colab notebooks
├── quizzes/           # Moodle XML files
├── datasets/          # Synthetic data
├── docs/              # GitHub Pages site
├── templates/         # Beamer, chart, notebook templates
├── rubrics/           # Grading rubrics
├── capstone/          # Project specification
└── syllabus/          # Multi-format syllabus

Learning Outcomes

By course completion, students will: 1. Select appropriate ML methods for business problems 2. Implement solutions from scratch and with libraries 3. Interpret results for non-technical stakeholders

License

Course materials for educational use only.