methods-algorithms
MSc Data Science: Methods and Algorithms - Course materials with Python infrastructure
Information
| Property | Value |
|---|---|
| Language | TeX |
| Stars | 0 |
| Forks | 0 |
| Watchers | 0 |
| Open Issues | 0 |
| License | No License |
| Created | 2026-01-07 |
| Last Updated | 2026-03-31 |
| Last Push | 2026-03-31 |
| Contributors | 1 |
| Default Branch | master |
| Visibility | private |
Notebooks
This repository contains 21 notebook(s):
| Notebook | Language | Type |
|---|---|---|
| L01_linear_regression | PYTHON | jupyter |
| L02_logistic_regression | PYTHON | jupyter |
| L03_kmeans | PYTHON | jupyter |
| L03_knn | PYTHON | jupyter |
| L03_knn_kmeans | PYTHON | jupyter |
| L04_dt | PYTHON | jupyter |
| L04_random_forests | PYTHON | jupyter |
| L04_rf | PYTHON | jupyter |
| L05_pca | PYTHON | jupyter |
| L05_pca_tsne | PYTHON | jupyter |
| L05_tsne | PYTHON | jupyter |
| L06_embeddings | PYTHON | jupyter |
| L06_embeddings_rl | PYTHON | jupyter |
| L06_rl | PYTHON | jupyter |
| L01_linear_regression | PYTHON | jupyter |
| L02_logistic_regression | PYTHON | jupyter |
| L03_knn_kmeans | PYTHON | jupyter |
| L04_random_forests | PYTHON | jupyter |
| L05_pca_tsne | PYTHON | jupyter |
| L06_embeddings_rl | PYTHON | jupyter |
| notebook_template | PYTHON | jupyter |
Datasets
This repository includes 46 dataset(s):
| Dataset | Format | Size |
|---|---|---|
| continuation-count.json | .json | 0.06 KB |
| L01_deepdive.json | .json | 8.74 KB |
| L01_overview.json | .json | 7.23 KB |
| L02_deepdive.json | .json | 18.55 KB |
| L02_overview.json | .json | 8.97 KB |
| L03_deepdive.json | .json | 9.01 KB |
| L03_overview.json | .json | 6.65 KB |
| L04_deepdive.json | .json | 11.99 KB |
| L04_overview.json | .json | 7.34 KB |
| L05_deepdive.json | .json | 17.1 KB |
| L05_overview.json | .json | 8.83 KB |
| L06_deepdive.json | .json | 7.12 KB |
| L06_overview.json | .json | 5.19 KB |
| ralph-state.json | .json | 1.2 KB |
| ralplan-state.json | .json | 0.56 KB |
| ultrawork-state.json | .json | 12.92 KB |
| audit_report.json | .json | 6.02 KB |
| datasets | | 0.0 KB |
| AGENTS.md | .md | 11.35 KB |
| credit_synthetic.csv | .csv | 21.33 KB |
| customers_synthetic.csv | .csv | 6.05 KB |
| housing_synthetic.csv | .csv | 1.37 KB |
| portfolio_synthetic.csv | .csv | 20.16 KB |
| text_corpus_synthetic.json | .json | 4.74 KB |
| transactions_synthetic.csv | .csv | 5.15 KB |
| L01_Introduction_Linear_Regression.json | .json | 0.16 KB |
| L01_deepdive.json | .json | 0.39 KB |
| L01_overview.json | .json | 0.21 KB |
| L02_Logistic_Regression.json | .json | 0.16 KB |
| L02_deepdive.json | .json | 0.38 KB |
| L02_overview.json | .json | 0.21 KB |
| L03_KNN_KMeans.json | .json | 0.16 KB |
| L03_deepdive.json | .json | 0.37 KB |
| L03_overview.json | .json | 0.21 KB |
| L04_Random_Forests.json | .json | 0.16 KB |
| L04_deepdive.json | .json | 0.37 KB |
| L04_overview.json | .json | 0.21 KB |
| L05_PCA_tSNE.json | .json | 0.16 KB |
| L05_deepdive.json | .json | 0.3 KB |
| L05_overview.json | .json | 0.21 KB |
| L06_Embeddings_RL.json | .json | 0.16 KB |
| L06_deepdive.json | .json | 0.31 KB |
| L06_overview.json | .json | 0.21 KB |
| manifest.json | .json | 52.54 KB |
| quiz_inventory.json | .json | 37.69 KB |
| validation_summary.json | .json | 2.38 KB |
Reproducibility
This repository includes reproducibility tools:
- Python requirements.txt
Research Keywords
PCA & t, visualization, Random Forests, SNE, Introduction & Linear Regression, Factor models, Portfolio risk decomposition, trading strategies, default prediction, anomaly detection, Sentiment analysis, Credit scoring, Fraud detection, house price prediction, Means, Nearest Neighbours & K, Embeddings & Reinforcement Learning, Logistic Regression, Customer segmentation, feature importance
Status
- Issues: Enabled
- Wiki: Enabled
- Pages: Enabled
README
Methods and Algorithms - MSc Data Science
Master core ML algorithms and develop a systematic approach to choosing the right method for data-driven decision making in finance and business contexts.
Course Overview
| Attribute | Value |
|---|---|
| Program | MSc Data Science |
| Sessions | 6 x 3 hours |
| Prerequisites | Python, Statistics, Linear Algebra |
| Domain Focus | Finance and Banking |
Topics
- Introduction & Linear Regression - Factor models, house price prediction
- Logistic Regression - Credit scoring, default prediction
- K-Nearest Neighbours & K-Means - Customer segmentation, anomaly detection
- Random Forests - Fraud detection, feature importance
- PCA & t-SNE - Portfolio risk decomposition, visualization
- Embeddings & Reinforcement Learning - Sentiment analysis, trading strategies
Course Structure
Each lecture follows the PMSP framework: - Problem (15 min) - Real finance use case - Method (45 min) - Algorithm theory and mathematics - Solution (45 min) - Implementation and results - Practice (75 min) - Hands-on notebook exercises
Materials
- Slides: LaTeX Beamer (overview + deep dive per topic)
- Notebooks: Google Colab with synthetic data
- Quizzes: Moodle format (30 min, timed)
- Capstone: Open-ended project (report only)
Quick Start
# Install dependencies
pip install -r requirements.txt
# Run course CLI
python infrastructure/course_cli.py status
# Build all slides
python infrastructure/course_cli.py build slides --all
# Validate all content
python infrastructure/course_cli.py validate --all
Directory Structure
Methods_and_Algorithms/
├── infrastructure/ # Python course management CLI
├── slides/ # LaTeX slides per topic
├── notebooks/ # Colab notebooks
├── quizzes/ # Moodle XML files
├── datasets/ # Synthetic data
├── docs/ # GitHub Pages site
├── templates/ # Beamer, chart, notebook templates
├── rubrics/ # Grading rubrics
├── capstone/ # Project specification
└── syllabus/ # Multi-format syllabus
Learning Outcomes
By course completion, students will: 1. Select appropriate ML methods for business problems 2. Implement solutions from scratch and with libraries 3. Interpret results for non-technical stakeholders
License
Course materials for educational use only.