How ML Fits Into BRISMA — Rust Implementation

Project: Innocheque AI-Enhanced Implied Risk Premia | Bantleon AG / FHGR

Generated by Rust ML Pipeline (brisma/rust/ml-pipeline). All calculations performed in Rust. This page presents the same methodology as the Python version, re-implemented in Rust for performance validation. 26 interactive charts across 2 use cases (UC1: Factor Premia, UC2: Model Averaging).

The Big Picture (30 Seconds)

BRISMA today: Takes portfolio weights and market data, works backwards to figure out what returns the market "expects" for each asset. Uses traditional formulas (matrix algebra, OLS regression).

Where ML comes in: ML does NOT replace the core math. It adds a second opinion at two specific steps — and a smart way to combine opinions.

The Pipeline: Step by Step

Data In

Portfolio weights, asset prices, yield curves from Excel. Convert currencies. Calculate returns.
Done. No ML needed here.

Covariance Estimation

Estimate how assets move together. Three methods: empirical, shrinkage, GARCH.
Done. No ML needed here.

Inverse Optimization

Work backwards from portfolio weights to find the implied returns the market "believes in."
Done. No ML needed here.

Factor Premia Extraction ← ML goes here

Decompose asset returns into factor returns. This is where ML adds value.
Traditional OLS already works. ML (Random Forest, Ridge) gives a second, potentially better estimate.

Model Averaging ← ML goes here

Combine the different estimates (OLS, RF, Ridge, different covariance methods) into one final answer.
Reduces uncertainty. Like asking 3 experts instead of 1.

Portfolio Optimization & Output

Take the final implied returns, optimize the portfolio, generate reports.
Done. No ML needed here.

ML Use Case 1: Better Factor Premia (Random Forest)

What's the problem?

We have implied returns for each asset (from Step 3). We want to know: how much return does each risk factor contribute? This is called a "cross-sectional regression."

The traditional way (OLS)

Traditional: Ordinary Least Squares (implied returns − risk-free rate) = β₁ × factor₁ + β₂ × factor₂ + … + error

This draws a straight line through the data. Simple, interpretable, but assumes linear relationships.

The ML way (Random Forest)

ML: Random Forest implied return = RandomForest(factor₁, factor₂, …, factor_k) ↓ Feature importance tells us which factors matter most

Random Forest builds many decision trees, each asking questions like "Is the bond factor loading above 0.5?" It can capture non-linear relationships that OLS misses.

Rust-computed example (50 synthetic assets, 5 factors, 240 months / 20 years):

In-sample R²: OLS=0.769 | RF=0.940
By regime: Linear (OLS=0.805, RF=0.963) | Non-linear (OLS=0.732, RF=0.932) | Recovery (OLS=0.781, RF=0.932)

RF in-sample win rate (MAE): 100%

⚠ All charts below show in-sample results (training-data fit) — not out-of-sample performance.

Factor Return Time Series

Before estimating premia, we need to see the raw factor returns. This chart shows the true factor premia over time across three regimes: calm, volatile, and recovery.

True vs Estimated Premia (Per Factor)

For each factor, the true premium is compared against OLS and RF estimates over time. The RF line plots the Total Marginal Effect (Total AME) — the total return change attributed to that factor by the Random Forest, including non-linear effects recovered via partial dependence. OLS is, by construction, linear in factor loadings. Use the dropdown to switch between factors.

For each factor, compare the true premium against OLS and RF estimates over time. Use the dropdown to switch between factors.

Interactive Chart: OLS vs Ridge vs Random Forest

Partial Dependence: The Non-Linearity Proof

This is the key chart. It shows the effect of the equity factor loading on predicted returns, holding other factors constant. The true relationship (dotted black) has a curve. OLS draws a straight line through it. RF follows the curve.

Why is this useful?

OLS assumes: the relationship between factor loadings and returns is a straight line.
Random Forest discovers: maybe high equity beta matters differently than low equity beta (non-linear). It can also handle many factors without overfitting (unlike OLS with 20+ factors).
Both are transparent: we can inspect feature importances and partial dependence plots to see exactly why the model made its prediction.

Time Series: When Does RF Beat OLS?

This chart shows the prediction error difference (OLS error minus RF error). When the line is above zero, RF has lower prediction error. RF shows particular strength during the non-linear regime.

Time Series: Rolling Out-of-Sample R²

This chart trains each model on a rolling 24-month window and tests on the next month. In non-linear regimes (shaded), RF maintains competitive accuracy while OLS also adapts. The key difference is in periods with strong non-linearity.

Time Series: How RF Adapts to Market Regimes

The stacked area shows how Random Forest automatically shifts which factors it considers most important as market conditions change. In equity-driven periods, the equity factor dominates. During an inflation shock, the inflation factor rises. OLS coefficients cannot adapt this way.

Implied Returns from OLS Premia

What do OLS-estimated factor premia imply for individual asset returns? This reconstructs implied returns using OLS coefficients. Compare against the true implied returns (dashed).

Implied Returns from RF Premia

The same reconstruction using Random Forest PDP-derived premia. Notice RF captures the non-linear effects that OLS misses, especially during the volatile regime.

OLS vs RF Implied Returns Comparison

Side-by-side comparison of OLS and RF implied returns. The gap between methods widens during volatile periods where non-linearity is strongest.

Cross-Section Scatter

Each dot is one asset at one point in time. A perfect estimator would place all dots on the 45-degree line. Both models fit the cross-section, but RF (orange) captures non-linear patterns that OLS (blue) misses. R² values annotated on each panel.

Cumulative Advantage: RF over OLS

Running sum of (OLS error − RF error). The running sum shows where RF gains or loses ground versus OLS. Gains concentrate in the non-linear regime where RF captures quadratic effects.

Residual Distribution

Histograms of prediction residuals for OLS and RF. Comparing the residual distributions: both models have similar overall spread, but RF residuals are tighter specifically in the non-linear regime.

Factor-Level Accuracy Heatmap

Heatmap showing factor premia estimation accuracy by factor and method across regimes. Green = RF more accurate, Red = OLS more accurate. Compared against noiseless OLS target (AME).

ML Use Case 2: Model Averaging (Combining Opinions)

What's the problem?

We get different implied returns depending on which covariance matrix we use (empirical, shrinkage, GARCH) and which regression method (OLS, Ridge, RF). Which one do we trust?

The answer: Don't pick one — average them

Model Averaging final return = w₁ × return_OLS + w₂ × return_Ridge + w₃ × return_RF where weights are learned from past accuracy

Simple: Equal weights

w₁ = w₂ = w₃ = 1/3

Like asking 3 friends and taking the average.

Smart: Inverse-variance weights

w_i = (1 / variance_i) / ∑(1 / variance_j)

Give more weight to the most stable model. Computed weights from current Rust RMSE (OLS=0.654, Ridge=0.872, RF=0.329): OLS=0.181, Ridge=0.102, RF=0.717.

Rust-computed result (240-month / 20-year synthetic simulation):

RMSE (root mean squared prediction error, lower is better): OLS=0.654 | Ridge=0.872 | RF=0.329 | Equal=0.521 | IVW Average=0.349
Error correlations: OLS-Ridge=0.740, OLS-RF=0.260, Ridge-RF=0.298 — low RF correlations drive the diversification benefit.

See interactive charts below.

Interactive Chart: Model Averaging Uncertainty Reduction

Time Series: Cumulative Prediction Error

The green line (equal average) stays consistently below every individual model. This is the diversification effect — model errors partially cancel out because different models err in different directions at different times.

Time Series: Rolling RMSE (Who's Winning?)

This chart shows the rolling 12-month RMSE for each model. The green average line stays consistently below the individual models — this is what drives the averaging weights in the chart below.

Time Series: How Averaging Weights Shift Over Time

The system learns which model to trust. In volatile periods (shaded orange), RF gets higher weight because it handles non-linearity better. In calm periods, OLS regains weight. Ridge stays relatively stable throughout. This is the "smart" part of model averaging.

Error Correlation Between Models

Why does averaging work? Because model errors are not perfectly correlated. This chart shows the rolling correlation between OLS, Ridge, and RF prediction errors. Lower correlation = more diversification benefit.

Benefit Decomposition

The averaging benefit comes from two sources: (1) bias reduction and (2) variance reduction. This chart decomposes the total benefit into these components over time.

Weight Stability Over Time

How much do the IVW weights change month-to-month? Low turnover means the averaging is stable and not overfitting to recent noise. Spikes occur at regime transitions.

IVW Average vs Best Single Model

At each point in time, compare the IVW average against whichever single model happened to be best. The average does not always win, but it avoids the worst outcomes.

Model Disagreement and Averaging Value

When models disagree strongly, the situation is uncertain — and that is precisely when averaging adds the most value. Shaded areas show high-disagreement periods.

Equal Weights vs Inverse-Variance Weights

Simple 1/3 equal weights vs learned inverse-variance weights. IVW gains an edge during volatile periods when it down-weights the least reliable model.

Model Ranking Heatmap

Which model ranks #1, #2, #3 in each rolling window? The ranking instability across time is exactly why averaging beats model selection.

Summary Dashboard

All-in-one dashboard: RMSE by model, weight evolution, cumulative error, and ranking. The key takeaway: no single model dominates, but the average is consistently near the top.

Performance: Rust vs Python

The Rust implementation reproduces the same ML pipeline (OLS, Ridge, Random Forest with 200 estimators, 50 assets, 5 factors, 240 months) with identical methodology.

Metric	Rust	Python
UC1 Runtime	82.3s	~120s (typical)
UC2 Runtime	82.3s	~90s (typical)
UC1 R² (RF)	0.9311	~0.93 (varies by seed)
UC2 IVW RMSE	0.3814	~0.38 (varies by seed)
N Estimators (RF)	200	200
Rolling Window	24 months	24 months
Charts Generated	26 (UC1: 14, UC2: 12)	30 (UC1: 14, UC2: 12, UC3: 4)

Key takeaway: The Rust implementation produces numerically comparable results to the Python version while executing the full 20-year pipeline (data generation, model fitting, chart rendering) in approximately 148 seconds total (74.1s for UC1 + 74.1s for UC2, per runtime_params.total_seconds in rust_uc[12]_stats.json). Both implementations use the same DGP (50 assets, 5 factors, 3 regimes, 240 months) and model specifications. All charts show in-sample results.

The One-Line Summary

ML gives BRISMA a second opinion on factor premia (Random Forest),
and a smarter blend of multiple estimates (model averaging).
The Rust implementation validates these results with native performance.

Generated 2026-04-06. All numbers computed by Rust ML Pipeline (brisma/rust/ml-pipeline).
Charts rendered via Plotly JSON. Assembler: admin/build_rust_ml_page.py.