Explainable Regime-Aware Portfolio Optimization: The case for Robust Rolling Regime Detection

Amine Boukardagha and Alex Saunders

Abstract

We propose an explainable framework for cross-asset portfolio optimization under time-varying market regimes. Our approach leverages a robust rolling regime detection model (R2-RD) to identify latent market states with interpretable mean and covariance structures, and a K-Nearest Neighbors (KNN) model as a nonparametric benchmark. Both methods are embedded within a regime-aware mean–variance optimization (MVO) framework and evaluated via a strictly causal expanding-window backtest on monthly data. Using a diversified cross-asset universe, we find that R2-RD delivers superior risk-adjusted returns and drawdown control relative to KNN, highlighting the benefits of parametric structure, temporal consistency, and regime-aware allocation for medium-frequency portfolio construction. The results demonstrate that R2-RD not only improves performance but also provides a transparent mapping between observed market states and portfolio decisions.

Introduction

Financial markets exhibit persistent structural changes driven by macroeconomic cycles, monetary policy regimes, and shifts in risk appetite. Ignoring such regime dynamics can lead to unstable portfolio allocations and poor risk-adjusted performance. Regime-aware portfolio construction seeks to address this issue by conditioning expected returns and risks on latent market states. This paper compares two distinct approaches to regime detection:

Both methods are evaluated within an identical portfolio optimization framework, enabling a clean empirical comparison of parametric versus nonparametric regime modeling for cross-asset allocation.

Data and Market Universe

The empirical analysis is conducted on a diversified cross-asset universe designed to capture major global risk factors. Monthly log-returns are constructed from adjusted close prices obtained via Yahoo Finance. The asset universe consists of:

This universe spans growth-sensitive, defensive, inflation-hedging, and currency assets, making it well suited for regime-based allocation. All strategies are evaluated using monthly rebalancing with a strictly out-of-sample backtest running from January 2016 to the most recent available data.

Feature Representation

Both regime detection approaches operate on a shared feature space constructed from asset returns and risk measures. At each month t, the feature vector is defined as:

xt = [rt; σt], σt = sqrt(1)6 Sum[i=1 to 6] (rt-i - rt)2,
where rt ∈ RN denotes the vector of monthly asset returns and σt is a six-month rolling volatility estimate. This representation captures both directional information and prevailing market uncertainty, which are key drivers of regime differentiation.

Robust Rolling Regime Detection (R2-RD)

Model Specification

Let zt ∈ \1,,K\ denote a latent market regime evolving according to a first-order Markov chain:

P(zt = j | zt-1 = i) = Aij.
Conditional on the regime, feature vectors follow a multivariate Gaussian distribution:
xt | zt = k N(μk, Σk).

Rolling BIC-Based Regime Selection with Regime Emergence Constraint

At each rebalancing date t, Hidden Markov Models are estimated on an expanding window using all information available up to t-1. Candidate models are fit for a range of regime counts K ∈ ,t, , K\, and the optimal number of regimes is selected via the Bayesian Information Criterion (BIC):

Kt^* = argmin[K] \ -2 log LK + pK log T
where LK denotes the maximized likelihood and pK the number of free parameters.
Crucially, the lower bound on the candidate regime count is set dynamically according to:
K,t = Kt-1^*,
so that the optimization permits only regime emergence and explicitly rules out regime merging. In other words, the number of regimes is allowed to increase over time as new market structures appear, but previously identified regimes are never removed. This design choice reflects the empirical observation that market regimes tend to fragment over time as new economic environments emerge, while previously learned regimes remain relevant reference states. Enforcing a monotonic regime count stabilizes the regime identification process, prevents spurious regime collapses driven by short-term noise, and preserves the interpretability of regime-dependent return and risk estimates across the backtest.

Regime Label Matching for Robust Rolling Regime Detection (R2-RD)

One limitation of standard rolling HMM estimation is the label switching problem: regime labels at successive expanding windows may permute arbitrarily, leading to inconsistent regime identification over time. To address this, we introduce a regime label matching mechanism, inspired by Hirsa et Al (2024), which ensures temporal consistency of regime assignments across windows.
Let zpastt denote the regime labels from the previous expanding window and znewt the labels from the current window over an overlapping period. We construct a similarity matrix M ∈ RK × K with elements

Mij = Sum[t ∈ overlap] 1(zpastt = i \ &\ znewt = j ),
counting the number of times each past regime coincides with a new regime.
We then solve the following linear assignment problem:
Sum[i=1 to K] Sum[j=1 to K] Mij xij s.t. j xij = 1, i xij = 1, xij ∈ \0,1
where xij=1 if past regime i is matched to new regime j. The solution provides a permutation mapping that aligns the new regime labels with the past ones.
Applying this mapping to znewt produces aligned regime labels zalignedt that are consistent with historical regimes. These aligned labels are then used to compute regime probabilities πt(k) and the corresponding conditional moments μt and Σt for portfolio optimization. This approach ensures: In combination with the regime emergence policy (no regime merging), this mechanism constitutes a fully robust rolling regime detection framework (R2-RD) suitable for cross-asset allocation.

Regime-Conditional Moments

Regime detection enters portfolio construction through probability-weighted return and covariance estimates:

μt = Sum[k=1 to Kt^*] πt(k) μk, Σt = Sum[k=1 to Kt^*] πt(k) Σk,
where πt(k) = P(zt = k | Ft-1). This mixture structure smooths estimation noise and stabilizes portfolio weights across regime transitions.

K-Nearest Neighbors Regime Approximation

Local Similarity Estimation

The KNN approach does not model latent regimes explicitly. Instead, it approximates regimes locally by identifying historical months whose feature vectors are closest to the current state:

NK(t) = argmin[t'] || xt - xt' ||2.

Local Moment Estimation

Expected returns and covariances are estimated directly from the identified neighbors:

μt = (1)/(K) Sum[t' ∈ NK(t)] rt'+1, Σt = Cov(t'+1\t' ∈ NK(t)).
While this method adapts rapidly to local changes, it is sensitive to noise, sample sparsity, and instability in covariance estimation, particularly at monthly frequency.

Portfolio Optimization

At each rebalancing date, portfolio weights are determined by solving the following constrained mean–variance optimization problem:

max[w] w^ μt - (λ)/(2) w^ Σt w - γ || w - wt-1 ||1,
subject to:
Sum[i=1]N wi = 1, ||w||_∈fty ≤ w.
The l1 penalty controls turnover, ensuring realistic trading behavior.

Backtesting Framework

The evaluation framework adheres to strict causality:

Performance is assessed using annualized Sharpe ratio and maximum drawdown.

Pseudocodes

R2-RD: Robust Rolling Regime Detection + MVO

Initialize K0^* = 1, zpast =
for t = T0 to T:
 Construct feature matrix _τ\τ t-1
 Step 1: Rolling HMM estimation
 Set K,t arrow Kt-1^* # Regime emergence policy: no merging
 for K = K,t to K:
 Fit HMM with K regimes using data up to t-1
 Compute BICK
 Select Kt^* arrow K BICK
 Infer raw regime labels znew and probabilities πt(k)
 if zpast is not empty:
 Step 2: Label matching via linear assignment
 Construct similarity matrix Mij = _τ 1[zpast,τ = i \ &\ znew,τ = j]
 Solve linear assignment problem:
 Apply optimal mapping to znew zaligned
 else:
 zaligned arrow znew
 Update zpast arrow zaligned
 Step 3: Compute regime-conditional moments
 μt = Sum[k=1 to Kt^*] πt(k) μk
 Σt = Sum[k=1 to Kt^*] πt(k) Σk
 Step 4: Solve MVO
 Solve:
 Observe return rt+1 and record PnL

KNN + MVO

for t = T0 to T:
 Construct feature vector xt
 Identify K nearest historical neighbors
 Estimate μt, Σt from neighbors
 Solve MVO to obtain wt
 Observe rt+1 and record PnL

Empirical Results

This section presents the out-of-sample performance of the two regime-aware portfolio construction methods evaluated from January 2016 through the most recent available data. Performance is assessed using annualized Sharpe ratio and maximum drawdown, computed from monthly portfolio returns.

Performance Summary

Table summarizes the risk-adjusted performance downside risk of the two strategies. The robust rolling regime detection approach (R2-RD) delivers superior performance across both metrics, achieving a higher Sharpe ratio and substantially lower maximum drawdown relative to the KNN-based approach.

Out-of-Sample Performance Comparison (2016–Present)

StrategySharpe Ratio (Ann.)Max Drawdown
R2-RD + MVO0.93-15.79\
KNN + MVO0.73-30.44\

Cumulative Performance

Figures and display the cumulative out-of-sample profit-and-loss (PnL) trajectories for the R2-RD and KNN strategies, respectively. The HMM-based strategy exhibits smoother growth and shallower drawdowns, while the KNN-based approach experiences larger volatility and deeper drawdown episodes.

[Figure]

[Figure]

Model Performance Discussion

The superior performance of R2-RD reflects the benefits of global parametric structure. By smoothing regime transitions and dynamically adapting regime complexity, rolling HMMs produce more stable estimates of return and risk, which translate into controlled portfolio turnover and drawdowns. KNN, while adaptive, relies on local similarity and is more sensitive to noise and covariance instability, particularly at monthly horizons.

R2-RD Regimes Discussion

We now examine the dynamics of the R2-RD regime labels over time and the corresponding asset performance across regimes. Figure presents the monthly regime assignments from the HMM, where each regime is labeled according to its economic interpretation: Regime 0 (Financial Crisis), Regime 1 (Normal), Regime 2 (Early Pandemic), and Regime 3 (Late Pandemic). The timeline clearly captures major structural shifts in global markets, including the 2008–2009 financial crisis and the COVID-19 pandemic period.

[Figure]

Figure shows the average monthly returns of each asset conditional on the detected regimes. Several key observations emerge:

[Figure]

These results highlight the economic interpretability of the R2-RD regimes. By linking asset performance to identified regimes, we observe patterns that align with well-known market behavior during crises, normal periods, and pandemic-related disruptions. This explainable mapping between regimes and asset returns provides a robust foundation for regime-aware portfolio construction, as it allows investors to anticipate which assets are likely to perform well or poorly under each detected market state.

Explainability of R2-RD and Regime-Aware Allocation

A key advantage of the R2-RD framework is its inherent explainability when applied to market data. Unlike black-box models, R2-RD provides interpretable regime definitions and transparent evolution over time:

When combined with mean–variance optimization, R2-RD enables a fully explainable regime-aware allocation model:

In summary, R2-RD provides interpretable regime detection, and its integration with MVO results in a portfolio construction methodology where both the drivers of market states and the resulting allocations are explainable. This stands in contrast to purely nonparametric approaches, which may adapt to data quickly but lack a clear, interpretable mapping between observed market conditions and allocation rationale.