Explainable Regime-Aware Portfolio Optimization: The case for Robust Rolling Regime Detection

Amine Boukardagha and Alex Saunders

Abstract

We propose an explainable framework for cross-asset portfolio optimization under time-varying market regimes. Our approach leverages a robust rolling regime detection model (R2-RD) to identify latent market states with interpretable mean and covariance structures, and a K-Nearest Neighbors (KNN) model as a nonparametric benchmark. Both methods are embedded within a regime-aware mean–variance optimization (MVO) framework and evaluated via a strictly causal expanding-window backtest on monthly data. Using a diversified cross-asset universe, we find that R2-RD delivers superior risk-adjusted returns and drawdown control relative to KNN, highlighting the benefits of parametric structure, temporal consistency, and regime-aware allocation for medium-frequency portfolio construction. The results demonstrate that R2-RD not only improves performance but also provides a transparent mapping between observed market states and portfolio decisions.

Introduction

Financial markets exhibit persistent structural changes driven by macroeconomic cycles, monetary policy regimes, and shifts in risk appetite. Ignoring such regime dynamics can lead to unstable portfolio allocations and poor risk-adjusted performance. Regime-aware portfolio construction seeks to address this issue by conditioning expected returns and risks on latent market states. This paper compares two distinct approaches to regime detection:

Rolling parametric regime detection using Hidden Markov Models with dynamically optimized regime count and robust regime label tracking (R2-RD);
Nonparametric local regime approximation using K-Nearest Neighbors (KNN).

Both methods are evaluated within an identical portfolio optimization framework, enabling a clean empirical comparison of parametric versus nonparametric regime modeling for cross-asset allocation.

Data and Market Universe

The empirical analysis is conducted on a diversified cross-asset universe designed to capture major global risk factors. Monthly log-returns are constructed from adjusted close prices obtained via Yahoo Finance. The asset universe consists of:

Equities: S&P 500 Index (SPX), representing U.S. equity market risk;
Fixed Income: iShares 7–10 Year Treasury ETF (IEF), proxying interest rate and duration risk;
Commodities: SPDR Gold Shares (GLD) and United States Oil Fund (USO), capturing inflation sensitivity and real asset exposure;
Foreign Exchange: U.S. Dollar Index proxy (UUP), representing global risk-off dynamics and dollar strength.

This universe spans growth-sensitive, defensive, inflation-hedging, and currency assets, making it well suited for regime-based allocation. All strategies are evaluated using monthly rebalancing with a strictly out-of-sample backtest running from January 2016 to the most recent available data.

Feature Representation

Both regime detection approaches operate on a shared feature space constructed from asset returns and risk measures. At each month t, the feature vector is defined as:

x_t = [r_t; σ_t], σ_t = sqrt(1)6 Sum[i=1 to 6] (r_t-i - r_t)²,

where r_t ∈ R^N denotes the vector of monthly asset returns and σ_t is a six-month rolling volatility estimate. This representation captures both directional information and prevailing market uncertainty, which are key drivers of regime differentiation.

Robust Rolling Regime Detection (R2-RD)

Model Specification

Let z_t ∈ \1,,K\ denote a latent market regime evolving according to a first-order Markov chain:

P(z_t = j | z_t-1 = i) = A_ij.

Conditional on the regime, feature vectors follow a multivariate Gaussian distribution:

x_t | z_t = k N(μ_k, Σ_k).

Rolling BIC-Based Regime Selection with Regime Emergence Constraint

At each rebalancing date t, Hidden Markov Models are estimated on an expanding window using all information available up to t-1. Candidate models are fit for a range of regime counts K ∈ _,t, , K\, and the optimal number of regimes is selected via the Bayesian Information Criterion (BIC):

K_t^* = argmin[K] \ -2 log L_K + p_K log T

where L_K denotes the maximized likelihood and p_K the number of free parameters.
Crucially, the lower bound on the candidate regime count is set dynamically according to:

K_,t = K_t-1^*,

so that the optimization permits only regime emergence and explicitly rules out regime merging. In other words, the number of regimes is allowed to increase over time as new market structures appear, but previously identified regimes are never removed. This design choice reflects the empirical observation that market regimes tend to fragment over time as new economic environments emerge, while previously learned regimes remain relevant reference states. Enforcing a monotonic regime count stabilizes the regime identification process, prevents spurious regime collapses driven by short-term noise, and preserves the interpretability of regime-dependent return and risk estimates across the backtest.

Regime Label Matching for Robust Rolling Regime Detection (R2-RD)

One limitation of standard rolling HMM estimation is the label switching problem: regime labels at successive expanding windows may permute arbitrarily, leading to inconsistent regime identification over time. To address this, we introduce a regime label matching mechanism, inspired by Hirsa et Al (2024), which ensures temporal consistency of regime assignments across windows.
Let z^past_t denote the regime labels from the previous expanding window and z^new_t the labels from the current window over an overlapping period. We construct a similarity matrix M ∈ R^{K × K} with elements

M_ij = Sum[t ∈ overlap] 1(z^past_t = i \ &\ z^new_t = j ),

counting the number of times each past regime coincides with a new regime.
We then solve the following linear assignment problem:

Sum[i=1 to K] Sum[j=1 to K] M_ij x_ij s.t. _j x_ij = 1, _i x_ij = 1, x_ij ∈ \0,1

where x_ij=1 if past regime i is matched to new regime j. The solution provides a permutation mapping that aligns the new regime labels with the past ones.
Applying this mapping to z^new_t produces aligned regime labels z^aligned_t that are consistent with historical regimes. These aligned labels are then used to compute regime probabilities π_t^(k) and the corresponding conditional moments μ_t and Σ_t for portfolio optimization. This approach ensures:

Temporal consistency of regime assignments,
Smooth evolution of portfolio weights,
Robustness to short-term label permutation artifacts.

In combination with the regime emergence policy (no regime merging), this mechanism constitutes a fully robust rolling regime detection framework (R2-RD) suitable for cross-asset allocation.

Regime-Conditional Moments

Regime detection enters portfolio construction through probability-weighted return and covariance estimates:

μ_t = Sum[k=1 to K_t^*] π_t^(k) μ_k, Σ_t = Sum[k=1 to K_t^*] π_t^(k) Σ_k,

where π_t^(k) = P(z_t = k | F_t-1). This mixture structure smooths estimation noise and stabilizes portfolio weights across regime transitions.

K-Nearest Neighbors Regime Approximation

Local Similarity Estimation

The KNN approach does not model latent regimes explicitly. Instead, it approximates regimes locally by identifying historical months whose feature vectors are closest to the current state:

N_K(t) = argmin[t'] || x_t - x_t' ||₂.

Local Moment Estimation

Expected returns and covariances are estimated directly from the identified neighbors:

μ_t = (1)/(K) Sum[t' ∈ N_K(t)] r_t'+1, Σ_t = Cov(_t'+1\_{t' ∈ N_K(t)}).

While this method adapts rapidly to local changes, it is sensitive to noise, sample sparsity, and instability in covariance estimation, particularly at monthly frequency.

Portfolio Optimization

At each rebalancing date, portfolio weights are determined by solving the following constrained mean–variance optimization problem:

max[w] w^ μ_t - (λ)/(2) w^ Σ_t w - γ || w - w_t-1 ||₁,

subject to:

Sum[i=1]^N w_i = 1, ||w||_∈fty ≤ w.

The l₁ penalty controls turnover, ensuring realistic trading behavior.

Backtesting Framework

The evaluation framework adheres to strict causality:

Expanding-window estimation;
Monthly rebalancing;
Out-of-sample period from 2016 to present;
No look-ahead bias.

Performance is assessed using annualized Sharpe ratio and maximum drawdown.

Pseudocodes

R2-RD: Robust Rolling Regime Detection + MVO

Initialize K₀^* = 1, z_past =
for t = T₀ to T:
 Construct feature matrix _τ\_{τ t-1}
 Step 1: Rolling HMM estimation
 Set K_,t arrow K_t-1^* # Regime emergence policy: no merging
 for K = K_,t to K:
 Fit HMM with K regimes using data up to t-1
 Compute BIC_K
 Select K_t^* arrow _K BIC_K
 Infer raw regime labels z_new and probabilities π_t^(k)
 if z_past is not empty:
 Step 2: Label matching via linear assignment
 Construct similarity matrix M_ij = _τ 1[z_past,τ = i \ &\ z_new,τ = j]
 Solve linear assignment problem:
 Apply optimal mapping to z_new z_aligned
 else:
 z_aligned arrow z_new
 Update z_past arrow z_aligned
 Step 3: Compute regime-conditional moments
 μ_t = Sum[k=1 to K_t^*] π_t^(k) μ_k
 Σ_t = Sum[k=1 to K_t^*] π_t^(k) Σ_k
 Step 4: Solve MVO
 Solve:
 Observe return r_t+1 and record PnL

KNN + MVO

for t = T₀ to T:
 Construct feature vector x_t
 Identify K nearest historical neighbors
 Estimate μ_t, Σ_t from neighbors
 Solve MVO to obtain w_t
 Observe r_t+1 and record PnL

Empirical Results

This section presents the out-of-sample performance of the two regime-aware portfolio construction methods evaluated from January 2016 through the most recent available data. Performance is assessed using annualized Sharpe ratio and maximum drawdown, computed from monthly portfolio returns.

Performance Summary

Table summarizes the risk-adjusted performance downside risk of the two strategies. The robust rolling regime detection approach (R2-RD) delivers superior performance across both metrics, achieving a higher Sharpe ratio and substantially lower maximum drawdown relative to the KNN-based approach.

Out-of-Sample Performance Comparison (2016–Present)

Strategy	Sharpe Ratio (Ann.)	Max Drawdown
R2-RD + MVO	0.93	-15.79\
KNN + MVO	0.73	-30.44\

Cumulative Performance

Figures and display the cumulative out-of-sample profit-and-loss (PnL) trajectories for the R2-RD and KNN strategies, respectively. The HMM-based strategy exhibits smoother growth and shallower drawdowns, while the KNN-based approach experiences larger volatility and deeper drawdown episodes.

[Figure]

Model Performance Discussion

The superior performance of R2-RD reflects the benefits of global parametric structure. By smoothing regime transitions and dynamically adapting regime complexity, rolling HMMs produce more stable estimates of return and risk, which translate into controlled portfolio turnover and drawdowns. KNN, while adaptive, relies on local similarity and is more sensitive to noise and covariance instability, particularly at monthly horizons.

R2-RD Regimes Discussion

We now examine the dynamics of the R2-RD regime labels over time and the corresponding asset performance across regimes. Figure presents the monthly regime assignments from the HMM, where each regime is labeled according to its economic interpretation: Regime 0 (Financial Crisis), Regime 1 (Normal), Regime 2 (Early Pandemic), and Regime 3 (Late Pandemic). The timeline clearly captures major structural shifts in global markets, including the 2008–2009 financial crisis and the COVID-19 pandemic period.

[Figure]

Figure shows the average monthly returns of each asset conditional on the detected regimes. Several key observations emerge:

Financial Crisis (Regime 0): Equity (SPX) and commodity (OIL, GOLD) returns are muted or slightly negative, while BOND and USD perform positively, reflecting a typical flight-to-safety pattern. Volatility is elevated, consistent with crisis periods.
Normal (Regime 1): SPX and BOND deliver steady positive returns, while GOLD and USD exhibit small negative average returns, indicating normal market conditions with moderate risk-on behavior.
Early Pandemic (Regime 2): SPX and BOND show strong positive returns, but GOLD suffers a large negative shock, reflecting dislocations in commodity markets during March–April 2020. OIL returns are mildly positive, and USD is roughly neutral.
Late Pandemic (Regime 3): SPX and USD exhibit strong positive returns, GOLD rebounds, and OIL turns slightly negative, consistent with a gradual recovery and central bank liquidity support.

[Figure]

These results highlight the economic interpretability of the R2-RD regimes. By linking asset performance to identified regimes, we observe patterns that align with well-known market behavior during crises, normal periods, and pandemic-related disruptions. This explainable mapping between regimes and asset returns provides a robust foundation for regime-aware portfolio construction, as it allows investors to anticipate which assets are likely to perform well or poorly under each detected market state.

Explainability of R2-RD and Regime-Aware Allocation

A key advantage of the R2-RD framework is its inherent explainability when applied to market data. Unlike black-box models, R2-RD provides interpretable regime definitions and transparent evolution over time:

Parametric structure: Each latent regime is explicitly characterized by a multivariate Gaussian distribution with interpretable mean vectors μ_k and covariance matrices Σ_k. This allows practitioners to understand the expected return and risk profile of each regime, and relate it to macroeconomic or market conditions.
Monotonic regime emergence: The policy preventing regime merging ensures that new regimes are only added when new market dynamics appear, preserving the identity of previously learned regimes. This makes the sequence of regimes over time easily interpretable, and avoids abrupt, arbitrary label changes.
Regime label alignment: The label matching mechanism provides temporal consistency, guaranteeing that the same regime label corresponds to the same market state across rolling windows. This further enhances interpretability of regime transitions and their impact on portfolio behavior.

When combined with mean–variance optimization, R2-RD enables a fully explainable regime-aware allocation model:

The conditional moments μ_t and Σ_t used in the optimization are explicitly weighted by regime probabilities π_t^(k), linking portfolio weights directly to interpretable market states.
The resulting allocations can be analyzed in the context of the underlying regime characteristics. For instance, higher allocations to defensive assets can be traced to regimes with elevated covariance among equities or increased market volatility, providing a transparent rationale for each allocation decision.
This framework allows practitioners and risk managers to understand not only what the portfolio holds at each point in time, but also why those allocations arise, bridging statistical modeling and practical portfolio decision-making.

In summary, R2-RD provides interpretable regime detection, and its integration with MVO results in a portfolio construction methodology where both the drivers of market states and the resulting allocations are explainable. This stands in contrast to purely nonparametric approaches, which may adapt to data quickly but lack a clear, interpretable mapping between observed market conditions and allocation rationale.