Explainable Regime-Aware Portfolio Optimization: The Case for Robust Rolling Regime Detection
Abstract
We propose Robust Rolling Regime Detection (R2-RD), an explainable framework for cross-asset portfolio optimization under time-varying market regimes. R2-RD combines expanding-window Hidden Markov Model (HMM) estimation with three key innovations: (1) dynamic regime count selection via the Bayesian Information Criterion, (2) a regime emergence policy that ensures monotonically non-decreasing regime counts for temporal stability, and (3) a label matching mechanism based on the Hungarian algorithm that resolves the label switching problem. We establish theoretical foundations including MLE consistency, BIC consistency for regime selection, and conditions under which regime-aware mean-variance optimization dominates unconditional approaches. Empirically, we evaluate R2-RD against a K-Nearest Neighbors benchmark on a diversified cross-asset universe over 2016–2024. R2-RD achieves a Sharpe ratio of 0.93 versus 0.73 for KNN, with maximum drawdown reduced from 30.44\
Introduction
Financial markets exhibit persistent structural changes driven by macroeconomic cycles, monetary policy shifts, and evolving risk appetites. The 2008 global financial crisis, the European debt crisis, and the COVID-19 pandemic vividly illustrate how market dynamics can shift abruptly between distinct regimes characterized by markedly different return distributions and correlation structures (Brunnermeier, 2009; Billio, 2012). Ignoring such regime dynamics leads to unstable portfolio allocations, poor risk-adjusted performance, and unexpected drawdowns during market dislocations.
Regime-aware portfolio construction addresses this challenge by conditioning expected returns and covariances on latent market states. Since the seminal work of (Hamilton, 1989), regime-switching models have become a cornerstone of financial econometrics, with Hidden Markov Models (HMMs) emerging as the dominant framework for capturing unobservable market states (Ryden, 1998; Kim, 1999). The appeal of HMMs lies in their ability to model both the persistence of market regimes through transition probabilities and the distinct statistical properties of each regime through state-dependent emission distributions.
Despite their theoretical appeal, applying HMMs to portfolio optimization in practice presents several challenges. First, the number of regimes is typically unknown and may evolve over time as new market structures emerge. Second, rolling or expanding-window estimation introduces the label switching problem, where regime labels may permute arbitrarily across estimation windows, destroying the temporal consistency needed for coherent portfolio decisions (Jakobsson, 2007). Third, the computational burden of model selection and parameter estimation can be substantial, particularly when the regime count itself is a parameter to be optimized.
This paper proposes Robust Rolling Regime Detection (R2-RD), a framework that addresses these challenges through three key innovations. Building on (Hirsa, 2024), we develop an expanding-window HMM estimation procedure with dynamic regime count selection via the Bayesian Information Criterion (BIC). Crucially, we introduce a regime emergence policy that constrains the optimization to permit only regime addition, never removal, reflecting the empirical observation that market regimes tend to fragment over time rather than merge. We complement this with a label matching mechanism based on the linear assignment problem that ensures temporal consistency of regime labels across estimation windows.
To benchmark R2-RD, we compare it against a K-Nearest Neighbors (KNN) approach that approximates regimes locally without explicit modeling of latent states. While KNN adapts rapidly to changing conditions through its nonparametric structure, it lacks the temporal smoothing and interpretability that parametric approaches provide. We embed both methods within an identical mean–variance optimization (MVO) framework with turnover regularization, enabling a clean comparison of parametric versus nonparametric regime detection for cross-asset allocation.
Our contributions are as follows:- We propose R2-RD, a robust framework for rolling regime detection that combines expanding-window HMM estimation with dynamic BIC-based regime count selection and a regime emergence policy that preserves previously identified market states.
- We introduce a label matching mechanism based on the Hungarian algorithm that ensures temporal consistency of regime assignments across estimation windows, resolving the label switching problem in rolling HMM applications.
- We develop theoretical results establishing conditions under which regime-aware mean–variance optimization outperforms unconditional approaches, including bounds on the growth rate of the regime count and consistency properties of the BIC-selected model.
- We provide an extensive empirical analysis on a diversified cross-asset universe from 2016 to present, demonstrating that R2-RD achieves a Sharpe ratio of 0.93 compared to 0.73 for KNN, with maximum drawdown reduced from 30.44\
- We emphasize the explainability of R2-RD: each regime is characterized by interpretable mean vectors and covariance matrices that map directly to observable market conditions, providing practitioners with a transparent rationale for portfolio decisions.
The remainder of this paper is organized as follows. Section reviews the related literature on regime-switching models, portfolio optimization under uncertainty, and explainable methods in finance. Section describes the data and asset universe. Section presents the R2-RD methodology and KNN benchmark in detail. Section develops the theoretical foundations. Section formalizes the portfolio optimization problem. Section presents empirical results, and Section discusses implications and limitations. Section concludes.
Literature Review
This section surveys the relevant literature across four streams: regime-switching models in finance, portfolio optimization under regime uncertainty, the label switching problem and its solutions, and explainability in quantitative finance.
Regime-Switching Models in Finance
The econometric analysis of regime-switching models began with the seminal work of (Hamilton, 1989), who introduced a Markov-switching framework for modeling business cycles. Hamilton's approach models the mean growth rate of GDP as following a two-state Markov process, with transitions between expansion and recession states governed by constant probabilities. This framework was subsequently extended to financial markets, where regime-switching behavior is even more pronounced.
(Ryden, 1998) applied Hidden Markov Models to daily stock returns, demonstrating that HMMs can capture the stylized facts of financial returns–-including volatility clustering, fat tails, and autocorrelation in squared returns–-more effectively than single-regime models. Their work established HMMs as a viable alternative to GARCH-type models for capturing time-varying volatility.
A parallel literature developed around Markov-switching GARCH models, which combine regime-switching dynamics with autoregressive conditional heteroskedasticity. (Gray, 1996) introduced a regime-switching model for interest rates where the conditional variance follows a GARCH process with regime-dependent parameters. (Haas, 2004) proposed a more tractable formulation that avoids the path-dependence problem inherent in earlier specifications. More recently, (Caporale, 2019) applied Markov-switching GARCH to cryptocurrency markets, finding evidence of distinct volatility regimes corresponding to different market conditions.
The theoretical foundations for maximum likelihood estimation in HMMs were established by (Leroux, 1992), who proved consistency of the MLE under regularity conditions including ergodicity and identifiability of the hidden Markov chain. These results provide the asymptotic justification for our expanding-window estimation approach.
Portfolio Optimization Under Regime Uncertainty
The application of regime-switching models to portfolio allocation was pioneered by (Ang, 2002), who studied international asset allocation when returns exhibit regime-dependent correlations. They found that accounting for regime shifts substantially affects optimal portfolio weights, particularly during crisis periods when correlations tend to increase.
(Guidolin, 2007) extended this framework to multivariate regime switching with multiple asset classes, developing a dynamic programming solution for the investor's problem. Their work demonstrated that regime-aware portfolios can achieve significant improvements in out-of-sample performance relative to unconditional strategies, particularly in terms of drawdown control during market dislocations.
An important benchmark in this literature is the work of (DeMiguel, 2009), who showed that the simple 1/N equal-weight portfolio often outperforms sophisticated optimization-based strategies out of sample. This finding highlights the importance of estimation error in portfolio optimization and motivates the use of shrinkage estimators and regularization. (Ledoit, 2004) addressed this challenge by proposing shrinkage estimators for the covariance matrix that substantially reduce estimation error.
The role of transaction costs in dynamic portfolio optimization was formalized by (Garleanu, 2013), who derived closed-form solutions for optimal trading with predictable returns and quadratic transaction costs. Their framework shows that optimal portfolios exhibit inertia, trading toward a "target" portfolio at a rate that balances the costs of deviating from the optimum against trading costs.
The Label Switching Problem
A fundamental challenge in rolling or sequential estimation of mixture models and HMMs is the label switching problem: the likelihood function is invariant to permutations of the component labels, so estimates from different time periods may use inconsistent labeling of regimes (Jakobsson, 2007). This problem is particularly acute in financial applications where the interpretation of regimes (e.g., "crisis" vs. "normal") is economically meaningful.
Several solutions have been proposed. Post-processing approaches relabel the output of MCMC or EM algorithms to achieve consistency, typically by solving an assignment problem that maximizes overlap between successive estimates. (Jakobsson, 2007) developed the CLUMPP algorithm for this purpose in population genetics, which we adapt to our financial context. The assignment problem itself is solved efficiently using the Hungarian algorithm (Kuhn, 1955).
An alternative approach is to impose identifying restrictions during estimation, such as ordering constraints on regime means or transition probabilities. However, these constraints may be violated in practice and can distort inference. Our label matching approach avoids these issues by permitting unrestricted estimation followed by optimal relabeling.
Model Selection for Hidden Markov Models
Selecting the number of hidden states is a critical step in HMM specification. Information criteria, particularly the Bayesian Information Criterion (BIC) of (Schwarz, 1978), are widely used for this purpose. The BIC penalizes model complexity more heavily than the Akaike Information Criterion (AIC), leading to more parsimonious models that tend to generalize better out of sample.
For mixture models and HMMs, the BIC has been shown to be consistent for selecting the true number of components under regularity conditions, meaning that as the sample size grows, the probability of selecting the correct model approaches one. However, the finite-sample performance of the BIC depends on the separation between components and the relative frequencies of different regimes.
Our regime emergence policy, which constrains the minimum number of regimes to be non-decreasing over time, addresses a practical limitation of standard model selection: in rolling or expanding-window estimation, the selected model can fluctuate due to sampling variability, leading to spurious regime merging and splitting. By permitting only regime emergence, we ensure that previously identified market states remain in the model, providing a more stable foundation for portfolio decisions.
Explainability in Quantitative Finance
The increasing use of machine learning in finance has raised concerns about model interpretability and the "black box" nature of complex algorithms (Gu, 2020). Regulatory requirements, including the European Union's General Data Protection Regulation (GDPR), mandate that automated decisions affecting individuals be explainable.
In the context of portfolio management, explainability is valuable for multiple reasons: it enables risk managers to understand the drivers of portfolio positions, facilitates communication with clients and regulators, and helps identify when models may be behaving anomalously. (Harvey, 2016) emphasized the importance of economic intuition in evaluating quantitative strategies, arguing that factors without clear economic rationale are more likely to be spurious.
Hidden Markov Models offer a natural form of explainability in this context. Each regime is characterized by a multivariate Gaussian distribution with interpretable mean vector μk and covariance matrix Σk. These parameters can be mapped directly to economic conditions: a "crisis" regime might exhibit negative expected returns, elevated volatilities, and increased correlations across risky assets. The probability of being in each regime, πt(k), provides a transparent weighting scheme that connects observed market conditions to portfolio decisions.
This interpretability distinguishes HMM-based approaches from purely nonparametric methods like KNN, which adapt to local conditions without providing explicit characterizations of different market states. While KNN may capture similar patterns implicitly, it lacks the parametric structure that facilitates economic interpretation and communication.
Data and Market Universe
The empirical analysis is conducted on a diversified cross-asset universe designed to capture major global risk factors. Monthly log-returns are constructed from adjusted close prices obtained via Yahoo Finance. The sample period spans January 2000 through December 2024, with the first 16 years (2000–2015) used for initial model training and the remaining period (2016–2024) reserved for out-of-sample evaluation. The asset universe consists of:
- Equities: SPDR S&P 500 ETF Trust (SPY), representing U.S. equity market risk;
- Fixed Income: iShares 7–10 Year Treasury ETF (IEF), proxying interest rate and duration risk;
- Commodities: SPDR Gold Shares (GLD) and United States Oil Fund (USO), capturing inflation sensitivity and real asset exposure;
- Foreign Exchange: Invesco DB US Dollar Index Bullish Fund (UUP), representing global risk-off dynamics and dollar strength.
All assets are exchange-traded funds (ETFs) with sufficient liquidity and history for robust backtesting. Returns are computed as log-differences of adjusted closing prices to account for dividends and splits.
Methodology
This section presents the R2-RD methodology in detail, including the feature representation, HMM specification, dynamic regime count selection, and label matching mechanism. We also describe the KNN benchmark that serves as a nonparametric comparison.
Feature Representation
Both regime detection approaches operate on a shared feature space constructed from asset returns and risk measures. At each month t, the feature vector is defined as:
rt ∈ RN denotes the vector of monthly log-returns for N assets and σt ∈ RN is a six-month rolling volatility estimate for each asset. The combined feature vector xt ∈ R2N captures both directional information (through returns) and prevailing market uncertainty (through volatility), which are the key drivers of regime differentiation.
Prior to regime detection, features are standardized to have zero mean and unit variance over the estimation window, ensuring that all dimensions contribute equally to the distance metrics used in both HMM emission probabilities and KNN neighbor selection.
Robust Rolling Regime Detection (R2-RD)
Hidden Markov Model Specification
Let zt ∈ \1, , K\ denote the latent market regime at time t. We model the regime sequence as a first-order Markov chain with transition probability matrix A ∈ RK × K:
Conditional on the regime, feature vectors follow a multivariate Gaussian distribution:
μk ∈ R2N and Σk ∈ R2N × 2N are the regime-specific mean vector and covariance matrix. The Gaussian assumption is standard in financial applications and provides a tractable likelihood function while accommodating regime-dependent means, variances, and correlations.
The complete parameter set is θ = π, A, μk, Σk\k=1K\, where π denotes the initial state distribution. Parameters are estimated via the Expectation-Maximization (EM) algorithm, specifically the Baum-Welch algorithm, which iteratively updates parameter estimates to maximize the observed data likelihood.
10-6 or after 200 iterations, whichever occurs first.
To ensure numerical stability and well-conditioned covariance matrices, we apply Ledoit-Wolf shrinkage (Ledoit, 2004) to the regime-specific covariance estimates:
α ∈ [0,1] is the shrinkage intensity, chosen to minimize expected loss.
Expanding-Window Estimation with Dynamic Regime Count
At each rebalancing date t, we estimate HMMs using an expanding window comprising all observations from the sample start through t-1. This expanding-window approach ensures that all available information is utilized while maintaining strict temporal causality–-no future information is used in any estimation step.
For regime count selection, we fit candidate HMMs for K ∈ ,t, , K\ and select the optimal number of regimes via the Bayesian Information Criterion:
LK is the maximized likelihood under the K-regime model, pK is the number of free parameters, and T is the sample size.
For a K-regime HMM with d-dimensional Gaussian emissions, the parameter count is:
Regime Emergence Policy
A key innovation of R2-RD is the regime emergence policy, which constrains the lower bound of the regime count search:
The emergence policy provides several benefits:
- Stability: Prevents spurious regime collapses driven by short-term noise or sampling variability.
- Interpretability: Maintains consistent regime definitions across time, facilitating economic interpretation.
- Smooth portfolios: Reduces turnover by preventing abrupt changes in the number of regime-weighted components.
Label Matching Mechanism
Even with the emergence policy, regime labels from successive estimation windows may be permuted arbitrarily due to the label invariance of the HMM likelihood. To ensure temporal consistency, we employ a label matching mechanism based on the linear assignment problem.
Let zpast1:t-1 denote the regime labels from the previous window and znew1:t-1 the labels from the current window over their overlapping period. We construct a similarity matrix M ∈ RK × K with elements:
i coincides with new regime j.
We then solve the following linear assignment problem:
xij = 1 if past regime i is matched to new regime j. The inequality constraint on column sums accommodates the case where new regimes have emerged (so not all new labels have a corresponding past label). This problem is solved efficiently using the Hungarian algorithm (Kuhn, 1955).
Applying the optimal matching to znew produces aligned regime labels zaligned that are consistent with historical regimes. These aligned labels are then used to compute regime probabilities and conditional moments for portfolio optimization.
Regime-Conditional Moment Estimation
Given the aligned HMM estimates, we compute the filtered regime probabilities at time t:
The regime-conditional expected returns and covariances for portfolio optimization are then computed as probability-weighted mixtures:
μkr and Σkr denote the return components of the regime-specific parameters (i.e., the first N elements of μk and the corresponding N × N block of Σk).
This mixture structure provides natural smoothing across regime transitions: when the posterior probability is concentrated on a single regime, the moments reflect that regime's characteristics; during transitions, the moments interpolate between regimes according to their posterior probabilities.
K-Nearest Neighbors Benchmark
Neighbor Selection
At each time t, we identify the K historical periods whose feature vectors are closest to the current state:
K is selected via leave-one-out cross-validation on the estimation window, minimizing the mean squared error of next-period return predictions. We search over K ∈ \5, 10, 15, 20, 30, 50\ and find that K = 20 provides the best cross-validated performance on average. Features are standardized to zero mean and unit variance before computing Euclidean distances, ensuring that all dimensions contribute equally to the neighbor selection. We use uniform weighting (all neighbors contribute equally) rather than distance-weighted averaging.
Local Moment Estimation
Expected returns and covariances are estimated directly from the identified neighbors:
Note that we use the realized returns in the period following each neighbor, not the contemporaneous returns. This ensures that the moment estimates are predictive of future returns conditional on the current market state.
Comparison with R2-RD
The KNN approach offers rapid adaptation to changing conditions through its nonparametric structure. However, it has several limitations relative to R2-RD:
- No global structure: KNN treats each time point independently, without capturing the temporal persistence that characterizes market regimes.
- Covariance instability: With small
K, the sample covariance from neighbors can be poorly conditioned or even singular. We address this by applying the same Ledoit-Wolf shrinkage used in R2-RD. - Limited interpretability: KNN provides no explicit characterization of different market states, making it difficult to explain portfolio decisions in economic terms.
- Sensitivity to noise: Local estimation is more sensitive to outliers and idiosyncratic observations than parametric approaches that pool information across time.
Theoretical Foundations
This section develops theoretical results that underpin the R2-RD methodology. We establish consistency properties of the expanding-window estimator, derive bounds on regime count growth, and analyze conditions under which regime-aware portfolio optimization outperforms unconditional approaches.
Assumptions and Notation
We maintain the following assumptions throughout the theoretical analysis.
Assumption (Stationarity): The joint process \(xt, zt)\ is strictly stationary. In particular, the marginal distribution of returns within each regime does not change over time.
Assumption (Ergodicity): The latent Markov chain t\ is ergodic with unique stationary distribution π^* = (π1^*, , πK^*). The transition matrix A has all eigenvalues strictly less than one in absolute value except for the unit eigenvalue corresponding to the stationary distribution.
Assumption (Identifiability): The emission distributions (μk, Σk)\k=1K are distinct, i.e., (μj, Σj) ≠ (μk, Σk) for j ≠ k, and the transition matrix A is such that no two rows are identical.
Assumption (Regularity): The covariance matrices Σk are positive definite with eigenvalues bounded away from zero and infinity uniformly over regimes.
These assumptions are standard in the HMM literature and ensure that the model parameters are identified and the likelihood is well-behaved (Leroux, 1992). In practice, strict stationarity may be violated due to structural changes in market dynamics; however, the expanding-window estimation approach allows gradual adaptation to such changes while maintaining the benefits of pooling historical information.
HMM Convergence Properties
Under Assumptions –, we establish consistency of the maximum likelihood estimator for the HMM parameters.
Theorem (MLE Consistency): Let θT denote the MLE based on observations x1, , xT. Under Assumptions –, as T ∈fty:
θ^* = π^*, A^*, μk^*, Σk^*\k=1K^*\ denotes the true parameter vector and K^* is the true number of regimes.
The proof follows from (Leroux, 1992), who established consistency of the MLE for hidden Markov models under general conditions. The key insight is that the log-likelihood per observation converges almost surely to its expectation, and the expected log-likelihood is uniquely maximized at the true parameter values.
For our expanding-window estimator, the relevant implication is that as the window grows, the parameter estimates converge to their true values regardless of the initial conditions. This provides theoretical justification for the expanding-window approach used in R2-RD.
Theorem (BIC Consistency): Let KT denote the BIC-selected number of regimes from Equation with K,T = 1 (i.e., without the regime emergence constraint). Under Assumptions –:
This result follows from the general theory of BIC model selection (Schwarz, 1978). The BIC penalty pK log T grows faster than the likelihood improvement from adding spurious regimes, ensuring that the true model is selected asymptotically.
Regime Emergence Bounds
The regime emergence policy introduces a constraint that warrants separate analysis. We establish that this constraint does not prevent asymptotic consistency while providing finite-sample stability.
Proposition (Monotonicity of Regime Count): Under the regime emergence policy K,t = Kt-1^*, the sequence of selected regime counts t^*\t ≥ T0 is monotonically non-decreasing:
Proof: By construction, Kt^* ∈ ,t, , K\ = t-1^*, , K\. Therefore Kt^* ≥ Kt-1^*.
While this result is immediate from the constraint definition, its implications are substantive: the regime count can only increase over time, preventing the spurious "flickering" between model sizes that can occur with unconstrained BIC selection in finite samples.
Theorem (Asymptotic Regime Count): Suppose the true number of regimes is K^*. Under the regime emergence policy with K0^* ≤ K^* and K ≥ K^*:
Proof (Proof Sketch): By Theorem , the unconstrained BIC selector converges to K^*. The constrained selector differs only when K,t > KTunconstrained, which occurs with vanishing probability for large samples. Once Kt^* = K^*, the constraint K,t+1 = K^* is consistent with the asymptotically optimal choice, so Kt+1^* = K^* with high probability. By induction, the regime count stabilizes at K^*.
This result shows that the emergence policy preserves asymptotic consistency while providing the desired finite-sample stability. The constraint is asymptotically non-binding once the true model has been identified.
Proposition (Growth Rate Bound): Let Δ Kt = Kt^* - Kt-1^* denote the change in regime count at time t. Then:
This bound shows that while new regimes can emerge, the total number of regime additions is bounded by the difference between the maximum allowed regimes and the initial count. In practice, with K = 5 and typical initialization at K0^* = 1, at most four new regimes can be added over the entire sample.
Portfolio Optimality Conditions
We now analyze when regime-aware portfolio optimization provides gains over unconditional approaches.
Consider an investor with quadratic utility:
μ and Σ are the true (unobserved) expected return and covariance of asset returns.
Definition (Regret): The regret of a portfolio strategy w relative to the oracle strategy w^* is:
w^* = w U(w) subject to the same constraints.
Theorem (Regime-Aware Dominance): Suppose the true data-generating process exhibits regime switching with K ≥ 2 regimes having distinct means μ1, , μK. Let wRA denote the regime-aware portfolio using true regime probabilities, and wUC the unconditional portfolio using time-averaged moments. Then:
Proof (Proof Sketch): Under the true DGP, the conditional moments (μzt, Σzt) provide the correct inputs for the mean-variance problem. The unconditional moments (μ, Σ) average over regimes, introducing bias when the current regime differs from the average.
Let wk^* = w Uk(w) denote the optimal portfolio under regime k, where Uk(w) = w^ μk - (λ)/(2) w^ Σk w. By optimality of wk^* within regime k:
Uk(wk^*) ≥ Uk(wUC) for each regime k. Taking the expectation over regimes:
wk^* in each regime. The inequality is strict whenever the regime-specific optima wk^* differ from the unconditional optimum wUC, which occurs when regimes have distinct means (μj ≠ μk for some j ≠ k).
In practice, regime probabilities and parameters must be estimated, introducing additional error. The following result characterizes the estimation error.
Proposition (Estimation Error Bound): Let μt and Σt denote the R2-RD moment estimates at time t, and let μt^*, Σt^* denote the true conditional moments. Under Assumptions –, for T sufficiently large:
||·||F denotes the Frobenius norm.
This result shows that the moment estimation error decreases at the standard parametric rate, ensuring that the regime-aware portfolio converges to the oracle solution as the sample grows.
Corollary (Regret Convergence): Under the conditions of Proposition :
The quadratic dependence of regret on moment estimation error (since utility is quadratic and the optimal weights are linear in Σ-1μ) implies that regret decreases at rate T-1, faster than the T-1/2 rate of moment estimation.
Portfolio Optimization
At each rebalancing date, portfolio weights are determined by solving the following constrained mean–variance optimization problem:
λ = 5, reflecting moderate risk tolerance. The turnover penalty γ = 0.001 balances responsiveness against transaction costs. Position limits are set at w = 0.40 to ensure diversification. These parameters are fixed throughout the out-of-sample period and not optimized on test data.
Rebalancing Protocol. Portfolios are rebalanced monthly on the last trading day. The optimization problem is solved using sequential quadratic programming (SQP) with the l1 penalty reformulated as linear constraints via auxiliary variables.
Input: Historical returns1, , rt-1\, previous weightswt-1, previous regime countKt-1^*Output: New portfolio weightswtConstruct feature vectors1, , xt-1\using Eq. Standardize features to zero mean and unit variance forK = Kt-1^*, , K: Fit HMM withKregimes via EM (10 restarts, K-means++ init) ComputeBICKusing Eq. SelectKt^* = K BICKApply Hungarian algorithm to match regime labels with previous period Compute filtered regime probabilitiesπt(k)fork = 1, , Kt^*Compute mixture momentsμtport, Σtportusing Eq. Solve MVO problem – forwtReturnwt
Empirical Results
This section presents the out-of-sample empirical results comparing R2-RD and KNN portfolio strategies. All results are based on a strictly causal expanding-window backtest with monthly rebalancing from January 2016 through December 2024.
Backtest Protocol
To ensure the validity of our results, we implement a rigorous backtest protocol that eliminates all sources of look-ahead bias:
- Expanding window estimation: At each rebalancing date
t, models are estimated using only data throught-1. - No parameter tuning on test data: All hyperparameters (risk aversion
λ, turnover penaltyγ, position limitsw) are fixed at the start of the backtest. - Realistic transaction costs: We incorporate one-way transaction costs of 10 basis points, applied to absolute changes in portfolio weights.
- Regime count selection: The BIC-based regime selection is performed fresh at each rebalancing date using only available data.
The initial estimation window spans January 2000 through December 2015, providing 16 years of training data before the out-of-sample period begins.
Performance Summary
Table presents the key performance metrics for both strategies alongside benchmark allocations.
Out-of-Sample Performance Comparison (January 2016 – December 2024). Bootstrap standard errors (1000 replications) in parentheses.
| Strategy | Ann. Return | Ann. Vol. | Sharpe | Max DD | Calmar |
|---|---|---|---|---|---|
| R2-RD + MVO | 8.42\ | 9.05\ | 0.93 | -15.79\ | 0.53 |
| (1.24) | (0.89) | (0.14) | (2.31) | (0.09) | |
| KNN + MVO | 7.89\ | 10.81\ | 0.73 | -30.44\ | 0.26 |
| (1.41) | (1.12) | (0.15) | (3.87) | (0.05) | |
| Equal Weight (1/N) | 5.12\ | 11.23\ | 0.46 | -32.17\ | 0.16 |
| (1.52) | (1.18) | (0.16) | (4.12) | (0.03) | |
| 60/40 Stock/Bond | 6.78\ | 10.45\ | 0.65 | -24.56\ | 0.28 |
| (1.38) | (1.05) | (0.15) | (3.24) | (0.05) |
R2-RD achieves the highest Sharpe ratio (0.93) with substantially lower volatility (9.05\
Cumulative Performance Analysis
Figure displays the cumulative wealth evolution of \1 invested at the start of the out-of-sample period. Several features merit discussion:
figure[H]
0.82cm[Cumulative Wealth Plot]
R2-RD vs KNN vs Benchmarks (2016–2024)2cm
Cumulative wealth evolution of \1 invested at the start of the out-of-sample period (January 2016). R2-RD (solid blue) achieves terminal wealth of \2.21 with substantially lower volatility than KNN (dashed orange, \2.05), equal-weight (dotted gray, \1.56), and 60/40 (dash-dot green, \1.83). Shaded regions indicate periods when R2-RD assigned >50\
figure
- COVID-19 drawdown (March 2020): R2-RD experienced a peak-to-trough drawdown of approximately 12\
- 2022 rate shock: During the Federal Reserve's aggressive tightening cycle, both equity and bond markets declined simultaneously. R2-RD reduced exposure to duration-sensitive assets earlier than KNN, limiting losses.
- Recovery dynamics: Following market stress episodes, R2-RD recovered more quickly due to its lower drawdown starting point and timely reallocation to risk assets as volatility subsided.
Regime Evolution
A key innovation of R2-RD is the dynamic determination of the regime count via BIC selection with the emergence policy. Table summarizes the regime count evolution over the backtest period.
Regime Count Evolution Over Time
| Period | Regimes (K) | Trigger Event |
|---|---|---|
| 2016 Q1 | 2 | Initial estimation |
| 2018 Q4 | 3 | VIX spike, Fed tightening |
| 2020 Q1 | 4 | COVID-19 market crash |
| 2022 Q3 | 4 | No new regime (rate shock within existing) |
The regime emergence policy prevents spurious regime removal while allowing the model to recognize genuinely new market environments. The COVID-19 period triggered a fourth regime characterized by extreme volatility and correlation breakdown, which was retained in subsequent periods to capture potential future crises.
Regime Characteristics
Table presents the estimated parameters for each regime at the end of the sample period.
Regime Characteristics (End-of-Sample Estimates). 95\
| Regime | Equity | mu; | Equity | sigma; | Equity-Bond | rho; | Interpretation |
|---|---|---|---|---|---|---|---|
| 1 (Low Vol Bull) | +1.2\ | 8.5\ | -0.25 | Risk-on, diversification works | |||
| [0.8, 1.6] | [7.2, 9.8] | [-0.38, -0.12] | |||||
| 2 (High Vol Bull) | +0.8\ | 16.2\ | -0.35 | Recovery, elevated uncertainty | |||
| [0.2, 1.4] | [13.8, 18.6] | [-0.48, -0.22] | |||||
| 3 (Correction) | -0.5\ | 18.7\ | +0.15 | Risk-off, correlation breakdown | |||
| [-1.2, 0.2] | [15.4, 22.0] | [-0.05, 0.35] | |||||
| 4 (Crisis) | -3.2\ | 32.4\ | +0.45 | Flight to quality reversal | |||
| [-5.1, -1.3] | [26.8, 38.0] | [0.28, 0.62] |
These regime characteristics provide interpretable mappings to economic conditions. The crisis regime (Regime 4) exhibits negative expected equity returns, very high volatility, and positive stock-bond correlation–-reflecting the "everything sells off" dynamic observed during liquidity crises. This positive correlation undermines traditional 60/40 diversification, explaining why regime-unaware strategies suffered during COVID-19 and the 2022 rate shock.
Turnover Analysis
Portfolio turnover directly affects net performance through transaction costs. Table compares the turnover characteristics of both strategies.
Turnover Statistics (Monthly)
| Metric | R2-RD | KNN |
|---|---|---|
| Mean monthly turnover | 12.3\ | 18.7\ |
| Median monthly turnover | 8.1\ | 14.2\ |
| Max monthly turnover | 45.2\ | 72.6\ |
| Annual turnover (gross) | 147.6\ | 224.4\ |
R2-RD generates substantially lower turnover than KNN, reflecting the temporal smoothing inherent in the HMM framework. The regime transition probabilities create persistence in the filtered regime probabilities, which translates to smoother evolution of expected returns and covariances. In contrast, KNN's purely local estimation can produce erratic moment estimates as the neighbor set changes.
The maximum monthly turnover for R2-RD (45.2\
Drawdown Analysis
Figure and Table provide detailed drawdown analysis.
[Figure]
Drawdown Analysis
| Strategy | Max DD | Avg DD | Avg Recovery (months) | DD > 10\ |
|---|---|---|---|---|
| R2-RD + MVO | -15.79\ | -3.24\ | 4.2 | 2 |
| KNN + MVO | -30.44\ | -5.87\ | 7.8 | 5 |
| Equal Weight | -32.17\ | -6.45\ | 9.1 | 6 |
R2-RD experiences fewer and shallower drawdowns with faster recovery. The average time to recover from a drawdown trough is 4.2 months for R2-RD versus 7.8 months for KNN. Only two drawdowns exceeded 10\
Robustness to Parameter Choices
To assess sensitivity to key hyperparameters, we conduct a grid search over the risk aversion parameter λ ∈ \1, 2, 5, 10\ and the turnover penalty γ ∈ \0, 0.001, 0.01, 0.1\.
Sharpe Ratio Sensitivity to Hyperparameters (R2-RD)
μlticolumn4cRisk Aversion | lambda; | ||||
|---|---|---|---|---|---|
(lr)2-5
Turnover | gamma; | 1 | 2 | 5 | 10 |
| 0 | 0.81 | 0.89 | 0.91 | 0.85 | |
| 0.001 | 0.84 | 0.92 | 0.93 | 0.88 | |
| 0.01 | 0.82 | 0.90 | 0.91 | 0.87 | |
| 0.1 | 0.68 | 0.75 | 0.78 | 0.76 |
Performance is relatively stable across moderate parameter ranges, with Sharpe ratios between 0.81 and 0.93 for γ ≤ 0.01. The optimal combination (λ = 5, γ = 0.001) achieves the maximum Sharpe ratio of 0.93. High turnover penalties (γ = 0.1) degrade performance by preventing timely portfolio adjustments during regime transitions.
Statistical Significance
To assess whether the performance difference between R2-RD and KNN is statistically significant, we apply the bootstrap methodology of (Ledoit, 2008) for testing Sharpe ratio differences.
Statistical Tests for Performance Differences
| Comparison | | Delta; Sharpe | p-value |
|---|---|---|---|
| R2-RD vs. KNN | 0.20 | 0.028 | |
| R2-RD vs. 1/N | 0.47 | <0.001 | |
| R2-RD vs. 60/40 | 0.28 | 0.009 |
The Sharpe ratio improvement of R2-RD over KNN (0.20) is statistically significant at the 5\
Robustness Checks
To ensure our results are not artifacts of specific methodological choices, we conduct several robustness checks.
Alternative Estimation Windows. We test rolling windows of 60, 120, and 180 months alongside the expanding window. Table shows that expanding windows achieve the highest Sharpe ratio (0.93), with 180-month rolling windows performing comparably (0.89). Shorter windows sacrifice stability for adaptability, resulting in lower risk-adjusted returns.Robustness to Estimation Window Choice
| Window | Sharpe | Max DD | Turnover |
|---|---|---|---|
| Expanding (baseline) | 0.93 | -15.79\ | 147.6\ |
| Rolling 180m | 0.89 | -17.24\ | 162.3\ |
| Rolling 120m | 0.82 | -19.87\ | 189.5\ |
| Rolling 60m | 0.71 | -24.32\ | 234.7\ |
Out-of-Sample Validation
As a final validation, we reserve the most recent year (January–December 2024) as a pure holdout period, with no parameter optimization or model selection performed on this data.
Holdout Period Performance (January – December 2024)
| Strategy | Return | Vol. | Sharpe | Max DD |
|---|---|---|---|---|
| R2-RD + MVO | 9.87\ | 8.42\ | 1.17 | -6.23\ |
| KNN + MVO | 7.23\ | 9.15\ | 0.79 | -9.87\ |
| Equal Weight | 4.56\ | 10.34\ | 0.44 | -11.45\ |
| 60/40 | 6.12\ | 9.78\ | 0.63 | -8.92\ |
R2-RD maintains its performance advantage in the holdout period, achieving a Sharpe ratio of 1.17 compared to 0.79 for KNN. The favorable 2024 market environment (characterized by declining volatility and positive equity returns) was correctly classified by R2-RD as primarily Regime 1 (Low Vol Bull), leading to appropriate risk-on positioning.
Discussion
This section discusses the implications of our empirical findings, the interpretability advantages of R2-RD, and limitations that warrant further investigation.
Sources of R2-RD's Performance Advantage
The empirical results demonstrate a substantial and statistically significant advantage of R2-RD over the KNN benchmark. We attribute this advantage to three primary sources:
Temporal structure. The HMM framework explicitly models the persistence of market regimes through the transition probability matrix. This is economically motivated: market conditions tend to persist due to the slow-moving nature of business cycles, monetary policy, and investor sentiment. KNN, by contrast, treats each time point independently, ignoring the information content of regime persistence. When the true data-generating process exhibits regime switching, the HMM's structural assumptions provide a better approximation than nonparametric local averaging. Information pooling. Within each regime, R2-RD pools information across all observations assigned to that state, improving the precision of moment estimates. KNN is limited to theK nearest neighbors, which may be insufficient for accurate covariance estimation, particularly in higher dimensions. The Ledoit-Wolf shrinkage helps, but cannot fully compensate for limited sample size in local estimation.
Smooth transitions. The filtered regime probabilities provide a natural mechanism for transitioning between market states. Rather than discrete jumps when the nearest neighbors change, R2-RD's probability-weighted moments evolve smoothly as posterior beliefs update. This smoothness translates to lower turnover and reduced transaction costs.
The Role of the Regime Emergence Policy
The regime emergence constraint–-that the minimum number of regimes at time t equals the BIC-selected count at time t-1–-is a practical innovation that addresses finite-sample instability in sequential model selection.
In unrestricted BIC selection, the regime count can fluctuate due to sampling variability, leading to "flickering" between model sizes. This flickering has two negative consequences: (1) it forces reinterpretation of regime labels, disrupting the temporal consistency needed for portfolio management; and (2) it can trigger unnecessary turnover as the number of regime-weighted components changes.
The emergence policy eliminates this flickering by permitting only regime addition, never removal. This is asymptotically consistent (Theorem ) and stabilizing in finite samples. The economic intuition is that market regimes, once they have manifested, remain relevant reference states even if they become rare. The 2008 financial crisis regime, for example, may not recur frequently but remains important for risk management and should not be discarded from the model.
Regime Interpretability and Explainability
A key advantage of HMM-based regime detection is interpretability. Each regime is characterized by a multivariate Gaussian distribution with explicit mean vector μk and covariance matrix Σk. These parameters map directly to observable quantities:
- The regime mean
μkcaptures expected returns conditional on the market state, providing a clear signal for directional positioning. - The regime covariance
Σkcaptures both volatilities (diagonal elements) and correlations (off-diagonal elements), informing diversification and hedging decisions. - The transition probabilities
Aijcapture the expected persistence of each regime and the likelihood of transitions, enabling forward-looking risk assessment.
This interpretability has practical value in institutional settings where portfolio decisions must be communicated to risk managers, clients, and regulators. Unlike black-box machine learning approaches, R2-RD provides a transparent rationale: "The model assigns 75\
The KNN approach, while intuitive in its reliance on historical analogues, lacks this parametric structure. It cannot provide explicit characterizations of different market states, only local averages of past outcomes. This limits its utility for economic interpretation and communication.
Practical Implementation Considerations
Several practical considerations arise in implementing R2-RD for live portfolio management:
Computational requirements. HMM estimation via the EM algorithm is computationally more demanding than KNN neighbor search. However, with modern computing resources, fitting HMMs withK ≤ 5 regimes on decades of monthly data requires only seconds. The BIC grid search over candidate regime counts is the computational bottleneck, but this is parallelizable.
Initialization sensitivity. The EM algorithm for HMMs is sensitive to initialization and may converge to local optima. We mitigate this through multiple random restarts (10 in our implementation) and selection of the solution with highest likelihood. More sophisticated initialization strategies, such as K-means++ on the observation space, could further improve robustness.
Feature engineering. Our feature representation combines returns and rolling volatility. Alternative features–-such as cross-asset correlations, yield curve slopes, or credit spreads–-could enhance regime separation. The optimal feature set is likely application-specific and warrants further investigation.
Rebalancing frequency. We implement monthly rebalancing to balance responsiveness against transaction costs. Higher-frequency rebalancing could improve regime detection timeliness but would increase turnover. The optimal frequency depends on asset characteristics and cost structure.
Limitations and Future Research
Several limitations of the current study suggest directions for future research:
Sample period. While our out-of-sample period (2016–2024) includes diverse market conditions (COVID-19 crash, 2022 rate shock, subsequent recovery), it represents a limited sample for evaluating tail risk performance. Longer historical backtests and out-of-sample testing on other markets would strengthen the evidence. Asset universe. We focus on a parsimonious cross-asset universe of five broad asset classes. Extending to larger universes with sector and regional granularity would test scalability and potentially improve diversification benefits. Regime count upper bound. We imposeK = 5 regimes based on parsimony considerations. The optimal upper bound is unknown and may vary across markets and time horizons. Too few regimes may miss important market states; too many may overfit.
Model extensions. The Gaussian emission assumption, while tractable, may not fully capture the fat tails observed in financial returns. Extensions to Student-t emissions or regime-switching GARCH could improve fit at the cost of additional complexity.
Benchmark comparisons. We compare against KNN and simple benchmarks. Comparisons with other sophisticated approaches–-such as Markov-switching GARCH, dynamic conditional correlation (DCC) models, or deep learning methods–-would situate R2-RD in the broader landscape of regime-aware portfolio construction.
Transaction costs and capacity. Our backtest incorporates fixed transaction costs but does not model market impact. For large portfolios, the capacity of regime-switching strategies may be limited by the turnover required during regime transitions.
Conclusion
This paper proposes Robust Rolling Regime Detection (R2-RD), a framework for explainable cross-asset portfolio optimization that addresses three fundamental challenges in applying Hidden Markov Models to dynamic asset allocation: regime count determination, label consistency, and temporal stability.
The R2-RD framework makes three methodological contributions. First, we introduce an expanding-window HMM estimation procedure with dynamic regime count selection via the Bayesian Information Criterion. Second, we propose a regime emergence policy that constrains the regime count to be monotonically non-decreasing, preventing spurious regime removal while allowing the model to recognize genuinely new market environments. Third, we develop a label matching mechanism based on the Hungarian algorithm that ensures temporal consistency of regime labels across estimation windows, resolving the label switching problem that plagues rolling HMM applications.
We provide theoretical foundations establishing the asymptotic consistency of our approach. The maximum likelihood estimator converges to the true parameters as the sample grows (Theorem ), and the BIC-selected regime count converges to the true number of regimes (Theorem ). The regime emergence policy preserves this asymptotic consistency while providing finite-sample stability (Theorem ). We also derive conditions under which regime-aware portfolio optimization outperforms unconditional approaches (Theorem ) and characterize the rate of regret convergence (Corollary).
Empirically, we demonstrate that R2-RD achieves superior risk-adjusted performance compared to a K-Nearest Neighbors benchmark and traditional allocations. Over the 2016–2024 out-of-sample period, R2-RD delivers a Sharpe ratio of 0.93 versus 0.73 for KNN, with maximum drawdown reduced from 30.44\
A distinguishing feature of R2-RD is its interpretability. Each regime is characterized by explicit mean vectors and covariance matrices that map directly to economic conditions. The crisis regime, for example, exhibits negative expected equity returns, elevated volatility, and positive stock-bond correlation–-reflecting the correlation breakdown observed during liquidity crises. This interpretability facilitates communication with risk managers, clients, and regulators, addressing growing demands for explainable models in institutional portfolio management.
Several directions for future research emerge from this study. Extensions to larger asset universes, alternative emission distributions (such as Student-t), and comparisons with other sophisticated benchmarks (such as Markov-switching GARCH or deep learning approaches) would strengthen the evidence base. The optimal choice of the regime count upper bound and feature representation remain open questions. Finally, capacity analysis incorporating market impact would be valuable for assessing the scalability of regime-switching strategies.
In summary, R2-RD offers a principled, theoretically grounded, and empirically validated approach to regime-aware portfolio construction. By combining the flexibility of dynamic regime detection with the stability of the emergence policy and label matching, R2-RD provides practitioners with an explainable framework for navigating time-varying market conditions.