Skip to content

CB-Speeches-Analysis

Central Bank Speech Sentiment vs Macroeconomic Conditions - PCA, PELT breakpoints, rolling regression analysis

View on GitHub


Information

Property Value
Language Python
Stars 0
Forks 0
Watchers 0
Open Issues 0
License MIT License
Created 2026-01-15
Last Updated 2026-02-19
Last Push 2026-01-15
Contributors 1
Default Branch main
Visibility private

Datasets

This repository includes 30 dataset(s):

Dataset Format Size

| data | | 0.0 KB |

| betas_inflation.csv | .csv | 15.6 KB |

| betas_macro.csv | .csv | 15.48 KB |

| breakpoints.json | .json | 0.74 KB |

| corr_inflation.csv | .csv | 0.2 KB |

| corr_macro.csv | .csv | 0.19 KB |

| correlation_matrix.csv | .csv | 0.19 KB |

| eigenvectors.csv | .csv | 0.78 KB |

| gigando_speeches_ner_v2.parquet | .parquet | 0.13 KB |

| inventory.json | .json | 16.7 KB |

| macroeconomic_data.csv | .csv | 14.2 KB |

| merged_inflation.csv | .csv | 20.25 KB |

| merged_macro.csv | .csv | 20.26 KB |

| pca_components.csv | .csv | 42.94 KB |

| pca_loadings.csv | .csv | 0.87 KB |

| processed_macro.csv | .csv | 41.75 KB |

| r2_inflation.csv | .csv | 15.8 KB |

| r2_macro.csv | .csv | 15.66 KB |

| raw_macro.csv | .csv | 18.0 KB |

| rolling_mean.csv | .csv | 35.97 KB |

| rolling_results_inflation.csv | .csv | 27.88 KB |

| rolling_results_macro.csv | .csv | 27.62 KB |

| rolling_std.csv | .csv | 41.62 KB |

| scaled_macro.csv | .csv | 41.75 KB |

| sentiment_aggregated.csv | .csv | 15.44 KB |

| sentiment_raw.csv | .csv | 6.57 KB |

| sentiment_standardized.csv | .csv | 15.44 KB |

| speeches_summary.csv | .csv | 85.73 KB |

| data | | 0.0 KB |

| index.html | .html | 4.05 KB |

Reproducibility

This repository includes reproducibility tools:

  • Python requirements.txt

Status

  • Issues: Enabled
  • Wiki: Enabled
  • Pages: Enabled

README

CB Speeches Analysis

CI Deploy Python 3.9+ License: MIT

Analyzing the relationship between Federal Reserve speech sentiment and macroeconomic conditions using PCA, structural break detection, and rolling regression.

Key Finding

Near-zero correlation (0.005) between CB speech sentiment (hawkish/dovish) and macroeconomic indices - a significant empirical result challenging assumptions about central bank communication effectiveness.

Metric Value
Macro-Hawkish Correlation 0.005
Variance Explained (2 PCs) 72%
US Fed Speeches Analyzed 2,421
Time Period 1996-2025

Quick Start

# Clone
git clone https://github.com/Digital-AI-Finance/CB-Speeches-Analysis.git
cd CB-Speeches-Analysis

# Install
pip install -r requirements.txt

# Run analysis
python -c "from analysis.run_all import run_pipeline; run_pipeline(use_cached=True, verbose=True)"

# Generate charts
cd charts && python run_all_charts.py

# Launch dashboard
streamlit run app.py

Project Structure

CB-Speeches-Analysis/
├── analysis/              # Core pipeline modules
│   ├── config.py          # Configuration and parameters
│   ├── run_all.py         # Master pipeline orchestrator
│   ├── data_loader.py     # FRED API / CSV loading
│   ├── preprocessing.py   # Rolling standardization
│   ├── pca_analysis.py    # PCA dimensionality reduction
│   ├── breakpoint_detection.py  # PELT algorithm
│   ├── speech_sentiment.py      # Sentiment aggregation
│   └── rolling_regression.py    # Rolling betas and R-squared
├── charts/                # 12 chart folders + utilities
│   ├── 01_scaled_macro_timeseries/
│   ├── 02_principal_components/
│   ├── ...
│   └── run_all_charts.py
├── data/                  # Input and output data
│   ├── gigando_speeches_ner_v2.parquet  # 20k+ CB speeches
│   ├── macroeconomic_data.csv           # FRED data
│   └── *.csv, *.json                    # Pipeline outputs
├── tests/                 # Unit tests
├── docs/                  # GitHub Pages website
├── app.py                 # Streamlit dashboard
├── generate_dashboard.py  # Static HTML generator
└── requirements.txt

Analysis Pipeline

  1. Data Loading - Load FRED macroeconomic data and CB speeches
  2. Preprocessing - 12-month rolling standardization
  3. PCA Analysis - Extract Macro Strength Index (PC1) and Inflation Index (PC2)
  4. Breakpoint Detection - PELT algorithm identifies structural breaks
  5. Sentiment Aggregation - Monthly hawkish/dovish counts from speeches
  6. Rolling Regression - 36-month rolling betas and R-squared

Data Sources

Source Description Period
FRED Fed Funds Rate, CPI, PPI, GDP, Unemployment, Nonfarm Payrolls 1996-2025
BIS/Gigando 2,421 US Federal Reserve speeches with sentiment labels 1996-2025

Key Parameters

Parameter Value Description
ROLLING_WINDOW 12 Months for standardization
REGRESSION_WINDOW 36 Months for rolling regression
PELT_PENALTY 4 Breakpoint detection sensitivity
RANDOM_STATE 42 Reproducibility seed

Output Files

File Description
pca_components.csv PC1 (Macro Index), PC2 (Inflation Index)
pca_loadings.csv Component weights on each macro variable
breakpoints.json PELT-detected structural break dates
sentiment_aggregated.csv Monthly hawkish/dovish counts
correlation_matrix.csv First-difference correlations

Charts

12 publication-ready figures covering: - Scaled macroeconomic time series - Principal components over time - Structural breakpoints (Macro and Inflation indices) - Speech sentiment distribution - Rolling regression results (betas, R-squared) - Correlation matrix and PCA loadings heatmaps

Running Tests

python -m pytest tests/ -v

Documentation

Citation

@article{taibi2025cbspeeches,
  title={Central Bank Communication and Macroeconomic Conditions:
         A PCA-Based Framework for Analyzing Narrative-Reality Disconnect},
  author={Taibi, Gabin and Osterrieder, Joerg},
  journal={Working Paper},
  year={2025},
  institution={University of Zurich}
}

License

MIT License - see LICENSE for details.

Acknowledgments

Part of the SNSF Narrative Digital Finance project (Grant IZCOZ0_213370).