CB-Speeches-Analysis
Central Bank Speech Sentiment vs Macroeconomic Conditions - PCA, PELT breakpoints, rolling regression analysis
Information
| Property | Value |
|---|---|
| Language | Python |
| Stars | 0 |
| Forks | 0 |
| Watchers | 0 |
| Open Issues | 0 |
| License | MIT License |
| Created | 2026-01-15 |
| Last Updated | 2026-02-19 |
| Last Push | 2026-01-15 |
| Contributors | 1 |
| Default Branch | main |
| Visibility | private |
Datasets
This repository includes 30 dataset(s):
| Dataset | Format | Size |
|---|---|---|
| data | | 0.0 KB |
| betas_inflation.csv | .csv | 15.6 KB |
| betas_macro.csv | .csv | 15.48 KB |
| breakpoints.json | .json | 0.74 KB |
| corr_inflation.csv | .csv | 0.2 KB |
| corr_macro.csv | .csv | 0.19 KB |
| correlation_matrix.csv | .csv | 0.19 KB |
| eigenvectors.csv | .csv | 0.78 KB |
| gigando_speeches_ner_v2.parquet | .parquet | 0.13 KB |
| inventory.json | .json | 16.7 KB |
| macroeconomic_data.csv | .csv | 14.2 KB |
| merged_inflation.csv | .csv | 20.25 KB |
| merged_macro.csv | .csv | 20.26 KB |
| pca_components.csv | .csv | 42.94 KB |
| pca_loadings.csv | .csv | 0.87 KB |
| processed_macro.csv | .csv | 41.75 KB |
| r2_inflation.csv | .csv | 15.8 KB |
| r2_macro.csv | .csv | 15.66 KB |
| raw_macro.csv | .csv | 18.0 KB |
| rolling_mean.csv | .csv | 35.97 KB |
| rolling_results_inflation.csv | .csv | 27.88 KB |
| rolling_results_macro.csv | .csv | 27.62 KB |
| rolling_std.csv | .csv | 41.62 KB |
| scaled_macro.csv | .csv | 41.75 KB |
| sentiment_aggregated.csv | .csv | 15.44 KB |
| sentiment_raw.csv | .csv | 6.57 KB |
| sentiment_standardized.csv | .csv | 15.44 KB |
| speeches_summary.csv | .csv | 85.73 KB |
| data | | 0.0 KB |
| index.html | .html | 4.05 KB |
Reproducibility
This repository includes reproducibility tools:
- Python requirements.txt
Status
- Issues: Enabled
- Wiki: Enabled
- Pages: Enabled
README
CB Speeches Analysis
Analyzing the relationship between Federal Reserve speech sentiment and macroeconomic conditions using PCA, structural break detection, and rolling regression.
Key Finding
Near-zero correlation (0.005) between CB speech sentiment (hawkish/dovish) and macroeconomic indices - a significant empirical result challenging assumptions about central bank communication effectiveness.
| Metric | Value |
|---|---|
| Macro-Hawkish Correlation | 0.005 |
| Variance Explained (2 PCs) | 72% |
| US Fed Speeches Analyzed | 2,421 |
| Time Period | 1996-2025 |
Quick Start
# Clone
git clone https://github.com/Digital-AI-Finance/CB-Speeches-Analysis.git
cd CB-Speeches-Analysis
# Install
pip install -r requirements.txt
# Run analysis
python -c "from analysis.run_all import run_pipeline; run_pipeline(use_cached=True, verbose=True)"
# Generate charts
cd charts && python run_all_charts.py
# Launch dashboard
streamlit run app.py
Project Structure
CB-Speeches-Analysis/
├── analysis/ # Core pipeline modules
│ ├── config.py # Configuration and parameters
│ ├── run_all.py # Master pipeline orchestrator
│ ├── data_loader.py # FRED API / CSV loading
│ ├── preprocessing.py # Rolling standardization
│ ├── pca_analysis.py # PCA dimensionality reduction
│ ├── breakpoint_detection.py # PELT algorithm
│ ├── speech_sentiment.py # Sentiment aggregation
│ └── rolling_regression.py # Rolling betas and R-squared
├── charts/ # 12 chart folders + utilities
│ ├── 01_scaled_macro_timeseries/
│ ├── 02_principal_components/
│ ├── ...
│ └── run_all_charts.py
├── data/ # Input and output data
│ ├── gigando_speeches_ner_v2.parquet # 20k+ CB speeches
│ ├── macroeconomic_data.csv # FRED data
│ └── *.csv, *.json # Pipeline outputs
├── tests/ # Unit tests
├── docs/ # GitHub Pages website
├── app.py # Streamlit dashboard
├── generate_dashboard.py # Static HTML generator
└── requirements.txt
Analysis Pipeline
- Data Loading - Load FRED macroeconomic data and CB speeches
- Preprocessing - 12-month rolling standardization
- PCA Analysis - Extract Macro Strength Index (PC1) and Inflation Index (PC2)
- Breakpoint Detection - PELT algorithm identifies structural breaks
- Sentiment Aggregation - Monthly hawkish/dovish counts from speeches
- Rolling Regression - 36-month rolling betas and R-squared
Data Sources
| Source | Description | Period |
|---|---|---|
| FRED | Fed Funds Rate, CPI, PPI, GDP, Unemployment, Nonfarm Payrolls | 1996-2025 |
| BIS/Gigando | 2,421 US Federal Reserve speeches with sentiment labels | 1996-2025 |
Key Parameters
| Parameter | Value | Description |
|---|---|---|
ROLLING_WINDOW |
12 | Months for standardization |
REGRESSION_WINDOW |
36 | Months for rolling regression |
PELT_PENALTY |
4 | Breakpoint detection sensitivity |
RANDOM_STATE |
42 | Reproducibility seed |
Output Files
| File | Description |
|---|---|
pca_components.csv |
PC1 (Macro Index), PC2 (Inflation Index) |
pca_loadings.csv |
Component weights on each macro variable |
breakpoints.json |
PELT-detected structural break dates |
sentiment_aggregated.csv |
Monthly hawkish/dovish counts |
correlation_matrix.csv |
First-difference correlations |
Charts
12 publication-ready figures covering: - Scaled macroeconomic time series - Principal components over time - Structural breakpoints (Macro and Inflation indices) - Speech sentiment distribution - Rolling regression results (betas, R-squared) - Correlation matrix and PCA loadings heatmaps
Running Tests
Documentation
Citation
@article{taibi2025cbspeeches,
title={Central Bank Communication and Macroeconomic Conditions:
A PCA-Based Framework for Analyzing Narrative-Reality Disconnect},
author={Taibi, Gabin and Osterrieder, Joerg},
journal={Working Paper},
year={2025},
institution={University of Zurich}
}
License
MIT License - see LICENSE for details.
Acknowledgments
Part of the SNSF Narrative Digital Finance project (Grant IZCOZ0_213370).