CB-Speeches-Analysis

Central Bank Speech Sentiment vs Macroeconomic Conditions - PCA, PELT breakpoints, rolling regression analysis

Information

Property	Value
Language	Python
Stars	0
Forks	0
Watchers	0
Open Issues	0
License	MIT License
Created	2026-01-15
Last Updated	2026-02-19
Last Push	2026-01-15
Contributors	1
Default Branch	main
Visibility	private

Datasets

This repository includes 30 dataset(s):

Dataset	Format	Size

| data | | 0.0 KB |

| betas_inflation.csv | .csv | 15.6 KB |

| betas_macro.csv | .csv | 15.48 KB |

| breakpoints.json | .json | 0.74 KB |

| corr_inflation.csv | .csv | 0.2 KB |

| corr_macro.csv | .csv | 0.19 KB |

| correlation_matrix.csv | .csv | 0.19 KB |

| eigenvectors.csv | .csv | 0.78 KB |

| gigando_speeches_ner_v2.parquet | .parquet | 0.13 KB |

| inventory.json | .json | 16.7 KB |

| macroeconomic_data.csv | .csv | 14.2 KB |

| merged_inflation.csv | .csv | 20.25 KB |

| merged_macro.csv | .csv | 20.26 KB |

| pca_components.csv | .csv | 42.94 KB |

| pca_loadings.csv | .csv | 0.87 KB |

| processed_macro.csv | .csv | 41.75 KB |

| r2_inflation.csv | .csv | 15.8 KB |

| r2_macro.csv | .csv | 15.66 KB |

| raw_macro.csv | .csv | 18.0 KB |

| rolling_mean.csv | .csv | 35.97 KB |

| rolling_results_inflation.csv | .csv | 27.88 KB |

| rolling_results_macro.csv | .csv | 27.62 KB |

| rolling_std.csv | .csv | 41.62 KB |

| scaled_macro.csv | .csv | 41.75 KB |

| sentiment_aggregated.csv | .csv | 15.44 KB |

| sentiment_raw.csv | .csv | 6.57 KB |

| sentiment_standardized.csv | .csv | 15.44 KB |

| speeches_summary.csv | .csv | 85.73 KB |

| data | | 0.0 KB |

| index.html | .html | 4.05 KB |

Reproducibility

This repository includes reproducibility tools:

Python requirements.txt

Status

Issues: Enabled
Wiki: Enabled
Pages: Enabled

README

CB Speeches Analysis

Analyzing the relationship between Federal Reserve speech sentiment and macroeconomic conditions using PCA, structural break detection, and rolling regression.

Key Finding

Near-zero correlation (0.005) between CB speech sentiment (hawkish/dovish) and macroeconomic indices - a significant empirical result challenging assumptions about central bank communication effectiveness.

Metric	Value
Macro-Hawkish Correlation	0.005
Variance Explained (2 PCs)	72%
US Fed Speeches Analyzed	2,421
Time Period	1996-2025

Quick Start

# Clone
git clone https://github.com/Digital-AI-Finance/CB-Speeches-Analysis.git
cd CB-Speeches-Analysis

# Install
pip install -r requirements.txt

# Run analysis
python -c "from analysis.run_all import run_pipeline; run_pipeline(use_cached=True, verbose=True)"

# Generate charts
cd charts && python run_all_charts.py

# Launch dashboard
streamlit run app.py

Project Structure

CB-Speeches-Analysis/
├── analysis/              # Core pipeline modules
│   ├── config.py          # Configuration and parameters
│   ├── run_all.py         # Master pipeline orchestrator
│   ├── data_loader.py     # FRED API / CSV loading
│   ├── preprocessing.py   # Rolling standardization
│   ├── pca_analysis.py    # PCA dimensionality reduction
│   ├── breakpoint_detection.py  # PELT algorithm
│   ├── speech_sentiment.py      # Sentiment aggregation
│   └── rolling_regression.py    # Rolling betas and R-squared
├── charts/                # 12 chart folders + utilities
│   ├── 01_scaled_macro_timeseries/
│   ├── 02_principal_components/
│   ├── ...
│   └── run_all_charts.py
├── data/                  # Input and output data
│   ├── gigando_speeches_ner_v2.parquet  # 20k+ CB speeches
│   ├── macroeconomic_data.csv           # FRED data
│   └── *.csv, *.json                    # Pipeline outputs
├── tests/                 # Unit tests
├── docs/                  # GitHub Pages website
├── app.py                 # Streamlit dashboard
├── generate_dashboard.py  # Static HTML generator
└── requirements.txt

Analysis Pipeline

Data Loading - Load FRED macroeconomic data and CB speeches
Preprocessing - 12-month rolling standardization
PCA Analysis - Extract Macro Strength Index (PC1) and Inflation Index (PC2)
Breakpoint Detection - PELT algorithm identifies structural breaks
Sentiment Aggregation - Monthly hawkish/dovish counts from speeches
Rolling Regression - 36-month rolling betas and R-squared

Data Sources

Source	Description	Period
FRED	Fed Funds Rate, CPI, PPI, GDP, Unemployment, Nonfarm Payrolls	1996-2025
BIS/Gigando	2,421 US Federal Reserve speeches with sentiment labels	1996-2025

Key Parameters

Parameter	Value	Description
`ROLLING_WINDOW`	12	Months for standardization
`REGRESSION_WINDOW`	36	Months for rolling regression
`PELT_PENALTY`	4	Breakpoint detection sensitivity
`RANDOM_STATE`	42	Reproducibility seed

Output Files

File	Description
`pca_components.csv`	PC1 (Macro Index), PC2 (Inflation Index)
`pca_loadings.csv`	Component weights on each macro variable
`breakpoints.json`	PELT-detected structural break dates
`sentiment_aggregated.csv`	Monthly hawkish/dovish counts
`correlation_matrix.csv`	First-difference correlations

Charts

12 publication-ready figures covering: - Scaled macroeconomic time series - Principal components over time - Structural breakpoints (Macro and Inflation indices) - Speech sentiment distribution - Rolling regression results (betas, R-squared) - Correlation matrix and PCA loadings heatmaps

Running Tests

python -m pytest tests/ -v

Documentation

Citation

@article{taibi2025cbspeeches,
  title={Central Bank Communication and Macroeconomic Conditions:
         A PCA-Based Framework for Analyzing Narrative-Reality Disconnect},
  author={Taibi, Gabin and Osterrieder, Joerg},
  journal={Working Paper},
  year={2025},
  institution={University of Zurich}
}