Research Objectives
Work Program & Methodology | SNSF Grant IZCOZ0_213370
The proposed work program comprises four work packages (WP) with a conceptually new approach compared to existing research in financial market analysis:
- Limited studies employing narratives and textual information in financial market analysis
- Unexplored emotional evolution - studies on bubbles have largely ignored narratives underlying times of change
- Methodological gap - finance research has not kept pace with advances in automated text analysis and large text corpora
Our empirical setup studies the interplay of narratives, language evolution, financial innovation, and market performance in a comprehensive cross-layer framework where evolutions, causality, and interactions can be measured explicitly.
WP1: Text Data & Text Analytics
Text mining techniques are frequently used for forecasting developments of various financial assets (FX, equities, bonds, commodities). Our solution uses NLP and text mining techniques for asset allocation and prediction, combined with structural breaks and change point detection methodology.
While techniques exist for predicting cryptocurrency price bubbles using social media data, the field of classic financial assets tends to be under-researched.
Research Questions
WP2: Structural Breaks Detection & Asset Price Bubbles
Despite recent advances, econometric detection of asset price bubbles cannot be achieved with satisfactory certainty. Currently, there exists a relatively low number of scientific papers about live detection of structural breaks in a systematic way, and most existing solutions have not been validated on real-world data.
Three-Step Approach
- Step 1: Post-ante structural detection methods to identify past breaks in real macroeconomic and financial time series
- Step 2: Reapply well-established methods for live detection and check ex-ante performance
- Step 3: Involve NLP and text analysis as supporting/main method for detecting breakpoints
Research Questions
WP3: Narratives for Structural Breaks
Narratives "go viral" and spread worldwide with economic impact (Shiller 2017). There is considerable evidence that people respond strongly to narratives in marketing, journalism, education, health interventions, and philanthropy.
Methodology (Evolved)
- Multimodal influence framework extending NLP to images, video, and audio for financial pricing
- TOPol: Semi-unsupervised framework using transformer embeddings, UMAP, and Leiden clustering
- BERTopic-based narrative shift analysis quantifying polarity drift across economic regimes
- Semantic centroid movement and keyword evolution tracking at structural breakpoints
Note: Original 2x2 experimental design evolved toward the multimodal influence framework, providing a more comprehensive theoretical foundation.
Research Questions
WP4: Multidimensional AI and ML Solutions
AI and ML techniques possess substantial potential to revolutionize financial markets. New technologies transform business models and markets for trading, credit, and blockchain-based finance, generating efficiencies and refining financial services.
Since previous blocks examine structural breaks and asset price bubbles from various perspectives using different techniques, we check if these methods can be combined into a fully integrated framework.
Research Questions
Unique Contributions
Our solution for detecting structural breakpoints will be unique: (1) Exclusive focus on ex-ante forecasting (live detection), easily adjustable to rapidly changing markets; (2) Combining existing data sources, developing new quantitative models, and new frameworks to understand markets.
Enhanced Prediction Models
Financial market models with greater accuracy in detecting asset price bubbles and structural breaks
Novel Datasets & Methods
Deeper insights into the interplay between market narratives and financial indicators
Practical Tools
For asset management and regulatory bodies to better anticipate and react to market crises
Our research utilizes diverse textual and financial datasets:
RavenPack
Financial news headlines for sentiment analysis and narrative detection
LSEG (Refinitiv)
Earnings call transcripts for corporate narrative analysis
BIS Gigando
Worldwide central bank speeches for monetary policy narratives
SEC EDGAR
10-K and 10-Q filings for regulatory text analysis
Deutsche Borse
Nanosecond-level Xetra/Eurex trading data for HFT research
St. Louis FED FRED
Macroeconomic indicators (CPI, GDP, Unemployment, FED Funds Rate)
Custom data pipelines developed for collecting, formatting, and pre-processing textual data. Infrastructure handles large-scale processing with dedicated computing resources.
Step 1: Data Collection Tool
Custom data pipelines collect and structure datasets from RavenPack, LSEG, BIS, SEC EDGAR. Data categorized, dated, and stored for analysis. Market data from Deutsche Borse and FRED.
Step 2: Research Execution
Focus on research questions across four building blocks, formulating data-driven hypotheses and testing within each work package.