WP2: Graph-Based Methodology

Work Package 2: Graph-Based Methodology Development

Lead: Yiting Liu (University of Twente & BFH) Duration: Months 3-9 Status: Completed

Research Context

Graph Neural Networks (GNNs) have emerged as a powerful paradigm for learning on structured data, achieving state-of-the-art results across diverse domains including social networks, molecular chemistry, and recommendation systems. This work package develops novel GNN methodologies specifically designed for credit risk assessment, addressing the unique challenges of constructing meaningful graphs from tabular financial data.

Theoretical Foundations

The application of GNNs to credit risk builds upon two foundational bodies of literature:

Message Passing Neural Networks: The theoretical framework of message passing (Gilmer et al., 2017) provides the basis for GNN architectures. Nodes iteratively update their representations by aggregating information from neighbors:

\[\mathbf{h}_v^{(k+1)} = \text{UPDATE}^{(k)}\left(\mathbf{h}_v^{(k)}, \text{AGGREGATE}^{(k)}\left(\{\mathbf{h}_u^{(k)} : u \in \mathcal{N}(v)\}\right)\right)\]

Homophily in Networks: The concept of homophily, first formalized by Lazarsfeld and Merton (1954) and extensively studied by McPherson et al. (2001), posits that similarity breeds connection. In credit contexts, this manifests as borrowers with similar characteristics exhibiting correlated default behavior, a pattern we exploit for graph construction.

Objectives

Develop novel graph neural network architectures for credit risk
Design homophily-guided graph construction methodology
Create interpretable credit scoring models suitable for regulatory requirements
Benchmark comprehensively against traditional machine learning approaches

Literature Review: GNNs in Finance

Evolution of Graph-Based Methods

The application of graph methods to financial problems has evolved substantially:

Era	Approach	Key Works	Limitations
2000s	Network centrality	Battiston et al. (2007)	Manual feature engineering
2010s	Graph kernels	Shervashidze et al. (2011)	Scalability issues
2016+	Spectral GNNs	Kipf & Welling (2017)	Fixed graph structure
2018+	Attention GNNs	Velickovic et al. (2018)	No graph construction
2020+	Dynamic GNNs	Pareja et al. (2020)	Computational cost

Credit Risk Applications

Recent applications of GNNs to credit risk include:

Fraud Detection: Weber et al. (2019) applied GNNs to transaction networks for anti-money laundering
Corporate Credit: Cheng et al. (2020) used supply chain networks for SME credit assessment
P2P Lending: Ma et al. (2021) incorporated social networks in lending platforms

However, these approaches assume pre-existing network structures. Our methodology addresses the fundamental challenge of graph construction from tabular loan data.

Core Innovation: Homophily-Guided Graph Construction

The Graph Construction Problem

Traditional GNN applications benefit from natural graph structures (social networks, molecular bonds). Credit risk data, however, consists primarily of tabular features without inherent relational structure. This creates a fundamental methodological challenge:

How do we construct meaningful graphs from tabular loan data?

Homophily-Guided Approach

Our methodology exploits the homophily principle: borrowers with similar default behavior tend to share observable characteristics. The construction process:

Step 1: Feature Similarity Computation

For each pair of borrowers $(i, j)$, compute multi-metric similarity:

\[S_{ij} = \omega_1 \cdot \text{cos}(\mathbf{x}_i, \mathbf{x}_j) + \omega_2 \cdot \text{euc}(\mathbf{x}_i, \mathbf{x}_j) + \omega_3 \cdot \text{jac}(\mathbf{x}_i^{cat}, \mathbf{x}_j^{cat})\]

where $\omega_k$ are learnable or cross-validated weights.

Step 2: Homophily-Guided Edge Filtering

During training, edges are retained based on label consistency:

\[A_{ij} = \mathbb{1}[S_{ij} > \tau] \cdot \mathbb{1}[y_i = y_j]\]

This ensures the constructed graph exhibits high homophily, enabling effective message passing.

Step 3: Inference Graph Construction

For prediction on new borrowers, edges connect to similar training nodes:

\[A_{i,\text{new}} = \mathbb{1}[S_{i,\text{new}} > \tau]\]

The model aggregates information from similar historical borrowers with known outcomes.

Theoretical Justification

The homophily-guided approach is theoretically grounded in:

Label Propagation Theory: High-homophily graphs enable effective semi-supervised learning (Zhu et al., 2003)
Smoothness Assumption: Connected nodes should have similar labels, which we enforce by construction
Information Aggregation: Neighbors provide relevant context when they share similar characteristics and outcomes

GNN Architecture Design

Graph Attention Networks (GAT)

We adopt Graph Attention Networks as the primary architecture due to their interpretability advantages:

Multi-Head Attention Mechanism:

\[\alpha_{ij}^{(k)} = \frac{\exp\left(\text{LeakyReLU}\left(\mathbf{a}^{(k)T}[\mathbf{W}^{(k)}\mathbf{h}_i \| \mathbf{W}^{(k)}\mathbf{h}_j]\right)\right)}{\sum_{l \in \mathcal{N}(i)} \exp\left(\text{LeakyReLU}\left(\mathbf{a}^{(k)T}[\mathbf{W}^{(k)}\mathbf{h}_i \| \mathbf{W}^{(k)}\mathbf{h}_l]\right)\right)}\]

Node Update:

\[\mathbf{h}_i' = \sigma\left(\frac{1}{K}\sum_{k=1}^{K}\sum_{j \in \mathcal{N}(i)} \alpha_{ij}^{(k)} \mathbf{W}^{(k)}\mathbf{h}_j\right)\]

Architecture Components

Input Embedding Layer

Projects heterogeneous features (continuous, categorical, temporal) into unified embedding space. Categorical features use learned embeddings; continuous features pass through linear transformation with batch normalization.

Graph Attention Layers

Two stacked GAT layers with 8 attention heads each. First layer expands representation; second layer aggregates neighbor information. Skip connections prevent over-smoothing.

Readout Layer

Final node representations pass through MLP classifier with dropout regularization. Sigmoid activation produces default probability.

Complete Architecture

Input: Feature matrix X (n x d), Adjacency matrix A (n x n)
       |
   [Embedding Layer]
   - Continuous: Linear(d_cont, 64) + BatchNorm + ReLU
   - Categorical: Embedding(vocab, 64) + Dropout(0.1)
   - Concatenation: (n x 128)
       |
   [GAT Layer 1]
   - Multi-head attention: 8 heads x 16 dims = 128 dims
   - Activation: ELU
   - Dropout: 0.5
       |
   [GAT Layer 2]
   - Multi-head attention: 8 heads x 16 dims = 128 dims
   - Skip connection from Layer 1
   - Activation: ELU
       |
   [Readout MLP]
   - Linear(128, 64) + ReLU + Dropout(0.3)
   - Linear(64, 1) + Sigmoid
       |
Output: Default probability p (n x 1)

Training Methodology

Loss Function

We employ weighted binary cross-entropy to handle class imbalance:

\[\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N}\left[w_1 \cdot y_i \log(\hat{y}_i) + w_0 \cdot (1-y_i)\log(1-\hat{y}_i)\right]\]

where $w_1 = \frac{N}{2 \cdot N_1}$ and $w_0 = \frac{N}{2 \cdot N_0}$ balance positive and negative classes.

Optimization

Parameter	Value	Rationale
Optimizer	Adam	Adaptive learning rates
Learning rate	0.001	Standard for GAT
Weight decay	5e-4	L2 regularization
Dropout	0.5	Prevent overfitting
Batch size	Full graph	Transductive setting
Early stopping	50 epochs	Validation AUC patience

Hyperparameter Optimization

Key hyperparameters optimized via grid search with 5-fold cross-validation:

Hyperparameter	Search Space	Optimal
Hidden dimensions	[32, 64, 128]	64
Attention heads	[4, 8, 16]	8
GAT layers	[1, 2, 3]	2
Similarity threshold	[0.5, 0.6, 0.7, 0.8]	0.7
Similarity metric weights	Uniform, Learned	Learned

Experimental Results

Primary Performance Comparison

Comprehensive benchmarking against 12 baseline methods across 5 datasets:

Model Category	Method	Bondora AUC	LendingClub AUC	Avg Rank
Linear	Logistic Regression	0.721	0.708	11.2
Linear	Linear SVM	0.718	0.705	11.8
Tree	Decision Tree	0.689	0.672	13.0
Tree	Random Forest	0.756	0.741	7.4
Boosting	XGBoost	0.771	0.756	5.2
Boosting	LightGBM	0.769	0.754	5.6
Neural	MLP	0.748	0.735	8.2
Neural	TabNet	0.778	0.762	4.4
Graph	GCN	0.782	0.768	4.0
Graph	GAT	0.791	0.774	3.2
Graph	GraphSAGE	0.788	0.771	3.4
Graph	Homophily-GAT (Ours)	0.812	0.798	1.4

Statistical Significance

Paired t-tests comparing Homophily-GAT to best baseline (TabNet) across 5 datasets:

Dataset	Homophily-GAT	TabNet	Difference	p-value
Bondora	0.812	0.778	+0.034	0.003**
LendingClub	0.798	0.762	+0.036	0.002**
German Credit	0.781	0.752	+0.029	0.018*
Prosper	0.803	0.771	+0.032	0.008**
Home Credit	0.809	0.775	+0.034	0.004**

Ablation Study

Component-wise contribution analysis:

Configuration	AUC	Delta
Full Homophily-GAT	0.812	-
Without homophily filtering	0.789	-0.023
Random graph construction	0.776	-0.036
Single attention head	0.798	-0.014
Single GAT layer	0.794	-0.018
Without skip connections	0.801	-0.011

Interpretability Analysis

Attention Weight Interpretation

The attention mechanism provides interpretable credit decisions at multiple levels:

Feature-Level: Attention weights reveal which borrower characteristics the model prioritizes:

Feature Category	Avg Attention Weight	Interpretation
Payment History	0.28	Strongest predictor
Credit Utilization	0.19	Capacity indicator
Employment Tenure	0.15	Stability signal
Debt-to-Income	0.14	Affordability measure
Loan Amount	0.12	Risk exposure
Other	0.12	Combined minor factors

Neighbor-Level: For each prediction, the model identifies which similar borrowers influenced the decision:

Example: Borrower #12345 (Predicted: High Risk, p=0.73)
Top Influential Neighbors:
  - Neighbor #8891: Similarity=0.89, Default=Yes, Attention=0.15
  - Neighbor #2234: Similarity=0.85, Default=Yes, Attention=0.12
  - Neighbor #5567: Similarity=0.82, Default=No,  Attention=0.08

Regulatory Compliance

The interpretability features support regulatory requirements:

GDPR Right to Explanation: Attention weights provide human-readable decision factors
Fair Lending Compliance: Protected attributes can be excluded while monitoring indirect effects
Model Documentation: Architecture and training process fully documented for audit

Computational Considerations

Scalability Analysis

Dataset Size	Nodes	Edges	Training Time	Memory
Small (German)	1K	50K	2 min	0.5 GB
Medium (Bondora)	134K	8M	45 min	8 GB
Large (LendingClub)	2.26M	150M	6 hours	64 GB

Efficiency Optimizations

Mini-batch Training: GraphSAGE-style sampling for large graphs
Sparse Operations: Efficient sparse matrix representations
GPU Acceleration: CUDA-optimized attention computations
Approximate k-NN: Faiss library for similarity computation

Deliverables

Deliverable	Status	Description
Methodology paper	Completed	JMIS submission ready
Code implementation	Completed	PyTorch Geometric based
Benchmark experiments	Completed	12 methods, 5 datasets
Visualization tools	Completed	Attention map visualizations
Documentation	Completed	API reference and tutorials

References

Gilmer, J., et al. (2017). Neural message passing for quantum chemistry. ICML.
Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. ICLR.
Velickovic, P., et al. (2018). Graph attention networks. ICLR.
McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology.
Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. ICML.

Liu, Y., Osterrieder, J., et al. “Credit Risk Prediction via Graph Neural Networks with Homophily-Guided Graph Construction” (JMIS Submission)
Baals, L.J., Liu, Y., et al. “A Systematic Literature Review on Graph-Based Models in Credit Risk Assessment” (In Preparation)

Next Steps

Methodology validated and ready for WP3: Validation on real-world scenarios.