Work Package 2: Graph-Based Methodology Development

Lead: Yiting Liu (University of Twente & BFH) Duration: Months 3-9 Status: Completed


Research Context

Graph Neural Networks (GNNs) have emerged as a powerful paradigm for learning on structured data, achieving state-of-the-art results across diverse domains including social networks, molecular chemistry, and recommendation systems. This work package develops novel GNN methodologies specifically designed for credit risk assessment, addressing the unique challenges of constructing meaningful graphs from tabular financial data.

Theoretical Foundations

The application of GNNs to credit risk builds upon two foundational bodies of literature:

Message Passing Neural Networks: The theoretical framework of message passing (Gilmer et al., 2017) provides the basis for GNN architectures. Nodes iteratively update their representations by aggregating information from neighbors:

\[\mathbf{h}_v^{(k+1)} = \text{UPDATE}^{(k)}\left(\mathbf{h}_v^{(k)}, \text{AGGREGATE}^{(k)}\left(\{\mathbf{h}_u^{(k)} : u \in \mathcal{N}(v)\}\right)\right)\]

Homophily in Networks: The concept of homophily, first formalized by Lazarsfeld and Merton (1954) and extensively studied by McPherson et al. (2001), posits that similarity breeds connection. In credit contexts, this manifests as borrowers with similar characteristics exhibiting correlated default behavior, a pattern we exploit for graph construction.


Objectives

  1. Develop novel graph neural network architectures for credit risk
  2. Design homophily-guided graph construction methodology
  3. Create interpretable credit scoring models suitable for regulatory requirements
  4. Benchmark comprehensively against traditional machine learning approaches

Literature Review: GNNs in Finance

Evolution of Graph-Based Methods

The application of graph methods to financial problems has evolved substantially:

Era Approach Key Works Limitations
2000s Network centrality Battiston et al. (2007) Manual feature engineering
2010s Graph kernels Shervashidze et al. (2011) Scalability issues
2016+ Spectral GNNs Kipf & Welling (2017) Fixed graph structure
2018+ Attention GNNs Velickovic et al. (2018) No graph construction
2020+ Dynamic GNNs Pareja et al. (2020) Computational cost

Credit Risk Applications

Recent applications of GNNs to credit risk include:

  • Fraud Detection: Weber et al. (2019) applied GNNs to transaction networks for anti-money laundering
  • Corporate Credit: Cheng et al. (2020) used supply chain networks for SME credit assessment
  • P2P Lending: Ma et al. (2021) incorporated social networks in lending platforms

However, these approaches assume pre-existing network structures. Our methodology addresses the fundamental challenge of graph construction from tabular loan data.


Core Innovation: Homophily-Guided Graph Construction

The Graph Construction Problem

Traditional GNN applications benefit from natural graph structures (social networks, molecular bonds). Credit risk data, however, consists primarily of tabular features without inherent relational structure. This creates a fundamental methodological challenge:

How do we construct meaningful graphs from tabular loan data?

Homophily-Guided Approach

Our methodology exploits the homophily principle: borrowers with similar default behavior tend to share observable characteristics. The construction process:

Step 1: Feature Similarity Computation

For each pair of borrowers $(i, j)$, compute multi-metric similarity:

\[S_{ij} = \omega_1 \cdot \text{cos}(\mathbf{x}_i, \mathbf{x}_j) + \omega_2 \cdot \text{euc}(\mathbf{x}_i, \mathbf{x}_j) + \omega_3 \cdot \text{jac}(\mathbf{x}_i^{cat}, \mathbf{x}_j^{cat})\]

where $\omega_k$ are learnable or cross-validated weights.

Step 2: Homophily-Guided Edge Filtering

During training, edges are retained based on label consistency:

\[A_{ij} = \mathbb{1}[S_{ij} > \tau] \cdot \mathbb{1}[y_i = y_j]\]

This ensures the constructed graph exhibits high homophily, enabling effective message passing.

Step 3: Inference Graph Construction

For prediction on new borrowers, edges connect to similar training nodes:

\[A_{i,\text{new}} = \mathbb{1}[S_{i,\text{new}} > \tau]\]

The model aggregates information from similar historical borrowers with known outcomes.

Theoretical Justification

The homophily-guided approach is theoretically grounded in:

  1. Label Propagation Theory: High-homophily graphs enable effective semi-supervised learning (Zhu et al., 2003)
  2. Smoothness Assumption: Connected nodes should have similar labels, which we enforce by construction
  3. Information Aggregation: Neighbors provide relevant context when they share similar characteristics and outcomes

GNN Architecture Design

Graph Attention Networks (GAT)

We adopt Graph Attention Networks as the primary architecture due to their interpretability advantages:

Multi-Head Attention Mechanism:

\[\alpha_{ij}^{(k)} = \frac{\exp\left(\text{LeakyReLU}\left(\mathbf{a}^{(k)T}[\mathbf{W}^{(k)}\mathbf{h}_i \| \mathbf{W}^{(k)}\mathbf{h}_j]\right)\right)}{\sum_{l \in \mathcal{N}(i)} \exp\left(\text{LeakyReLU}\left(\mathbf{a}^{(k)T}[\mathbf{W}^{(k)}\mathbf{h}_i \| \mathbf{W}^{(k)}\mathbf{h}_l]\right)\right)}\]

Node Update:

\[\mathbf{h}_i' = \sigma\left(\frac{1}{K}\sum_{k=1}^{K}\sum_{j \in \mathcal{N}(i)} \alpha_{ij}^{(k)} \mathbf{W}^{(k)}\mathbf{h}_j\right)\]

Architecture Components

Input Embedding Layer

Projects heterogeneous features (continuous, categorical, temporal) into unified embedding space. Categorical features use learned embeddings; continuous features pass through linear transformation with batch normalization.

Graph Attention Layers

Two stacked GAT layers with 8 attention heads each. First layer expands representation; second layer aggregates neighbor information. Skip connections prevent over-smoothing.

Readout Layer

Final node representations pass through MLP classifier with dropout regularization. Sigmoid activation produces default probability.

Complete Architecture

Input: Feature matrix X (n x d), Adjacency matrix A (n x n)
       |
   [Embedding Layer]
   - Continuous: Linear(d_cont, 64) + BatchNorm + ReLU
   - Categorical: Embedding(vocab, 64) + Dropout(0.1)
   - Concatenation: (n x 128)
       |
   [GAT Layer 1]
   - Multi-head attention: 8 heads x 16 dims = 128 dims
   - Activation: ELU
   - Dropout: 0.5
       |
   [GAT Layer 2]
   - Multi-head attention: 8 heads x 16 dims = 128 dims
   - Skip connection from Layer 1
   - Activation: ELU
       |
   [Readout MLP]
   - Linear(128, 64) + ReLU + Dropout(0.3)
   - Linear(64, 1) + Sigmoid
       |
Output: Default probability p (n x 1)

Training Methodology

Loss Function

We employ weighted binary cross-entropy to handle class imbalance:

\[\mathcal{L} = -\frac{1}{N}\sum_{i=1}^{N}\left[w_1 \cdot y_i \log(\hat{y}_i) + w_0 \cdot (1-y_i)\log(1-\hat{y}_i)\right]\]

where $w_1 = \frac{N}{2 \cdot N_1}$ and $w_0 = \frac{N}{2 \cdot N_0}$ balance positive and negative classes.

Optimization

Parameter Value Rationale
Optimizer Adam Adaptive learning rates
Learning rate 0.001 Standard for GAT
Weight decay 5e-4 L2 regularization
Dropout 0.5 Prevent overfitting
Batch size Full graph Transductive setting
Early stopping 50 epochs Validation AUC patience

Hyperparameter Optimization

Key hyperparameters optimized via grid search with 5-fold cross-validation:

Hyperparameter Search Space Optimal
Hidden dimensions [32, 64, 128] 64
Attention heads [4, 8, 16] 8
GAT layers [1, 2, 3] 2
Similarity threshold [0.5, 0.6, 0.7, 0.8] 0.7
Similarity metric weights Uniform, Learned Learned

Experimental Results

Primary Performance Comparison

Comprehensive benchmarking against 12 baseline methods across 5 datasets:

Model Category Method Bondora AUC LendingClub AUC Avg Rank
Linear Logistic Regression 0.721 0.708 11.2
Linear Linear SVM 0.718 0.705 11.8
Tree Decision Tree 0.689 0.672 13.0
Tree Random Forest 0.756 0.741 7.4
Boosting XGBoost 0.771 0.756 5.2
Boosting LightGBM 0.769 0.754 5.6
Neural MLP 0.748 0.735 8.2
Neural TabNet 0.778 0.762 4.4
Graph GCN 0.782 0.768 4.0
Graph GAT 0.791 0.774 3.2
Graph GraphSAGE 0.788 0.771 3.4
Graph Homophily-GAT (Ours) 0.812 0.798 1.4

Statistical Significance

Paired t-tests comparing Homophily-GAT to best baseline (TabNet) across 5 datasets:

Dataset Homophily-GAT TabNet Difference p-value
Bondora 0.812 0.778 +0.034 0.003**
LendingClub 0.798 0.762 +0.036 0.002**
German Credit 0.781 0.752 +0.029 0.018*
Prosper 0.803 0.771 +0.032 0.008**
Home Credit 0.809 0.775 +0.034 0.004**

Ablation Study

Component-wise contribution analysis:

Configuration AUC Delta
Full Homophily-GAT 0.812 -
Without homophily filtering 0.789 -0.023
Random graph construction 0.776 -0.036
Single attention head 0.798 -0.014
Single GAT layer 0.794 -0.018
Without skip connections 0.801 -0.011

Interpretability Analysis

Attention Weight Interpretation

The attention mechanism provides interpretable credit decisions at multiple levels:

Feature-Level: Attention weights reveal which borrower characteristics the model prioritizes:

Feature Category Avg Attention Weight Interpretation
Payment History 0.28 Strongest predictor
Credit Utilization 0.19 Capacity indicator
Employment Tenure 0.15 Stability signal
Debt-to-Income 0.14 Affordability measure
Loan Amount 0.12 Risk exposure
Other 0.12 Combined minor factors

Neighbor-Level: For each prediction, the model identifies which similar borrowers influenced the decision:

Example: Borrower #12345 (Predicted: High Risk, p=0.73)
Top Influential Neighbors:
  - Neighbor #8891: Similarity=0.89, Default=Yes, Attention=0.15
  - Neighbor #2234: Similarity=0.85, Default=Yes, Attention=0.12
  - Neighbor #5567: Similarity=0.82, Default=No,  Attention=0.08

Regulatory Compliance

The interpretability features support regulatory requirements:

  1. GDPR Right to Explanation: Attention weights provide human-readable decision factors
  2. Fair Lending Compliance: Protected attributes can be excluded while monitoring indirect effects
  3. Model Documentation: Architecture and training process fully documented for audit

Computational Considerations

Scalability Analysis

Dataset Size Nodes Edges Training Time Memory
Small (German) 1K 50K 2 min 0.5 GB
Medium (Bondora) 134K 8M 45 min 8 GB
Large (LendingClub) 2.26M 150M 6 hours 64 GB

Efficiency Optimizations

  • Mini-batch Training: GraphSAGE-style sampling for large graphs
  • Sparse Operations: Efficient sparse matrix representations
  • GPU Acceleration: CUDA-optimized attention computations
  • Approximate k-NN: Faiss library for similarity computation

Deliverables

Deliverable Status Description
Methodology paper Completed JMIS submission ready
Code implementation Completed PyTorch Geometric based
Benchmark experiments Completed 12 methods, 5 datasets
Visualization tools Completed Attention map visualizations
Documentation Completed API reference and tutorials

References

  • Gilmer, J., et al. (2017). Neural message passing for quantum chemistry. ICML.
  • Kipf, T. N., & Welling, M. (2017). Semi-supervised classification with graph convolutional networks. ICLR.
  • Velickovic, P., et al. (2018). Graph attention networks. ICLR.
  • McPherson, M., Smith-Lovin, L., & Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual Review of Sociology.
  • Zhu, X., Ghahramani, Z., & Lafferty, J. D. (2003). Semi-supervised learning using Gaussian fields and harmonic functions. ICML.

  • Liu, Y., Osterrieder, J., et al. “Credit Risk Prediction via Graph Neural Networks with Homophily-Guided Graph Construction” (JMIS Submission)
  • Baals, L.J., Liu, Y., et al. “A Systematic Literature Review on Graph-Based Models in Credit Risk Assessment” (In Preparation)

Next Steps

Methodology validated and ready for WP3: Validation on real-world scenarios.


(c) Joerg Osterrieder 2025