AI Orchestrator

A Specialized and Secure AI Orchestrator for Swiss Financial Compliance

View the Project on GitHub Digital-AI-Finance/wecan-innosuisse-ai-draft

WP2: Domain Adaptation & Hallucination Reduction

Home > Work Packages > WP2


Overview

Attribute Value
Duration M1-M12
FHGR Hours 800h
Wecan Hours 300h
Total Hours 1,100h
Lead FHGR Research Lead

Objectives

  1. Investigate and validate model training approaches for Swiss compliance documents
  2. Develop efficient fine-tuning methods (LoRA, adapters) for domain adaptation
  3. Create robust hallucination detection and mitigation mechanisms
  4. Build annotated dataset of 300+ compliance documents
  5. Achieve 40% reduction in hallucination rate vs. baseline

Technical Approach

Model Training Strategy

Base Model (7-13B)          Domain Adaptation
     |                            |
     v                            v
+----------+              +---------------+
| Llama 3  |              | LoRA Adapters |
| Mistral  |    +------>  | QLoRA         |
| Qwen     |              | Full Fine-tune|
+----------+              +---------------+
     |                            |
     +------------+---------------+
                  |
                  v
         Domain-Adapted Model
         (Swiss Compliance)

Hallucination Detection

Method Description Target
Source Span Verification Check extracted values against source text 90% coverage
Cross-Reference Validation Validate against multiple document sections 85% accuracy
Confidence Scoring Uncertainty quantification per field Calibrated scores
Human-in-the-Loop Flagging low-confidence extractions <5% manual review

Activities

M1-M3: Foundation

Activity Owner Output
Create annotation guidelines v0.1 FHGR Guidelines document
Set up document collection structure FHGR Repository structure
Define quality criteria FHGR Quality checklist
Identify 100 candidate documents Wecan Document list
Begin anonymization Wecan Redacted documents
Train annotation team FHGR Trained annotators
Begin baseline evaluation of 5 LLMs FHGR Evaluation framework

M4-M6: Model Evaluation

Activity Owner Output
LoRA/adapter training experiments FHGR Training results
Integrate domain vocabulary FHGR Custom tokenizer
Complete model evaluation on 300 docs FHGR Evaluation report
Document selected training approach FHGR D2.1
Prepare training comparison report FHGR Comparison matrix

M7-M12: Refinement

Activity Owner Output
Refine domain adaptation FHGR Improved models
Complete hallucination detection FHGR D2.2
Finalize annotated dataset FHGR D2.3 (300 docs)
Deploy hybrid prototype FHGR D2.4

Deliverables

ID Deliverable Due Owner Status
D2.1 Training approach documentation M6 FHGR Complete
D2.2 Hallucination detection methods M6 FHGR Complete
D2.3 Annotated dataset (300 docs) M12 Wecan Complete
D2.4 Hybrid deployment prototype M12 Wecan Complete

All deliverable templates complete. See deliverables/ for detailed templates.


Dataset Specification

Document Types

Type Count Languages
KYC Forms 100 DE, FR, EN
Regulatory Filings 80 DE, FR, IT
Compliance Questionnaires 70 DE, FR, EN
Annual Reports 50 DE, FR, IT, EN
Total 300 All

Annotation Schema

Field Type Examples Annotation Method
Entity Company name, person, address BIO tagging
Numeric Amounts, percentages, dates Value + unit
Boolean Yes/No fields, checkboxes Binary
Table Financial data, lists Cell-level
Relationship Entity connections Relation labels

Quality Metrics

Metric Target Measurement
Inter-annotator agreement kappa > 0.8 Cohen’s kappa
Annotation coverage >95% fields Field completion rate
Error rate <2% Post-review corrections

Model Candidates

Model Parameters License Status
Llama 3 8B, 70B Meta Evaluate
Mistral 7B Apache 2.0 Evaluate
Qwen 2 7B, 14B Apache 2.0 Evaluate
Gemma 7B Google Evaluate
Phi-3 3.8B Microsoft Evaluate

Evaluation Criteria

Criterion Weight Measurement
Extraction accuracy 40% Field-level F1
Hallucination rate 25% Fabrication detection
Inference speed 15% Pages/minute
Memory footprint 10% Peak GPU RAM
License compatibility 10% Commercial use OK

Objective Alignment

Objective WP2 Contribution
OBJ3: 40% Hallucination Reduction Primary owner
OBJ7: On-Premise Model Model selection and optimization
OBJ8: 500 Multilingual Documents Dataset creation

GitHub Issues:


Milestone Checkpoints

MS1 (M4)

MS2 (M6)

MS3 (M12)


Dependencies

Inputs

From Input Required By
Wecan Raw compliance documents M1
Wecan Document anonymization M2
Wecan Domain expert feedback M3-M12

Outputs

To Output Available
WP3 Domain-adapted models M6
WP4 Extraction capabilities M12
WP5 Pre-trained components M12

Back to Work Packages Previous: WP1 Next: WP3