AI Orchestrator

A Specialized and Secure AI Orchestrator for Swiss Financial Compliance

View the Project on GitHub Digital-AI-Finance/wecan-innosuisse-ai-draft

Project Inventory

Innosuisse Innovation Project 133.672 IP-SBM A Specialized and Secure AI Orchestrator for Swiss Financial Compliance


1. Project Summary

Attribute Value
Application 133.672 IP-SBM
Duration 24 months (January 2026 - December 2027)
Total Budget CHF 591,240
Total Hours 5,800 hours
Innosuisse Contribution CHF 327,336 (55%)
Partner Contribution CHF 263,904 (45%)
Leveraging Ratio 1:1.25

2. Partners

2.1 FHGR - Fachhochschule Graubuenden (Research Partner)

Attribute Value
Role Research lead
Hours 3,500 (60%)
Budget Contribution CHF 350,000
Key Expertise NLP, Document AI, Applied ML
Responsibilities Domain adaptation, hallucination reduction, scientific validation
Deliverables D1.1-D1.6, D2.1-D2.2, D3.1, D3.3, D4.2-D4.3, D5.2-D5.3 (14 total)

2.2 WeCanGroup SA (Implementation Partner)

Attribute Value
Role Implementation lead
Hours 2,300 (40%)
Budget Contribution CHF 241,240
Key Expertise Swiss compliance, CRM integration, Financial services
Responsibilities Platform integration, pilot deployments, commercialization
Deliverables D2.3-D2.4, D3.2, D4.1, D5.1 (5 total)

3. Quantifiable Objectives (OBJ1-OBJ8)

3.1 Scientific Objectives

ID Objective Target Validation Milestone
OBJ1 Document Accuracy 90% on 50-100 page docs Blind assessment on held-out test set MS5 (M20)
OBJ2 Zero-Shot Schema Mapping F1 > 85% on 50 enterprise schemas No schema-specific training MS4 (M16)
OBJ3 Hallucination Reduction 40% reduction vs baseline Baseline: 23.5% total hallucination rate MS3 (M12)
OBJ8 Multilingual Validation 500 documents (DE/FR/IT/EN) Distribution: 40% DE, 30% FR, 15% IT, 15% EN MS3 (M12)

3.2 Economic Objectives

ID Objective Target Validation Milestone
OBJ4 Processing Time < 2 hours per 100-page doc Baseline: 2-3 weeks manual MS5 (M20)
OBJ5 Institution Deployments 3-5 Swiss financial institutions Production use, min 1 month, 50+ docs MS5 (M20)

3.3 Technological Objectives

ID Objective Target Validation Milestone
OBJ6 TRL Advancement TRL 3 to TRL 5-6 Evidence documentation MS5 (M20)
OBJ7 On-Premise Deployment 7-13B parameters Cost comparison vs cloud MS4 (M16)

4. Work Packages

4.1 WP1: Project Management (M1-M24)

Attribute Value
Lead FHGR
Hours FHGR: 400h, Wecan: 200h, Total: 600h
Objectives Governance, coordination, risk management

Deliverables: | ID | Name | Due | Owner | |—-|——|—–|——-| | D1.1 | Project Kickoff Report | M1 | FHGR | | D1.2 | Quarterly Progress Reports | M3+ | FHGR | | D1.3 | Stakeholder Engagement Plan | M2 | FHGR | | D1.4 | Data Management Plan | M2 | FHGR | | D1.5 | Risk Mitigation Reports | Ongoing | FHGR | | D1.6 | Final Project Report | M24 | FHGR |

4.2 WP2: Domain Adaptation & Hallucination Control (M1-M12)

Attribute Value
Lead FHGR
Hours FHGR: 800h, Wecan: 300h, Total: 1,100h
Objectives Train domain-adapted LLM, reduce hallucinations, create annotated dataset

Key Targets:

Deliverables: | ID | Name | Due | Owner | |—-|——|—–|——-| | D2.1 | Training Approach Documentation | M6 | FHGR | | D2.2 | Hallucination Detection Methods | M6 | FHGR | | D2.3 | Annotated Dataset (300 docs) | M12 | Wecan | | D2.4 | Hybrid Deployment Prototype | M12 | Wecan |

4.3 WP3: Long Document Understanding (M4-M15)

Attribute Value
Lead FHGR
Hours FHGR: 600h, Wecan: 500h, Total: 1,100h
Objectives OCR pipeline, hierarchical attention, 50-100 page processing

Key Targets:

Deliverables: | ID | Name | Due | Owner | |—-|——|—–|——-| | D3.1 | Technology Evaluation Report | M6 | FHGR | | D3.2 | Document Extraction Prototypes | M12 | Wecan | | D3.3 | Validation Report (100 docs) | M15 | FHGR |

4.4 WP4: Multi-Source Information Fusion (M10-M21)

Attribute Value
Lead FHGR
Hours FHGR: 1,100h, Wecan: 500h, Total: 1,600h
Objectives CRM integration, zero-shot schema mapping, field matching

Key Targets:

Deliverables: | ID | Name | Due | Owner | |—-|——|—–|——-| | D4.1 | CRM Form Pre-filling System | M16 | Wecan | | D4.2 | Field Matching Validation | M20 | FHGR | | D4.3 | Scientific Validation Report | M21 | FHGR |

4.5 WP5: Intelligent Document Pre-Filling (M14-M24)

Attribute Value
Lead Wecan
Hours FHGR: 600h, Wecan: 800h, Total: 1,400h
Objectives Automated form filling, format support, public benchmark

Key Targets:

Deliverables: | ID | Name | Due | Owner | |—-|——|—–|——-| | D5.1 | Complete Pre-filling System | M20 | Wecan | | D5.2 | Open Benchmark Dataset | M22 | FHGR | | D5.3 | Accuracy Validation (300+ forms) | M24 | FHGR |


5. Milestones (MS1-MS5)

5.1 MS1: Project Foundation (M4 - April 2026)

Criterion Target
Team onboarding 100% complete
Infrastructure Development environment operational
Document collection 100+ annotated documents
LLM baseline Established for all WPs
Risk register Complete and approved
Data governance Approved by steering committee
Pilot commitment 2 LOIs secured

Decision: GO/NO-GO by steering committee

5.2 MS2: Technical Validation (M6 - June 2026)

Criterion Target
D2.1 submitted Training approach documented
D2.2 submitted Hallucination detection methods validated
D3.1 submitted Technology evaluation complete
Baseline accuracy Measured and documented
OCR pipeline Operational

Decision: Training approach selected, model candidates shortlisted

5.3 MS3: Research Documentation (M12 - December 2026)

Criterion Target
D2.3 300 documents annotated
D2.4 Hybrid deployment prototype operational
D3.2 Extraction prototypes validated on 100 docs
First paper Submitted to peer-reviewed venue
OBJ3 validated 40% hallucination reduction achieved
OBJ8 validated 500 multilingual documents processed

Decision: Innosuisse mid-project review, TRL 6 evidence

5.4 MS4: Field Matching (M16 - April 2027)

Criterion Target
D4.1 CRM system operational
Pilot partners 3+ confirmed
D3.3 Validation report (100 docs) complete
Schema mapping F1 > 85% on 50 schemas
Accuracy > 90% on extended documents
OBJ2 validated Zero-shot mapping proven
OBJ7 validated On-premise deployment demonstrated

Decision: CRM API integration confirmed

5.5 MS5: Validation (M20 - August 2027)

Criterion Target
D4.2 Field matching validated (300 cases)
D4.3 Scientific validation report complete
D5.1 Pre-filling system complete
Security audit Passed
TRL evidence TRL 5-6 documentation complete
OBJ1 validated 90% accuracy on blind assessment
OBJ4 validated < 2 hours processing time
OBJ5 validated 3-5 institution deployments
OBJ6 validated TRL advancement documented

Decision: Commercial readiness, project success criteria met


6. Technical Specifications

6.1 Model Specifications

Attribute Primary Backup
Model Mistral v0.3 Llama 3.1
Parameters 7B 8B
License Apache 2.0 Meta Community
VRAM (FP16) 14 GB 16 GB
Languages FR excellent, DE/IT good DE good, FR good, IT limited

6.2 Fine-Tuning Approach

Technique Description
Method LoRA (Low-Rank Adaptation)
Quantization QLoRA for deployment
Training data 300 anonymized compliance documents
Terminology 1,720+ domain terms (FINMA, SBA, FATF)
Quality threshold Inter-annotator Kappa >= 0.80

6.3 Infrastructure Requirements

Configuration GPU RAM Use Case
Minimum RTX 4090 24GB 64GB Small institution
Recommended A100 40GB 128GB Medium institution
Enterprise 2x A100 80GB 256GB Large institution

6.4 Deployment Models

Model Description Data Location
On-Premise Full stack on customer infrastructure Customer
Hybrid Compute in cloud, data on-premise Customer
Managed Wecan-operated Swiss data center Wecan (Swiss)

7. Document Types

7.1 Target Document Categories

Category Pages Languages Priority
KYC Forms 10-50 DE/FR/IT/EN High
Proof of Residence 1-5 DE/FR/IT/EN High
Bank Statements 5-20 DE/FR/IT/EN High
Tax Returns 20-50 DE/FR/IT/EN Medium
Employment Contracts 5-15 DE/FR/IT/EN Medium
Company Registrations 10-30 DE/FR/IT/EN Medium
Trust Documents 30-100 EN/DE Low
Board Resolutions 5-15 DE/FR/IT/EN Low

7.2 Entity Types to Extract

Entity Code Description
Person Name PER Full name of individual
Organization ORG Company, institution
Address ADDR Physical location
Date DATE Temporal reference
Amount AMT Monetary value
Account Number ACCT Bank account, IBAN
ID Number IDNO Passport, AHV number
Phone PHONE Telephone number
Email EMAIL Email address
Legal Reference LEGAL Law, regulation citation

8. Compliance Requirements

8.1 Regulatory Framework

Regulation Requirement Implementation
FINMA Data sovereignty Swiss-only processing
FADP Purpose limitation Strict scope controls
GDPR Data minimization Configurable retention
AML Audit trail Immutable logging

8.2 Security Standards

Standard Status Evidence
ISO 27001 Target Security framework
SOC 2 Type II Target Annual audit
Penetration testing Quarterly External auditor
Encryption Required AES-256 at rest, TLS 1.3 in transit

9. Success Metrics Summary

Dimension Metric Target Validation
Accuracy Document extraction 90% MS5 blind test
Speed Processing time < 2 hours MS5 benchmark
Cost On-premise vs cloud 60-70% savings MS4 comparison
Scale Deployments 3-5 institutions MS5 validation
Model Parameters 7-13B on-premise MS4 demonstration
TRL Advancement TRL 5-6 MS5 evidence
Data Annotated documents 500+ MS3 validation
Hallucination Reduction 40% MS3 baseline comparison
Languages Coverage DE/FR/IT/EN MS3 corpus
Schema Zero-shot mapping F1 > 85% MS4 test

10. Budget Breakdown

10.1 By Category

Category Amount (CHF) Percentage
Personnel 523,500 88.5%
Infrastructure 32,000 5.4%
Data acquisition 22,000 3.7%
Travel 8,500 1.4%
Other 5,240 0.9%
Total 591,240 100%

10.2 By Partner

Partner Hours Hourly Rate Total
FHGR 3,500h CHF 100/h CHF 350,000
Wecan 2,300h CHF 105/h CHF 241,240
Total 5,800h - CHF 591,240

11. Risk Categories

Category Count Top Risks
Technical 28 Model performance, integration issues
Resource 18 Team availability, partner constraints
Process 14 GO/NO-GO criteria, deliverable quality
External 12 CRM access, pilot partner delays
Total 72 3 per month

12. Repository Structure

/
├── CLAUDE.md                 # Project instructions
├── AUTONOMOUS_INSTRUCTIONS.md # Autonomous execution guide
├── status.md                 # Current project status
├── changelog.md              # Version history
├── deliverables/             # Deliverable templates (19)
├── docs/                     # Documentation
│   ├── assets/charts/        # 10 visualizations
│   └── work-packages/        # WP1-WP5 details
├── execution/                # M01-M24 JSON files (314 tasks)
├── presentations/            # Beamer slides
├── scripts/                  # Generation scripts
├── tools/                    # Validation tools
└── web/                      # Wiki interface

13. Current State

Metric Value Status
Execution Plans 24/24 months Complete
Total Tasks 314 Complete
Budget Allocated 5,798h of 5,800h 99.97%
Deliverable Templates 19/19 In Progress
Charts 10/10 Complete
Validation 0 errors, 0 warnings Passed
Version 1.5.0 Current

Generated: 2026-01-04 Source: Innosuisse Application 133.672 IP-SBM