Project Inventory
Innosuisse Innovation Project 133.672 IP-SBM
A Specialized and Secure AI Orchestrator for Swiss Financial Compliance
1. Project Summary
| Attribute |
Value |
| Application |
133.672 IP-SBM |
| Duration |
24 months (January 2026 - December 2027) |
| Total Budget |
CHF 591,240 |
| Total Hours |
5,800 hours |
| Innosuisse Contribution |
CHF 327,336 (55%) |
| Partner Contribution |
CHF 263,904 (45%) |
| Leveraging Ratio |
1:1.25 |
2. Partners
2.1 FHGR - Fachhochschule Graubuenden (Research Partner)
| Attribute |
Value |
| Role |
Research lead |
| Hours |
3,500 (60%) |
| Budget Contribution |
CHF 350,000 |
| Key Expertise |
NLP, Document AI, Applied ML |
| Responsibilities |
Domain adaptation, hallucination reduction, scientific validation |
| Deliverables |
D1.1-D1.6, D2.1-D2.2, D3.1, D3.3, D4.2-D4.3, D5.2-D5.3 (14 total) |
2.2 WeCanGroup SA (Implementation Partner)
| Attribute |
Value |
| Role |
Implementation lead |
| Hours |
2,300 (40%) |
| Budget Contribution |
CHF 241,240 |
| Key Expertise |
Swiss compliance, CRM integration, Financial services |
| Responsibilities |
Platform integration, pilot deployments, commercialization |
| Deliverables |
D2.3-D2.4, D3.2, D4.1, D5.1 (5 total) |
3. Quantifiable Objectives (OBJ1-OBJ8)
3.1 Scientific Objectives
| ID |
Objective |
Target |
Validation |
Milestone |
| OBJ1 |
Document Accuracy |
90% on 50-100 page docs |
Blind assessment on held-out test set |
MS5 (M20) |
| OBJ2 |
Zero-Shot Schema Mapping |
F1 > 85% on 50 enterprise schemas |
No schema-specific training |
MS4 (M16) |
| OBJ3 |
Hallucination Reduction |
40% reduction vs baseline |
Baseline: 23.5% total hallucination rate |
MS3 (M12) |
| OBJ8 |
Multilingual Validation |
500 documents (DE/FR/IT/EN) |
Distribution: 40% DE, 30% FR, 15% IT, 15% EN |
MS3 (M12) |
3.2 Economic Objectives
| ID |
Objective |
Target |
Validation |
Milestone |
| OBJ4 |
Processing Time |
< 2 hours per 100-page doc |
Baseline: 2-3 weeks manual |
MS5 (M20) |
| OBJ5 |
Institution Deployments |
3-5 Swiss financial institutions |
Production use, min 1 month, 50+ docs |
MS5 (M20) |
3.3 Technological Objectives
| ID |
Objective |
Target |
Validation |
Milestone |
| OBJ6 |
TRL Advancement |
TRL 3 to TRL 5-6 |
Evidence documentation |
MS5 (M20) |
| OBJ7 |
On-Premise Deployment |
7-13B parameters |
Cost comparison vs cloud |
MS4 (M16) |
4. Work Packages
4.1 WP1: Project Management (M1-M24)
| Attribute |
Value |
| Lead |
FHGR |
| Hours |
FHGR: 400h, Wecan: 200h, Total: 600h |
| Objectives |
Governance, coordination, risk management |
Deliverables:
| ID | Name | Due | Owner |
|—-|——|—–|——-|
| D1.1 | Project Kickoff Report | M1 | FHGR |
| D1.2 | Quarterly Progress Reports | M3+ | FHGR |
| D1.3 | Stakeholder Engagement Plan | M2 | FHGR |
| D1.4 | Data Management Plan | M2 | FHGR |
| D1.5 | Risk Mitigation Reports | Ongoing | FHGR |
| D1.6 | Final Project Report | M24 | FHGR |
4.2 WP2: Domain Adaptation & Hallucination Control (M1-M12)
| Attribute |
Value |
| Lead |
FHGR |
| Hours |
FHGR: 800h, Wecan: 300h, Total: 1,100h |
| Objectives |
Train domain-adapted LLM, reduce hallucinations, create annotated dataset |
Key Targets:
- Base model: Mistral v0.3 (7B) or Llama 3.1 (8B)
- Fine-tuning: LoRA/QLoRA on 300 documents
- Terminology: 1,720+ Swiss financial terms (FINMA, SBA, FATF)
- Hallucination baseline: 23.5% -> Target: 14% (40% reduction)
Deliverables:
| ID | Name | Due | Owner |
|—-|——|—–|——-|
| D2.1 | Training Approach Documentation | M6 | FHGR |
| D2.2 | Hallucination Detection Methods | M6 | FHGR |
| D2.3 | Annotated Dataset (300 docs) | M12 | Wecan |
| D2.4 | Hybrid Deployment Prototype | M12 | Wecan |
4.3 WP3: Long Document Understanding (M4-M15)
| Attribute |
Value |
| Lead |
FHGR |
| Hours |
FHGR: 600h, Wecan: 500h, Total: 1,100h |
| Objectives |
OCR pipeline, hierarchical attention, 50-100 page processing |
Key Targets:
- OCR accuracy: < 2% character error rate
- Document length: 50-100 pages
- Processing: Hierarchical attention architecture
- Accuracy: 90% on extended documents
Deliverables:
| ID | Name | Due | Owner |
|—-|——|—–|——-|
| D3.1 | Technology Evaluation Report | M6 | FHGR |
| D3.2 | Document Extraction Prototypes | M12 | Wecan |
| D3.3 | Validation Report (100 docs) | M15 | FHGR |
| Attribute |
Value |
| Lead |
FHGR |
| Hours |
FHGR: 1,100h, Wecan: 500h, Total: 1,600h |
| Objectives |
CRM integration, zero-shot schema mapping, field matching |
Key Targets:
- Schema mapping: F1 > 85% on 50 enterprise schemas
- CRM integration: Wecan Comply + 3-5 external systems
- Field matching: 300 validation cases
- Zero-shot: No schema-specific training required
Deliverables:
| ID | Name | Due | Owner |
|—-|——|—–|——-|
| D4.1 | CRM Form Pre-filling System | M16 | Wecan |
| D4.2 | Field Matching Validation | M20 | FHGR |
| D4.3 | Scientific Validation Report | M21 | FHGR |
4.5 WP5: Intelligent Document Pre-Filling (M14-M24)
| Attribute |
Value |
| Lead |
Wecan |
| Hours |
FHGR: 600h, Wecan: 800h, Total: 1,400h |
| Objectives |
Automated form filling, format support, public benchmark |
Key Targets:
- Form types: Text fields, checkboxes, dropdowns
- Formats: PDF, Excel, variable layouts
- Accuracy: 90% on 300+ forms
- Benchmark: Open dataset for reproducibility
Deliverables:
| ID | Name | Due | Owner |
|—-|——|—–|——-|
| D5.1 | Complete Pre-filling System | M20 | Wecan |
| D5.2 | Open Benchmark Dataset | M22 | FHGR |
| D5.3 | Accuracy Validation (300+ forms) | M24 | FHGR |
5. Milestones (MS1-MS5)
5.1 MS1: Project Foundation (M4 - April 2026)
| Criterion |
Target |
| Team onboarding |
100% complete |
| Infrastructure |
Development environment operational |
| Document collection |
100+ annotated documents |
| LLM baseline |
Established for all WPs |
| Risk register |
Complete and approved |
| Data governance |
Approved by steering committee |
| Pilot commitment |
2 LOIs secured |
Decision: GO/NO-GO by steering committee
5.2 MS2: Technical Validation (M6 - June 2026)
| Criterion |
Target |
| D2.1 submitted |
Training approach documented |
| D2.2 submitted |
Hallucination detection methods validated |
| D3.1 submitted |
Technology evaluation complete |
| Baseline accuracy |
Measured and documented |
| OCR pipeline |
Operational |
Decision: Training approach selected, model candidates shortlisted
5.3 MS3: Research Documentation (M12 - December 2026)
| Criterion |
Target |
| D2.3 |
300 documents annotated |
| D2.4 |
Hybrid deployment prototype operational |
| D3.2 |
Extraction prototypes validated on 100 docs |
| First paper |
Submitted to peer-reviewed venue |
| OBJ3 validated |
40% hallucination reduction achieved |
| OBJ8 validated |
500 multilingual documents processed |
Decision: Innosuisse mid-project review, TRL 6 evidence
5.4 MS4: Field Matching (M16 - April 2027)
| Criterion |
Target |
| D4.1 |
CRM system operational |
| Pilot partners |
3+ confirmed |
| D3.3 |
Validation report (100 docs) complete |
| Schema mapping |
F1 > 85% on 50 schemas |
| Accuracy |
> 90% on extended documents |
| OBJ2 validated |
Zero-shot mapping proven |
| OBJ7 validated |
On-premise deployment demonstrated |
Decision: CRM API integration confirmed
5.5 MS5: Validation (M20 - August 2027)
| Criterion |
Target |
| D4.2 |
Field matching validated (300 cases) |
| D4.3 |
Scientific validation report complete |
| D5.1 |
Pre-filling system complete |
| Security audit |
Passed |
| TRL evidence |
TRL 5-6 documentation complete |
| OBJ1 validated |
90% accuracy on blind assessment |
| OBJ4 validated |
< 2 hours processing time |
| OBJ5 validated |
3-5 institution deployments |
| OBJ6 validated |
TRL advancement documented |
Decision: Commercial readiness, project success criteria met
6. Technical Specifications
6.1 Model Specifications
| Attribute |
Primary |
Backup |
| Model |
Mistral v0.3 |
Llama 3.1 |
| Parameters |
7B |
8B |
| License |
Apache 2.0 |
Meta Community |
| VRAM (FP16) |
14 GB |
16 GB |
| Languages |
FR excellent, DE/IT good |
DE good, FR good, IT limited |
6.2 Fine-Tuning Approach
| Technique |
Description |
| Method |
LoRA (Low-Rank Adaptation) |
| Quantization |
QLoRA for deployment |
| Training data |
300 anonymized compliance documents |
| Terminology |
1,720+ domain terms (FINMA, SBA, FATF) |
| Quality threshold |
Inter-annotator Kappa >= 0.80 |
6.3 Infrastructure Requirements
| Configuration |
GPU |
RAM |
Use Case |
| Minimum |
RTX 4090 24GB |
64GB |
Small institution |
| Recommended |
A100 40GB |
128GB |
Medium institution |
| Enterprise |
2x A100 80GB |
256GB |
Large institution |
6.4 Deployment Models
| Model |
Description |
Data Location |
| On-Premise |
Full stack on customer infrastructure |
Customer |
| Hybrid |
Compute in cloud, data on-premise |
Customer |
| Managed |
Wecan-operated Swiss data center |
Wecan (Swiss) |
7. Document Types
7.1 Target Document Categories
| Category |
Pages |
Languages |
Priority |
| KYC Forms |
10-50 |
DE/FR/IT/EN |
High |
| Proof of Residence |
1-5 |
DE/FR/IT/EN |
High |
| Bank Statements |
5-20 |
DE/FR/IT/EN |
High |
| Tax Returns |
20-50 |
DE/FR/IT/EN |
Medium |
| Employment Contracts |
5-15 |
DE/FR/IT/EN |
Medium |
| Company Registrations |
10-30 |
DE/FR/IT/EN |
Medium |
| Trust Documents |
30-100 |
EN/DE |
Low |
| Board Resolutions |
5-15 |
DE/FR/IT/EN |
Low |
| Entity |
Code |
Description |
| Person Name |
PER |
Full name of individual |
| Organization |
ORG |
Company, institution |
| Address |
ADDR |
Physical location |
| Date |
DATE |
Temporal reference |
| Amount |
AMT |
Monetary value |
| Account Number |
ACCT |
Bank account, IBAN |
| ID Number |
IDNO |
Passport, AHV number |
| Phone |
PHONE |
Telephone number |
| Email |
EMAIL |
Email address |
| Legal Reference |
LEGAL |
Law, regulation citation |
8. Compliance Requirements
8.1 Regulatory Framework
| Regulation |
Requirement |
Implementation |
| FINMA |
Data sovereignty |
Swiss-only processing |
| FADP |
Purpose limitation |
Strict scope controls |
| GDPR |
Data minimization |
Configurable retention |
| AML |
Audit trail |
Immutable logging |
8.2 Security Standards
| Standard |
Status |
Evidence |
| ISO 27001 |
Target |
Security framework |
| SOC 2 Type II |
Target |
Annual audit |
| Penetration testing |
Quarterly |
External auditor |
| Encryption |
Required |
AES-256 at rest, TLS 1.3 in transit |
9. Success Metrics Summary
| Dimension |
Metric |
Target |
Validation |
| Accuracy |
Document extraction |
90% |
MS5 blind test |
| Speed |
Processing time |
< 2 hours |
MS5 benchmark |
| Cost |
On-premise vs cloud |
60-70% savings |
MS4 comparison |
| Scale |
Deployments |
3-5 institutions |
MS5 validation |
| Model |
Parameters |
7-13B on-premise |
MS4 demonstration |
| TRL |
Advancement |
TRL 5-6 |
MS5 evidence |
| Data |
Annotated documents |
500+ |
MS3 validation |
| Hallucination |
Reduction |
40% |
MS3 baseline comparison |
| Languages |
Coverage |
DE/FR/IT/EN |
MS3 corpus |
| Schema |
Zero-shot mapping |
F1 > 85% |
MS4 test |
10. Budget Breakdown
10.1 By Category
| Category |
Amount (CHF) |
Percentage |
| Personnel |
523,500 |
88.5% |
| Infrastructure |
32,000 |
5.4% |
| Data acquisition |
22,000 |
3.7% |
| Travel |
8,500 |
1.4% |
| Other |
5,240 |
0.9% |
| Total |
591,240 |
100% |
10.2 By Partner
| Partner |
Hours |
Hourly Rate |
Total |
| FHGR |
3,500h |
CHF 100/h |
CHF 350,000 |
| Wecan |
2,300h |
CHF 105/h |
CHF 241,240 |
| Total |
5,800h |
- |
CHF 591,240 |
11. Risk Categories
| Category |
Count |
Top Risks |
| Technical |
28 |
Model performance, integration issues |
| Resource |
18 |
Team availability, partner constraints |
| Process |
14 |
GO/NO-GO criteria, deliverable quality |
| External |
12 |
CRM access, pilot partner delays |
| Total |
72 |
3 per month |
12. Repository Structure
/
├── CLAUDE.md # Project instructions
├── AUTONOMOUS_INSTRUCTIONS.md # Autonomous execution guide
├── status.md # Current project status
├── changelog.md # Version history
├── deliverables/ # Deliverable templates (19)
├── docs/ # Documentation
│ ├── assets/charts/ # 10 visualizations
│ └── work-packages/ # WP1-WP5 details
├── execution/ # M01-M24 JSON files (314 tasks)
├── presentations/ # Beamer slides
├── scripts/ # Generation scripts
├── tools/ # Validation tools
└── web/ # Wiki interface
13. Current State
| Metric |
Value |
Status |
| Execution Plans |
24/24 months |
Complete |
| Total Tasks |
314 |
Complete |
| Budget Allocated |
5,798h of 5,800h |
99.97% |
| Deliverable Templates |
19/19 |
In Progress |
| Charts |
10/10 |
Complete |
| Validation |
0 errors, 0 warnings |
Passed |
| Version |
1.5.0 |
Current |
Generated: 2026-01-04
Source: Innosuisse Application 133.672 IP-SBM