AI Orchestrator

A Specialized and Secure AI Orchestrator for Swiss Financial Compliance

View the Project on GitHub Digital-AI-Finance/wecan-innosuisse-ai-draft

Data Pipeline


Overview

The AI Orchestrator project uses a multi-stage pipeline to transform the Innosuisse application PDF into structured data, execution plans, and interactive documentation.


Pipeline Architecture

graph TD
    A[Innosuisse Application PDF] -->|extract_pdf.py| B[extracted_content.json]
    B -->|generate_project_plan.py| C[project_plan_data.json]
    B -->|generate_project_plan.py| D[generated_project_plan.md]
    C -->|generate_execution_jsons.py| E[24 Monthly JSONs M01-M24]
    E -->|generate_dashboard.py| F[execution/dashboard.html]
    E -->|update_wiki.py| G[web/wiki.html]
    E -->|consolidate_by_partner.py| H[48 GitHub Issues]
    H -->|fetch_progress.py| I[progress_snapshot.json]
    I -->|generate_progress_dashboard.py| J[web/progress.html]
    I -->|generate_monthly_report.py| K[Monthly Status Reports]

Stage 1: PDF Extraction

Script: scripts/extract_pdf.py

Input Output
Innosuisse application PDF (42 pages) data/extracted_content.json (587 KB)

Extracts:

Dependencies: pdfplumber


Stage 2: Project Plan Generation

Script: scripts/generate_project_plan.py

Input Output
data/extracted_content.json data/project_plan_data.json (47.8 KB)
  docs/generated_project_plan.md

Generates:


Stage 3: Execution JSON Generation

Script: scripts/generate_execution_jsons.py

Input Output
data/project_plan_data.json 24 files: execution/M01_January2026.json through M24_December2027.json

Each JSON file contains:

Totals: 314 tasks, 19 deliverables, 5 milestones, 72 risks, 5,798 hours


Stage 4: HTML Generation

Dashboard

Script: scripts/generate_dashboard.py

Input Output
execution/M01-M24 JSON files execution/dashboard.html

Interactive dashboard with all 24 months embedded.

Wiki

Script: scripts/update_wiki.py

Input Output
data/ JSONs, docs/, deliverables/ web/wiki.html

Three-tab wiki: PROJECT (overview), PLAN (monthly details), APPLICATION (PDF content).


Stage 5: GitHub Integration

Issue Creation

Script: tools/consolidate_by_partner.py

Input Output
Execution JSON files 48 GitHub issues (2 per month: FHGR + Wecan)

Each issue contains a checklist of tasks for that partner/month.

Progress Tracking

Script: tools/fetch_progress.py

Input Output
GitHub API (48 issues) data/progress_snapshot.json

Parses checkbox states and accomplishment comments from GitHub issues.


Stage 6: Reporting

Progress Dashboard

Script: tools/generate_progress_dashboard.py

Input Output
data/progress_snapshot.json web/progress.html

Visual dashboard showing task completion and accomplishment timeline.

Monthly Reports

Script: tools/generate_monthly_report.py

Input Output
data/progress_snapshot.json reports/YYYY-MM_progress_report.md

Pre-filled monthly status reports for steering committee review.


Data File Locations

File Location Size Purpose
extracted_content.json data/ 587 KB Raw PDF extraction
project_plan_data.json data/ 47.8 KB Structured plan data
progress_snapshot.json data/ Variable GitHub progress cache
M01-M24 JSONs execution/ ~15 KB each Monthly execution plans
schema.json execution/ ~3 KB JSON schema for validation
wiki.html web/ ~4,500 lines Interactive wiki
progress.html web/ Variable Progress dashboard
dashboard.html execution/ Variable Execution dashboard

Validation

Run at any time to verify data integrity:

python tools/validate_all_months.py

Checks: structural integrity, budget compliance (5,800h), dependency chains, milestone criteria, deliverable assignments.

Expected result: PASSED (0 errors, 0 warnings)


See also: Tools Reference, Developer Setup