Overview
The project includes 15 active scripts and tools, plus 11 archived tools. Scripts in scripts/ handle the core data pipeline (PDF extraction through HTML generation). Tools in tools/ handle GitHub integration, validation, and reporting.
Core Scripts (scripts/)
- Purpose: Extract all text, tables, and structure from the Innosuisse application PDF
- Input: Application/ folder (PDF file)
- Output: data/extracted_content.json (587 KB, 42 pages, 13 sections, 140 tables)
- Usage: python scripts/extract_pdf.py
- Dependencies: pdfplumber
generate_project_plan.py
- Purpose: Generate month-by-month project plan from extracted PDF data
- Input: data/extracted_content.json
- Output: docs/generated_project_plan.md, data/project_plan_data.json
- Usage: python scripts/generate_project_plan.py
- Notes: Uses work package definitions from PDF Section 5
generate_execution_jsons.py
- Purpose: Create 24 individual monthly execution JSON files with tasks, deliverables, risks
- Input: data/project_plan_data.json
- Output: execution/M01_January2026.json through execution/M24_December2027.json
- Usage: python scripts/generate_execution_jsons.py
- Notes: Generates 314 tasks across 24 months
generate_dashboard.py
- Purpose: Generate interactive HTML dashboard from execution JSONs
- Input: execution/M01-M24 JSON files
- Output: execution/dashboard.html
- Usage: python scripts/generate_dashboard.py
update_wiki.py
- Purpose: Update wiki.html with Project/Plan/Application tabbed views
- Input: data/ JSONs, docs/, deliverables/
- Output: web/wiki.html
- Usage: python scripts/update_wiki.py
add_to_project.py
- Purpose: Add all GitHub issues to a GitHub Project board
- Input: GitHub API (requires gh CLI)
- Output: Issues added to project board
- Usage: python tools/add_to_project.py
consolidate_by_partner.py
- Purpose: Create consolidated GitHub issues (2 per month: FHGR + Wecan = 48 total)
- Input: Execution JSON files
- Output: 48 GitHub issues with task checklists
- Usage: python tools/consolidate_by_partner.py
- Notes: Includes retry with exponential backoff
fetch_progress.py
- Purpose: Fetch task completion progress from GitHub issues
- Input: GitHub API (48 consolidated issues)
- Output: data/progress_snapshot.json
- Usage: python tools/fetch_progress.py
- Config: GITHUB_REPO env var (default: wecangroup/wecan-innosuisse-ai)
sync_to_project.py
- Purpose: Sync all issues to GitHub Project board (only adds missing ones)
- Input: GitHub API
- Output: Issues synced to project
- Usage: python tools/sync_to_project.py
update_github_issues.py
- Purpose: Update GitHub issues with enhanced execution data (M19-M24)
- Input: Execution JSON files
- Output: Updated GitHub issues with dependency info and risk summaries
- Usage: python tools/update_github_issues.py
generate_monthly_report.py
- Purpose: Generate pre-filled monthly status report from progress data
- Input: data/progress_snapshot.json
- Output: reports/YYYY-MM_progress_report.md
- Usage: python tools/generate_monthly_report.py [month_number]
generate_progress_dashboard.py
- Purpose: Generate HTML progress dashboard from snapshot data
- Input: data/progress_snapshot.json
- Output: web/progress.html
- Usage: python tools/generate_progress_dashboard.py
validate_all_months.py
- Purpose: Comprehensive validation of all 24 execution JSONs
- Input: execution/M01-M24 JSON files, execution/schema.json
- Output: Validation report (PASS/FAIL with error details)
- Usage: python tools/validate_all_months.py
- Checks: Structural integrity, budget compliance, dependency chains, milestone criteria, deliverable assignments
enhance_execution_jsons.py
- Purpose: Enhance execution JSONs with additional detail (consolidates 4 archived scripts)
- Input: Execution JSON files
- Output: Enhanced execution JSON files
- Usage: python tools/enhance_execution_jsons.py [start_month] [end_month]
- Purpose: Generate metainfo.txt for Quantlet metadata from Python files
- Input: Any .py file
- Output: metainfo.txt following Quantlet standards
- Usage: python tools/create_metainfo.py
11 tools that have been superseded or were single-use:
| Tool |
Purpose |
Superseded By |
| create_infrastructure_issues.py |
Create GitHub verification infrastructure |
consolidate_by_partner.py |
| reimport_failed.py |
Re-import failed GitHub issue imports |
One-time use |
| enhance_m1_m6.py |
Enhance months 1-6 |
enhance_execution_jsons.py |
| enhance_m7_m12.py |
Enhance months 7-12 |
enhance_execution_jsons.py |
| enhance_m13_m18.py |
Enhance months 13-18 |
enhance_execution_jsons.py |
| enhance_m19_m24.py |
Enhance months 19-24 |
enhance_execution_jsons.py |
| fix_execution_gaps.py |
Fix gaps in execution data |
One-time use |
| update_deliverables_v190.py |
Update deliverables for v1.9.0 |
One-time use |
| update_status_files.py |
Update status files |
One-time use |
| update_wiki_v190.py |
Update wiki for v1.9.0 |
update_wiki.py |
| update_work_packages.py |
Update work package docs |
One-time use |
Pipeline Flow
The data flows through scripts in this sequence:
- PDF Extraction: extract_pdf.py reads the Innosuisse application PDF and outputs extracted_content.json
- Plan Generation: generate_project_plan.py converts extracted data into project_plan_data.json
- Execution JSONs: generate_execution_jsons.py creates 24 monthly files (M01-M24)
- Dashboard: generate_dashboard.py creates execution/dashboard.html
- Wiki: update_wiki.py creates web/wiki.html
- GitHub Issues: consolidate_by_partner.py creates 48 issues on GitHub
- Progress Tracking: fetch_progress.py pulls completion data into progress_snapshot.json
- Progress Dashboard: generate_progress_dashboard.py creates web/progress.html
- Reports: generate_monthly_report.py creates monthly status reports
See also: Developer Setup, Data Pipeline