Skip to content

python-utils

Python utilities for web tools and automation

View on GitHub


Information

Property Value
Language Python
Stars 0
Forks 0
Watchers 0
Open Issues 25
License No License
Created 2025-12-08
Last Updated 2026-02-19
Last Push 2025-12-28
Contributors 1
Default Branch main
Visibility private

Datasets

This repository includes 16 dataset(s):

Dataset Format Size

| manifest.json | .json | 1.92 KB |

| neural-networks-template.json | .json | 3.64 KB |

| manifest.json | .json | 0.71 KB |

| manifest.json | .json | 3.2 KB |

| manifest.json | .json | 1.61 KB |

| manifest.json | .json | 1.85 KB |

| manifest.json | .json | 1.28 KB |

| test_mapping.json | .json | 0.96 KB |

| manifest.json | .json | 0.74 KB |

| manifest.json | .json | 3.22 KB |

| manifest.json | .json | 1.17 KB |

| course_config.example.json | .json | 0.51 KB |

| manifest.json | .json | 0.38 KB |

| manifest.json | .json | 0.45 KB |

| manifest.json | .json | 1.22 KB |

| config_schema.json | .json | 8.19 KB |

Reproducibility

This repository includes reproducibility tools:

  • Python requirements.txt

Status

  • Issues: Enabled
  • Wiki: Disabled
  • Pages: Disabled

README

Python Utils

A collection of Python utilities for web validation, quality assessment, and LaTeX/Beamer branding.

Tools

Utility Description
link_checker Validate all links on a website with deep crawling
quality_checker Check GitHub Pages quality against a reference site
screenshot_checker Capture screenshots and analyze layout consistency
layout_enforcer Strict layout validation against reference site templates
quantlet_tools Add QuantLet branding (logo + QR overlays) to LaTeX slides
page_analytics Cookie-free, GDPR-friendly analytics for GitHub Pages
pdf_source_linker Map PDFs to LaTeX source folders and add GitHub links to HTML
notebooklm_podcast Generate podcasts and videos from PDFs via NotebookLM
slide_reviewer Catalog, review, and report on Beamer slide content

Note: Organization management tools (health/quality/publish analysis, auto-fixer, analytics deployment) have moved to digital-ai-finance-organization-manager. Install with pip install -e . and use the org-manager CLI.

Shared Utilities

The shared/ module provides common utilities used across web validation tools:

  • crawler.py - BFS page discovery, URL handling, page fetching

Requirements

  • Python 3.9+
  • See individual tool READMEs for specific dependencies

Quick Start

# Link checking
python link_checker/link_checker.py https://your-site.io

# Quality checking with subpage crawling
python quality_checker/quality_checker.py https://your-site.io --depth 2

# Screenshot and layout analysis
pip install playwright && playwright install chromium
python screenshot_checker/screenshot_checker.py https://your-site.io --depth 2

# Layout enforcement - extract template and validate
python layout_enforcer/layout_enforcer.py extract https://ref-site.io -o template.json
python layout_enforcer/layout_enforcer.py validate https://new-site.io --template template.json

# QuantLet branding workflow (CLI)
quantlet brand slides.tex --repo https://github.com/Org/repo  # One-step
# Or step by step:
quantlet metadata --repo https://github.com/Org/repo
quantlet qr
quantlet latex slides.tex

# Page analytics - setup and report
python page_analytics/page_analytics.py setup --repo owner/repo --output ./analytics/
python page_analytics/page_analytics.py report --repo owner/repo --period 30d

# PDF source linker - map PDFs to LaTeX folders and add GitHub links
python pdf_source_linker/pdf_source_linker.py scan ./project --repo https://github.com/org/repo -o mapping.json
python pdf_source_linker/pdf_source_linker.py preview ./project --mapping mapping.json
python pdf_source_linker/pdf_source_linker.py apply ./project --mapping mapping.json --backup

# NotebookLM podcast - generate audio/video from PDFs
pip install playwright && playwright install chromium
python notebooklm_podcast/notebooklm_podcast.py setup
python notebooklm_podcast/notebooklm_podcast.py podcast lecture.pdf -o podcast.mp3

# Slide reviewer - catalog and review Beamer presentations
python slide_reviewer/slide_reviewer.py presentation.tex
python slide_reviewer/slide_reviewer.py presentation.tex -f json -o catalog.json
python slide_reviewer/slide_reviewer.py presentation.tex -f html -o report.html

File Governance

This repository uses a manifest-based file tracking system:

# Check Python file budget (max 50)
python validate_manifest.py --budget

# Validate all manifests
python validate_manifest.py
File Purpose
manifest.json Root index of all 9 tools
*/manifest.json Per-tool file tracking
ARCHITECTURE.md Folder structure and governance rules
validate_manifest.py Validation and budget checking

Current: 50/100 Python files

See ARCHITECTURE.md for detailed structure and contribution guidelines.

License

MIT