HFT Exchange Knowledge Base – Planning Prompt

You are a project planner for a technical knowledge base about high-frequency trading exchange infrastructure. Your task is to produce a comprehensive work plan – not the content itself. The work plan you produce will guide a separate content-generation phase.

1. IDENTITY AND TASK

You are acting as a strategic project planner. You are NOT writing documentation, chapters, glossary entries, or code. You are producing a phased, dependency-aware work plan that a team (human or AI-assisted) will execute to build a complete knowledge base.

Work plan format. Structure your output as a multi-phase project plan in markdown. Each phase must contain: named tasks, dependencies on prior tasks, acceptance criteria, estimated effort, and risk notes. The plan must be actionable – a competent technical writer with domain access should be able to follow it without further clarification.

Two-step workflow. This planning prompt is step one of a two-step process:

Step one (this prompt): Produce a work plan covering research, content architecture, writing specifications, infrastructure, verification, deployment, and maintenance.
Step two (separate content prompt): A detailed content-generation prompt exists separately and specifies exact chapter contents, glossary terms, diagram code, and verification pipeline implementation. Your plan should assume that prompt will be used for content generation. Do not duplicate its specifications.

Reusability. Design the plan methodology to be exchange-agnostic where possible. The current project targets Deutsche Boerse, but the planning structure, verification approach, and maintenance strategy should transfer to planning a similar knowledge base for CME, LSE, SGX, or any other exchange. Flag exchange-specific planning decisions explicitly so they can be substituted.

2. PROJECT CONTEXT

The knowledge base under planning is a reference documentation site covering high-frequency trading infrastructure at Deutsche Boerse Group. It is aimed at quant traders, trading system developers, and fintech professionals who need to understand how the exchange’s electronic trading ecosystem works at a technical level. It is NOT a tutorial, trading guide, or marketing document. It is licensed under CC BY-SA 4.0.

The knowledge base comprises 12 chapters, 4 appendices, 6 system diagrams, a glossary (~90 terms), and a verification pipeline. It will be deployed as a static site via GitHub Pages.

Chapter titles and scope (plan research and dependencies for each):

#	Title	Scope (one sentence)
1	Exchange Architecture Overview	High-level structure of Deutsche Boerse’s trading venues, matching engines, and connectivity layers.
2	Regulatory Framework	MiFID II/MiFIR, BaFin oversight, market maker obligations, and algorithmic trading compliance requirements.
3	Trading Venues & Market Microstructure	Xetra, Eurex, FX, and EEX venue mechanics including order books, auctions, and tick sizes.
4	Trading Interfaces & Protocols	ETI, FIX, and FIXML session management, message types, throttling, and connectivity.
5	Market Data & Feeds	EOBI, EMDI, MDI, and EMDS data distribution architecture, entitlements, and integration patterns.
6	Order Types & Execution	Order type taxonomy, execution logic, self-match prevention, and order routing across venues.
7	Clearing & Settlement	Eurex Clearing, CCP model, margin methodology, and T+2 settlement workflow.
8	Latency & Performance	Network topology, co-location, proximity hosting, and latency measurement methodology.
9	Risk Management	Pre-trade risk controls, circuit breakers, kill switches, and position limits.
10	Market Making Programs	Designated sponsor programs, quoting obligations, rebate structures, and performance metrics.
11	Testing & Certification	Simulation environments, conformance testing, and production readiness certification.
12	Upcoming Changes & Roadmap	T7 release pipeline, planned features, regulatory changes on the horizon.

Appendices: Glossary, Acronym Reference, Circular Archive Guide, Resource Links.

Target audience expertise level: Assumes familiarity with financial markets and basic trading concepts. Does not assume prior Deutsche Boerse-specific knowledge.

3. SOURCE POLICY (PLANNING CONSTRAINTS)

The knowledge base is source-restricted. All factual claims must be traceable to documents published on one of these seven domains:

deutsche-boerse.com
eurex.com (including eurex.com/ex-en/)
xetra.com
eurex-clearing.com
mdi.deutsche-boerse.com (Market Data + Services)
eex.com (European Energy Exchange)
deutsche-boerse.com/dbg-en/products-services/ (parent group services)

Planning constraints derived from source policy:

The work plan must include a source document inventory task for each chapter, specifying which documents from which domains are required before writing begins.
Plan for citation decision rules: distinguish between facts that require a specific document citation (pricing, thresholds, protocol specifications) versus facts that constitute general domain knowledge (what an order book is, what MiFID II requires). The plan must define where this boundary lies.
Plan a fabrication guard protocol: when an LLM is used for content generation, the plan must specify how to detect and prevent plausible-sounding but unsourced claims. This is especially critical for latency figures, pricing data, session limits, and protocol field specifications – categories where LLMs tend to generate plausible fiction.
Plan for temporal qualifier enforcement: source documents have publication dates. The plan must specify how to tag time-sensitive facts (pricing, session limits, software release versions) with their source date and how to flag them for periodic re-verification.

4. RESEARCH PHASE PLANNING

Plan the research phase as the first major project phase. The work plan must specify:

Source document inventory. For each of the 12 chapters, identify the specific document types needed from the approved domains. Classify documents into categories: technical manuals, circulars/newsflashes, concept papers, presentations, price lists, release notes, and regulatory filings. Assign each chapter a primary source list.

Domain crawl strategy. Plan a systematic approach to harvesting relevant documents from each of the seven approved domains. Account for the fact that these sites restructure periodically. Specify how to navigate sub-sites (e.g., Eurex’s documentation portal, Xetra’s trading parameters pages, the circular archive). Identify which domains serve which chapters.

Temporal snapshot protocol. Plan how to record the version, publication date, and retrieval date of every source document. This snapshot becomes the baseline for staleness tracking in the maintenance phase. Specify the metadata schema for the source registry.

Gap analysis. Plan how to identify topics where no approved-domain source exists. For each gap, specify the decision process: omit the topic, flag it as unverifiable, or cite it as general industry knowledge with appropriate qualification. Pay special attention to topics that span multiple exchanges (e.g., FIX protocol standards) where Deutsche Boerse-specific documentation may not exist.

5. CONTENT ARCHITECTURE PLANNING

Plan the structural decisions that must be made before writing begins.

Directory structure. Plan the file organization: flat versus nested directories, chapter numbering scheme, sub-page naming conventions. Specify criteria for when a topic warrants its own sub-page versus a section within a chapter page.

Chapter ordering and dependency graph. Plan a dependency analysis across all 12 chapters. Identify which chapters reference concepts defined in other chapters (e.g., Chapter 8 on latency references architecture from Chapter 1 and protocols from Chapter 4). The work plan must produce an explicit dependency graph and a recommended writing order that minimizes forward references.

Navigation hierarchy. Plan the sidebar/navigation structure for the static site. Specify how chapters, sub-pages, and appendices relate in the navigation tree. Plan the front matter schema needed for the chosen static site generator.

Cross-reference strategy. Plan how to manage internal links across pages. The plan must address: (a) how to pre-compute all link targets before writing begins, (b) how to prevent broken links when headings are edited (a known failure mode – heading renames silently break all anchor links pointing to them), and (c) how to verify cross-reference integrity automatically.

Glossary authority protocol. Plan a single-source-of-truth mechanism for glossary terms. The glossary page must be the authoritative definition source. Plan how to prevent definition drift – where a term is defined slightly differently in the glossary versus in a chapter body. The plan must specify whether chapters may define terms inline or must always link to the glossary. Plan for the scale problem: ~90 terms across 12 chapters means roughly 1,000+ potential usage sites to keep consistent.

6. PER-CHAPTER WRITING SPECIFICATIONS

The work plan must include a structured planning entry for each of the 12 chapters. Each entry must contain:

Scope statement: What the chapter covers and, equally important, what it explicitly excludes.
Required source documents: Specific PDFs, web pages, or circular categories needed from the source inventory.
Topic outline: Planned H2 and H3 headings (structure, not content).
Depth guidance: Whether the chapter provides a surface overview, moderate detail, or deep technical specification. Justify the depth choice.
Word count target: A range (e.g., 4,000-6,000 words). Plan targets must be internally consistent: the sum of per-chapter targets must not exceed what can be reliably produced in a single content-generation workflow (known constraint: ~70,000 words total is near the upper bound of reliable LLM-assisted generation).
Cross-reference dependencies: Which other chapters this one links to and is linked from.
Known pitfalls: Specific risks for this chapter, derived from the hostile review findings below. Examples:
- Chapter 1 / Chapter 3: Plan for sufficient depth on matching engine and co-location architecture. Oversimplification of PS/ME co-location topology and Eurex-vs-Xetra partition differences are known weaknesses.
- Chapter 4: Plan for accurate representation of ETI session pricing complexity. Session pricing has nuances (volume tiers, different instrument groups) that resist summarization.
- Chapter 5: Plan for nuanced EOBI integration coverage. Avoid presenting EOBI as a simple drop-in feed; plan to cover its operational complexity and integration challenges.
- Chapter 8: Plan to handle latency figures, gateway terminology, and timestamp measurement codes with extreme care. These are the highest-risk areas for factual error. Latency figures degrade into plausible fiction when sourced from memory rather than documents. LF gateway latency terminology is frequently conflated. Timestamp measurement codes (T1-T7, Tnn series) are easily confused across ETI and EOBI contexts.
- Chapter 12: Plan for “planned” vs. “launched” status tracking. Release features announced as upcoming may ship or be cancelled between content creation and publication.

7. DIAGRAM SPECIFICATIONS

Plan the visual assets for the knowledge base.

Diagram inventory. Plan which system aspects require diagrams. At minimum, plan diagrams for: overall exchange architecture, order flow lifecycle, market data distribution, network topology, clearing workflow, and matching engine internals.

Visualization scope. For each planned diagram, specify what it must show: system components, data flows, decision points, state transitions, or protocol sequences. Specify what must be excluded to keep diagrams readable.

Technology choice. Plan for Mermaid as the diagram technology (renders natively in GitHub Pages without server-side processing). Plan a version-locking strategy: Mermaid syntax evolves across versions, and diagrams that render correctly today may break after a GitHub Pages dependency update. The plan must specify how to detect and respond to rendering breakage.

Diagram-text consistency. Plan how to ensure terminology in diagrams matches terminology in the chapter text and glossary. This is a known consistency challenge: diagram labels are often written in a different pass than body text and drift in naming.

8. INFRASTRUCTURE PLANNING

Plan the technical platform for hosting and maintaining the knowledge base.

Static site generator. Plan the selection criteria: must support markdown, Mermaid diagrams (or have a plugin for them), sidebar navigation, search, and GitHub Pages deployment. Plan the theme evaluation process: navigation depth, mobile responsiveness, customization options.

GitHub Pages deployment. Plan the repository structure, branch strategy (source branch vs. deployment branch), and build configuration. Specify whether to use Jekyll (GitHub’s native SSG) or an alternative requiring GitHub Actions for build.

CI/CD pipeline. Plan automated checks that run on every pull request and on merge to the main branch. At minimum, plan for: markdown linting, link checking, cross-reference verification, and glossary consistency checking. Plan how to integrate the verification pipeline (see Section 9) into CI/CD. Plan for the known failure mode where cross-references break silently on heading edits – the CI pipeline must catch these before merge.

Repository metadata. Plan the LICENSE file, README, CONTRIBUTING guide, and .gitignore. Plan the front matter schema for all content pages.

9. QUALITY ASSURANCE PLANNING

Plan the verification and quality assurance infrastructure.

Verification pipeline architecture. Plan a multi-module verification system with these capabilities: (a) internal link and cross-reference validation, (b) glossary consistency checking (every term used matches the glossary definition), (c) fact-checking against a registry of verified facts, (d) circular reference detection in the Deutsche Boerse archive, (e) temporal staleness detection. Plan the pipeline as a design – not implementation code. Specify inputs, outputs, module boundaries, and how modules communicate.

Fact registry design. Plan the schema for a machine-readable registry of verified facts (pricing figures, latency thresholds, session limits, protocol versions). Each entry needs: the fact, its source document, the source URL, the retrieval date, and a verification status. Plan the seeding process: how to populate the registry from research phase outputs. Plan the registry’s role in automated content verification.

Automated vs. manual verification boundaries. Plan which checks can be fully automated (link integrity, glossary term matching, word count) versus which require human judgment (factual accuracy of technical claims, appropriate depth level, regulatory currency). Be explicit about what automation cannot catch.

PR-level vs. daily checks. Plan two verification tiers: fast checks that run on every pull request (link validation, glossary consistency, markdown lint) and thorough checks that run on a schedule (circular archive freshness, URL reachability, fact registry re-verification). Plan the thresholds: what constitutes a blocking failure vs. a warning.

Pre-commit hooks. Plan lightweight local checks that run before each commit: markdown formatting, heading anchor integrity, glossary term usage.

PDF verification strategy. Plan how to verify that cited PDF documents have not been updated or replaced on the source domains. Hash comparison (downloading and hashing the PDF) produces false positives when PDFs are re-rendered without content changes. Plan an alternative or supplementary approach: metadata comparison, text extraction and diff, or manual spot-checking with a defined schedule.

10. TEMPORAL MAINTENANCE STRATEGY

Plan how the knowledge base stays current after initial publication.

Content decay model. Plan for differential decay rates across content categories. From fastest to slowest decay:

Pricing data (fee schedules, rebate tiers) – changes annually or more frequently.
URLs and document links – 20-30% rot within 18 months is typical for institutional sites.
Regulatory frameworks (MiFID III, EMIR 3.0) – major revisions expected 2026-2027.
Technical architecture (T7 platform, matching engine design) – stable across years but changes on major release boundaries.

The plan must specify a different monitoring and update workflow for each decay category.

Pricing update workflow. Plan an annual (at minimum) review process for all pricing data. The plan must specify where current pricing documents are published, how to detect changes, and how to propagate updates across all pages that reference pricing.

Circular and newsflash monitoring. Plan a monitoring process for new circulars and newsflashes from Deutsche Boerse that affect knowledge base content. Specify which circular categories to monitor and which keywords to watch for (e.g., “session limit,” “throttle,” “fee schedule,” “T7 Release”). Note that session limits and throttling parameters can change without prominent announcement – plan for proactive monitoring rather than reactive discovery.

Living document tracking. Some source documents are “living” – updated in place without changing their URL or title (e.g., the T7 Architecture Factsheet). Plan how to detect silent updates to these documents. Plan a re-verification schedule and a mechanism to track which version of a living document was used as the source.

“Planned” vs. “Launched” status management. Plan a tagging system for features, releases, and regulatory changes that are described as “upcoming” or “planned” at time of writing. The plan must specify: how to tag these items, how to monitor their status (launched, delayed, cancelled), and how to update the knowledge base when status changes. This is a known time bomb – stale “planned” labels erode credibility.

Staleness markers. Plan a system for displaying “last verified” dates on content pages. Plan the threshold beyond which content is flagged as potentially stale (e.g., 6 months for pricing, 12 months for architecture, 18 months for regulatory).

11. REVIEW AND VALIDATION PLANNING

Plan the review process for both the work plan itself and the generated content.

Hostile review process. Plan for adversarial review using distinct reviewer archetypes, each targeting a different failure mode. Include at least these five archetypes:

Hallucination Auditor – Challenges every factual claim for source traceability. Focuses on: latency figures, pricing data, protocol specifications, and timestamp codes.
Domain Insider – Reviews with deep Deutsche Boerse operational knowledge. Focuses on: oversimplified explanations, conflated terminology, and missing nuance in complex topics.
Prompt Engineer – Evaluates whether the content-generation prompt will produce consistent, complete output. Focuses on: cross-batch dependency handling, word count feasibility, citation enforceability, and glossary consistency.
Open-Source Maintainer – Evaluates long-term maintainability. Focuses on: link breakage patterns, version sensitivity, automation gaps, and contribution barriers.
Temporal Decay Auditor – Evaluates shelf life. Focuses on: pricing currency, URL durability, regulatory horizon, status label accuracy, and living document tracking.

Review criteria. For each reviewer archetype, the plan must specify: what they review, what they score on, and what constitutes a blocking finding versus an advisory note.

Incorporation workflow. Plan how review findings feed back into the work plan and content. Specify the iteration cycle: produce content, review, catalog findings, revise, re-review. Define the exit criterion (e.g., all blocking findings resolved, advisory notes logged for future maintenance).

12. KNOWN LLM LIMITATIONS (PLANNING CONSTRAINTS)

If an LLM is used for content generation (as anticipated in the two-step workflow), the plan must account for these known limitations:

URL verification impossibility. An LLM cannot visit URLs at generation time. The plan must not assume generated content will contain valid, tested links. Plan a post-generation URL verification pass as a mandatory step.

Output volume constraint. The complete knowledge base (~70,000 words across all pages) exceeds what an LLM can reliably produce in a single session. Plan a batched generation strategy with explicit batch boundaries, and specify how to maintain consistency across batches. Plan for the known degradation pattern: quality and consistency decline in later batches as context accumulates.

Cross-file consistency degradation. When generating multiple pages in sequence, an LLM’s adherence to glossary definitions, naming conventions, and cross-reference targets degrades over time. Plan for a post-generation consistency normalization pass. Plan which consistency dimensions to check: glossary term usage, acronym definitions, cross-reference targets, heading anchor formats, and front matter schema.

Regulatory decay. Major EU regulatory changes (MiFID III, EMIR 3.0) are expected in the 2026-2027 timeframe. Content generated now about regulatory requirements will partially decay. Plan for regulatory sections to be written with explicit temporal markers and plan the update trigger process.

Glossary-chapter consistency. With ~90 glossary terms referenced across 12 chapters, maintaining definition consistency is a significant challenge for LLM-generated content. Plan a single-source-of-truth enforcement mechanism and a post-generation audit.

Timestamp and measurement code confusion. T7 uses multiple timestamp measurement code series (T1-T7 for ETI, Tnn series for EOBI) that an LLM easily conflates. Plan explicit differentiation guidance for the content generation prompt and plan a post-generation audit for timestamp accuracy.

PDF fact verification limits. An LLM cannot open, read, or verify PDF documents. All facts cited from PDFs must be verified by a human or a separate tool in a post-generation step. Plan this verification as a distinct task with its own schedule and acceptance criteria.

13. PLAN OUTPUT FORMAT

Structure the work plan output as follows:

Phase structure. Organize into sequential phases with clear entry and exit criteria:

Phase	Focus	Entry Criterion	Exit Criterion
1. Research	Source collection, document inventory	Project kickoff	All chapters have source lists; gap analysis complete
2. Architecture	Directory structure, navigation, glossary protocol	Research complete	Architecture decision document approved
3. Writing Specifications	Per-chapter plans, diagram plans	Architecture approved	All 12 chapter specs and 6 diagram specs complete
4. Infrastructure	Site setup, CI/CD, verification pipeline design	Architecture approved	Site deploys; CI pipeline runs
5. Content Generation	Execute writing specs using content prompt	Specs + infrastructure ready	All chapters, appendices, diagrams generated
6. Verification	Run verification pipeline, hostile reviews	Content generated	All blocking findings resolved
7. Deployment	Publish to GitHub Pages	Verification passed	Site live and accessible
8. Maintenance	Ongoing monitoring and updates	Site deployed	(Continuous)

Per-phase task specification. For each phase, list:

Named tasks with one-sentence descriptions
Dependencies (which tasks from prior phases must be complete)
Acceptance criteria (how to know this task is done)
Estimated effort (hours or days)
Risk notes (what could go wrong, drawn from the hostile review findings)

Risk register. Produce a risk register derived from the hostile review weaknesses. For each risk: description, likelihood, impact, mitigation strategy, and which phase addresses it. Encode all 26 weaknesses from the five hostile reviewer archetypes as risks.

Decision log template. Include a blank decision log template for recording architectural and planning decisions made during execution. Fields: decision ID, date, decision, rationale, alternatives considered, reviewer.

Success criteria. Define measurable success criteria for the overall project:

All 12 chapters published with source citations
All cross-references resolve
Glossary consistency verified
Verification pipeline operational
No blocking hostile review findings unresolved
Maintenance monitoring active
Content staleness within defined thresholds

CONSTRAINTS SUMMARY

The following constraints are non-negotiable and must be reflected throughout the plan:

Source restriction. All factual content must trace to the seven approved domains. No external sources for Deutsche Boerse-specific facts.
No content generation. This prompt produces a plan. It does not produce chapter text, glossary definitions, diagram code, pricing figures, or verification scripts.
Hostile review coverage. All 26 identified weaknesses must appear in the plan as risks with mitigations.
Temporal awareness. The plan must treat content as perishable and specify decay-rate-appropriate maintenance for every content category.
LLM limitation awareness. The plan must not assume capabilities that LLMs lack (URL verification, PDF reading, unlimited context, perfect consistency).
Exchange portability. Planning methodology must be reusable for a different exchange. Exchange-specific decisions must be flagged.
Self-containment. This prompt must be usable by an LLM with zero prior context about the project. All necessary context is provided above.

This planning prompt is step one of a two-step workflow. A separate, detailed content-generation prompt (the “megaprompt”) specifies exact chapter contents, glossary terms, Mermaid diagram code, verification pipeline implementation, and all other production-level specifications. The plan produced from this prompt feeds into that content-generation step.