← Week 10 Week 12 →

Week 11: Domain Applications

Code, finance, and healthcare agents with domain-specific constraints

Week 11 of 12

Learning Objectives

Paper	Authors	Year	Link
SWE-bench: Real-World GitHub Issues	Jimenez et al.	2024	arXiv
AlphaCodium: Flow Engineering for Code	Ridnik et al.	2024	arXiv
FinAgent: Multimodal Trading Agent	Li et al.	2024	arXiv

3-4 hours Code agents Financial agents Healthcare agents

Study of domain-specific agents for code, finance, and healthcare

Devin: An Autonomous Software Engineer
Cognition AI (2024)
Technical Report Link

AlphaCodium: From Prompt Engineering to Flow Engineering - Ridnik, T., et al. (2024) arXiv
MDAgents: Adaptive Collaboration of LLMs for Medical Decision-Making - Kim, Y., et al. (2024) arXiv
LLM Agents for Financial Applications Survey - Li, S., et al. (2024) arXiv

100 Points 6-8 hours Expert

Build a domain-specific agent with specialized tools

Task	Points	Description
Domain Analysis	25	Analyze domain requirements
Agent Implementation	45	Build specialized agent
Evaluation	30	Evaluate on domain tasks

Domain Maturity Landscape:

Code (High): Clear success criteria (tests pass), sandboxed execution, active deployment
Finance (Medium): Regulatory constraints, compliance requirements, emerging deployments
Healthcare (Emerging): High stakes, human oversight required, FDA/HIPAA compliance

Flow Engineering: AlphaCodium’s structured multi-stage pipeline approach - break complex tasks into stages, generate and run tests iteratively.

SWE-bench Performance: Best agents solve ~50% of real GitHub issues - significant but still far from human-level.

Cross-Domain Patterns: Verification intensity should match domain risk level.

Build a domain-specific agent for one of:

Code Generation: Flow engineering approach with test-driven iteration
Research Assistant: Literature search with citation verification
Financial Analysis: Market data analysis with compliance guardrails
Clinical Decision Support: Evidence-based recommendations with human oversight

Have questions about this week's material? Want to discuss concepts with fellow students?