Week 11: Domain Applications
Code, finance, and healthcare agents with domain-specific constraints
Week 11 of 12
Learning Objectives
- Define SWE-bench, code agent, FinAgent, clinical decision support
- Explain domain-specific requirements for agent deployment
- Implement a code agent using flow engineering
- Compare agent architectures across different domains
- Assess regulatory and safety requirements for each domain
- Design a domain-specific agent with appropriate safeguards
Topics Covered
- Domain maturity landscape (code, finance, healthcare)
- Code agents and SWE-bench performance
- AlphaCodium flow engineering methodology
- FinAgent multimodal trading architecture
- Healthcare agent regulatory constraints (FDA, HIPAA)
Resources
Jupyter Notebooks
Required Readings
| Paper | Authors | Year | Link |
|---|---|---|---|
| SWE-bench: Real-World GitHub Issues | Jimenez et al. | 2024 | arXiv |
| AlphaCodium: Flow Engineering for Code | Ridnik et al. | 2024 | arXiv |
| FinAgent: Multimodal Trading Agent | Li et al. | 2024 | arXiv |
Reading Guide: Domain Applications
Study of domain-specific agents for code, finance, and healthcare
Primary Paper
Secondary Papers
Exercise: Domain Agent
Build a domain-specific agent with specialized tools
Learning Objectives
- Create: Build domain-specific agents
- Apply: Handle regulatory constraints
- Integrate: Integrate specialized tools
Tasks
| Task | Points | Description |
|---|---|---|
| Domain Analysis | 25 | Analyze domain requirements |
| Agent Implementation | 45 | Build specialized agent |
| Evaluation | 30 | Evaluate on domain tasks |
Key Concepts
Domain Maturity Landscape:
- Code (High): Clear success criteria (tests pass), sandboxed execution, active deployment
- Finance (Medium): Regulatory constraints, compliance requirements, emerging deployments
- Healthcare (Emerging): High stakes, human oversight required, FDA/HIPAA compliance
Flow Engineering: AlphaCodium’s structured multi-stage pipeline approach - break complex tasks into stages, generate and run tests iteratively.
SWE-bench Performance: Best agents solve ~50% of real GitHub issues - significant but still far from human-level.
Cross-Domain Patterns: Verification intensity should match domain risk level.
Exercise
Build a domain-specific agent for one of:
- Code Generation: Flow engineering approach with test-driven iteration
- Research Assistant: Literature search with citation verification
- Financial Analysis: Market data analysis with compliance guardrails
- Clinical Decision Support: Evidence-based recommendations with human oversight
Discussion Questions
- How do domain constraints affect agent architecture design?
- What safety measures are essential for high-stakes domains like healthcare?
- When should agents defer to humans vs act autonomously?
- How does regulatory compliance (SEC, FDA, HIPAA) constrain agent capabilities?
- What is the appropriate verification intensity for each domain?
Additional Resources
Discussion & Questions
Join the Conversation
Have questions about this week's material? Want to discuss concepts with fellow students?