Research Papers
A curated collection of foundational and cutting-edge papers in agentic AI with summaries.
Key Papers with Summaries
ReAct: Synergizing Reasoning and Acting (Yao et al., 2023)
Core Idea: Interleave reasoning traces and actions in LLMs, allowing models to reason about tasks (Thought) and interact with external environments (Action) to gather information (Observation).
Key Contribution: Shows that combining reasoning and acting outperforms either approach alone on question answering and decision making tasks.
Why It Matters: Foundational paradigm for most modern LLM agents. The Thought-Action-Observation loop is now standard in agent frameworks.
arXiv | Week 1
Chain-of-Thought Prompting (Wei et al., 2022)
Core Idea: Adding "Let's think step by step" or providing reasoning examples dramatically improves LLM performance on complex reasoning tasks.
Key Contribution: Demonstrates emergent reasoning abilities in large models when prompted to show intermediate steps.
Why It Matters: Enables agents to break down complex problems and explain their reasoning, improving both accuracy and interpretability.
arXiv | Week 2
Toolformer (Schick et al., 2023)
Core Idea: Train LLMs to decide when and how to use external tools (calculators, search, etc.) by self-supervised learning on API calls.
Key Contribution: Shows LLMs can learn tool use without explicit supervision by generating and filtering training data from successful tool calls.
Why It Matters: Foundational work for function calling and tool-augmented LLMs used in modern APIs.
arXiv | Week 3
Reflexion (Shinn et al., 2023)
Core Idea: Agents learn from verbal self-reflection on failures, storing insights in episodic memory to avoid repeating mistakes.
Key Contribution: Introduces a framework for agents to improve through natural language feedback rather than gradient updates.
Why It Matters: Enables agents to learn from experience within a session, critical for complex multi-step tasks.
arXiv | Week 4
AutoGen (Wu et al., 2023)
Core Idea: Framework for building multi-agent systems through conversational interactions between specialized agents.
Key Contribution: Demonstrates that complex tasks can be solved by having agents with different roles collaborate through natural conversation.
Why It Matters: Pioneered the conversational multi-agent paradigm used in many modern agent frameworks.
arXiv | Week 5
Self-RAG (Asai et al., 2023)
Core Idea: Train LLMs to adaptively retrieve information and self-reflect on generated content using special tokens.
Key Contribution: Shows models can learn when to retrieve, what to retrieve, and how to critique their outputs for factuality.
Why It Matters: Improves RAG accuracy by making retrieval decisions dynamic rather than always-on.
arXiv | Week 7
GraphRAG (Edge et al., 2024)
Core Idea: Use knowledge graphs to structure document relationships, enabling community-based summarization and multi-hop reasoning.
Key Contribution: Outperforms naive RAG on global sensemaking queries that require synthesizing information across documents.
Why It Matters: Addresses RAG limitations for complex queries requiring broad understanding of a corpus.
Microsoft | Week 8
Chain-of-Verification (Dhuliawala et al., 2023)
Core Idea: Reduce hallucinations by having the model generate verification questions, answer them independently, and revise based on inconsistencies.
Key Contribution: Provides a systematic approach to fact-checking generated content without external knowledge bases.
Why It Matters: Critical technique for building trustworthy agents that can verify their own claims.
arXiv | Week 9
AgentBench (Liu et al., 2023)
Core Idea: Comprehensive benchmark for evaluating LLMs as agents across 8 distinct environments (web, DB, OS, games, etc.).
Key Contribution: First systematic evaluation framework for agent capabilities, revealing significant gaps between top models.
Why It Matters: Enables standardized comparison of agent capabilities and identifies areas for improvement.
arXiv | Week 10
Generative Agents (Park et al., 2023)
Core Idea: Simulate believable human behavior in a sandbox environment using memory, reflection, and planning architectures.
Key Contribution: Demonstrates emergent social behaviors from simple agent architectures, including information spreading and relationship formation.
Why It Matters: Opens possibilities for agent-based simulations of complex social systems.
arXiv | Week 12
Complete Paper List
Agent Architectures
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| ReAct: Synergizing Reasoning and Acting | Yao et al. | 2023 | arXiv | 1 |
| A Survey on LLM-based Autonomous Agents | Wang et al. | 2024 | arXiv | 1 |
| The Rise of LLM-Based Agents | Xi et al. | 2023 | arXiv | 1 |
Reasoning and Prompting
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Chain-of-Thought Prompting | Wei et al. | 2022 | arXiv | 2 |
| Tree of Thoughts | Yao et al. | 2023 | arXiv | 2 |
| Self-Consistency Improves CoT | Wang et al. | 2023 | arXiv | 2 |
Tool Use
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Toolformer | Schick et al. | 2023 | arXiv | 3 |
| Gorilla: LLM Connected with APIs | Patil et al. | 2023 | arXiv | 3 |
| ToolLLM | Qin et al. | 2024 | arXiv | 3 |
Planning and Reflection
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Reflexion | Shinn et al. | 2023 | arXiv | 4 |
| LATS: Language Agent Tree Search | Zhou et al. | 2024 | arXiv | 4 |
| Plan-and-Solve Prompting | Wang et al. | 2023 | arXiv | 4 |
Multi-Agent Systems
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| AutoGen | Wu et al. | 2023 | arXiv | 5 |
| MetaGPT | Hong et al. | 2023 | arXiv | 5 |
| ChatDev | Qian et al. | 2024 | arXiv | 5 |
| Multi-Agent Collaboration Survey | Tran et al. | 2025 | arXiv | 5 |
Retrieval-Augmented Generation
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Self-RAG | Asai et al. | 2023 | arXiv | 7 |
| Corrective RAG | Yan et al. | 2024 | arXiv | 7 |
| RAPTOR | Sarthi et al. | 2024 | arXiv | 7 |
| RAG Survey | Gao et al. | 2024 | arXiv | 7 |
Knowledge Graphs
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| GraphRAG | Edge et al. | 2024 | Microsoft | 8 |
| Graph of Thoughts | Besta et al. | 2024 | arXiv | 8 |
| HippoRAG | Gutierrez et al. | 2024 | arXiv | 8 |
Hallucination and Safety
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Chain-of-Verification | Dhuliawala et al. | 2023 | arXiv | 9 |
| FActScore | Min et al. | 2023 | arXiv | 9 |
| Self-Refine | Madaan et al. | 2023 | arXiv | 9 |
| Hallucination Survey | Ji et al. | 2023 | arXiv | 9 |
Evaluation and Benchmarks
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| AgentBench | Liu et al. | 2023 | arXiv | 10 |
| WebArena | Zhou et al. | 2024 | arXiv | 10 |
| GAIA Benchmark | Mialon et al. | 2024 | arXiv | 10 |
| SWE-bench | Jimenez et al. | 2024 | arXiv | 10 |
Domain Applications
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| AlphaCodium | Ridnik et al. | 2024 | arXiv | 11 |
| MDAgents | Kim et al. | 2024 | arXiv | 11 |
| FinAgent Survey | Li et al. | 2024 | arXiv | 11 |
Research Frontiers
| Paper | Authors | Year | Link | Week |
|---|---|---|---|---|
| Generative Agents | Park et al. | 2023 | arXiv | 12 |
| Voyager | Wang et al. | 2023 | arXiv | 12 |
| Constitutional AI | Bai et al. | 2022 | arXiv | 12 |
Reference Management
Zotero Collection
Import our curated paper collection directly into Zotero:
Agentic AI Course Papers
A shared Zotero collection with all course readings, organized by week.
Join the group to sync papers to your library and add notes.
BibTeX Export
Download all citations in BibTeX format for your papers:
Coming soon - bibliography.bib file with all course papers
Reading Tips
- Start with abstracts - Get the main idea before deep diving
- Focus on methods - Understanding the approach is more valuable than memorizing results
- Take notes - Write summaries in your own words
- Discuss with peers - Different perspectives help understanding
- Implement key ideas - The best way to learn is by doing
Citation Format
When citing papers in your work, use the following format:
@article{yao2023react,
title={ReAct: Synergizing Reasoning and Acting in Language Models},
author={Yao, Shunyu and others},
journal={arXiv preprint arXiv:2210.03629},
year={2023}
}