Week 8: GraphRAG and Knowledge
From vector search to knowledge graphs for structured retrieval
Week 8 of 12
Learning Objectives
- Define knowledge graphs, entities, relations, and communities
- Explain how GraphRAG enhances retrieval with structure
- Build a knowledge graph from unstructured text using LLMs
- Compare vector-only vs graph-enhanced retrieval strategies
- Assess when GraphRAG provides value over standard RAG
- Design a hybrid retrieval system combining vectors and graphs
Topics Covered
- Limitations of vector-only RAG
- Knowledge graph fundamentals (entities, relations, communities)
- GraphRAG architecture and community detection
- Query routing (local vs global search)
- Entity extraction with LLMs
Resources
Jupyter Notebooks
Required Readings
| Paper | Authors | Year | Link |
|---|---|---|---|
| From Local to Global: A GraphRAG Approach | Edge et al. | 2024 | arXiv |
| Unifying LLMs and Knowledge Graphs: A Roadmap | Pan et al. | 2024 | arXiv |
| Graph of Thoughts | Besta et al. | 2024 | arXiv |
Reading Guide: GraphRAG and Knowledge Graphs
Exploration of graph-based retrieval and knowledge graph construction
Primary Paper
Edge, D., Trinh, H., Cheng, N., et al. (2024)
Microsoft Research Link
Secondary Papers
Exercise: GraphRAG
Build knowledge graphs and graph-based retrieval
Learning Objectives
- Create: Build knowledge graphs from text
- Apply: Implement graph-based retrieval
- Analyze: Compare graph vs vector retrieval
Tasks
| Task | Points | Description |
|---|---|---|
| Entity Extraction | 30 | Extract entities and relationships |
| Graph Construction | 35 | Build and index knowledge graph |
| Query Implementation | 35 | Implement local and global queries |
Key Concepts
Knowledge Graphs: Entity-relationship structures that make implicit relationships explicit and queryable. Represented as triples: (subject, predicate, object).
GraphRAG: Microsoft’s approach combining knowledge graphs with retrieval-augmented generation. Enables both local (entity-specific) and global (thematic) queries.
Community Detection: Using the Leiden algorithm to cluster related entities, enabling hierarchical summarization for global queries.
Query Routing: Intelligently matching query types (local vs global) to optimal retrieval strategies.
Exercise
Build a domain knowledge graph system:
- Extract entities and relationships from documents using an LLM
- Construct a queryable knowledge graph with Neo4j or NetworkX
- Implement community detection with Leiden algorithm
- Build both local search (entity lookup) and global search (community summaries)
- Compare performance with vector-based retrieval on multi-hop queries
Discussion Questions
- When is graph-based retrieval better than vector search? What query patterns benefit most?
- How do you handle entity disambiguation when the same entity appears with different names?
- What are the scaling challenges of knowledge graphs for large corpora?
- How does the upfront indexing cost of GraphRAG compare to the query benefits?
- When would a hybrid vector + graph approach be preferred over pure GraphRAG?
Additional Resources
Discussion & Questions
Join the Conversation
Have questions about this week's material? Want to discuss concepts with fellow students?