Hybrid Retrieval
Hybrid retrieval is a search strategy that combines two complementary retrieval methods — dense vector search (semantic) and sparse keyword search (typically BM25) — and merges their results before passing context to a language model. The combination consistently outperforms either method used in isolation because the two approaches fail on different query types.
How It Works
In a hybrid retrieval system, every query triggers two parallel searches:
- Dense retrieval — the query is embedded and compared to document vectors using cosine similarity, returning semantically relevant chunks even when exact words differ.
- Sparse retrieval — the query is matched against an inverted index using BM25 (a probabilistic term-frequency scoring function), returning documents that share exact or near-exact terms with the query.
Results from both searches are then merged using a ranking fusion technique — commonly Reciprocal Rank Fusion (RRF), which rewards documents that appear highly in both result lists — or via a weighted linear combination of the two scores.
The merged, re-ranked list is passed to the language model as retrieved context.
Why Hybrid Outperforms Single-Method Retrieval
Dense-only retrieval handles paraphrase and concept matching well but struggles with exact entity names, product codes, or rare technical terms that have few training examples in the embedding model. Sparse-only retrieval handles exact terms precisely but misses semantically related documents that use different vocabulary.
A hybrid system covers both failure modes:
| Query type | Dense | Sparse | Hybrid |
|---|---|---|---|
| "sales productivity software" (concept) | Strong | Weak | Strong |
| "error code 4092B" (exact term) | Weak | Strong | Strong |
| "how to improve pipeline velocity" (conceptual + specific) | Good | Partial | Best |
Common Use Cases
- Enterprise RAG systems — most production-grade RAG pipelines use hybrid retrieval as the default, since enterprise knowledge bases contain both conceptual documents and exact-term records.
- Semantic search with entity constraints — combining semantic meaning with exact filters for product IDs, person names, or dates.
- CRM and contract search — where users may query by concept ("liability clause") or by specific term ("Section 8.3(b)").
Hybrid Retrieval vs. Pure Vector Search
Pure vector search is simpler to implement and sufficient for many use cases where queries are always conceptual. Hybrid retrieval adds operational complexity (maintaining both a vector index and an inverted index, running fusion) but is worth the cost in any system where users or agents query by specific terms as well as concepts.
Related Terms
Knowlee's Approach
Knowlee's retrieval layer applies hybrid search when querying the knowledge graph for account context: semantic similarity surfaces conceptually relevant signals while keyword matching ensures exact company names, product terms, and identifiers are reliably retrieved. The fusion step is weighted based on query type — agent-generated analytical queries lean on dense retrieval; lookup queries prioritize sparse scores. This is part of the broader architecture described in The Enterprise Knowledge Graph Moat.