OmniRetrieval — Source-native query dispatch
AgentThe news. On May 29, 2026, the OmniRetrieval paper (arXiv:2605.29250) proposed a retrieval framework that accepts a natural-language query and routes it to the appropriate knowledge source — unstructured text, relational tables, or graphs — dispatching each source's native query to its own execution engine rather than flattening everything into a single embedding index. The authors report evaluating across 13 datasets and 309 distinct knowledge bases, and exceeding single-source retrieval baselines. Read the paper →
Picture the reference desk again. A question comes in — "which suppliers shipped to the Berlin warehouse in Q3, and who introduced them?" The clerk doesn't translate the question into one bland house dialect and shout it at the whole building. They split it: the archivist searches the prose contracts, the accountant runs the numbers in their ledgers, and the genealogist walks the introductions graph. Each specialist answers in their own language, keeping the structure that makes their corner useful — the ledger's columns, the family tree's lines.
The animation above is that desk. In the first beat the query flows query → router → one flat vector index: every source is shredded into the same uniform grid of vectors, and the table's JOIN arc and the graph's edges go dashed and grey — flattened into embedding space where structure can't be queried. Then the router flips from flatten to dispatch: the same three sources light up in their native forms, and three typed queries fan out — text search, table query · JOIN, graph traversal — with the JOIN arc and graph edges restored in green.
The mechanism is a router plus per-source engines. Instead of an embed-then-ANN lookup over one homogenised store, OmniRetrieval generates a query in each source's own language and runs it on that source's engine, then unifies the heterogeneous results for the generator. Because the table never left its relational form, a JOIN still composes rows by key; because the graph never left its node-edge form, a traversal still follows edges. Those are exactly the structural affordances a single similarity vector erases.
Why flattening loses the answer
Hold the Berlin question fixed and walk the two paths. Say the answer needs a JOIN of a 5,000-row suppliers table with a 40,000-row shipments table on supplier_id. The relational path evaluates the key match exactly — every shipment resolves to its supplier, and the introductions graph then traverses two hops to the people who introduced them. The flat-index path instead embeds each row into, say, a 768-dimension vector and returns, say, the top-k = 20 nearest chunks to the query. Two hundred million potential supplier–shipment pairings (5,000 × 40,000 = 200,000,000, illustrative) collapse into 20 fuzzy neighbours chosen by cosine distance — and "who introduced them" is an edge that was never embedded as an edge. The JOIN and the traversal are not slow in the flat index; they are absent. That is the RAG failure mode source-native dispatch is built to remove.
Flat vector index vs. source-native dispatch
| Property | Unified vector index | Source-native dispatch |
|---|---|---|
| Index built upfront | one embedding pass + ANN index over all sources | each source keeps its native engine (full-text, SQL, graph) |
| Query form | one nearest-neighbour lookup over shared space | a source-native query per source, chosen by the router |
| Structure preserved | no — JOINs and edges averaged into a vector | yes — JOINs compose, edges traverse |
| Best when | fuzzy topical match over homogeneous prose | answer spans tables / graphs, needs exact relations |
| Reported scope | — | ~13 datasets · 309 KBs, beats single-source baselines (OmniRetrieval) |
The table isn't an argument that embeddings are obsolete — fuzzy topical recall over prose is exactly what a vector index is good at. It's that a single representation can't be the right one for text, tables, and graphs at once. Routing to a source-native query lets each source answer with the structure it actually has, and only then unifies — so the generator receives composed relations, not a bag of nearest neighbours.
Goes deeper in: AI Agents → Retrieval & RAG → RAG Failure Modes
Related explainers
- Is Grep All You Need? — Grep vs vector retrieval for agentic search — another result that the vector index is one option among many, not the default
- CDD — Context-Driven Decomposition for RAG knowledge conflict — what to do once retrieval returns sources that disagree