Learn

Vector database: the search backend behind RAG

A vector database stores embeddings, the numeric vectors that represent the meaning of text or other data, and finds the ones nearest to a query vector. That nearest-neighbour search by meaning is what lets a retrieval system surface relevant passages, which is why it sits at the centre of most RAG setups.

At a glance

What it stores
Embeddings: numeric vectors that capture meaning
What it does
Finds the vectors nearest to a query, by similarity
Where it sits
The search backend behind retrieval-augmented generation (RAG)
Also called
A vector store
Flow

How a query finds its passages

A question becomes a vector, the store finds the nearest stored vectors, and their source passages come back as context. Green is the relevant context you wanted.

1
Query becomes an embedding the question is turned into a vector
2
Vector database searches finds the nearest stored vectors by similarity
3
Relevant passages return their source text is handed back as context

What is a vector database for?

Ordinary search matches words. A vector database matches meaning. It works on embeddings, which are lists of numbers that an embedding model produces to represent a piece of text, so that passages with similar meaning end up close together in that numeric space. Ask it a question and it turns the question into a vector too, then finds the stored vectors nearest to it. The source passages behind those vectors come back as your results.

That is why it is the engine behind retrieval-augmented generation (RAG): the step where a system pulls relevant context out of a document pile before the model answers is a nearest-neighbour search, and that is exactly what a vector database does. Some people call it a vector store. Same idea.

Where does it fit, and where does it fall short?

The vector database is one stage in a pipeline, not the whole thing. Before it, documents are split into passages (chunking) and turned into embeddings. The store searches those embeddings. After it, a reranker often reorders the hits, because nearest in vector space is not always the most relevant to the question.

So its quality is borrowed. Feed it good chunks and good embeddings and it returns useful passages fast. Feed it badly split text or a weak embedding model and it returns near-misses with full confidence. It finds what is closest, not what is correct. The surrounding steps are what make closest worth having.

A vector database is for

  • Finding text by meaning, not by exact keyword match
  • Powering the retrieval step of a RAG pipeline
  • Searching large collections of embedded passages fast
  • Returning the closest matches, ranked by similarity

It is not

  • A substitute for a reranker; nearest is not always most relevant
  • A general database for transactions and structured records
  • Only as good as the chunking and embeddings you feed it
  • A guarantee of correctness; it returns close, not true

Related terms

← All terms Reviewed: June 2026