What does a reranker do?
The first step of a retrieval system is built for speed. It searches a large collection of embeddings and returns the passages whose vectors sit nearest the query, but nearest in that numeric space is a rough proxy for relevant. Some of what comes back is genuinely on point, and some of it is a near-miss that happened to land close.
Reranking is the cleanup pass. A reranker, a model that scores how well a passage answers a specific query, takes the candidates the first search returned and gives each one a fresh relevance score against the actual question. Then it reorders them. The truly useful passages rise to the top, the lucky near-misses fall. The first pass casts a wide, fast net; the reranker is the careful read of what the net caught.
When is it worth the extra step?
Reranking costs time. It runs a slower model over every candidate, per query, so it earns its place only when ordering matters and the first pass gets it wrong often enough to notice. If you retrieve many candidates and hand only the top few to a model with a limited context window, a reranker makes those few count.
It has a hard limit worth stating plainly: a reranker can only reorder what it was given. If the passage you needed never came back from the first search, because of bad chunking or a weak embedding model, no reranking will conjure it. It sharpens a good shortlist. It does not rescue a bad one.