Vector Search for Ecommerce Elasticsearch: When It Works, When It's Overhead

A vendor pitched vector search for our catalog. 40,000 SKUs. Elasticsearch already running — fuzzy configured, synonyms tuned, function_score boosting by margin and stock.

My first question: what problem does the current setup not solve?

They couldn't answer. We didn't add vector search to our Elasticsearch setup — and the catalog didn't suffer for it.

This isn't skepticism toward the technology. It's a method.

What vector search actually does — and why vendors push it

Classic Elasticsearch runs BM25: it scores by term frequency, normalizes by document length, finds exact matches. Fast — 5–10ms. Predictable. Easy to debug when results look wrong.

Vector search is different. Text (a query or a product card) gets encoded into a numeric vector by an embedding model (SBERT, CLIP, Elastic ELSER). Elasticsearch as a UX tool then finds the nearest neighbors via HNSW index. It surfaces semantically similar results even when no word matches.

That's the pitch: "the user searches for 'comfortable office chair' and gets ergonomic seating even if the word 'comfortable' isn't in the product description."

It works. In specific scenarios.

Three scenarios where vector search earns its place

First — semantic intent queries. "Gift for mom," "chair for long desk sessions," "fabric for kids' clothing." Queries where the user doesn't have a keyword for the product they want. BM25 fails here because the query words don't match the catalog terms.

Second — multilingual or cross-language search. If the catalog is in Russian and some queries arrive in English or transliteration, a multilingual embedding model bridges that gap better than synonym dictionaries.

Third — "similar products" recommendations. Not a search query — a nearest-neighbor lookup by product vector. This is where vector search lives naturally.

All three are about fuzzy intent, not exact lookups.

When BM25 + fuzzy + synonyms is already enough

SKU "WH-1000XM5" is an out-of-vocabulary token for any embedding model. kNN search on that query returns noise. BM25 finds the exact match in 5ms.

For catalogs under 100,000 SKUs, a well-configured BM25 with fuzziness: AUTO, a synonym dictionary, and a language analyzer covers 80–90% of real search problems. We saw it on a 28,000-SKU project: zero-results rate dropped from 22% to 11% after configuring fuzzy search and adding 180 synonym pairs — without a single line of vector search code.

Latency: BM25 runs at 5–10ms. kNN adds 20–50ms depending on index size and HNSW parameters. On mobile with a slow connection, that's noticeable.

Vector search also adds operational weight. It needs:

an embedding pipeline that vectorizes queries and products
re-indexing on every description, price, or stock update
a separate model-serving service or paid Elastic ELSER

On a 28k-SKU catalog with basic config, 384-dimensional dense vectors add roughly 400MB to heap. Not critical — but worth accounting for.

The cost they don't mention in the pitch

The vendor demo runs on 1,000 products with cherry-picked examples. Production with 40,000 SKUs looks different.

Batch embedding at initial indexing: 40k documents × 50ms per external API call = 33 minutes single-threaded. Faster with parallelism, but that needs infrastructure.

Incremental re-embedding: every product update (price, stock, description) has to recompute the vector and update the HNSW index. That's a batch pipeline you have to write, test, and monitor.

kNN latency under load: HNSW isn't O(1). At high traffic with 40k docs, p95 can push past 100ms. BM25 under the same conditions stays at 10–20ms.

None of this makes vector search wrong. It makes honest evaluation non-optional.

Five questions before you commit

Run through these before the ticket goes in the backlog.

Is your zero-results rate above 15%? If not — fix fuzzy and synonyms first. That's almost certainly where the problem is.

What share of queries are exact SKU or model number lookups? If it's above 40%, vector search won't help those.

Do you have a labeled query dataset? Without one you can't run a proper A/B test — and you won't know if things actually improved.

What's your latency budget? Under 20ms per search request, fitting vector search in without hybrid mode isn't realistic.

Who owns the embedding pipeline? It's a separate service with its own SLA, monitoring, and deploy process. If the answer is "we'll figure it out" — that's a risk, not a plan.

If the client already bought the idea

Don't fight it. You'll lose.

Propose hybrid search instead: BM25 + vector with Reciprocal Rank Fusion (RRF). Elasticsearch supports this natively via sub_searches. The client gets semantic coverage where it matters; BM25 stays as the anchor for exact queries. Latency goes up — tell them upfront.

Lock in a baseline before anything changes: zero-results rate, search CTR, search-to-cart conversion. No "before" numbers means no honest "after" conversation.

If the A/B test doesn't show movement after a month, that's not failure. That's data. And data makes the next conversation with the client an engineering discussion, not a political one.

The short version

Vector search for ecommerce Elasticsearch isn't an upgrade. It's a different tool for a different job.

A catalog under 100k SKUs with properly configured fuzzy and synonyms usually isn't hitting the ceiling vector search is meant to break. It's hitting operational details: wrong analyzer, missing synonyms, zero results on a common typo.

Start with zero-results rate. Then search CTR. If both are in reasonable shape, hold off on kNN until you hit a real problem you can name.

You'll know when you need it. When semantic intent is the actual bottleneck, and no synonym dictionary can close that gap, it'll be obvious. That's when this conversation is worth having.