Back to blog

Elasticsearch Mapping Is Architecture, Not Configuration

For three weeks we couldn't figure out why the price filter wasn't working. Elasticsearch was up. Queries were coming in. Results were returning. But the price range — say, $15 to $50 — was being ignored completely. Products at $180 and $3 showed up side by side.

We found it by accident. The price field was indexed as text.

Not because someone made a typo. Because the index was created with dynamic: true, and Elasticsearch inferred the type from the first document it saw — where price arrived as a string from an XML export. One import, one string instead of a number, and the search engine quietly decided price was text.

You can't fix that without a reindex. Mapping in Elasticsearch is immutable. Not "difficult to change in-place" — there is no in-place. You need a new index with the correct mapping, a full re-import of 28,000 documents, and an alias switch. That cost us four hours of work plus 90 minutes of reindex time.

That's why elasticsearch mapping design is an architectural decision, not a configuration task.

What immutable mapping actually means

When you change a schema in a relational database, you write ALTER TABLE. Expensive, sometimes risky, but doable without recreating the table. In Elasticsearch, changing a field's type in-place isn't possible. You can add a new field. But changing an existing one means a full reindex.

For 28k SKU, that's 80–90 minutes in production. For an e-commerce store with evening peak load, that's real exposure.

The decision is made once. Before the first PUT /my_index. After that, only alias swap.

Start with dynamic: false

By default, Elasticsearch runs dynamic: true. Any new field in a document gets automatically added to the mapping, type inferred from the value.

Fine for tutorials. Not for production.

Three things I've hit with dynamic: true:

First — the type sticks from the first document. If document #1 has price: "1200" (a string from XML), the mapping locks in text. The range filter stops working, and you find out three weeks later.

Second — surprise fields accumulate. Bitrix sometimes adds internal service fields to its API response. They land in your index, bloat the mapping, and make GET /my_index/_mapping unreadable.

Third — field count creeps up. After six months you have 40+ fields and you're only using 12.

dynamic: false for production. New fields don't enter the mapping automatically — you have to add them explicitly and reindex. Annoying, yes. But six months later you'll know exactly what's in your index.

text vs keyword: a contract, not a type

When you pick a field type, you're not deciding how to store data. You're deciding what operations are possible.

text — runs through an analyzer: tokenization, stemming, optional normalization. Full-text search works. Exact filtering, aggregations, and sorting don't.

keyword — stored as-is. Exact filtering (term, terms), aggregations (facets), sorting work. Full-text search doesn't.

Classic mistake: mark brand as text. Then try to build a brand facet, and get broken bucket counts because the analyzer split "Samsung Galaxy" into two tokens.

Fields you need for exact filtering and facets go keyword. Fields you need for search go text. When you need both, use fields:

"name": {
  "type": "text",
  "analyzer": "russian",
  "fields": {
    "keyword": { "type": "keyword" }
  }
}

Anything that needs to be both searched and faceted — do it this way.

nested vs object: when hierarchy costs you

If a product has multiple variants (color + size → price), you want to store them as an array of objects:

"variants": [
  { "color": "red", "size": "M", "price": 1200 },
  { "color": "blue", "size": "L", "price": 1400 }
]

With type object, Elasticsearch flattens this: variants.color: [red, blue], variants.price: [1200, 1400]. The relationship between fields within a single variant is lost. A query for "red variant under $15" can return a document where red costs $18 and the cheaper one is blue.

nested preserves the relationships. But each nested object is a separate hidden document. On 28,000 products with 5 variants each, that's 140,000 hidden documents. Performance takes a hit, and there's a index.max_nested_depth limit to watch.

Use nested only when you need to filter by combinations of fields within an object. For everything else, use object or put variants in a separate index.

Russian morphology: search field ≠ filter field

For Russian search you need a custom analyzer with a morpheme or russian stemmer. It lets you match "телефоны" (phones, plural) on the query "телефон" (phone, singular).

But that same analyzer will break facets. "Samsung" gets tokenized to "samsung" (lowercase). Exact filtering with term: { "brand": "Samsung" } stops working.

Solution: two sub-fields on one field.

"name": {
  "type": "text",
  "analyzer": "ru_morpheme",
  "fields": {
    "raw": { "type": "keyword" }
  }
}

Search runs on name (with morphology). Facets and filters run on name.raw (exact value).

Same pattern anywhere: one sub-field for the analyzer, one raw keyword for everything else.

How to change mapping when it's already wrong

If you're already in production with the wrong types, here's how to fix it without downtime:

  1. Create a new index with the correct mapping: PUT /my_index_v2
  2. Run POST /_reindex from the old to the new. On 28k SKU — about 80 minutes
  3. Before switching, verify the count: GET /my_index_v2/_count == GET /my_index/_count
  4. Switch the alias: POST /_aliasesremove: my_index_alias → my_index, add: my_index_alias → my_index_v2
  5. New requests route to v2 automatically. Keep the old index as fallback for 24 hours

The application should only know about the alias, never the versioned index name. This is the only way to do mapping changes without downtime.

Mapping checklist before the first PUT /my_index

  • dynamic: false set explicitly
  • All facet fields typed keyword
  • All full-text search fields typed text with an analyzer
  • Fields needed for both search and filtering — fields with a keyword sub-field
  • Russian morphology via a custom analyzer, not standard
  • nested only where filtering by field combinations inside an object is actually needed
  • Index accessed only through an alias, never by versioned name directly

Mapping gets set once. Change it later and you're looking at a full reindex. That's the kind of constraint that belongs in the project kickoff conversation, not in a post-incident retrospective.

For how search mapping connects to conversion, see Elasticsearch is a UX tool, not a fast database. For fuzzy search on top of correctly mapped fields, see the typo-tolerance case study.