Oct 1, 2025

Risky Operations in Elasticsearch and OpenSearch.

Max Irwin
17 min read

At Bonsai, we run search clusters for thousands of customers, and some of them have billions of documents. These clusters require dozens of large instances totaling thousands of CPUs, petabytes of Disk, and terabytes of RAM.

At this scale, lots can go wrong. We’re here to make sure everything runs smoothly so our customers can focus on delivering business value to their customers, and not worry about the intricacies of keeping such a large cluster healthy.

We’re also keen on making sure our customers can be flexible in what they do. We don’t lock everything down, but we’ve found that most people don’t know about the dangerous side of running certain operations. Some things can get particularly unsavory when run against a gigantic cluster that ingests millions of documents per day, and executes even more queries.

So, dear reader, welcome to the scary side of search. In this post we outline several features that exist in Elasticsearch and OpenSearch that can outright ruin your week if you’re not careful.

While it doesn’t cover absolutely everything, this is a good start to learning more about the many potential footguns in your engine. We’ve broken them up into some groups: Dangerous, Destructive, Spatial, Heavy Read/Write, and Config. Enjoy!

Dangerous

These operations are called out first, in that they are non-obvious one-liners that will make things bad for everyone if you call them without realizing what the implications are. You should probably never use these unless you have a very specific reason and only with careful understanding and planning.

Name/Operation Description & Risks Docs
Force Merge
POST [/{index}]/_forcemerge
Force segment merges to reduce segment count (optionally expunge deletes).

Heavy CPU/IO; can create very large segments and can run for DAYS.

Elasticsearch: Force a merge
OpenSearch: Force Merge API
Clear Cache
POST [/{index}]/_cache/clear
Clear request/query/fielddata caches for one or more indices.

Drops hot caches. Immediate latency/CPU spikes as caches rebuild.

Elasticsearch: Clear Cache
OpenSearch: Clear Cache
Refresh
POST [/{index}]/_refresh
Explicitly refresh one or more indices to make recent writes searchable.

Synchronous and resource-intensive; rely on the periodic index.refresh_interval instead.

Elasticsearch: Refresh API
OpenSearch: Refresh Index API
Flush
POST [/{index}]/_flush
Force Lucene commit; fsync segments; rotate translog.

Burst I/O + segment churn if run broadly/frequently. It's usually unnecessary, and Lucene will do this for you.

Elasticsearch: Flush API
OpenSearch: Flush API

Destructive

This section outlines ways to say goodbye to data. Obviously, you can delete an entire index in a single curl command, and that is typically run in very purposeful scenarios. But others are more devious. For example, you can delete by query - but triple check the query without the delete first and don’t YOLO your way into a week-long recovery effort!

Name/Operation Description & Risks Docs
Delete Index
DELETE /{index}
Delete an index (irreversible unless you have snapshots).

Destructive; can break aliases and dependent apps; requires careful RBAC.

Elasticsearch: Delete Index
OpenSearch: Delete Index
Delete By Query
POST /_delete_by_query
Delete all docs matching a query.

Full/large scan; can delete massive volumes; heavy merges; difficult to roll back.

Elasticsearch: Delete By Query
OpenSearch: Delete By Query
Close Index
POST /{index}/_close
Close index (no read/write/search; frees some resources).

Operational risk: apps fail on closed index; Ingestion will fail.

Elasticsearch: Close Index
OpenSearch: Close Index

Heavy Read/Write/Compute

When you’re serving lots of queries in a live environment, and decide you want to run some ad-hoc reports, get some comprehensive stats, take a snapshot, or reindex, you can slow things down for your customer application. Likewise, if you’ve got a new large dataset that exceeds your typical daily ingest that you want to toss into the index, you should have careful planning before doing so.

Name/Operation Description & Risks Docs
Reindex
POST /_reindex
Copy documents from one index/alias/data stream to another (optionally filtered/transformed).

Very heavy read+write workload; can saturate I/O, heap, and network; can create version conflicts; best throttled or run off-peak.

Elasticsearch: Reindex API
OpenSearch: Reindex API
Update By Query
POST /{index}/_update_by_query
Scan + script-update matching docs in-place.

Expensive full/large scan; scripts execute per-hit; version conflicts; large translog and segment churn.

Elasticsearch: Update By Query
OpenSearch: Update By Query
Create a Snapshot
POST /_snapshot/{repository}/{snapshot}
Filesystem/object-store snapshot of indices/cluster state.

Heavy I/O and repository load; long-running; can contend with indexing.

Elasticsearch: Create Snapshot
OpenSearch: Create Snapshot
Restore a Snapshot
/_snapshot/.../_restore
Restore indices/cluster metadata from snapshot.

Cluster-wide writes and shard allocations; can overwhelm nodes and disrupt routing.

Elasticsearch: Restore Snapshot
OpenSearch: Restore Snapshot
Disk Usage API
/{index}/_disk_usage?run_expensive_tasks=true
Analyze per-field on-disk footprint.

Expensive offline analysis; can be very resource intensive on large indices.

Elasticsearch: Analyze Index Disk Usage
OpenSearch: No Equivalent
Scroll Search
/_search?scroll=...
Long-lived search contexts to page large result sets.

Holds resources per context; forgetting to clear can leak heap & file handles.

Elasticsearch: Scroll API
OpenSearch: Scroll API

Spatial (Disk & RAM consumption)

Have you ever run out of disk or memory? Talk about fun, if you’re bored these are just a really great way to find lots of work to do for the next 24 hours. Some operations unwittingly produce lots more data than you realize, sometimes double or more of your index. If you don’t have enough room, you’ll find out when the machines start complaining.

Name/Operation Description & Risks Docs
Reindex
/_reindex
Copy documents from one index/alias/data stream to another (optionally filtered/transformed).

Very heavy read+write workload; can saturate I/O, heap, and network; can create version conflicts; best throttled or run off-peak.

Elasticsearch: Reindex API
OpenSearch: Reindex API
Shrink Index
/_shrink
Rewrites an index into fewer primary shards.

Requires read-only state; creates new index; heavy reindex-like copy and segment rewrite.

Elasticsearch: Shrink Index
OpenSearch: Shrink Index
Split Index
/_split
Rewrites an index into more primary shards (multiples only).

Requires read-only source; full rewrite; heavy disk/CPU.

Elasticsearch: Split Index
OpenSearch: Split Index
Clone Index
/_clone
Clone an index to a new one (same shard count).

Faster than reindex, but still creates a full copy and transient resource spikes.

Elasticsearch: Clone Index
OpenSearch: Clone Index
Point In Time
/_pit
Consistent snapshot for paginating searches.

Keeps segments pinned; too many/long keep_alive PITs consume disk/heap.

Elasticsearch: Point-in-Time API
OpenSearch: Point-in-Time API
Restore a Snapshot
/_snapshot/.../_restore
Restore indices/cluster metadata from snapshot.

Requires you to close, delete, or rename the index. The former two may result in data loss, the latter requires enough space.

Elasticsearch: Restore Snapshot
OpenSearch: Restore Snapshot

Mappings, Settings, and Aggregations

This section doesn't cover explicit operations, but rather things that you can add to mappings and settings that might result in things you don't want.

First things first: Don't use dynamic mappings. This means that you have data in a document that doesn't have a corresponding mapping property/field. This will result in that data being poorly optimized - so always ensure you are explicitly declaring your data in mappings with full coverage.

Name/Operation Description & Risks Docs
Add a document without Mappings
POST /{index}/_doc/{id}
Automatic dynamic mapping updates.

Field explosion & poor types from dynamic mapping; mapping growth increases heap and slows queries.

Elasticsearch: Dynamic mappings
OpenSearch: Dynamic mappings
Update Mappings
PUT /{index}/_mapping
Update index mappings (add fields, parameters).

Some changes are irreversible without a reindex.

Elasticsearch: Put mapping
OpenSearch: Put mapping
Update Settings
PUT /{index}/_settings
Change the settings for an index.

Some changes can change infrastructure layout and impact runtime.

Elasticsearch: Update Settings
OpenSearch: Update Settings

Mapping footguns

Some things for mappings enable incredible things, like highlighting and aggregations (more on aggs later). Here are some specific properties you should use rarely.

Property Description & Risks Docs
term_vectors
"term_vector": "with_positions_offsets"
Term vectors are used with position offsets to enable highlighting, and can also be used to enable payloads.

Enabling with_positions_offsets will increase disk and heap use for the field on which it is enabled by a significant factor.

Elasticsearch: term_vector
OpenSearch: term_vector
copy_to
"copy_to": ["other field", ...]
Copy_to allows you to duplicate a field into another for alternate index and query analysis configuration.

Using copy_to on large text fields to multiple destinations with unoptimized analyzis can grow your index significantly

Elasticsearch: copy_to
OpenSearch: copy_to

Get to know all your mapping field parameters!

Settings footguns

Index Settings are vast. There are many of them which you should just leave as defaults unless you know what you are doing. Some settings can be changed on live indices to trigger infrastructure changes with deep implications.

As a general guideline, I encourage you to read through your respective engine's guide:

Property Description & Risks Docs
number_of_replicas
"number_of_replicas": integer
This will set the number of primary shard replicas for your index.

Can trigger mass shard movement and recovery (network + IO heavy), degrading search/indexing.

Elasticsearch: Index Settings
OpenSearch: Index Settings
refresh_interval
"refresh_interval": integer
Sets the interval time (in seconds) that the engine will make recently added documents available for search.

The default is 1 second, but in large clusters with high ingest rates, consider changing this to a higher number, upwards of 10 seconds maximum.

Elasticsearch: Index Settings
OpenSearch: Index Settings

Settings is also where you configure field analysis that your mapping properties will use. In general, avoid ngrams and shingles unless you need them for a specific purpose, as they will significantly increase spatial requirements of the fields using them.

Property Description & Risks Docs
N-gram token filter
"type": "ngram"
Breaks down words into smaller pieces to assisist with partial matching and fuzzy search.

This will grow your index vocabulary in size, impacting the disk and memory requirements for the field.

Elasticsearch: N-gram Token Filter
OpenSearch: N-gram Token Filter
Shingle token filter
"type": "shingle"
Generates word n-grams ("shingles") which assists in phrase search.

With the default output_unigrams set to true as the default, your index vocabulary will grow in size, impacting the disk and memory requirements for the field.

Elasticsearch: Shingle Token Filter
OpenSearch: Shingle Token Filter

Get to know all your analysis types!

Aggregations

I saved the best for last, because this is a query time footgun that I see all too often. Aggregations are complex counting operations that have high I/O and CPU use, and when used without care will saturate your CPU and make latency terrible for everyone. When serving large corpora at high query loads, be sure to optimize your aggregations and only use where necessary. Also be sure to couple them with strict matching/filter critera in your query to ensure you're not aggregating across too much data.

Conclusion

Well, that's all for now. Remember, Bonsai is here to take away the pain. Stay green, stay happy, and stay safe out there folks!

Ready to take a closer look at Bonsai?

Find out if Bonsai is a good fit for you in just 15 minutes.

Learn how a managed service works and why it’s valuable to dev teams

You won’t be pressured or used in any manipulative sales tactics

We’ll get a deep understanding of your current tech stack and needs

Get all the information you need to decide to continue exploring Bonsai services

Calming Bonsai waves