Categories

How Bonsai Manages Watermarks

Elasticsearch is constantly checking on each node’s disk availability to make decisions about where to allocate and move shards.
Last updated
June 20, 2023
Note: The following article applies to Business and Enterprise tiers with dedicated single tenant clusters. These watermark alerts do not apply to Sandbox, Staging, or Standard tiers on shared multitenant clusters; instead, these tiers will be notified once an overage is detected on at least one metered metric.

Elasticsearch protects itself from data loss with the disk-based shard allocator. This mechanism attempts to strike a balance between minimizing disk usage across all nodes while also minimizing the number of shard reallocation processes. This ensures that all nodes have as much disk headroom as possible, with minimal impact to cluster performance.

Elasticsearch is constantly checking on each node’s disk availability to make decisions about where to allocate and move shards. There are 4 important“stages” that help dictate this decision making process:

  1. Alert watermark. When a node in a single tenant cluster reaches 70% disk or higher, Bonsai sends out an alert to the user.
  2. Low watermark. When a node reaches its low watermark stage(Bonsai defaults to 70% disk used), the cluster will no longer allocate new shards to this node.
  3. High watermark. When a node reaches its high watermark stage(Bonsai defaults to 75% disk used), the cluster will actively try to move shards off the node.
  4. Flood stage. When a node reaches its flood stage watermark(Bonsai uses 95% disk used), the cluster will put all open indices into a state that only allows for reads and deletes.

The low and high watermark stages are when Business and Enterprise tiers receive an emailed notification from the Bonsai Support team with suggestions for scaling disk capacity. Clusters that are over the 75% threshold can start to experience performance issues that may include but are not limited to: increased bulk queue times, higher CPU and/or load usage, or slower than normal searches.

View code snippet
Close code snippet