Categories

Docs
>
Introduction to BonsaiUsing the Bonsai AppBonsai FeaturesHeroku User Guide for BonsaiLanguage & Framework GuidesIntegrationsPlatformElasticsearch & OpenSearch Version SupportManaging Your ResourcesTroubleshootingAPI GuideBonsai FAQCompliance & Security
>
When multiple nodes across different Availability Zones for a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

When multiple nodes across different Availability Zones for a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

All production Bonsai Clusters are deployed to minimum of three nodes for redundancy and to prevent stalemates in leadership election.
Last updated
June 17, 2023

All production Bonsai Clusters are deployed to minimum of three nodes for redundancy and to prevent stalemates in leadership election. Each node in the cluster will be deployed to a separate AWS Availability Zone, giving us data center isolation as well.

When a Bonsai cluster does experience a node loss, Elasticsearch and OpenSearch will automatically reroute the primary and replica shards to machines that are up and running. In the background, AWS Auto Scaling Groups will immediately begin spinning up the replacement instance that will auto-bootstrap into your configured Elasticsearch or OpenSearch configuration and version. Once the node has successfully provisioned, it will join the cluster, and then Elasticsearch or OpenSearch will offload the relocated shards back to the empty machine.

An event like this is handled as a Severity 1 incident.

A Bonsai cluster that experiences a complete loss of two AWS data centers does represent downtime for a cluster until the primary shards are restored on the last remaining node. To mitigate this downtime on an Enterprise cluster, we can discuss a setup that includes multi-region deployment.

View code snippet
Close code snippet