All production Bonsai Clusters are deployed to minimum of three nodes for redundancy and to prevent stalemates in leadership election. Each node in the cluster will be deployed to a separate AWS Availability Zone, giving us data center isolation as well.
A Bonsai cluster could experience a complete loss of one AWS data center, and the cluster will still continue to operate. This makes Bonsai clusters extremely fault-tolerant.
When a Bonsai cluster does experience a node loss, Elasticsearch and OpenSearch will automatically reroute the primary and replica shards to machines that are up and running. In the background, AWS Auto Scaling Groups will immediately begin spinning up the replacement instance that will auto-bootstrap into your configured Elasticsearch or OpenSearch configuration and version. Once the node has successfully provisioned, it will join the cluster, and then Elasticsearch or OpenSearch will offload the relocated shards back to the empty machine.
An event like this is handled as a Severity 1 incident.