Documentation

Welcome to the Bonsai Documentation. This is the place to learn about how to integrate, setup, and manage your Bonsai account

Upgrading Your Bonsai Cluster

Overview

Upgrading your Bonsai cluster to a more powerful one is a straightforward process, designed to minimize downtime and maintain the integrity of your data. This document outlines the steps involved, the expected downtime, and the effort required for a successful upgrade.

Upgrade Process

Bonsai utilizes a two-phase snapshot and restore process to migrate your data from one hardware cluster to another. A snapshot of your data is taken from the current hardware and restored to the new new hardware. There is no downtime or production impact to your cluster during this phase. This phase of the upgrade can take time, depending on how much data you have on your existing hardware.

Once this is completed, a second snapshot-restore operation is performed. This operation is simply a delta — covering only the data that has been changed since the phase-1 shapshot was taken. Usually this second phase lasts a few seconds, or maybe a minute. During this phase, your cluster is placed into a read-only mode. Search traffic is not impacted, but writes are blocked until the restore completes.

Once this final restore has completed, the cluster will be running entirely on the new hardware, with no data loss.

Downtime and Effort

Zero Downtime for Searches:
- Your search operations will not experience any downtime during the upgrade process. This ensures continuous availability for read operations.
Short Window for Writes:
- There will be a brief period where write operations are paused. This typically lasts less than a minute, depending on the amount of data being transferred.
- An error message will be returned during this pause: 403 Forbidden along with a JSON response indicating that Cluster is currently read-only for maintenance. Please try your request again in a few minutes. See [status.bonsai.io](<http://status.bonsai.io/>) or contact [[email protected]](<mailto:[email protected]>) for updates.. It's essential to handle this gracefully in your application.
Data Volume Considerations:
- The time required for the upgrade depends on the size of your data and the rate of updates. Larger datasets may take slightly longer, but the write pause is generally kept under a minute.
Using a Queue and Buffering:
- As a general rule, we highly recommend using a queue to buffer write operations wherever possible. This could be something as simple as a Redis queue, kafka or similar that implements retries and exponential backoff for failed writes to the cluster. This ensures that transient network issues or other connection problems don’t cause lost writes.
- During the upgrade process, this kind of queue can be paused to ensure that any writes attempted during the brief readonly phase are retried and successfully processed once the cluster is upgraded.

‍

Let’s discuss how to change a cluster’s plan the Bonsai’s Cluster Dashboard.

‍
To begin, navigate to the cluster’s Plan under Settings on the cluster dashboard.

If you haven’t added billing information yet, please do so at Account Billing. Detailed steps can be found in Add a Credit Card.

Once there is billing information listed on the account, you will see the different options for changing the cluster’s plan. Select the plan you would like to change to and click the green Change to … Plan button. In this example, this `documentation_cluster` is on a Sandbox plan and it will be upgraded to a Standard Micro plan.

A successfully changed plan will notify you at the top with Plan scheduled for update. The Plan has been upgraded to the Standard Micro plan in this example.

Can't Downgrade?

Please note that downgrading a plan may fail to update if it puts the cluster in an Overage State for the new plan. For example, downgrading a Standard Micro plan to a Sandbox plan will fail if there is 11 shards on the cluster (Sandbox plan has a shard limit of 10).

‍

Upgrading and Changing a Cluster’s Plan

Bonsai has created and supports a Terraform Provider to enable creation and management of Bonsai Elasticsearch and OpenSearch Clusters from within your infrastructure as code.

‍

Terraform is an infrastructure as code (IaC) tool designed to define and manage both cloud and on-premises resources through human-readable configuration files. These files can be versioned, reused, and shared, enabling streamlined collaboration and iterative development. Bonsai has created and supports a Terraform Provider to enable creation and management of Bonsai Elasticsearch and OpenSearch Clusters from within your infrastructure as code.

Create an Elasticsearch or OpenSearch cluster with the Bonsai.io Terraform Provider

1. Create API Key and Secret

The Bonsai Terraform Provider works by integrating with the Bonsai REST API via the (also newly created) Bonsai Cloud Go API Client. To create resources with the Bonsai API, you'll need a way to authenticate your requests. Currently, both the Bonsai Go API Client and the Bonsai Terraform Provider support HTTP Basic Authentication over TLS, using an API Key and Secret Token combination.

‍

To generate an API Key and Secret Token, head over to your account's API Tokens overview page.

‍

and click the "Generate Token" button.

‍

‍

More in-depth instructions are available at the Bonsai Documentation site!

2. Configure the Terraform Provider

Next, you'll need to configure the Bonsai Terraform Provider with the API Key and Token pair generated in the previous step.

‍

There are two options here:

‍

Implicitly configure the provider by setting the BONSAI_API_KEY and BONSAI_API_TOKEN environment variables in the environment that will execute your Terraform code, or,
Explicitly configure the provider by setting the api_key and api_token provider variables in terraform, as detailed below.

‍

Choose an option that best fits your use case, and you're ready to start creating Elasticsearch and OpenSearch clusters!

‍

# provider.tf

# Configure the Bonsai Provider using the required_providers stanza.
terraform {
  required_providers {
    bonsai = {
      source  = "omc/bonsai"
      version = "~> 1.0"
    }
  }
}

provider "bonsai" {
  # Default: The Bonsai Terraform Provider will fetch the api_key configuration
  # value from the BONSAI_API_KEY environment variable.
  #
  # Uncomment the following line to set directly via variable.
  # api_key = var.bonsai_api_key

  # Default: The Bonsai Terraform Provider will fetch the api_token configuration
  # value from the BONSAI_API_TOKEN environment variable.
  #
  # Uncomment the following line to set directly via variable.
  # api_token = var.bonsai_api_token
}

3. Configure the Terraform Resources

Configuring a Bonsai Elasticsearch or OpenSearch cluster is easy, we only need to make 4 decisions, reflected as Terraform Resource Attributes for our cluster:

‍

name: The human-readable name of the cluster.
plan: The subscription plan.
space: The space describing the server group and geographic location of the cluster.some text
- For AWS, these follow the format of omc/bonsai/$REGION/common, where $REGION is an AWS Region.
- For GCP, these follow the format of omc/bonsai-gcp/$REGION/common, where $REGION is a GCP Region.
release: The version of Elasticsearch or OpenSearch to be deployed.

‍

Details about the plans, spaces, and releases available to your account are available either within the Bonsai.io control plane, via the API (we recommend using the Bonsai Cloud Go API Client), or via the Terraform Data Sources.

‍

Here's an example of listing all available Plans, Releases, and Spaces.

‍

Related note! You really only need to output the bonsai_plans data list, as each Plan object will embed a list of the spaces it's available in, and the various Elasticsearch/OpenSearch releases that the plan supports.

# data.tf

// Fetch all Available Bonsai Plans
data "bonsai_plans" "list" {}

// Fetch all Available Bonsai Spaces
data "bonsai_spaces" "list" {}

// Fetch all Available Bonsai Releases
data "bonsai_releases" "list" {}

// Output collected data
output "bonsai_spaces" {
  value = data.bonsai_spaces.list
}

output "bonsai_release" {
  value = data.bonsai_releases.list
}

output "bonsai_plans" {
    value = data.bonsai_plans.list
}

With that output, we can create an optimized OpenSearch cluster that matches our needs.

Related note! We've built an awesome (if we do say so ourselves...) interactive tool to help you estimate the best plan for your use-case! We're also always happy to help your team navigate the sizing process, reach out to us via the Support button in your console!

‍

# main.tf

resource "bonsai_cluster" "test" {    
    name = "Terraform Created Cluster"  
    plan = {  
        slug = "sandbox"  
    }  
      
    space = {  
        path = "omc/bonsai/us-east-1/common"  
    }  
      
    release = {  
       slug = "opensearch-2.6.0-mt"  
    }  
}

4. Configure the Terraform Outputs

And finally, we'll configure our infrastructure data output, so that we can reference our newly created cluster directly in the rest of our infrastructure.

Related note! If you're wondering what the sensitive = true configuration does below, it's required because the cluster's user and password are treated as sensitive outputs by the provider.

If you don't mark their output as sensitive, you'll receive an error to reduce the risk of accidentally exporting sensitive information as plain text!

‍

We also recommend encrypting your Terraform state at rest, and if possible, before storage.

output "bonsai_cluster_id" {  
    value = bonsai_cluster.test.id  
}  
  
output "bonsai_cluster_name" {  
    value = bonsai_cluster.test.name  
}  
  
output "bonsai_cluster_host" {  
    value = bonsai_cluster.test.access.host  
}  
  
output "bonsai_cluster_port" {  
    value = bonsai_cluster.test.access.port  
}  
  
output "bonsai_cluster_scheme" {  
    value = bonsai_cluster.test.access.scheme  
}  
  
output "bonsai_cluster_url" {  
    value = bonsai_cluster.test.access.url  
}  
  
output "bonsai_cluster_slug" {  
    value = bonsai_cluster.test.slug  
}  
  
output "bonsai_cluster_user" {  
    value = bonsai_cluster.test.access.user  
    sensitive = true  
}  
  
output "bonsai_cluster_password" {  
    value = bonsai_cluster.test.access.password  
    sensitive = true  
}  
  
output "bonsai_cluster_state" {  
    value = bonsai_cluster.test.state  
}  
  
output "bonsai_cluster_stats" {  
    value = bonsai_cluster.test.stats  
}  
  
output "bonsai_cluster_plan" {  
    value = bonsai_cluster.test.plan  
}  
  
output "bonsai_cluster_release" {  
    value = bonsai_cluster.test.release  
}  
  
output "bonsai_cluster_space" {  
    value = bonsai_cluster.test.space  
}

Next Steps & Additional Resources

‍

For additional details on managing your Bonsai resources with Terraform, check out our blog post, "Introducing Bonsai's Terraform Provider for Elasticsearch and OpenSearch".

‍

Check out the Provider on GitHub and the official HashiCorp Terraform Registry!

‍

Using Terraform with Bonsai

The Problem

One of the biggest hurdles a search developer comes across is how to get data from one cluster into a new one. In a perfect world we would have fast and reliable reindexing scripts to quickly teardown and/or rebuild indices. A good example of this pattern is in the elasticsearch-rails gem’s import tasks. See also a more in-depth example of an Indexer class in the search for Jekyll gem, searchyll.

Sometimes the best case is not always possible, either from accumulated tech debt or contextual constraints. For those in on our Sandbox and Standard plans, this problem is compounded in that in an effort to keep these plans accessible, the Snapshot API is not available on demand. Read more in our write up on this here: https://bonsai.io/docs/snapshots-on-bonsai. Particularly for those on non-production plans such as our Sandbox plan, backups aren’t taken regularly. In this case, what options are available? Let’s explore a couple of strategies.

Possible Solutions

There are two solutions to reindexing and/or migrating your cluster in a situation where both the Snapshots API isn’t available. The first is to use the elasticsearch-dump library, and the second is to manage it with a custom solution. Regardless of which way you chose, you’ll need to follow this larger process:

Download your mappings.
Download a copy of your old cluster data, or design and implement indexing scripts to do it on your own.
Re-create your indices and the mappings on your new cluster.
Index your data on the new cluster.

elasticsearch-dump

elasticsearch-dump is a mature javascript library that has been around through nearly every release of Elasticsearch. It can download data and mappings, migrate between clusters directly, and do all sorts of imports and exports necessary for the search engineer’s workflow.

The process for getting started is simple:

Download the library:

npm install elasticdump

Copy over mappings and data into a new cluster, either through a download and reindex or via urls.

Here’s an example of what a migration might look like:

# Backup index data to a file:
elasticdump \
  --input=https://key:[email protected]:443/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping

# Index the data into your cluster with the file:
elasticdump \
  --input=/data/my_index.json \
  --output=https://key:[email protected]:443/my_index \
  --type=data

You’ll need to use your cluster credentials to access your index from a terminal session. See our docs on Cluster Credentials here: https://bonsai.io/docs/credential-management.

Managing your own reindex

Much of what elasticdump does can be manually written if necessary, using curl or whatever language you prefer. For example, downloading mappings can be done using curl:

curl -XGET "https://key:[email protected]:443/_mapping?pretty=true" > mappings.json

And later, with a new cluster, you can PUT your new mappings to its corresponding index:

curl -XPUT "https://key:[email protected]:443/index_name/_mapping" \
  -H 'Content-Type: application/json' \
  -d @mappings.json

It’s important to note that the downloaded mappings will have to be edited or pieced apart to PUT to the new indices. In the case of managing your reindex yourself, either dump the index data with elasticdump above, or create scripts to reindex straight from your database.

Further Resources

Depending on how many versions you are upgrading you’ll need to navigate breaking changes between versions, like the drop of _doc types in v6.x. There is extensive coverage of breaking changes in the Elasticsearch documentation. See also our guides on moving from major versions:

We’ve seen it all and are here to help. Please reach our to [email protected] and we’ll point you in the right direction. Cheers!

How to Download Elasticsearch and OpenSearch Data Without Snapshots

Bonsai Elasticsearch can be removed via the command line or Heroku’s app Dashboard.

Danger!

This will destroy all associated data and cannot be undone!

Using the Command Line

The Bonsai Add-on can be removed from the application via Heroku’s command line tool:

This will destroy the cluster and all of its data instantly.

Using the Heroku Dashboard

To remove the Bonsai Add-on from the Heroku Dashboard, log in to your Heroku account and click on the app the Add-on is attached to. In this example, that Add-on is <span class="inline-code"><pre><code>mycoolelasticsearchapp</code></pre></span>:

Next, click on the Resources tab:

Find the Bonsai Add-on in the list of application Add-ons, then click the carot menu:

Select “Delete Add-on” from the list of options. You will need to type in the app’s name to confirm the removal of the Bonsai Add-on:

Click on “Remove Add-on” to complete the process.

‍

How to remove Bonsai from your Heroku app

When it’s time to scale, you can easily upgrade your shared plan, or migrate to a dedicated cluster.

How to Update Your Cluster Plan

Note on plan changes

Most plan changes take effect instantly. However, if you’re trying to upgrade or downgrade across an architecture class, there will be a delay in processing.

Updating your cluster plan with Heroku is fairly simple. You can do this in either of two ways:

Through the Heroku CLI tool
Through the Heroku UI - your app dashboard

Using the Command Line

You can view all of our available plans for Heroku users on the Bonsai Heroku Add-on page:

From there, choose the plan you’d like to change your cluster to and note the plan slug. For example, our <span class="inline-code"><pre><code>Standard SM</code></pre></span> has a plan slug of <span class="inline-code"><pre><code>standard-sm</code></pre></span>, our <span class="inline-code"><pre><code>Private Compute LG</code></pre></span> has a slug of <span class="inline-code"><pre><code>private-compute-lg</code></pre></span>, etc. Open up your project in a terminal and run:

If you have several applications with Bonsai Add-ons, you can specify which one you would like to upgrade or downgrade with the <span class="inline-code"><pre><code>-a</code></pre></span> flag:

Using the Heroku app Dashboard

Log into your Heroku account and open your app dashboard. In this example, the app is called <span class="inline-code"><pre><code>mycoolelasticsearchapp</code></pre></span>:

Click on the Resources tab to view your Add-ons:

You’ll see a list of your Add-ons, with a carot menu on the far right side.

Clicking on the carot icon will open up a dropdown menu. Select <span class="inline-code"><pre><code>Modify Plan</code></pre></span> to open a modal, and you can choose a new plan for your cluster:

How Heroku Handles Billing

Heroku handles all of the billing, and prorates by the second.

A contrived example: a customer signs up for a $50/mo plan at 00:00:00, then decides to upgrade to a $150 plan at 00:10:30, then downgrades back down to a $50 plan at 01:00:00, then destroys it at 02:45:00. The customer’s bill would be roughly calculated as:

630s on a $50/mo plan = $0.012
2970s on a $150/mo plan = $0.17
6300s on a $50/mo plan = $0.12
Total: $0.30

This amount would be added to the customer’s next invoice from Heroku.

Migrating Across Architectures

Bonsai offers two main architecture classes: multi-tenant and single tenant. The multi-tenant class – sometimes called “shared” – is designed to allow clusters to share hardware resources while still being sandboxed from one another. This allows us to provide unparalleled performance per dollar in a way that’s also extremely affordable.

The single tenant class – sometimes called “dedicated” – maps one cluster to a private set of hardware resources. Because these resources are not shared with any other cluster, single tenant configurations provide maximum performance for a slightly higher price.

Plan changes within the multi-tenant class take place instantly. However, moving a cluster from one class to another will take some time. This is because data will need to be moved across a network boundary and search resources may need to be created.

In other words, if a customer upgrades from a “Shared” plan to a “Dedicated,” the Bonsai app will need to provision and configure new private servers, wait for them to come online and pass health checks, then migrate their cluster’s data out of the multi-tenant class and into their new single tenant class.

Alternately, if a customer downgrades from a “Dedicated” plan to a “Shared” plan, a data migration will need to be performed, and the old servers torn down.

The time required for these migrations can vary. If you have questions or concerns, please send us a note at [email protected]

‍

Scale your cluster in Heroku

Our Heroku Add-on

We offer a free Hobby plan for development and testing on Heroku. A list of all available plans and capacities can be found at https://elements.heroku.com/addons/bonsai.

There are two ways to add Bonsai to your Heroku app:

1. Through the Heroku CLI tool

2. Through the Heroku app dashboard

Using the Heroku CLI tool

You can add Bonsai to your app with this command:

You can verify that the operation was successful by running <span class="inline-code"><pre><code>heroku config:get BONSAI_URL</code></pre></span>:

If this value is null, then Bonsai was not properly added to your account. Try again and look for error messages. If you still have problems, give us a shout.

Once the Add-on has been successfully added, you can check on the status of your cluster by running <span class="inline-code"><pre><code>$ heroku addons:open bonsai -a</code></pre></span>

Specifying a Search Engine

Bonsai supports multiple search engines. The default is Elasticsearch, which will be used if nothing is specified on creation. Users can specify which search engine to use with the <span class="inline-code"><pre><code>--engine</code></pre></span> parameter. To use it on the command line, you would use something like this:

Allowed values are: "elasticsearch" and "opensearch".

Adding a specific version of a Search Engine

You have a fair amount of flexibility to choose the search engine version your cluster can run. There is a command line flag for specifying which version to use. This flag is called <span class="inline-code"><pre><code>--version</code></pre></span>, and it can be invoked in a couple different ways. To use it on the command line, you would use something like this:

Bonsai only supports certain versions of a given search engine, so users cannot provision any version. If you request a version that is not available, Bonsai will default to the latest available version.

There is a list of available versions documented here: Supported Elasticsearch Versions. In short, the options available will be determined by three factors:

Region. Your cluster will be provisioned in the same AWS region where your Heroku app is located. However, some regions support more versions than other regions, based on demand.
Plan. Versions are rolled out to Hobby and Staging tier plans in advance of Production grade plans. Additionally, older versions of Elasticsearch may only be supported on Business tier plans.
Age. Bonsai will occasionally deprecate older versions of Elasticsearch. If you need a version of Elasticsearch that is no longer supported, the only way to get it will be to upgrade to a Business plan.

The <span class="inline-code"><pre><code>version</code></pre></span> and <span class="inline-code"><pre><code>engine</code></pre></span> parameters also work in your app.json, if you use PR apps and Heroku pipelines:

To view a list of our current supported regions, see our documentation here:

Bonsai Supported Search Engine Versions

For further help with your app.json and addons, see Heroku’s documentation here:

Heroku App JSON Schema

Using the Heroku dashboard

To add Bonsai through the Heroku UI, open up your application in the Heroku dashboard

Click on either the Resources tab, or the “Configure Add-ons” link. This menu will have a search bar for various Add-ons. Begin typing “bonsai,” and the autocomplete will find the Bonsai Elasticsearch Add-on:

Click on the Add-on to add it to your application. This will bring up a new screen where you will be able to select a payment plan for your new cluster. See the Bonsai Heroku Add-ons page for details about what each plan offers.

When you are ready, click on “Provision.” Your new cluster will be instantly created. Your dashboard will now show this:

When Bonsai is added to your application, a new environment variable called BONSAI_URL is created and initialized with the URL to your cluster. This is the URL that you will need in order to interact with your cluster.

Your application should be configured to read this URL directly from the environment, and should not be hard-coded or shared with others. If you would like to confirm that Elasticsearch is up and running, you can retrieve the URL by clicking no the Settings tab in the Heroku Dashboard.

There will be a section called “Config Vars”:

Click on “Reveal Config Vars” to see your environment variables. This will reveal the value of each variable like so:

You can copy the contents of the BONSAI_URL to your clipboard and paste it into a browser or curl command to see the response from Elasticsearch:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">$ curl https://username:[email protected]
{
  "name" : "PvRcoFq",
  "cluster_name" : "elasticsearch",
  "cluster_uuid" : "DNlbVYS0TIGYwbQ6CUNwTw",
  "version" : {
    "number" : "5.4.3",
    "build_hash" : "eed30a8",
    "build_date" : "2017-06-22T00:34:03.743Z",
    "build_snapshot" : false,
    "lucene_version" : "6.5.1"
  },
  "tagline" : "You Know, for Search"
}</code></pre>
</div>
</div>

HTTP 401: Authentication required

If you’re seeing this message, then you’re not including the correct authentication credentials in your request. Double check that the request includes the credentials shown in your dashboard and try again.

Keep Your URL Secret

Because the URL includes authentication information, anyone with the fully-qualified URL can access your cluster. Treat the URL like a password: don’t hard-code it into applications, don’t check it into source control, and for Pete’s sake, never ever paste it into a StackOverflow question. If you need to share your cluster URL with us for support purposes, the auth-less host name is all you need:

Bad: “My cluster is https://username:[email protected]”

Good: “My cluster is somehost-1234567”

If your URL is leaked somehow, you can regenerate the credentials.

Adding Bonsai to your app automatically creates a cluster for you. It also adds a variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> to your application’s environment, which will contain the canonical URL and credentials to your cluster. This automation makes it fast and easy to get started with your new cluster.

Add Bonsai to your Heroku app

If such a failure happens, Bonsai’s staff will work with your team to understand where you will be relocating your application and can then initiate a restore process into a cluster in the same AWS Region while maintaining your existing DNS connections. Alternatively, clusters on an Enterprise plan have the option to re-provision to a nearby AWS Region.

An event like this is handled as a Severity 1 incident.

How much time will it take to restore?

‍There are several factors to calculate the time to restore: lead time for a support request or Bonsai’s internal alerting system to our Platform team, plus primary data restoration time from the AWS S3 system.For example, recovering 1 TB of primary data at 1,000 MB/second would take just over two hours. Performance may vary. With 5 TB of primary data, that can approximately take over 10 hours to restore.

When an entire Bonsai cluster fails to come up (for example with 5 TB of disk usage), how does Bonsai restore the cluster from a backup and how much time will it take?

All production Bonsai Clusters are deployed to minimum of three nodes for redundancy and to prevent stalemates in leadership election. Each node in the cluster will be deployed to a separate AWS Availability Zone, giving us data center isolation as well.

When a Bonsai cluster does experience a node loss, Elasticsearch and OpenSearch will automatically reroute the primary and replica shards to machines that are up and running. In the background, AWS Auto Scaling Groups will immediately begin spinning up the replacement instance that will auto-bootstrap into your configured Elasticsearch or OpenSearch configuration and version. Once the node has successfully provisioned, it will join the cluster, and then Elasticsearch or OpenSearch will offload the relocated shards back to the empty machine.

An event like this is handled as a Severity 1 incident.

A Bonsai cluster that experiences a complete loss of two AWS data centers does represent downtime for a cluster until the primary shards are restored on the last remaining node. To mitigate this downtime on an Enterprise cluster, we can discuss a setup that includes multi-region deployment.

When multiple nodes across different Availability Zones for a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

A Bonsai cluster could experience a complete loss of one AWS data center, and the cluster will still continue to operate. This makes Bonsai clusters extremely fault-tolerant.

An event like this is handled as a Severity 1 incident.

When a single node on a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

The Bonsai cluster’s availability will not be impacted by these two changes. Ideally, these are scheduled during off-peak hours for your cluster. Our operators are notified if something were to go awry during the process.

When increasing disk capacity by adding new nodes or changing an instance type on a Bonsai cluster, what is the likely impact to availability of the system and what are the possible recovery steps?

There are 2 options for a minor version upgrade.

‍Option 1 - Enterprise Plans

‍Bonsai clusters on Enterprise plans can send in a request to [email protected] for our team to perform an in-place minor version upgrade on your behalf. The impact of an in-place minor version upgrade is the same as a rolling restart: there will be a few minutes of read-only mode as it restarts.

‍Option 2 - Non-Enterprise Plans

‍Check out our documentation on Upgrading Major Versions that we also recommend for minor version upgrades. By following our variation of a blue-green strategy outlined in the documentation and assuming you can pause or buffer your updates:

Version upgrades happen instantaneously
Can be performed with zero downtime
Offers recovery steps depending on your use case’s approach

When upgrading a Bonsai cluster’s Elasticsearch or OpenSearch minor version, what is the likely impact to availability of the system and what are the possible recovery steps?

There are 2 options for a major version upgrade.

‍Option 1 - Enterprise Plans

‍Schedule a time with our team for us to perform a snapshot-restore process to upgrade a major version. During the process, you can expect a short period (depending on your cluster’s disk capacity) of read-only mode. In most cases it takes a few seconds.TitanIAM users will find that it handles retries, so as far as their application is concerned, this would be close to zero-downtime.

The read-only mode would end with an atomic update to the routing layer, sending incoming traffic to the new hardware. This gives our team a fallback path if something goes awry. However, any writes applied between the routing update and fallback are simply lost.

Option 2 - Non-Enterprise Plans

‍Check out our documentation on Upgrading Major Versions. By following our recommended variation of a blue-green strategy outlined in the documentation and assuming you can pause or buffer your updates:

Version upgrades happen instantaneously
Can be performed with zero downtime
Offers recovery steps depending on your use case’s approach

When upgrading a Bonsai cluster’s Elasticsearch or OpenSearch major version, what is the likely impact to availability of the system and what are the possible recovery steps?

Yes.

Does Bonsai Support HTTP/2?

No. HTTP Pipelining is not supported by Bonsai's platform, and instead Bonsai supports HTTP 2 instead.

Does Bonsai Support HTTP Pipelining?

The Groovy scripting language was introduced in Elasticsearch in version 1.4 as a replacement for MVEL. This replacement was supposed to address a number of security vulnerabilities in MVEL (among other things). However, Groovy also introduced some serious vulnerabilities that led to some high profile attacks.

While those specific vulnerabilities were ultimately patched, the decision was made that Groovy was not a safe option for multitenant configurations. As a result, Groovy scripting is only enabled for single tenant plans.

‍Groovy Errors

‍If your app returns errors, containing <span class="inline-code"><pre><code>ExpressionScriptCompilationException</code></pre></span>, it’s likely that you need Groovy Scripting enabled on your cluster. This is a feature available on Dedicated and Enterprise plans.Users with a Business or Enterprise plan should contact us to get groovy enabled. Users on our multitenant plans can still use dynamic scripting with the faster (but somewhat limited) Lucene Expressions language.

Does Bonsai Support Groovy Scripting?

Yes! CORS is enabled for all clusters by default. Users will not need to perform any additional configuration on their clusters in order to set it up.

Does Bonsai Provide CORS Support?

Do you need to test a certain analyzer or a new Elasticsearch feature? Testing locally is usually the fastest way to make iterative changes before pushing them to staging or production. Please download an Elasticsearch version that is compatible with your plan. Find your OS system below, and follow the instructions to get Elasticsearch running locally and connect to it.

Windows

Windows users can download Elasticsearch as a ZIP file. Simply extract the contents of the ZIP file, and run <span class="inline-code"><pre><code>bin/elasticsearch.bat</code></pre></span> to start up an instance. Note that you’ll need Java installed and configured on your system in order for Elasticsearch to run properly.

Elasticsearch can also be run as a service in Windows.

Linux

There are many Linux distributions out there, so the exact method of getting Elasticsearch installed will vary. Generally, you can download a tarball of Elasticsearch, and extract the compressed contents to a folder. It should have all of the proper executable permissions set, so you can just run <span class="inline-code"><pre><code>bin/elasticsearch</code></pre></span> to spin up an instance. Note that if you’re managing Elasticsearch in Linux without a package manager, you’ll need to ensure all the dependencies are met. Java 7+ is a hard requirement, and there may be others.

Arch Linux

Some distributions have preconfigured Elasticsearch binaries available through repositories. Arch Linux, for example, offers it through the community repo, and can be easily installed via pacman:

This package also comes with a systemd service file for starting/stopping Elasticsearch with <span class="inline-code"><pre><code>sudo systemctl elasticsearch.service</code></pre></span>.

One caveat with Arch: packages are bleeding edge, which means updates are pushed out as they become available. Bonsai is not a bleeding edge service, so you’ll need to be careful to version lock the Elasticsearch package to whatever version you’re running on Bonsai. You may also need to edit the PKGBUILD and elasticsearch.install files to ensure you’re running the same version locally and on Bonsai.

Ubuntu and Debian-flavors

Other distros can use the DEB and RPM files that Elasticsearch offers on the download page. Debian-based Linux distributions can use <span class="inline-code"><pre><code>dpkg</code></pre></span> to install Elasticsearch (note that this doesn’t handle configuring dependencies like Java):

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript"># Update the package lists
$ sudo apt-get update

# Make sure Java is installed and working:
$ java -version

# If the version of Java shown is not 7+ (1.7+ if using OpenJDK),
# or it doesn't recognize java at all, you need to install it:
$ sudo apt-get install openjdk-7-jre

# Download the DEB from Elasticsearch:
$ wget https://download.elastic.co/elasticsearch/elasticsearch/elasticsearch-X.Y.Z.deb

# Install the DEB:
$ sudo dpkg -i elasticsearch-1.7.2.deb
</code></pre>
</div>
</div>

This approach will install the configuration files to <span class="inline-code"><pre><code>/etc/elasticsearch/</code></pre></span> and will add init scripts to <span class="inline-code"><pre><code>/etc/init.d/elasticsearch</code></pre></span>.

Red Hat / Suse / Fedora / RPM

Elasticsearch does provide an RPM file for installing Elasticsearch on distros using rpm:

<span class="inline-code"><pre><code>rpm</code></pre></span> should handle all of the dependency checks as well, so it will tell you if there is something missing.

Testing the Install

Once you have Elasticsearch installed and running on your local machine, you can test to see that it’s up and running with a tool like curl. By default, Elasticsearch will be running on port 9200. Typically the machine will have a name like <span class="inline-code"><pre><code>localhost</code></pre></span>. If that doesn’t work, you can always use the machine’s local IP address (typically 127.0.0.1).

The <span class="inline-code"><pre><code>curl</code></pre></span> request and Elasticsearch response should look something like this:

If you see this response, then your local Elasticsearch cluster is up and running! If not, review the documentation for getting Elasticsearch up and running on your operating system and try again.

Once Elasticsearch is running locally, you can configure a local instance of your application to connect to <span class="inline-code"><pre><code>localhost:9200</code></pre></span> (this is probably the default for your Elasticsearch client anyway). Now you can test out your application’s Elasticsearch integration locally!

‍

Testing Elasticsearch Locally

Private Networks

The information provided in this section pertains to Bonsai clusters which have been provisioned on public networks (accessible across the Internet). Bonsai clusters that are provisioned in a Vault or Heroku Private Space can not be reached in this way.

This is by design -- why put a cluster on a private network if you want to access it from anywhere? Connecting to Bonsai clusters in private networks will require a user to establish a remote connection to the VPC before being able to interact with Elasticsearch.

Every Bonsai cluster is created with a unique URL designed for secure, authenticated access to an Elasticsearch cluster. This URL allows a wide array of platforms and application clients to communicate with Elasticsearch. This URL looks something like this:

This URL has the following parts:

Protocol (ex: https). The protocol to use for communicating with Elasticsearch. This can be either HTTP or HTTPS. The primary difference between the two is that HTTPS is encrypted, whereas HTTP is not. Bonsai defaults to HTTPS to ensure that your data can not be read in transit.
Authentication Credentials (ex: a1b2c3d4e:5f6g7h8i9). This is a randomly-generated username/password pair. By default, you will need to include these credentials with every request to your cluster, otherwise Bonsai will return an <span class="inline-code"><pre><code>HTTP 401: Authorization Required</code></pre></span> message.
Hostname (ex: something-1234567). The hostname will be a concatenated string generated by your cluster's name, and a random number. This provides some security through obscurity by making guesses and enumeration extraordinarily difficult, as well as avoiding name collisions.
Region (ex: us-east-1). This is the region in which the cluster is provisioned. There are a number of possible regions for this, and we use the AWS naming conventions to indicate where the cluster is located.
Domain (ex: bonsaisearch.net). Bonsai clusters can be accessed via one of two domains: bonsai.io and bonsaisearch.net. This is to avoid the rare case of a TLD outage; if the .io or .net TLD is down for some reason, the cluster can still be accessed.

Let's examine these in greater detail.

Protocols and Ports

All Bonsai clusters default to secure HTTPS using recent versions of TLS for encrypted communication. These connections should use port 443, the standard for SSL/TLS.

We also support unencrypted HTTP access over ports 80 and 9200.

You can see a table describing the protocols and ports available here:

<table>
<thead>
<tr>
<th>Port</th><th>Protocol</th><th>Notes</th>
</tr>
</thead>
<tbody>
<tr>
<td>80</td><td>HTTP</td><td>Unencrypted</td>
</tr>
<tr>
<td>443</td><td>HTTPS</td><td>Default; recommended</td>
</tr>
<tr>
<td>9200</td><td>HTTP</td><td>Unencrypted</td>
</tr>
<tr>
<td>9300</td><td>Elasticsearch Native Binary Protocol</td><td>Not supported</td>
</tr>
</tbody>
</table>

Unfortunately we do not support the native binary protocol on 9300 at this time. If your application strongly depends on the binary protocol, please contact our sales team at [email protected] to design a custom cluster deployment.

Authentication Credentials

When your cluster is created, it is provided with a randomly generated set of credentials. In the example URL above, the API Key is <span class="inline-code"><pre><code>a1b2c3d4e</code></pre></span> and the API Secret is <span class="inline-code"><pre><code>5f6g7h8i9</code></pre></span>. These are often supplied to various HTTP or Elasticsearch clients as the HTTP username and password, and encoded according to the Basic Authorization scheme. (Cf. RFC 2617 Section 2.)

Credentials are Random

The cluster credentials are not the same as what you would use to log into your Bonsai/Heroku account. These are randomly generated when the cluster is created.

HTTP 401: Authentication required

If you're seeing this message, then you're not including the correct authentication credentials in your request. Double check that the request includes the credentials shown in your dashboard and try again.

Keep Your URL Secret

Because the URL includes authentication information, anyone with the fully-qualified URL can access your cluster. Treat the URL like a password: don't hard-code it into applications, don't check it into source control, and for Pete's sake, never ever paste it into a StackOverflow question. If you need to share your cluster URL with us for support purposes, the auth-less host name is all you need:

Bad: "My cluster is https://username:[email protected]"

Good: "My cluster is somehost-1234567"

If your URL is leaked somehow, you can regenerate the credentials. See the dashboard documentation for more information.

Hostname

Each Bonsai cluster has a unique hostname. In the example above, this is <span class="inline-code"><pre><code>somehost-1234567</code></pre></span>. The hostname is a blend of your cluster's name along with a unique identifier. The hostname is immutable: once the cluster has been created, the hostname can not be changed. Keep this in mind when creating your cluster.

Hostnames and Support

When asking questions via our support channels, you may provide the unique identifier portion of your hostname (e.g., <span class="inline-code"><pre><code>somehost-1234567</code></pre></span>) to help us cross-reference with your account and your cluster's logs.

Region

All Bonsai clusters have a region slug included in the URL. Some example slugs might be:

us-east-1
us-west-2
eu-west-1
... and so on.

These slugs are based on the AWS Regions and AZs and Google Cloud Regions and Zones, and indicate where on the planet your cluster is running. Ideally, this will be as close as possible to your application servers, to minimize latency.

Domain

All Bonsai clusters can be accessed via either of two domains: bonsai.io or bonsaisearch.net. In other words, both of these URLs will work:

This is a failover option to accommodate for the rare, but not unheard of, TLD outage. If there is an outage impacting .net or .io domains, users can point their application to whichever TLD is operational.

Connecting to Elasticsearch via Curl

One way to connect to Elasticsearch is with a command line tool called `curl`. Curl is easy to use and comes preinstalled on many operating systems (and is widely available for download and installation). Curl can send and receive data from your Bonsai Elasticsearch cluster like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">curl https://a1b2c3d4e:[email protected]
{
"name" : "PvRcoFq",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "DNlbVYS0TIGYwbQ6CUNwTw",
"version" : {
"number" : "5.4.3",
"build_hash" : "eed30a8",
"build_date" : "2017-06-22T00:34:03.743Z",
"build_snapshot" : false,
"lucene_version" : "6.5.1"
},
"tagline" : "You Know, for Search"
}
</code></pre>
</div>
</div>

Curl can be used to create and remove indices, add data, check on cluster state and more:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript"># Create an index called 'test'
curl -s -XPUT https://a1b2c3d4e:[email protected]/test {"acknowledged":true,"shards_acknowledged":true}

# Add a document to the 'test' index
curl -s -XPUT https://a1b2c3d4e:[email protected]/test/doc/1 -d '{"title":"test doc"}' {"_index":"test","_type":"doc","_id":"1","_version":1,"result":"created","_shards":{"total":2,"successful":2,"failed":0},"created":true}

# Delete the 'test' index
curl -s -XDELETE https://a1b2c3d4e:[email protected]/test {"acknowledged":true}
</code></pre>
</div>
</div>

Connecting to Elasticsearch via Browser

Another way to connect to Elasticsearch is via your browser of choice. Simply copy the full URL from the Bonsai dashboard, and paste it into the browser's address bar. It should look something like this:

HTTP 401: Authorization required When Using a Browser

Some browsers, as a security feature, will strip the authentication credentials from the URL to prevent them from being displayed in the address bar. The credentials are stored in a local session. However, if you subsequently copy/paste the URL from the address bar, it will drop the credentials. There are some other conditions that can cause the credentials to be lost from the active window.

If you're using a browser to interact with Elasticsearch and you are seeing HTTP 401 errors, try to copy/paste the credentials back into the address bar.

This method is good for reading the cluster state, but not so great for creating indices, updating data, etc.

Connecting to Elasticsearch via Interactive Console

The Bonsai dashboard offers an interactive console which allows users to engage with their cluster. This console simplifies the process of communicating with the cluster, and it obviates the need for dealing with credentials altogether

It should be noted that Bonsai offers Private Spaces and when a space is private the Interactive Console is not accessible.

‍

Connecting to Bonsai

Creating an index on Elasticsearch is the first step towards leveraging the awesome power of Elasticsearch. While there is a wealth of resources online for creating an index on Elasticsearch, if you’re new to it, make sure to check out the definition of an index in our Elasticsearch core concepts.

If you’re already familiar with the basics, we have a blog post / white paper on The Ideal Elasticsearch Index, which has a ton of information and things to think about when creating an index.

Note that many Elasticsearch clients will take care of creating an index for you. You should review your client’s documentation for more information on its index usage conventions. If you don’t know how many indexes your application needs, we recommend creating one index per model or database table.

Important Note on Index Auto-Creation

By default, Elasticsearch has a feature that will automatically create indices. Simply pushing data into a non-existing index will cause that index to be created with mappings inferred from the data. In accordance with Elasticsearch best practices for production applications, we’ve disabled this feature on Bonsai.

However, some popular tools such as Kibana and Logstash do not support explicit index creation, and rely on auto-creation being available. To accommodate these tools, we’ve whitelisted popular time-series index names such as <span class="inline-code"><pre><code>logstash*</code></pre></span>, <span class="inline-code"><pre><code>requests*</code></pre></span>, <span class="inline-code"><pre><code>events*</code></pre></span>, <span class="inline-code"><pre><code>.kibana*</code></pre></span> and <span class="inline-code"><pre><code>kibana-int*</code></pre></span>.

For the purposes of this discussion, we’ll assume that you don’t have an Elasticsearch client that can create the index for you. This guide will proceed with the manual steps for creating an index, changing settings, populating it with data, and finally destroying your index.

Manually Creating an Index

There are two main ways to manually create an index in your Bonsai cluster. The first is with a command line tool like <span class="inline-code"><pre><code>curl</code></pre></span> or <span class="inline-code"><pre><code>httpie</code></pre></span>. Curl is a standard tool that is bundled with many *nix-like operating systems. OSX and many Linux distributions should have it. It can even be installed on Windows. If you do not have curl, and don’t have a package manager capable of installing it, you can download it here.

The second is through the Interactive Console. The Interactive Console is a feature provided by Bonsai and found in your cluster dashboard.

Let’s create an example index called <span class="inline-code"><pre><code>acme-production</code></pre></span> from the command line with curl.

Authentication required

All Bonsai clusters have a randomly generated username and password. By default, these credentials need to be included with all requests in order to be processed. If you’re seeing something an HTTP 401 Error like this:

then your credentials were not supplied. You can view the fully-qualified URL in your cluster dashboard. It will look like this: <span class="inline-code"><pre><code>http://user:[email protected]</code></pre></span>

Updating the Index Settings

We can inspect the index settings with a GET call to <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> like so:

The <span class="inline-code"><pre><code>?v</code></pre></span> at the end of the URL tells Elasticsearch to return the headers of the data it’s returning. It’s not required, but it helps explain the data.

In the example above, Elasticsearch shows that the <span class="inline-code"><pre><code>acme-production</code></pre></span> index was created with one primary shard and one replica shard. It doesn’t have any documents yet, and is only a few bytes in size.

Let’s add a replica to the index:

Now, when we re-query the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> endpoint, we can see that there are now two replicas, where before there was only one:

Similarly, if we wanted to remove all the replicas, we could simply modify the JSON payload like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">$ curl -XPUT http://user:[email protected]/acme-production/_settings -d '{"index":{"number_of_replicas":0}}'
{"acknowledged":true}

$ curl -XGET http://user:[email protected]/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open acme-production 1 0 0 0 318b 159b</code></pre>
</div>
</div>

Adding Data to Your Index

Let’s insert a “Hello, world” test document to verify that your new index is available, and to highlight some basic Elasticsearch concepts.

Every document prior to Elasticsearch 7.x should specify a <span class="inline-code"><pre><code>type</code></pre></span>, and preferably an <span class="inline-code"><pre><code>id</code></pre></span>. You may specify these values with the <span class="inline-code"><pre><code>_id</code></pre></span>_id and the <span class="inline-code"><pre><code>_type</code></pre></span> keys, or Elasticsearch will infer them from the URL. If you don’t explicitly provide an id, Elasticsearch will create a random one for you.

In the following example, we use POST to add a simple document to the index which specifies a <span class="inline-code"><pre><code>_type</code></pre></span>_type of <span class="inline-code"><pre><code>test</code></pre></span> and an <span class="inline-code"><pre><code>_id</code></pre></span>_id of 1. You should replace the sample URL in this document with your own index URL to follow along:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">$ curl -XPOST http://user:password@redwood-12345/acme-production/test/1 -d '{"title":"Hello world"}'
{
"_index" : "acme-production",
"_type" : "test",
"_id" : "1",
"_version" : 1,
"_shards" : {
"total" : 3,
"successful" : 3,
"failed" : 0
},
"created" : true
}</code></pre>
</div>
</div>

Because we haven’t explicitly defined a mapping (schema) for the acme-production index, Elasticsearch will come up with one for us. It will inspect the contents of each field it receives and attempt to infer a data structure for the content. It will then use that for subsequent documents.

Dynamic Mapping Can Be Dangerous

Elasticsearch’s ability to generate mappings on the fly is a really nice feature, but it has some drawbacks. One is that the contents of the first field it sees determines how it will interpret the rest.

For example, there have been cases where users attempt to index geospatial data, and Elasticsearch interprets the field as being a float type. Certain documents then fail later in the indexing process with an HTTP 400 error. Or everything succeeds, but geospatial filtering is broken.

It’s a best practice to explicitly create your mappings before indexing into a new index, if you’re planning to power a production application. Today, most clients and frameworks are pretty good about handling this automatically, but it’s a subtle “gotcha” that has made its way into the support queues from time to time.

We can see the mapping that Elasticsearch generated by using the <span class="inline-code"><pre><code>_mapping</code></pre></span> API:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">$ curl -XGET http://user:password@redwood-12345/acme-production/_mapping
{"acme-production":{"mappings":{"test":{"properties":{"title":{"type":"string"}}}}}}</code></pre>
</div>
</div>

The inspection of the index mapping shows that Elasticsearch has generated a schema from our sample JSON, and that it has decided that documents in the “acme-production” index of type “test” will have a string body in the “title” field. This is reasonable, so we’ll leave it alone.

Next, you may view this document by accessing it directly. In the example below, note the ?pretty parameter at the end of the URL. This tells Elasticsearch to pretty print the results, making them more legible:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript">$curl -XGET 'http://user:password@redwood-12345/acme-production/test/1?pretty'
{
"_index" : "acme-production",
"_type" : "test",
"_id" : "1",
"_version" : 1,
"found" : true,
"_source" : {
"title" : "Hello world"
}
}</code></pre>
</div>
</div>

Alternatively, you can see it in the search results with the <span class="inline-code"><pre><code>_search</code></pre></span> endpoint:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">$curl -XGET 'http://user:password@redwood-12345/acme-production/_search?pretty'
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
},
"hits" : {
"total" : 3,
"max_score" : 1.0,
"hits" : [ {
"_index" : "acme-production",
"_type" : "test",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"title" : "Hello world"
}
}
}
}</code></pre>
</div>
</div>

Check the _source

Note the <span class="inline-code"><pre><code>_source</code></pre></span> key, which contains a copy of your original document. Elasticsearch makes an excellent general-purpose document store, although should never be used as a primary store. Use something ACID-compliant for that.

The _source field does also add some overhead. It can be disabled in the mappings. See the Elasticsearch documentation for more details.

To learn more about about the operations supported by your index, you should read the Elasticsearch Index API documentation. Note that some operations mentioned in the documentation (such as “Automatic Index Creation”) are restricted on Bonsai for technical reasons.

Destroying Your Index

When you have decided you no longer need the “acme-production” index, you can destroy it with a one liner:

<div class="code-snippet-container">
<a fs-copyclip-element="click-12" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-12" class="hljs language-javascript">$ curl -XDELETE http://user:password@redwood-12345/acme-production
{"acknowledged":true}</code></pre>
</div>
</div>

The <span class="inline-code"><pre><code>DELETE</code></pre></span> verb will delete one or more indices in your cluster. If you have several indices to delete, you can still perform the action in one line by concatenating the indices with a comma, like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript">$ curl -XDELETE http://user:password@redwood-12345/acme-production-1,acme-production-2,acme-production-3</code></pre>
</div>
</div>

There’s No ‘Undo’ Button

Destroying an index can not be undone, unless you restore it from a snapshot (if one exists). Do not delete indices without fully understanding the consequences. If there is a chance that your cluster is supporting a production application, be very careful before taking this kind of action. Accidental deletes are a major reason Bonsai doesn’t support <span class="inline-code"><pre><code>_all</code></pre></span> or wildcard (*) destructive actions.

‍

Creating Your First Index

General Solutions

From 1.x

‍

Upgrading to ES 2

Elasticsearch 5 brings with it the removal of the <span class="inline-code"><pre><code>string</code></pre></span> type and has been replaced with <span class="inline-code"><pre><code>text</code></pre></span> and <span class="inline-code"><pre><code>keyword</code></pre></span>.

General Solutions

Analyze your mappings and look for <span class="inline-code"><pre><code>string</code></pre></span> mapping types. You will want to review how you use those fields to determine if you should use <span class="inline-code"><pre><code>text</code></pre></span> or <span class="inline-code"><pre><code>keyword</code></pre></span> going forward.

<span class="inline-code"><pre><code>text</code></pre></span>: you want full text search analysis on

<span class="inline-code"><pre><code>keyword</code></pre></span>: you want to search for the value exactly as is, a PO Number, or url for example

From 2.x

From 1.x

‍

Upgrading to ES 5

Elasticsearch 6 has a stand-out breaking change, which is the limitation of only one mapping type per index. This is due to the upcoming removal of mapping types. Before this version all versions of Elasticsearch supported the concept of multiple mapping types, which would let you store different "types" of data in one index. Elastic is removing this feature in ES 7 after announcing its deprecation in ES 6.

General Solutions

Rename all remaining mapping types to <span class="inline-code"><pre><code>_doc</code></pre></span> to align with the future default

Use 1 index per document type.

This has been the standard recommendation for the best performance in Elasticsearch since its inception. Rather than index multiple types of documents into a single index, simply use one index per document type. Many tools and frameworks supporting Elasticsearch will do this for you automatically.

Custom Type Field

For those times when it simply doesn't make since to have one document type per index, you can add a dedicated field to the document that declares a document's given type. Then when you query on that index you will simply pass in which type you want the query limited to as well.

From 5.x

After working through the General Solutions above, we recommend you do the following:

upgrading to at least v5.6 and then turning on <span class="inline-code"><pre><code>index.mapping.single_type: true</code></pre></span> to verify you haven't missed anything.

From 2.x

There are no specific tricks for this upgrade process, outside of the General Solution listed above. We would simply recommend you follow our traditional process for testing your searches on a new version of Elasticsearch. In short, this involves provisioning a new 6.x cluster and validating it in a development environment. Once it has been validated,

From 1.x

‍

Upgrading to ES 6

Elasticsearch 7 has a stand out breaking change which is the removal of mapping types. Before this version all versions of Elasticsearch supported the concept of mapping types, which would let you store different "types" of data in one index. Elastic is removing this feature after announcing its deprecation in ES 6.

General Solutions

Rename all remaining mapping types to <span class="inline-code"><pre><code>_doc</code></pre></span> to align with the future default

Use 1 index per document type.

Custom Type Field

From 6.x

After working through the General Solutions above, we recommend you do the following:

When creating indices in v6 start setting <span class="inline-code"><pre><code>include_type_name=false</code></pre></span> to get ready for the ES7 behavior

From 5.x

After working through the General Solutions above, we recommend you do the following:

upgrading to at least v5.6 and then turning on <span class="inline-code"><pre><code>index.mapping.single_type: true</code></pre></span> to verify you haven't missed anything.

From 2.x

From 1.x

There are no specific tricks for this upgrade process, outside of the General Solution listed above. We would simply recommend you follow our bonsai.io/docs/upgrading-major-versions your searches on a new version of Elasticsearch.

‍

Upgrading to ES 7

Bonsai has been designed from day one to be resilient against service disruptions resulting from any number of crises. This document lays out some important information about how we maintain our customers’ uptime during large-scale emergencies.

Do you have a Business Continuity Plan implemented?

Yes. In the face of a global event we have designed and built a platform which is highly available and resilient to outages. We also have a distributed, cross-trained and multidisciplinary team with a succession plan. Bonsai is designed to gracefully handle a loss of infrastructure, personnel, or both.

How does your platform prevent service interruption due to catastrophic events?

First and foremost, the vast majority of our platform is automated. We also use multiple systems to monitor our fleet in real time, and have a rotating team of engineers providing 24/7/365 coverage in the event of problems big and small.

Additionally, all Bonsai customers’ search resources are balanced across at least three data centers. In order for a cluster to suffer complete disruption, there would need to be simultaneous outages across an entire region. Even in that event, we take regular, encrypted offsite backups which can be used to restore a cluster to a new region if an entire region were to go offline.

How does your team handle catastrophic events, such as a natural disaster or global pandemic ?

One More Cloud has always been a remote-first company, and our team is distributed around the USA. We have the infrastructure and culture in place to facilitate productive work from anywhere on the planet. Further, we have ensured that our team is cross-trained in our operational tooling, so that the platform will continue to be managed even if multiple team members are unreachable due to some calamitous event.

Finally, we are constantly assessing economic changes and potential threats, and revising our company policies policies accordingly. This includes, but is not limited to: cancelling all non-essential travel, enforcing social distancing, staggering travel plans/accommodations, requiring updated training, and so on.

If you have any questions about any of this, please don’t hesitate to reach out to [email protected].

‍

Bonsai's Business Continuity Plans

On Thursday, January 14th, Elastic announced that Elasticsearch 7.11 and on will be dual-licensed under the Elastic proprietary license, and the Server Side Public License (SSPL). Formerly, Elasticsearch had been released under the Elastic and Apache 2.0 licenses. The switch to SSPL is meaningful for several reasons, and there is a lot of uncertainty about what it means for us and our customers. You can read Elastic's FAQ on the subject here.

Last updated: 20-Jan-2021

What's the Bottom Line for Me as a Bonsai Customer?

Bonsai customers have nothing to worry about. Bonsai has multiple ways of continuing to be an Elasticsearch host under the SSPL, and we're exploring which option makes the most sense for everyone involved. The switch to SSPL is noteworthy, and it's worth reading Elastic's posts outlining their reasoning and answering questions. But the short version is that Bonsai customers are not facing any impact.

What is the SSPL?

Tl;dr - The SSPL is a software license that allows the software to be managed as a service by a vendor, but the vendor must publish their service's complete source code under the SSPL. Simply, a company like Bonsai that offers Elasticsearch-as-a-service must release all of the code we use to manage Elasticsearch clusters, if we want to remain compliant with the terms of the license.

What Does This Mean to Me as a Bonsai Customer?

Nothing. The SSPL does not prohibit Bonsai from continuing operations, nor does it prohibit us from supporting new versions of Elasticsearch as they become available. The responsibility is on our team to work out how to remain in compliance with the terms of the new license, and we are committed to doing so.

Is Bonsai Going to Shut Down / Drop Elasticsearch?

No. Bonsai has been operational for over a decade, we’re profitable, and we have always had a good relationship with Elastic. There are no plans to shut down, change to some Elasticsearch competitor, raise prices, version-pin, or anything else.

Will Bonsai Slowly Become Obsolete?

No. We are working out the details for our service’s continued growth and development alongside Elasticsearch.

Why Is Elastic Doing This?

Elastic has published a blog entry outlining why they're making this change. In short, it has to do with trademark disputes (among other things) with Amazon. Bonsai is not involved in those disputes, and we have always been conscientious about respecting Elastic's IP and trademarks in our branding and marketing.

I Have Another Question / Concern

Feel free to reach out to us if you have a question or concern not answered here. We will answer as best as we can, but please remember that this transition is expected to take time and be complex. Quick, accurate answers may not be immediately available.

‍

Bonsai SSPL FAQ

In the Supreme Court case South Dakota v Wayfair, it was determined that U.S. states may charge tax on online purchases made from out-of-state sellers, even if that seller has no physical presence in the state. This has significantly impacted eCommerce companies and SaaS providers like Bonsai. If you sell online to various U.S. states, this impacts your company too.

Because of this ruling, Bonsai will now be charging sales tax to customers in a variety of states. Here’s what you should know:

Is my state going to charge me sales tax?

As of the date when this article was published, 36 states and the District of Columbia have chosen to charge online sales tax. We anticipate that more states will soon begin to charge these taxes. To see up-to-date information on whether or not your state charges sales tax, please reference the Sales Tax Institute’s Nexus State Guide.

Does this apply to all Bonsai plans sold in these states?

Our Hobby Tier will remain free. Standard, Business, and Enterprise plans will all be subject to sales tax.

Plans sold through Heroku, or other third party channels will be taxed through those providers. Please follow up with your reseller for more information.

When does Bonsai plan to charge these taxes?

These taxes will be collected starting in January 2020.

Will these taxes be charged retroactively?

There is no indication that this will be necessary.

How do I opt-in to get charged for my state's sales tax?

Follow our step by step guide for Account Settings Billing.

If I have questions, who should I ask?

‍

Bonsai's Policy on Collecting Sales Tax

Credits

Customers may downgrade a cluster's plan or destroy it at any time to receive a prorated service credit. This credit will be automatically applied to the next billing cycle.

For example, suppose a customer creates a cluster on a $50/month plan, then downgrades to a $20/month plan after 14 days. The customer would receive a $25 service credit for the unused time at the $50/mo rate, less $10 for the remainder of the billing period at the $20/mo rate. This leaves the customer with a credit of $15.

At the end of the first billing period, they would be charged $5 for the next month, as the $15 credit would be deducted from their normal rate of $20/mo. In the following month, the customer would pay the normal rate of $20.

Credits are not issued for service and performance related reasons on our free, Staging and Standard subscriptions. However, there are special SLAs available to Business and Enterprise customers, wherein service disruptions may result in a service credit according to a predetermined schedule negotiated directly with our sales and legal teams prior to using Bonsai.

Refunds

Per our Terms of Service, One More Cloud, Inc. is generally not required to refund a subscription fee under any circumstances. One More Cloud is also not liable for damages incurred through use of the service.

Extenuating Circumstances?

Customers with extenuating circumstances should reach out to [email protected] and describe their situation for consideration.

‍

Refund & Credit Policy

If you would like to request a SOC 2 Type 2 Report for SOC 2 compliance, please email us at [email protected].

Once we recieve your request, a PDF of the report will be emailed.

‍

How to get a SOC 2 Type 2 Report

Any Bonsai employee who suspects that a data breach has occurred must immediately notify the CTO and CEO. The CTO will form a response team to handle the incident.

The Bonsai response team will investigate all reported data breaches to confirm if a data exfiltration has occurred. Once confirmed, the response team will perform the following steps:

Notify impacted customers within an hour of confirmation by email
Coordinate with customers to identify the correct immediate course of action
Identify the root cause of the incident, and then work to remediate the issue

To report a data breach or suspicious activity, email us at [email protected] with a description of events and observations.

‍

Bonsai's Policy on Handling Data Breaches

The General Data Protection Regulation (GDPR) is a legal framework which went into effect on May 25, 2018. It is designed to give EU citizens more control over their personal data. The GDPR is regulates how internet companies track and store customer information. Bonsai is compliant with the GDPR for all customers, regardless of whether or not they are citizens of the EU.

How is Bonsai GDPR Compliant?

Bonsai has never sold email addresses or private information, does not track users' activity across the web, and does not otherwise spy on users. However, in order to be fully compliant and extend the privacy protections of GDPR to customers around the globe, Bonsai took the following measures prior to the GDPR's effective date:

Performed a software audit focused on collection and use of customer data, and purged some metrics deemed unnecessary for business.
Removed Facebook tracking pixels and replaced with GDPR-compliant frontend components.
Removed third party integrations that do not comply with GDPR.
Created a process for our European customers to sign a Data Processing Agreement (DPA) with Bonsai. To sign a DPA, please reach out to [email protected].

Bonsai's Privacy Policy

The GDPR requires internet companies to outline their privacy policies for their customers in plain English. Bonsai's privacy policy can be read in full here: https://bonsai.io/privacy. A few highlights that pertain to the GDPR are:

Any changes to our policies will be alerted to you via email.
Personal Information Section. This outlines how we use PII (personal identifiable information).
Purposes of Personal Information Processing Activities Section. This section identifies where we use your information to denote GDPR compliance.
How We Share Your Personal Information with Third Parties. We do not share your personal information unless there is a legitimate business interest to do so (Article 6.1). Historically this has always been true at Bonsai, but now it is explicitly stated in our policy.
How You Can Access or Change Your Personal Information. This section explains how to update or remove your information.
Cookies. This section outlines what cookies are, what kind we use, and why Bonsai uses them. It also provides resources for removing cookies from your browsers and opting out of advertisements.
Data Security, Integrity and Retention. This is an updated and renamed section from the previously named "Protection of PII."
Retention of Personal Information. Specifies the length of time Bonsai retains your information.
Third-Party Links on Our Websites. Covers any case in which Bonsai might link out to another site that may not have the same level of protection of personal information as Bonsai.
Children’s Personal Information. Clarifies that Bonsai does not collect information on minors, and only allow the use of our Services for individuals over 18.
International Transfer of Information. Bonsai's adherence to privacy laws in the countries where our users and their data is located.
International Users & Data. This is a big section that accounts for a lot of the changes for GDPR. It outlines the rights of EEA citizens, how Bonsai will adhere to them, and covers some abnormal cases that would prevent fulfilling those rights - such as legal proceedings or investigations, which are highly unlikely.

More Information

As always, we’re here to address any questions you may have, so please email us at [email protected] if you need any clarification.

‍

GDPR

If you would like to request a Data Protection Agreement (DPA) for GDPR compliance, please email us at [email protected] with the full name and email address of the person who will be counter-signing the document.

Once we receive your request, a DPA will be emailed via HelloSign for review and signature.

‍

How to get a DPA

If you would like to request a Business Associate's Agreement (BAA) for HIPAA compliance, please email us at [email protected].

Once we receive your request, our team will follow up to review your Bonsai usage and verify appropriate credentials. Upon confirming the necessary details, a BAA will be emailed via HelloSign for review and signature.

‍

How to get a BAA

An HTTP 404: Not Found error is returned by the API when a requested resource can not be found. This can happen for one of several reasons:

You are attempting to hit an endpoint that does not exist (like https://api.bonsai.io/covfefe).
You are attempting to access some information that is not available to your account.

Example

An HTTP 404: Not Found error may look something like this:

The <span class="inline-code"><pre><code>"status": 404</code></pre></span> key confirms the error is an HTTP 404: Not Found error.

Troubleshooting

The first thing to look at when troubleshooting an HTTP 404: Not Found error is the error message array which is returned from the API. Also make sure to check for any typos that may have ended up in the request.

If you're still unsure why you are receiving the error, then please shoot us an email.

‍

API Error 404: Not Found

An HTTP 403: Forbidden error can occur for one of several reasons. Generally, it communicates that the server understood the request, but is refusing to authorize it. This is distinct from an authentication error ( HTTP 401), in that the authentication credentials are correct, but there is some other reason the request is not authorized.

Some examples of situations where a user might see an HTTP 403: Forbidden response from the API:

We have detected activity on your account that appears fraudulent or highly suspicious, and have blocked access to the API until further notice.
We have identified a Terms of Service violation and are blocking access until it is resolved.
The API may be down for maintenance.

Example

A call to the API which results in an HTTP 403: Forbidden response may look something like this:

The <span class="inline-code"><pre><code>"status": 403</code></pre></span> key indicates that the error is indeed an HTTP 403: Forbidden error.

Troubleshooting

The first step for troubleshooting the error is to examine the error messages in detail. If there is a problem that merits contacting support, then you will want to reach out to us for further discussion. Also check your email inbox and spam folders for anything that we may have already sent.

If the error indicates a temporary interruption, such as maintenance mode, then check out our Twitter account for updates, or shoot us an email.

‍

API Error 403: Forbidden

An HTTP 402: Payment Required error occurs when your account is past due and you try to make a request to the API. Bonsai only provides API access to accounts which are up to date on payments.

If you are receiving an HTTP 402: Payment Required error, then there is a balance due on your account. You can update your billing information in your account profile. If you run into any issues, there is documentation available here.

Example

A call to the API which results in an HTTP 402: Payment Required error may look something like this:

‍

The <span class="inline-code"><pre><code>"status": 402</code></pre></span> key indicates the HTTP 402: Payment Required error.

Troubleshooting

The first thing to do is navigate to your billing profile and make sure your account is up to date. You can update your credit card if needed, and review any recent invoices. Additionally, you should check your inbox and spam folders for any billing-related notices from Bonsai.

If everything seems correct with your account, you can always contact support and we will be glad to assist.

‍

API Error 402: Payment Required

An HTTP 422: Unprocessable Entity error occurs when a request to the API can not be processed. This is a client-side error, meaning the problem is with the request itself, and not the API.

If you are receiving an HTTP 422: Unprocessable Entity error, there are several possibilities for why it might be occurring:

You are sending in a payload that is not valid JSON
You are sending HTTP headers, such as <span class="inline-code"><pre><code>Content-Type</code></pre></span> or <span class="inline-code"><pre><code>Accept</code></pre></span>, which specify a value other than <span class="inline-code"><pre><code>application/json</code></pre></span>
The request may be valid JSON, but there is something wrong with the content. For example, instructing the API to upgrade a cluster to a plan which does not exist.

Example

A call to the API that results in an HTTP 422: Unprocessable Entity error may look something like this:

The <span class="inline-code"><pre><code>"status": 422</code></pre></span> key indicates the HTTP 422: Unprocessable Entity error.

Troubleshooting

The first step in troubleshooting this error is to carefully inspect the response from the API. It will often provide valuable information about what went wrong with the request. For example, if there was a problem creating a cluster because a plan slug was not recognized, you might see something like this:

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 422: Unprocessable Entity

An HTTP 429: Too Many Requests error occurs when an API token is used to make too many requests to the API in a given period. Bonsai throttles the number of API calls that can be made by any given token in order to maintain a high level of service and prevent DoS scenarios.

If you are receiving an HTTP 429: Too Many Requests error, then you are hitting the API too frequently. The API documentation introduction describes which limits are in place.

Example

A call to the API that results in an HTTP 429: Too Many Requests error may look something like this:

Troubleshooting

The first step is to look at how often you are polling the API. If you're checking the API while waiting for a cluster to provision or update, then every 3-5 seconds should be adequate.

If you have an adequate amount of sleep time in between API calls, then the next thing to check would be whether you have multiple jobs checking the API using the same token. If you're using some kind of CI system to spin up and tear down clusters during testing and continuous integration, then and they all share the same token, then they're likely interfering with each other.

Note that when the API returns an HTTP 429: Too Many Requests error, it will include a header called <span class="inline-code"><pre><code>Retry-After</code></pre></span>, which indicates how many seconds you will need to wait to make your next request. So you could add in a check for this into your scripts. That way an HTTP 429 does not cause things to fail, but rather informs the necessary sleep duration in order to proceed.

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 429: Too Many Requests

An HTTP 401: Unauthorized error occurs when a request to the API could not be authenticated. All requests to API resources must use some authentication scheme to prove access rights to the resource.

If you are receiving an HTTP 401: Unauthorized error, there are several possibilities for why it might be occurring:

The authentication credentials are missing
The authentication credentials are incorrect
The authentication credentials belong to a token that has been revoked

Check that the authentication credentials you are passing along in the request are correct and belong to an active token.

Example

A call to the API that results in an HTTP 401: Unauthorized error may look something like this:

The <span class="inline-code"><pre><code>"status": 401</code></pre></span> key indicates the HTTP 401: Unauthorized error.

Troubleshooting

The first thing to do is to carefully read the list of errors returned by the API. This will often include some hints about what is happening:

If that doesn't help, then check is that the credentials you're sending are correct. You can view the credentials in your account dashboard and cross-reference this with the credentials you're passing to the API.

If you're sure that the credentials are correct, then you may want to try isolating the problem. Try making a curl call to the API and see what happens. For example, using Basic Auth:

If the request succeeds, then you have eliminated the API Token as the source of the problem, and it's likely an issue with how the application is making the call to the API.

If the request still fails, then you should consult the documentation for the authentication scheme you're using to determine which HTTP headers are needed in the request. You can also use the -vvv flag in curl to see which headers and values are being passed with the request.

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 401: Unauthorized

Alpha Stage

The Bonsai API supports HTTP Basic Authentication over TLS >= 1.2 as one means for authenticating requests. The authentication protocol utilizes an Authorization header, with the contents: Basic .

Many tools, such as curl will construct this header automatically from credentials in a URL. For example, curl will translate https://user:[email protected]/ in to https://api.bonsai.io/ with the header Authorization: Basic dXNlcjpwYXNz.

For Basic Auth, the token key corresponds to the“user” parameter and the token secret corresponds to the“password” parameter.

‍

Using Basic Authentication

Alpha Stage

The Bonsai API supports two methods of authenticating requests: HTTP Basic Auth, and HMAC. The former is a widely-adopted standard supported by most HTTP clients, but requires an encrypted connection for safe transmission. The latter is an older and slight more complicated method, but offers some security over unencrypted connections.

Which Scheme Should I Use?

If you’re connecting to the API via https (as most people are), then Basic Auth is fine. The header containing the credentials is encrypted using industry-standard protocols before being sent over the Internet. TLS allows you to authenticate the receiving party(the API) using a trusted certificate authority, rendering MITM attacks highly unlikely. Basic Auth is not secure over unencrupted connections, however. Your credentials could be leaked and read by a third party.

If you can’t use https for some reason, then consider using HMAC. This protocol involves passing along some special headers with your API requests, with the expectation that a 3rd party can access the transmission. It’s slightly more complicated to configure, but it involves a private key-signed time-based nonce, mitigating against MITM and replay attacks. A third party could see the data you send/receive with the API, but would not be able to steal your API credentials and interact with the API on your behalf.

Failed Authentication

Requests that do not have the proper authentication will receive an HTTP 401: Not Authorized response. This can happen for a variety of reasons, including(but not limited too):

The token itself has been revoked
The token key can not be found(perhaps due to a typo)
The token secret does not match the secret provided
One or more HTTP headers have been miscalculated
(When using HMAC) the X-BonsaiApi-Time timestamp deviates more than 60 seconds from the server time
Some other filtering rule has been violated

If you are having trouble authenticating your requests to the API, please reach out to [email protected].

‍

Authentication

Alpha Stage

HMAC

The Bonsai API supports a hash-based message authentication code protocol for authenticating requests. This scheme allows the API to simultaneously verify both the integrity and the authenticity of a user’s request.

This authentication protocol requires that all API requests include three HTTP headers:

X-BonsaiApi-Time. The current Unix time – seconds since epoch. This value helps to guarantee uniqueness over time, and must be within one minute of our server time to prevent replay attacks.

X-BonsaiApi-Key. The API token’s key, as generated within the Bonsai application. This key will also have a corresponding secret.

X-BonsaiApi-Hmac. The hexadecimal HMAC-SHA1 digest of your shared secret and the concatenation of the above time and public key.

For example, in Ruby, the X-BonsaiApi-Auth header can be computed as: OpenSSL::HMAC.hexdigest('sha1', token_secret, "#{time}#{token_key}"), where the token_secret is the API key’s secret.

‍

Using HMAC Authentication

Alpha Stage

The Releases API provides users a method to explore the different versions of Elasticsearch available to their account. This API supports the following actions:

View all available releases for your account
View a single release for your account

All calls to the Releases API must be authenticated with an active API token.

The Bonsai Release Object

The Bonsai API provides a standard format for Release objects. A Release object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>A String representing the name for the release.</td>
</tr>
<tr>
<td>slug</td><td>A String representing the machine-readable name for the deployment. </td>
</tr>
<tr>
<td>version</td><td>A String representing the version of the release.</td>
</tr>
<tr>
<td>multitenant</td><td>A Boolean representing whether or not the release is available on multitenant deployments.</td>
</tr>
</tbody>
</table>

View all available releases

The Bonsai API provides a method to get a list of all releases available to your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/releases</code></pre></span> endpoint, and Bonsai will return a JSON list of Release objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the releases available to your account:

View a single release

The Bonsai API provides a method to get information about a single release available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Release object:

Releases API Introduction

Alpha Stage

The Spaces API provides users a method to explore the server groups and geographic regions available to their account, where clusters may be provisioned. This API supports the following actions:

View all available spaces for your account
View a single space for your account

All calls to the Spaces API must be authenticated with an active API token.

The Bonsai Space Object

The Bonsai API provides a standard format for Space objects. A Space object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>path</td><td>A String representing a machine-readable name for the server group.</td>
</tr>
<tr>
<td>private_network</td><td>A Boolean indicating whether the space is isolated and inaccessible from the public Internet. A VPC connection will be needed to communicate with a private cluster.</td>
</tr>
<tr>
<td>cloud</td><td>An Object containing details about the cloud provider and region attributes:

provider. A String representing a machine-readable name for the cloud provider in which this space is deployed.
region. A String representing a machine-readable name for the geographic region of the server group.

</td>
</tr>
</tbody>
</table>

View all available spaces

The Bonsai API provides a method to get a list of all available spaces on your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/spaces</code></pre></span> endpoint, and Bonsai will return a JSON list of Space objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the spaces available to your account:

View a single space

The Bonsai API provides a method to get information about a single space available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces/[:path]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Space object:

Spaces API Introduction

Alpha Stage

The Plans API gives users the ability to explore the different cluster subscription plans available to their account. This API supports the following actions:

View all plans available for your account.
View a single plan available for your account.

All calls to the Plans API must be authenticated with an active API token.

The Bonsai Plan Object

The Bonsai API provides a standard format for Plan objects. A Plan object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>slug</td><td>A String representing a machine-readable name for the plan. </td>
</tr>
<tr>
<td>name</td><td>A String representing the human-readable name of the plan.</td>
</tr>
<tr>
<td>price_in_cents</td><td>An Integer representing the plan price in cents.</td>
</tr>
<tr>
<td>billing_interval_in_months</td><td>An Integer representing the plan billing interval in months.</td>
</tr>
<tr>
<td>single_tenant</td><td>A Boolean indicating whether the plan is single-tenant or not. A value of false indicates the Cluster will share hardware with other Clusters. Single tenant environments can be reached via the public Internet. Additional documentation here.</td>
</tr>
<tr>
<td>private_network</td><td>A Boolean indicating whether the plan is on a publicly addressable network. Private plans provide environments that cannot be reached by the public Internet. A VPC connection will be needed to communicate with a private cluster.</td>
</tr>
<tr>
<td>available_releases</td><td>An Array with a collection of search release slugs available for the plan. Additional information about a release can be retrieved from the Releases API.</td>
</tr>
<tr>
<td>available_spaces</td><td>An Array with a collection of Space paths available for the plan. Additional information about a space can be retrieved from the Spaces API.</td>
</tr>
</tbody>
</table>

View all plans

The Bonsai API provides a method to get a list of all plans available to your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/plans</code></pre></span> endpoint, and Bonsai will return a JSON list of Plan objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the Plans available to your account:

View a single plan

The Bonsai API provides a method to retrieve information about a single Plan available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans/[:plan-slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Plan object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
"slug": "sandbox-aws-us-east-1",
"name": "Sandbox",
"price_in_cents": 0,
"billing_interval_in_months": 1,
"single_tenant": false,
"private_network": false,
"available_releases": [
"elasticsearch-7.2.0"
],
"available_spaces": [
"omc/bonsai-gcp/us-east4/common",
"omc/bonsai/ap-northeast-1/common",
"omc/bonsai/ap-southeast-2/common",
"omc/bonsai/eu-central-1/common",
"omc/bonsai/eu-west-1/common",
"omc/bonsai/us-east-1/common",
"omc/bonsai/us-west-2/common"
]
}
</code></pre>
</div>
</div>

Plans API Introduction

Alpha Stage

The Clusters API provides a means of managing clusters on your account. This API supports the following actions:

View all clusters in your account
View a single cluster in your account
Create a new cluster
Update a cluster in your account
Destroy a cluster in your account

All calls to the Clusters API must be authenticated with an active API token.

The Bonsai Cluster Object

The Bonsai API provides a standard format for Cluster objects. A Cluster object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>slug</td><td>A string representing a unique, machine-readable name for the cluster. A cluster slug is based its name at creation, to which a random integer is concatenated.</td>
</tr>
<tr>
<td>name</td><td>A string representing the human-readable name of the cluster.</td>
</tr>
<tr>
<td>uri</td><td>A URI to get more information about this cluster.</td>
</tr>
<tr>
<td>plan</td><td>An Object with some information about the cluster's current subscription plan. This hash has two keys:

slug. The unique, machine-readable name of the plan.
uri. A URI to retrieve more information about this plan.

You can see more details about the plan by passing the slug to the Plans API.
</td>
</tr>
<tr>
<td>release</td><td>An Object with some information about the cluster's current release. This hash has five keys:

version. The version of the release this cluster is running on.
slug. The unique slug of the release.
package_name. The package name of the release.
service_type. The name of the search service.
uri. A URI to retrieve more information about this plan.

You can see more details about the release by passing the slug to the Releases API.
</td>
</tr>
<tr>
<td>space</td><td>An Object with some information about where the cluster is running. This has three keys:

path. The path to the space. This string maps to a geographic region or data center.
region. The geographic region in which the cluster is running.
uri. A URI with more information about the space

You can see more details about the space by passing the path to the Spaces API.
</td>
</tr>
<tr>
<td>stats</td><td>An Object with a collection of statistics about the cluster. This hash has four keys:

docs. The number of documents in the index.
shards_used. The number of shards the cluster is using.
data_bytes_used. Integer representing the number of bytes the cluster is using on disk.

This attribute should not be used for real-time monitoring! Stats are updated every 10-15 minutes. To monitor real-time metrics, monitor your cluster directly, via the Index Stats API.
</td>
</tr>
<tr>
<td>access</td><td>An Object containing information about connecting to the cluster. This hash has several keys:

host. The host name of the cluster.
port. The HTTP port the cluster is running on.
scheme. The HTTP scheme needed to access the cluster (defaults to "https")

</td>
</tr>
<tr>
<td>state</td><td>A String representing the current state of the cluster. This indicates what the cluster is doing at any given moment. There are 8 defined states:

DEPROVISIONED. The cluster has been destroyed.
DEPROVISIONING. The cluster is in the process of being destroyed.
DISABLED. The cluster has been disabled.
MAINTENANCE. The cluster is in maintenance mode.
PROVISIONED. The cluster has been created and is ready for use.
PROVISIONING. The cluster is in the process of being created.
READONLY. The cluster is in read only mode.
UPDATING PLAN. The cluster's plan is being updated.

</td>
</tr>
</tbody>
</table>

View all clusters

The Bonsai API provides a method to get a list of all active clusters on your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will return a JSON list of Cluster objects. This API call will not return deprovisioned clusters. This call uses pagination, so you may need to make multiple requests to fetch all clusters.

Supported Parameters (Query String Parameters)

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>q</td><td>Optional. A query string for filtering matching clusters. This currently works on name</td>
</tr>
<tr>
<td>tenancy</td><td>Optional. A string which will constrain results to parent or child cluster. Valid values are: parent, child</td>
</tr>
<tr>
<td>location</td><td>Optional. A string representing the account, region, space, or cluster path where the cluster is located. You can get a list of available spaces with the Spaces API. Space path prefixes work here, so you can find all clusters in a given region for a given cloud.</td>
</tr>
</tbody>
</table>

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/clusters</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an HTTP 200: OK code, along with a JSON list representing the clusters on your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
  "pagination": {
    "page_number": 1,
    "page_size": 2,
    "total_records": 2
  },
  "clusters": [
    {
      "slug": "first-testing-cluste-1234567890",
      "name": "first_testing_cluster",
      "uri": "https://api.bonsai.io/clusters/first-testing-cluste-1234567890",
      "plan": {
        "slug": "sandbox-aws-us-east-1",
        "uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"
      },
      "release": {
        "version": "7.2.0",
        "slug": "elasticsearch-7.2.0",
        "package_name": "7.2.0",
        "service_type": "elasticsearch",
        "uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"
      },
      "space": {
        "path": "omc/bonsai/us-east-1/common",
        "region": "aws-us-east-1",
        "uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"
      },
      "stats": {
        "docs": 0,
        "shards_used": 0,
        "data_bytes_used": 0
      },
      "access": {
        "host": "first-testing-cluste-1234567890.us-east-1.bonsaisearch.net",
        "port": 443,
        "scheme": "https"
      },
      "state": "PROVISIONED"
    },
    {
      "slug": "second-testing-clust-1234567890",
      "name": "second_testing_cluster",
      "uri": "https://api.bonsai.io/clusters/second-testing-clust-1234567890",
      "plan": {
        "slug": "sandbox-aws-us-east-1",
        "uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"
      },
      "release": {
        "version": "7.2.0",
        "slug": "elasticsearch-7.2.0",
        "package_name": "7.2.0",
        "service_type": "elasticsearch",
        "uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"
      },
      "space": {
        "path": "omc/bonsai/us-east-1/common",
        "region": "aws-us-east-1",
        "uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"
      },
      "stats": {
        "docs": 0,
        "shards_used": 0,
        "data_bytes_used": 0
      },
      "access": {
        "host": "second-testing-clust-1234567890.us-east-1.bonsaisearch.net",
        "port": 443,
        "scheme": "https"
      },
      "state": "PROVISIONED"
    }
  ]
}</code></pre>
</div>
</div>

View a single cluster

The Bonsai API provides a method to retrieve information about a single cluster on your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Cluster object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
"cluster": {
"slug": "second-testing-clust-1234567890",
"name": "second_testing_cluster",
"uri": "https://api.bonsai.io/clusters/second-testing-clust-1234567890",
"plan": {
"slug": "sandbox-aws-us-east-1",
"uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"
},
"release": {
"version": "7.2.0",
"slug": "elasticsearch-7.2.0",
"package_name": "7.2.0",
"service_type": "elasticsearch",
"uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"
},
"space": {
"path": "omc/bonsai/us-east-1/common",
"region": "aws-us-east-1",
"uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"
},
"stats": {
"docs": 0,
"shards_used": 0,
"data_bytes_used": 0
},
"access": {
"host": "second-testing-clust-1234567890.us-east-1.bonsaisearch.net",
"port": 443,
"scheme": "https"
},
"state": "PROVISIONED"
}
}</code></pre>
</div>
</div>

Create a new cluster

The Bonsai API provides a method to create new clusters on your account. An HTTP POST call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will create the cluster.

Supported Parameters

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>Required. A String representing the name for your new cluster.</td>
</tr>
<tr>
<td>plan</td><td>A String representing the slug of the new plan for your cluster. You can get a list of available plans via the Plans API.</td>
</tr>
<tr>
<td>space</td><td>A String representing the Space slug where the new cluster will be created. You can get a list of available spaces with the Spaces API.</td>
</tr>
<tr>
<td>release</td><td>A String representing the search service release to use. You can get a list of available versions with the Releases API.</td>
</tr>
</tbody>
</table>

HTTP Request

An HTTP POST call is made to /clusters along with a JSON payload of the supported parameters.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with a short message and details about the cluster that was created:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">{
"message": "Your cluster is being provisioned.",
"monitor": "https://api.bonsai.io/clusters/test-5-x-3968320296",
"access": {
"user": "utji08pwu6",
"pass": "18v1fbey2y",
"host": "test-5-x-3968320296",
"port": 443,
"scheme": "https",
"url": "https://utji08pwu6:[email protected]:443"
},
"status": 202
}</code></pre>
</div>
</div>

Error

An <span class="inline-code"><pre><code>HTTP 422: Unprocessable Entity</code></pre></span> error may arise if you are trying to create one too many Sandbox clusters on your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">{
"errors": [
"The requested plan is not available for provisioning. Solution: Please use the plans endpoint for a list of available plans.",
"Your request could not be processed. "
],
"status":422
}</code></pre>
</div>
</div>

If you are not creating a Sandbox cluster, please refer to the API Error 422: Unprocessable Entity documentation.

Update a cluster

The Bonsai API provides a method to update the name or plan of your cluster. An HTTP PUT call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will update the cluster.

Supported Parameters

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>A String representing the new name for your cluster. Changing the cluster name will not change its URL.</td>
</tr>
<tr>
<td>plan</td><td>A String representing the slug of the new plan for your cluster. Updating the plan may trigger a data migration. You can get a list of available plans via the Plans API.</td>
</tr>
</tbody>
</table>

HTTP Request

To make a change to an existing cluster, make an HTTP PUT call to <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> with a JSON body for one or more of the supported params.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with short message:

Destroy a cluster

The Bonsai API provides a method to delete a cluster from your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP DELETE call is made to the <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> endpoint.

HTTP Response

Bonsai will respond with an HTTP 202: Accepted code, along with a short message:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">{
"message": "Your cluster is being deprovisioned.",
"monitor": "https://api.bonsai.io/clusters/[:slug]",
"status": 202
}</code></pre>
</div>
</div>

Cluster API Introduction

Alpha Stage

This introduction to the Bonsai API includes the following sections:

Introduction: What is the API and what is required to make it work?
Request Data. Sending data to the API.
Response Data. Receiving data from the API.
Error Handling. What to expect if something fails.
API Access Limitations. How often can you hit the API?
Breaking Changes. How Bonsai handles changes to the API.
Questions, Problems, Feature Requests. Providing feedback about the API.

Introduction

Bonsai provides a REST API at https://api.bonsai.io for managing clusters, exploring plans, and checking out versions and available regions. This API allows customers to create, view, manage and destroy clusters via HTTP calls instead of through the dashboard. The API supports four endpoints:

Clusters API. Create, view, manage and destroy clusters on your account.
Plans API. View subscription plans available to your account.
Spaces API. View locations where your account may provision new clusters.
Releases API. View versions of Elasticsearch available to your account.

To interact with the API, users must create an API Token. You can read more about creating those tokens here. An API token will have a key and a secret. The API supports multiple ways of authenticating requests with an API token. All calls to the API using an API token are logged for auditing purposes. Additional constraints on API tokens are in development.

The API generally conforms to RESTful principals. Users interact with their clusters using standard HTTP verbs such as GET, PUT, POST, PATCH and DELETE. The Bonsai API accepts and returns JSON payloads. No other formats are supported at this time.

Request Data

The Bonsai API accepts request bodies in JSON format only. Request bodies that are not in proper JSON will receive an HTTP 422: Unprocessable Entity response code, along with a JSON body containing messages about the problem.

A <span class="inline-code"><pre><code>Content-Type: application/json</code></pre></span> HTTP header is preferred, but not required. Requests may also provide an <span class="inline-code"><pre><code>Accept</code></pre></span>Accept header, as either <span class="inline-code"><pre><code>Accept: */*</code></pre></span> or <span class="inline-code"><pre><code>Accept: application/json</code></pre></span>. Any other accept type will receive an HTTP 422 error.

Response Data

The Bonsai API responds with standard HTTP response codes. All HTTP message bodies will be in JSON format. The API documentation for the call will describe the response bodies that a client should expect.

Error Handling

In the event that one or more errors are raised, the API will return a JSON response detailing the problem. The response will have a status code, and an array of error messages. For example:

Error codes for the API are documented in the API Error Codes section.

API Access Limitations

The Bonsai API limits any given token to 60 requests per minute. Provision requests are limited to 5 per minute. Making too many requests in a short amount of time will result in a temporary period of HTTP 429 responses.

Additionally, access to the API may be blocked if:

Your Bonsai account is cancelled (HTTP 404).
Your Bonsai account is suspended due to non-payment (HTTP 402).
Your shared secret is revoked (HTTP 401).
If fraudulent or bot activity is suspected (HTTP 403).
Any violation of our Terms of Service (HTTP 403).

Breaking Changes

The Bonsai team considers the following changes to be backwards compatible, and can be made without advance notice:

Adding new API endpoints
Non-breaking changes to existing API endpoints:
Adding new optional parameters
Adding additional attributes to response bodies
Adding new, optional features
Adding support for new formats(such as XML)
Adding support for additional authentication options

Questions, Problems, Feature Requests

The Bonsai team is committed to providing the best, most-reliable platform for deploying and managing Elasticsearch clusters. If you have a question, issue, or just want to submit a feature request, please reach out to our support team at [email protected].

‍

Introduction to the API

Alpha Stage

An API Token is required in order to access the API. Tokens are associated to a user within a given account. To create a credential, navigate to your account and click on the API Tokens tab:

Click on the Generate Token button to create a token:

Whenever a token is submitted to the API, a log entry is generated for that token. The logs will indicate which token was used, how it was authenticated, who it belongs to (and the IP address of the requester), and a host of other information:

You can see even more details about a request by clicking on its Details button:

If there is some security concern with the request, there will be a flash message indicating that it was not made safely.

To revoke a token, click on the Revoke button. This will bring up a confirmation dialog. Confirm the request, and the token will be revoked. Once a token is revoked, it can no longer be used to access the API. Requests to the API using a revoked token will result in a HTTP 401: Authorization error.

‍

Creating an API token

Pagination Query Parameters

All APIs supporting recognizes the following request parameters for pagination:

<table>
<thead>
<tr>
<th>Parameter</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>page</td><td>The page number, starting at 1</td>
</tr>
<tr>
<td>size</td><td>The size of each page, with a max of 100</td>
</tr>
</tbody>
</table>

Pagination Response Fragment

All API responses which support pagination will include a top level pagination fragment in the JSON response body, which looks like:

With the above information, you can infer how many pages there are and iterate until the list is exhausted.

API Result Pagination

An HTTP 503: Service Unavailable error indicates a problem with a server somewhere in the network. It is most likely related to a node restart affecting your primary shard(s) before a replica can be promoted.

The easiest solution is to simply catch and retry HTTP 503's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

‍

HTTP 503: Service Unavailable

An HTTP 502: Bad Gateway error is rare, but when it does happen, there are really only two root causes: a problem with the load balancer, or Elasticsearch is returning a high number of deprecation warnings.

If you are seeing occasional, generic messages about HTTP 502 errors, then the most likely cause is the load balancer. The short explanation is that there are a few cases where the proxy software hits an OOM error and is restarted. This causes the load balancer to send back an HTTP 502. The error message will be very generic, and it will not say anything about Bonsai.io. The easiest solution is to simply catch and retry these HTTP 502's.

If you are seeing frequent, repeated HTTP 502 messages, and those messages say something like "A problem occurred with the Elasticsearch response. Please check status.bonsai.io or contact [email protected] for assistance", then it's likely due to Elasticsearch's response overwhelming the load balancer with HTTP headers. These headers might look something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript"><script> console.log('hello'); </script>"The [string] field is deprecated, please use [text] or [keyword] instead on [my_field]"</code></pre>
</div>

This usually happens when the client sends a request using a large number of deprecated references. Elasticsearch responds with an HTTP header for each one. If there is a large number (many thousands) of headers, the load balancer will simply close the connection and respond with an HTTP 502 message. The solution is to review your client and application code, and either: A) use smaller bulk requests, or B) update the code so that it's no longer using deprecated features.

If you are having trouble with this resolving this issue, please let us know.

‍

HTTP 502: Bad Gateway

An HTTP 400 Bad Request can be caused by a variety of problems. However, it is generally a client-side issue. An HTTP 400 implies the problem is not with Elasticsearch, but rather with the request to Elasticsearch.

For example, if you have a mapping that expects a number in a particular field, and then index a document with some other data type in that field, Elasticsearch will reject it with an HTTP 400:

<div class="code-snippet-container">
<a fs-copyclip-element="click" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript"># Create a document with a field called "views" and the number 0POST /myindex/mytype/1?pretty -d '{"views":0}'
{
"_index" : "myindex",
"_type" : "mytype",
"_id" : "1",
"_version" : 1,
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"created" : true
}

# Elasticsearch has automagically determined the "views" field to be a long (integer) data type:GET /myindex/_mapping?pretty
{
"myindex" : {
"mappings" : {
"mytype" : {
"properties" : {
"views" : {
"type" : "long"
}
}
}
}
}
}

# Try to create a new document with a string value instead of a long in the "views" field:POST /myindex/mytype/2?pretty -d '{"views":"zero"}'
{
"error" : {
"root_cause" : [ {
"type" : "mapper_parsing_exception",
"reason" : "failed to parse [views]"
} ],
"type" : "mapper_parsing_exception",
"reason" : "failed to parse [views]",
"caused_by" : {
"type" : "number_format_exception",
"reason" : "For input string: \"zero\""
}
},
"status" : 400
}</code></pre>
</div>
</div>

The way to troubleshoot an HTTP 400 error is to read the response carefully and understand which part of the request is raising the exception. That will help you to identify a root cause and remediate.

HTTP 400: Bad Request

The HTTP 501 Not Implemented error means that the requested feature is not available on Bonsai. Elasticsearch offers a handful of API endpoints that are not exposed on Bonsai for security and performance reasons. You can read more about these in the Unsupported API Endpoints documentation.

HTTP 501: Not Implemented

The "Cluster not found"-variant HTTP 404 is distinct from the "Index not found" message. This error message indicates that the routing layer is unable to match your URL to a cluster resource. This can be caused by a few things:

A typo in the URL. If you're seeing this in the command line or terminal, then it's possible the hostname is wrong due to a typo or incomplete copy/paste.
The cluster has been destroyed. If you deprovision a cluster, it will be destroyed instantly. Further requests to the old URL will result in an HTTP 404 Cluster Not Found response.
The cluster has not yet been provisioned. There are a couple cases in which clusters take a few minutes to come online. Attempting to access the cluster before it's operational can lead to an HTTP 404 response.

If you have confirmed that: A) the URL is correct (see Connecting to Bonsai for more information), B) the cluster has not been destroyed, and C) the cluster should be up and running, and D) you're still receiving HTTP 404 responses from the cluster, then send us an email and we'll investigate.

HTTP 404: Cluster Not Found

This error is raised when an update request is sent to a cluster that has been placed into read-only mode. Clusters can be placed into read-only mode for one of several reasons, but the most common reason is due to an overage.

If you're seeing this error, check on your cluster status and address any overages you see. You can find more information about this in our Metering on Bonsai documentation, specifically the section "Checking on Cluster Status". If you're not seeing any overages and the cluster is still set to read-only, please contact us and let us know.

‍

HTTP 403: Cluster Read-only

This error is raised when a request is sent to a cluster that has been disabled. Clusters can be disabled for one of several reasons, but the most common reason is due to an overage.

‍

HTTP 403: Cluster Disabled

All Bonsai clusters are provisioned with a randomly generated set of credentials. These must be supplied with every request in order for the request to be processed. An HTTP 401 response indicates the authentication credentials were missing from the request.

To elaborate on this, all Bonsai cluster URLs follow this format:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://username:[email protected]
</code></pre>
</div>

The username and password in this URL are not the credentials used for logging in to Bonsai, but are randomly generated alphanumeric strings. So your URL might look something like:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript"><script> console.log('hello'); </script>https://kjh4k3j:[email protected]
</code></pre>
</div>

The credentials <span class="inline-code"><pre><code>kjh4k3j:lv9pngn9fs</code></pre></span> must be present with all requests to the cluster in order for them to be processed. This is a security precaution to protect your data (on that note, we strongly recommend keeping your full URL a secret, as anyone with the credentials can view or modify your data).

Not All APIs are Available

It's possible to get an HTTP 401 response when attempting to access one of the Unsupported API Endpoints. If you're trying to access server level tools, restart a node, etc, then the request will fail, period. Please read the documentation on unavailable APIs to determine whether the failing request is valid.

I'm including credentials and still getting a 401!

Please ensure that the credentials are correct. You can find this information on your cluster dashboard. Note that there is a tool for rotating credentials. So it's entirely possible to be using an outdated set of credentials.

Heroku users should also inspect the contents of the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> config variable. This can be found in the Heroku app dashboard, or by running <span class="inline-code"><pre><code>heroku config:get BONSAI_URL</code></pre></span>. The contents of this variable should match the URL shown in the Bonsai cluster dashboard exactly.

If you're sure that the credentials are correct and being supplied, send us an email and we will investigate.

HTTP 401: Authorization Required

In some rare cases, the Bonsai Ops Team will put a cluster into maintenance mode. There are a lot of reasons this may happen:

Load shedding
Data migrations
Rolling restarts
Version upgrades
... and more.

Maintenance mode blocks updates to the cluster, but not searches. If you're seeing this message, it will be temporary; it rarely lasts for more than a minute or two. If your cluster has been in a maintenance state for more than a few minutes, please contact support.

HTTP 403: Maintenance

The HTTP 500 Internal Server Error is both rare and often difficult to reproduce. It generally indicates a problem with a server somewhere. It may be Elasticsearch, but it could also be a node in the load balancer or proxy. A process restarting is typically the root cause, which means it will often resolve itself within a few seconds.

The easiest solution is to simply catch and retry HTTP 500's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

‍

HTTP 500: Internal Server Error

This error indicates that the request body to Elasticsearch exceeded the limits of the Bonsai proxy layer. This can be caused by a few things:

A request larger than 40MB. Elasticsearch's Query DSL can be fairly verbose JSON, particularly when queries are complex. The 40MB cap is meant to be a safety mechanism to prevent runaway queries from overwhelming the routing layer, while still being an order of magnitude higher than 99.9% of request bodies.
Indexing too many documents at once. The Elasticsearch Bulk API allows applications to index groups of documents in a single request. Sending a single batch of millions of documents could easily trigger the HTTP 413 message.
Lots of request headers. Metadata about a request can be passed to Elasticsearch in the form of request headers. Bonsai allows up to 16KB for request headers; this should be enough for whatever CORS and content-type specification needs to occur. Note that the TLS and authentication headers in the request are not counted towards this limit.
Indexing large files. When Elasticsearch indexes a rich text file like a PDF or Word document, it converts the file into a Base64 string to compress it for transit. Still, it's possible that this string is longer than 40MB, which could trigger the HTTP 413 error.

If you're seeing this error, check that your queries are sane and not 40MB of flattened JSON. Ensure you're not explicitly sending lots of headers to your cluster.

If you're seeing this message during bulk indexing, then decrease your batch sizes by half and try again. Repeat until you can reindex without receiving an HTTP 413.

Finally, if it is indeed a large file causing the problem, then the odds are good that metadata and media in the file are resulting in its huge size. You may need to use a some file editing tool to remove the media (images, movies, sounds) and possibly the metadata from the file and then try again. If the files are user-submitted, consider capping the file size your users are able to upload.

Customers who are unable to change their application to accommodate smaller request bodies or payloads should reach out to us. We we lift this limit for customers on Business and Enterprise plans, subject to some caveats on performance.

‍

HTTP 413: Request Too Large

The proximate cause of HTTP 429 errors occur is that an app has exceeded its concurrent connection limits for too long. This is often due to a spike in usage -- perhaps a new feature as been deployed, a service is growing quickly, or possibly a regression in the code.

It can also happen when reindexing (for example: when engineers want to push all the data into Elasticsearch or OpenSearch as quickly as possible, which means lots of parallelization). Unusually expensive requests, or other unusual latency and performance degradation within Elasticsearch itself can also cause unexpected queuing and result in 429 errors.

In most cases, 429 errors can be solved by upgrading your plan to a plan with higher connection limits; new connection limits are applied immediately. If that's not viable, then you may need to perform additional batching of your updates (such as queuing and bulk updating) or searches (with multi-search API as one example). We have some suggestions for optimizing your requests that can help point you in the right direction.

HTTP 429: Too Many Requests

This response is distinct from the "Cluster not found" message. This message indicates that you're trying to access an index that is not registered with Elasticsearch. For example:

There are a couple reasons you might see this:

Race condition. You or your app may be trying to access an index before it was created.
Typo. You may have misspelled or only partially copy/pasted the name of the index you're trying to access.
The index has been deleted. Trying to access an index that has been deleted will return an HTTP 404 from Elasticsearch.

Important Note on Index Auto-Creation

However, some popular tools such as Kibana and Logstash do not support explicit index creation, and rely on auto-creation being available. To accommodate these tools, we've whitelisted popular time-series index names such as <span class="inline-code"><pre><code>logstash*</code></pre></span>, <span class="inline-code"><pre><code>requests*</code></pre></span>, <span class="inline-code"><pre><code>events*</code></pre></span>, <span class="inline-code"><pre><code>kibana*</code></pre></span>. and <span class="inline-code"><pre><code>kibana-int*</code></pre></span>.

The solution to this error message is to confirm that the index name is correct. If so, make sure it is properly created (with all the mappings it needs), and try again.

HTTP 404: Index Not Found

The HTTP 504 Gateway Timeout error is returned when a request takes longer than 60 seconds to process, regardless of whether the process is waiting on Elasticsearch or sitting in a connection queue. This can sometimes be due to network issues, and sometimes it can occur when Elasticsearch is IO-bound and unable to process requests quickly. Complex requests are more likely to receive an HTTP 504 error in these cases.

For more information on timeouts, please see our recommendations on Connection Management.

‍

HTTP 504: Gateway Timeout

A HTTP 402 response indicates a cluster’s account is behind on payments. All requests to the cluster will return the following message:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
"code": 402,
"message": "Cluster has been disabled due to non-payment. Please update billing info or contact [email protected] for further details."
}</code></pre>
</div>

The cluster cannot be used until an overdue payment is successfully processed. If an account remains in an unpaid state for more than a week, it will be destroyed and any active clusters will be terminated and all data destroyed.

Please refer to this article on how to check an account’s billing information. If you’ve updated the account’s billing information and confirmed that the credit card on file is working, please contact us and let us know that this response still persists.

‍

HTTP 402: Payment Required

Troubleshooting and resolving connection issues can be time consuming and frustrating. This article aims to reduce the friction of resolving connection problems by offering suggestions for quickly identifying a root cause.

First Steps

If you're seeing connection errors while attempting to reach your Bonsai cluster, the first step is Don't Panic. Virtually all connection issues can be fixed in under 5 minutes once the root cause is identified.

The next step is to review the documentation on Connecting to Bonsai, which contains plenty of information about creating connections and testing the availability of your Bonsai cluster.

If the basic documentation doesn't help resolve the problem, the next step is to read the error message very carefully. Often the error message contains enough information to explain the problem. For example, something like this may show up in your logs:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Faraday::ConnectionFailed (Connection refused - connect(2) for "localhost" port 9200):</code></pre>
</div>

This message tells us that the client tried to reach Elasticsearch over localhost, which is a red flag (discussed below). It means the client is not pointed at your Bonsai cluster.

Next, look for an HTTP error code. Bonsai provides an entire article dedicated to explaining what different HTTP codes mean here: HTTP Error Codes. Like the error message, the HTTP error code often provides enough diagnostic information to identify the underlying problem.

Last, don't make any changes until you understand the error message and HTTP code (if one is present), and have read the relevant documentation. Most of these errors occur during the initial set up and configuration step, so also read the relevant Quickstart guide if one exists for your language / framework.

Common Connection Issues

There are issues which do not return HTTP error codes because the request can not be made at all, or because the request is not recognized by Elasticsearch. These issues are commonly one of the following:

Connection Refused
No handler found for URI
Name or service not known
HTTP 401 / Missing Authentication Headers
Missing SSL Root Certificate

Connection Refused

This error indicates that a request was made to a server that is not accepting connections.

This error most commonly occurs when your Elasticsearch client has not been properly configured. By default, many clients will try to connect to something like `localhost:9200`. This is a problem because Bonsai will never be running on your localhost or network, regardless of your platform.

What is Localhost?

Simply, localhost can be interpreted as "this computer." Which computer is "this" computer is determined by whatever machine is running the code. If you're running the code on your laptop, then localhost is the machine in front of you; if you're running the code on AWS or in Heroku, then the localhost is the node running your application.

Bonsai clusters run in a private network that does not include any of your infrastructure, which is why trying to reach it via localhost will always fail.

Why Port 9200?

By default, Elasticsearch runs on port 9200. While you can access your Bonsai cluster over port 9200, this is not recommended due to lack of encryption in transit.

All users will all need to ensure their Elasticsearch client is pointed at the correct URL and ensure that the Elasticsearch client is properly configured.

Ruby / Rails

If you’re using the elastisearch-rails client, simply add the following gem to your Gemfile:

The bonsai-elasticsearch-rails gem is a shim that configures your Elasticsearch client to load the cluster URL from an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>. You can read more about it on the project repository.

If you'd prefer to keep your Gemfile sparse, you can initialize the client yourself like so:

If you opt for this method, make sure to add the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> to your environment. It will be automatically created for Heroku users. Users managing their own application environment will need to run something like:

Other Languages / Frameworks

The Elasticsearch client is probably using the default <span class="inline-code"><pre><code>localhost:9200</code></pre></span> or <span class="inline-code"><pre><code>127.0.0.1:9200</code></pre></span> (127.0.0.1 is the IPv4 equivalent of "localhost"). You'll need to make sure that the client is configured to use the correct URL for your cluster, and that this configuration is not being overwritten somewhere.

No handler found for URI

The Elasticsearch API has a variety of endpoints defined, like <span class="inline-code"><pre><code>/_cat/indices</code></pre></span>. Each of these endpoints can be called with a specific HTTP method. This error simply indicates a request was made to an endpoint using the wrong method.

Here are some examples:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">
# Request to /_cat/indices with GET method:
$ curl -XGET https://user:[email protected]/_cat/indices
green open test1 1 1 0 0 318b 159b
green open test2 1 1 0 0 318b 159b

# Request to /_cat/indices with PUT method:
$ curl -XPUT https://user:[email protected]/_cat/indices
No handler found for uri [/_cat/indices] and method [PUT]</code></pre>
</div>
</div>

The solution for this issue is simple: use the correct HTTP method. The Elasticsearch documentation will offer guidance on which methods pertain to the endpoint you're trying to use.

Name or service not known

The Domain Name System (DNS) is a worldwide, decentralized and distributed directory service which translates human-readable domains like www.example.com in to network addresses like 93.184.216.119. When a client makes a request to a URL like "google.com," the application's networking layer will use DNS to translate the domain to an IP address so that it can pass the request along to the right server.

The "Name or service not known" error indicates that there has been a failure in determining an IP address for the URL's domain. This typically implicates one of several root causes:

There is a typo in the URL.
There is a TLD outage.
The Internet Service Provider (ISP) handling your application's traffic may have lost a DNS server, or is having another outage.

The first troubleshooting step is to carefully read the error and double-check that the URL (particularly the domain name) is spelled correctly. If this is your first time accessing the cluster, then a typo is almost certainly the problem.

If this error arises during regular production use, then there is probably a DNS outage. DNS outages are outside of Bonsai's control. There are a couple types of outages, but the ones that have affected our users before are:

TLD outage

A TLD is something like ".com," ".net," ".org," etc. A TLD outage affects all domains under the TLD. So an outage of the ".net" TLD would affect all domains with the ".net" suffix.

Fortunately, all Bonsai clusters can be accessed via either of two domains:

bonsai.io
bonsaisearch.net

If there is a TLD outage, you should be able to restore service by switching to the other domain. In other words, if your application is sending traffic to <span class="inline-code"><pre><code>https://user:[email protected]</code></pre></span>, and the ".io" TLD goes down, you can switch over to <span class="inline-code"><pre><code>https://user:[email protected]</code></pre></span>, and it will fix the error.

ISP Outage

Many ISPs operate their own DNS servers. This way, requests made from a node in their network can get low-latency responses for IP addresses. Most ISPs also have a fleet of DNS servers, at minimum a primary and a secondary. However, this is not a requirement, and there have been instances where an ISP's entire DNS service is taken offline.

There are also multitenant services which have Software Defined Networks (SDN) running Internal DNS (iDNS). Regressions in this software can also lead to application-level DNS name resolution problems.

If you've already confirmed the domain is correct and swapped to another TLD for your cluster, and you're still having issues, then you are probably dealing with an ISP/DNS or SDN/iDNS outage. One way to confirm this is to try making requests to other common domains like google.com. A name resolution error on google.com or example.com points to a local DNS problem.

If this happens, there is basically nothing you can do about it as a user, aside from complaining to your application sysadmin.

Missing Authentication Headers

Users who are seeing persistent HTTP 401 Error Codes may be using a client that is not handling authentication properly. As explained in the error code documentation, as well as in Connecting to Bonsai, all Bonsai clusters have a randomly-generated username and password which must be present in the request in order for it to be accepted.

What's less clear from this documentation is that including the complete URL in your Elasticsearch client may not be enough to create a secure connection.

This is due to how HTTP Basic Access Authentication works. In short, there needs to be a request header present. This header has an "Authorization" field and a value of <span class="inline-code"><pre><code>Basic</code></pre></span> . The base64 string is a base64 representation of the username and password, concatenated with a colon (":").

Most clients handle the headers for you automatically and in the background. But not all do, especially if the client is part of a bleeding edge language or framework, or if it's something homebrewed/forked/patched.

Here is a basic example in Ruby using <span class="inline-code"><pre><code>Net::HTTP</code></pre></span>, demonstrating how a URL with auth can still receive an HTTP 401 response:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">require 'base64'
require 'net/http'

# URL with credentials:
uri = URI("https://randomuser:[email protected]")

# Net::HTTP does not automatically detect the presence of
# authentication credentials and insert the proper Authorization header.req = Net::HTTP::Get.new(uri)

# This request will fail with an HTTP 401, even though the credentials
# are in the URI:
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
http.request(req)}

# The proper header must be added manually:
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp

# The request now succeeds
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
http.request(req)
}</code></pre>
</div>
</div>

From the Ruby example, it's clear that there are some cases where the credentials are simply ignored instead of being automatically put in to a header. This causes the Basic authentication to fail, and receive the HTTP 401.

Simply, if you're seeing HTTP 401 responses even while including the credentials in the URL, and you've confirmed that the credentials are entered correctly and have not been expired, then the problem is probably a missing header. You can detect this with tools like socat or wireshark if you're familiar with network traffic inspection. Or, you can try adding the headers manually.

Here are some examples of calculating the base64 string and adding the request header in several different languages:

Android

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript"> public static Map getHeaders() {
Map headers = new HashMap<>();
String credentials = "randomuser:randompass";
String auth = "Basic " + Base64.encodeToString(credentials.getBytes(), Base64.NO_WRAP);
headers.put("Authorization", auth);
headers.put("Content-type", "application/json");
return headers;
}</code></pre>
</div>
</div>

Java

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-java">public static Map getHeaders() {
Map headers = new HashMap<>();
String credentials = "randomuser:randompass";
String auth = "Basic " + Base64.getEncoder().encodeToString(credentials.getBytes());
headers.put("Authorization", auth);
headers.put("Content-type", "application/json");
return headers;
}</code></pre>
</div>
</div>

‍

Ruby

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-ruby">require 'base64'
require 'net/http'

uri = URI("https://something-12345.us-east-1.bonsaisearch.net")
req = Net::HTTP::Get.new(uri)
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
http.request(req)
}</code></pre>
</div>
</div>

Missing SSL Root Certificate

As mentioned in the Security documentation, all Bonsai clusters support SSL/TLS. This enables your traffic to be encrypted over the wire. In some rare cases, users may see something like this when trying to access their cluster over HTTPS:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">SSL: CERTIFICATE_VERIFY_FAILED</code></pre>
</div>

This is almost certainly due to a server level misconfiguration.

A complete examination of how SSL works is well outside the scope of this article, but in short: it utilizes cryptographic signatures to facilitate a chain of trust from a root certificate issued by a certificate authority (CA) to a certificate deployed to a server. The latter is used to prove the server's ownership before initiating public key exchange.

Think of a certificate authority as a mediator of sorts. A company like Bonsai goes to the CA and provides proof of identity and ownership of a domain. The CA issues Bonsai a unique certificate cryptographically signed by the CA's root certificate. Then Bonsai configures its servers to respond to SSL/TLS requests using the certificates signed by the CA's root.

When your application reaches out to Bonsai over HTTPS, the Bonsai server will respond with this certificate. The application can then inspect the certificate it receives from Bonsai and determine whether the CA's cryptographic signature is valid. If it's valid, then the application knows it really is talking to Bonsai and not an impersonator. It then performs a key exchange to open an encrypted connection and then sends/receives data.

One weakness of this system is that the CA has to be recognized by both parties; if your application doesn't have a copy of the CA's root certificate, it can't validate that the server it's talking to is genuine. You may start seeing errors like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Could not verify the SSL certificate for...
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failedpeer certificate won't be verified in this SSL session
SSL certificate problem, verify that the CA cert is OK
SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed</code></pre>
</div>

These are just a sample of the errors you might see, but it all points to the application's inability to verify Bonsai's certificate.

It's possible that this is caused by an active man in the middle (MITM) attack, but the greater probability is that the application does not have access to the right certificates. There are a few things to check:

1. Make sure the CA root certificate for Bonsai is available

Bonsai's nodes are using Amazon Trust Services as a certificate authority. These CA's need to be added to your trust store. They should be automatically installed in most browsers / operating systems. If you're seeing SSL issues, make sure that you have the correct root certificates in your trust store.

This is sometimes easier said than done. Typically a sysadmin will need to handle this. If you're using Heroku, you'll need to open a request to their support team to verify that the proper root certificate is deployed to their systems. There are some hackish workarounds that involve bundling the certificate in your deploy and configuring the application to read it from somewhere other than the OS' trust store.

2. Make sure the certificate(s) for Bonsai are up to date

Certificate Authorities typically do not issue permanent certificates. Most certificates expire every 1-2 years and need to be renewed. Some browsers and applications will cache SSL certificates to improve latency. It's possible to be using a cached certificate that has passed its expiration date, thus rendering it invalid. Flush your caches and make sure that the Bonsai certificate is up to date.

Connection Issues

Bonsai implements connection management as a metered plan resource. Our connection management proxy is designed to help model and encourage best practices for using Elasticsearch efficiently at different cluster sizes.

This document outlines several additional strategies that users can take to maximize the performance and throughput for their cluster.

Implement HTTP Keep-alive

Normally when an application makes a request to the cluster, it needs to perform a lot of steps. Very generally, it looks like this:

Set up the connection locally
Reach the target server
Establish a secure communication protocol via SSL/TLS (this step is actually a dozen smaller steps, see The SSL/TLS Handshake for an infographic)
Send the authentication credentials and the request
Wait for a response
Receive the response
Tear down the connection
Return the response to the application

As you can see, there is quite a bit of overhead to accomplish a simple search request. There is an optimization to be made here, called HTTP keep-alive. Essentially, the application opens a persistent connection to the cluster and may send and receive multiple search requests over it without the need to perform redundant tasks like negotiating a cryptographic protocol.

Persistent connections are “free” on Bonsai; the concurrency allowances only apply to active, in-flight connections. Otherwise idle HTTP keep-alive connections are not tracked toward this limit. Users should maintain a reasonably sized pool of keep-alive connections to reduce the setup cost of each connection and reduce the average latency.

This strategy saves a significant amount of overhead and reduces overall request latencies substantially. Some benchmarks of applications in production have seen a 20-25% improvement in throughput (requests per second) by implementing HTTP keep-alive.

In the ideal implementation, the application would have access to a local pool of persistent connections, from which it would draw to perform search requests. The local pool would also queue requests when all persistent connections are in use. This implementation would allow the developer to fine-tune the local pool size to minimize latency without exceeding the concurrency allowance for the cluster’s plan.

Consult the documentation for your Elasticsearch client for more details on implementing this feature.

Utilize a Durable Message Queue

For best performance and reliability, updates to data should be pushed into a durable message queue for asynchronous grouping and processing in batches. Some examples of message queues are: Apache Kafka, AWS Kinesis, IronMQ, Disque, Redis, Rescue, and RabbitMQ.

These services are resistant to network partitions and random node outages. There may be times, particularly on multi tenant class clusters, where a node is randomly restarted. Consider a case where an application pushes updates directly in to Elasticsearch. Such an interruption, even for a few seconds, could lead to error logs and emails about HTTP 503 responses.

But with a message queue acting as an intermediary, temporary connection issues are resolved automatically, and the application does not need to bubble up errors to the user or administrators. A message queue can also avoid the overhead of single-document updates by passing updates in batches. This strategy is one way to avoid an HTTP 429 response.

Prefer Bulk Updates Over Single-Document Updates

It’s worth noting that Bonsai distinguishes Bulk API updates from other writes. This is because the Bulk API is the more efficient means of data ingress into Elasticsearch, and is the preferred means of data ingest to the cluster. Single-document updates are helpful for maintaining synchronization over time, but should be avoided as a means for mass ingest.

Batch sizes should be determined experimentally based on your document size. A good heuristic is to aim for a batch size that takes approximately 1 second to complete. If requests are slower than one second, reduce the batch size. If requests are consistently faster than one-second, add more documents to the batch size. Repeat until bulk updates average about 1 second.

‍

Improving Throughput and Performance

A shard is the basic unit of work in Elasticsearch. If you haven’t read the Shards section in Elasticsearch Core Concepts, that would be a good place to start. Bonsai meters on the total number of shards in a cluster. That means both primary and replica shards count towards the limit.

The relationship between shard scheme and your cluster’s usage can sometimes not be readily apparent. For example, if you have an index with a 3x2 sharding scheme (3 primaries, 2 replicas), that’s not 5 or 6 shards, it’s 9. If this is confusing, read our Shard Primer documentation for some nice illustrations.

Managing shards is a basic skill for any Elasticsearch user. Shards carry system overhead and potentially stale data. Keeping your cluster clean by pruning old shards can both improve performance and reduce your server costs.

In this guide we cover a few different ways of reducing shard usage:

What is a Shard Overage?
Delete Unneeded Indices
Use a Different Sharding Scheme
Reduce replication
Data Collocation
Upgrade the Subscription

This guide will make frequent references to the Elasticsearch API using the command line tool <span class="inline-code"><pre><code>curl</code></pre></span>. Interacting with your own cluster can also be done via <span class="inline-code"><pre><code>curl</code></pre></span>, or via a web browser. You can also use the interactive Bonsai Console in your cluster dashboard.

What is a Shard Overage?

A shard overage occurs when your cluster has more shards than the subscription allows. In practice, this usually means one of three things:

Extraneous indices.
Sharding scheme is not optimal.
Replication too high

It is also possible that the cluster and its data are already configured according to best practices. In that case, you may need to get creative with aliases and data collocation in order to remain on your current subscription.

Delete Unneeded Indices

There are some cases where one or more indices are created on a cluster for testing purposes, and are not actually being used for anything. These will count towards the shard limits; if you’re getting overage notifications, then you should delete these indices.

There are also some clients that will use aliases to roll out changes to mappings. This is a really nice feature that allows for zero-downtime reindexing and immediate rollbacks if there’s a problem, but it can also result in lots of stale indices and data accumulating in the cluster.

To determine if you have extraneous indices, use the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> endpoint to get a list of all the indices you have in your cluster:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript"><script> console.log('hello'); </script>GET /_cat/indices
green open test1 1 1 0 0 318b 159b
green open test2 1 1 0 0 318b 159b
green open test3 1 1 0 0 318b 159b
green open test4 1 1 0 0 318b 159b
green open test5 1 1 0 0 318b 159b
green open test6 1 1 0 0 318b 159b</code></pre>
</div>

If you see indices that you don’t need, you can simply delete them:

Use a Different Sharding Scheme

It’s possible that for some reason one or more indices were created with far more shards than necessary. For example, a check of <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> shows something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_cat/indices
green open test1 5 2 0 0 318b 159b
green open test2 5 2 0 0 318b 159b</code></pre>
</div>

That 5x2 shard scheme results in 15 shards per index, or 30 total for this cluster. This is probably really overprovisioned for these indices. Choosing a more conservative shard scheme of 1x1 would reduce this cluster’s usage from 30 shards down to 4.

Unfortunately, the number of primary shards can not be changed once an index has been created. To fix this, you will need to manually create a new index with the desired shard scheme and reindex the data. If you have not read The Ideal Elasticsearch Index, it has some really nice information on capacity planning and sizing. Check out the sections on Intelligent Sharding and Benchmarking for some tips on what scheme would make more sense for your particular use case.

Reduce replication

Multitenant subscription plans (Hobby and Standard tiers) are required to have at least 1 replica shard in use per index. Indices on these plans cannot have a replica count of 0.

For most use-cases, a single replica is perfectly sufficient for redundancy and load capacity. If any of your indices have been created with more than one replica, you can reduce it to free up shards. An index with more than one replica might look like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript"><script> console.log('hello'); </script>GET /_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open test1 5 2 0 0 318b 159b
green open test2 5 2 0 0 318b 159b</code></pre>
</div>

Note that the <span class="inline-code"><pre><code>rep</code></pre></span> column has a 2? That means there are actually 3 copies of the data: one primary shard, and its two replicas. Replicas are a multiplier against primary shards, so if an index has a 5×2 configuration (5 primary shards with 2 replicas), reducing replication to 1 will free up five shards, not just one. See the Shard Primer for more details.

Fortunately, reducing the replica count for the index is a small JSON body to the <span class="inline-code"><pre><code>_settings</code></pre></span> endpoint:

That simple request shaved 10 shards off of this cluster’s usage.

Replication == Availability and Redundancy

It might seem like a good money-saving idea to simply set all replicas to 0 so as to fit as many indices into your cluster as possible. However, this is not advisable. This means that your primary data has no live backup. If a node in your cluster goes offline, data loss is basically guaranteed.

Data can be restored from a snapshot, but this is messy and not a great failover plan. The snapshot could be an hour or more old, and any updates to your data since then either need to be reindexed or are lost for good. Additionally, the outage will last much longer.

Having replication of at least 1 mitigates against all these problems.

Data Collocation

Another solution for reducing shard usage involves using aliases and custom routing rules to collocate different data models onto the same group of shards.

What is data collocation? Many Elasticsearch clients use an index per model paradigm as the default for data organization. This is analogous to, say, a Postgres database with a table for each type of data being indexed.

Sharding this way makes sense most of the time, but in some rare cases users may benefit from putting all of the data into a single namespace. In the Postgres analogy, this would be like putting all of the data into a single table instead of a table for each model. An attribute (i.e., table column) is then used to filter searches by class.

For example, you might have a cluster that has three indices: <span class="inline-code"><pre><code>videos</code></pre></span>, <span class="inline-code"><pre><code>images</code></pre></span> and <span class="inline-code"><pre><code>notes</code></pre></span>. If each of these has a conservative 1x1 sharding scheme, it would require 6 shards. But this data could potentially be compacted down into a single index, <span class="inline-code"><pre><code>production</code></pre></span>, where the mapping has a <span class="inline-code"><pre><code>type</code></pre></span> field of some kind to indicate whether the document belongs to the <span class="inline-code"><pre><code>video</code></pre></span>, <span class="inline-code"><pre><code>image</code></pre></span> or <span class="inline-code"><pre><code>note</code></pre></span> class.

The latter configuration with the same 1x1 scheme would only require two shards (one primary, one replica) instead of six:

There are several major downsides to this approach. One is that field name collisions become an issue. For example, if there is a field called <span class="inline-code"><pre><code>published</code></pre></span> in two models, but one is defined as a boolean and the other is a datetime, it will create a conflict in the mapping. One will need to be renamed.

Another downside is that it is a pretty large refactor for most users, and may be more trouble than simply upgrading the plan. Overriding the default behavior in the application’s Elasticsearch client may require forking the code and relying on other hacks/workarounds.

There are others. Data collocation is mentioned here as a possibility, and one that only works for certain users. It is by no means a recommendation.

Upgrade the Subscription

If you find that you’re unable to reduce shards through the options discussed here, then you will need to upgrade to the next plan.

‍

Reducing Shard Usage

Throughout the Bonsai documentation, we use a standard terminology for discussing Elasticsearch-related subjects. This terminology is described in the Elasticsearch documentation, but we’ll highlight and discuss some very common terms here.

Node

A node is simply a running instance of Elasticsearch. While it’s possible to run several instances of Elasticsearch on the same hardware, it really is a best practice to limit a server to a single running instance of Elasticsearch. On Bonsai, a node is technically a virtual server running in the cloud. Bonsai follows the best practice of one Elasticsearch instance per server.

Cluster

A cluster is a collection of nodes running Elasticsearch. From the Elasticsearch documentation: “A cluster consists of one or more nodes which share the same cluster name. Each cluster has a single master node which is chosen automatically by the cluster and which can be replaced if the current master node fails.”

Index

In Elasticsearch parlance, the word “index” can either be used as a verb or a noun. This can sometimes be confusing for users new to Elasticsearch, and especially for users for whom English is not their first language. The intended meaning is usually understood through syntax and context clues.

Index (noun)

From the Elasticsearch documentation: “An index is like a table in a relational database… [It] is a logical namespace which maps to one or more primary shards and can have zero or more replica shards.”

To add to this, an index is sort of an abstraction because the shards (discussed in the section on Shards) are the “real” search engines. Queries to an index’s contents are routed to its shards, each of which is actually a Lucene instance. Because this happens in the background, it can appear that the index is performing the actual work, and the shards are more akin to a Heroku dyno or perhaps are simply there to add compute resources.

“How Many Indices Can I Put on a Shard?”

Because an Elasticsearch index is both abstract and opaque with regards to data storage and retrieval, sometimes new users miss the connection between indices and shards. A common support question that arises is “how do I put more indices on my shard?” The answer is that the index is composed of shards, not vice-versa.

In other words, an index can have many shards, but each shard can only belong to one index. This relationship precludes collocating multiple indices on a single shard.

Index (verb)

The process of populating an Elasticsearch index (noun) with data. It is most commonly used as a transitive verb with the data as the direct object, rather than the index (noun) being populated. In other words, the process is performed on the data, so that you would say: “I need to index my data,” and not “I need to index my index.”

Indexing is a critical step in running a search engine. Without indexing your content, you will not be able to query it using Elasticsearch, or take advantage of any of the powerful search features Elasticsearch offers.

Elasticsearch does not have any mechanism for extracting the contents of your website or database on its own. Indexing is something that must be managed by the application and/or the Elasticsearch client (defined below). Refer to the documentation for your platform as necessary for details.

Elasticsearch is not a Crawler

Elasticsearch is a search engine, not a content crawler. If your use case involves scanning domains or websites, you’ll need to use a crawler like Apache Nutch. In this case, the crawler would be responsible for scraping content and indexing it into your cluster.

Shards

From the Elasticsearch documentation: “Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.”

To add to this, each shard is technically a standalone search engine. There are two types of shards: primary and replica. We have some documentation on what shards are and where they come from in What are shards and replicas?, but we’ll also elaborate on some of the differences here.

Primary Shards

According to the Elasticsearch documentation: “Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.”

Another way to think about primary shards is “the number of ways your data is split up.” One reason to have an index composed of X primary shards is that each shard will contain 1/X of your data. Queries can then be performed in parallel. This is useful and can definitely improve performance if you have millions of documents.

So Why Not a Kagillion Shards?

Elasticsearch has an entire article dedicated to the tradeoffs and limitations of a large number of primary shards. The really short version is that primary shards offer diminishing returns while the overhead increases sharply.

The takeaway is that you never want to use a large number of primary shards without some serious capacity planning and testing. If you are wondering how many primary shards you’ll need, you can check out The Ideal Elasticsearch Index (specifically the benchmarking section), or simply shoot us an email.

Primary Shards Can Not Be Added/Removed Later

From one of our more popular blog entries: “Elasticsearch uses a naive hashing algorithm to route documents to a given primary shard. This design choice allows documents to be randomly distributed in a reproducible way. This avoids “hot spots” that affect performance and overallocation. However, it has one major downside, which is that the number of primary shards can not be changed after an index has been created. Replicas can be added and removed at will, but the number of primary shards is basically written in stone.

Replica Shards

The Elasticsearch definition for replica shards sums it up nicely:

A replica is a copy of the primary shard, and has two purposes:

Increase failover: a replica shard can be promoted to a primary shard if the primary fails
Increase performance: get and search requests can be handled by primary or replica shards. By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard.

Another way to think about replica shards is “the number of redundant copies of your data.” If your index has 1 primary shard and 2 replica shards, then you can think of the cluster as having 3 total copies of the data. If the primary shard is lost – for example, if the server running it dies, or there is a network partition – then the cluster can recover automatically by using one of the replicas.

Replication is a Best Practice

If you’re running a production application, all of your indices should have a replication factor of at least 1. Otherwise, you’re exposed to data loss if anything unexpected happens.

Replicas are Easy

In contrast to primary shards, which can not be added/removed after the index is created, replicas can be added and removed at any time.

Document(s)

From the Elasticsearch documentation: “A document is a JSON document which is stored in Elasticsearch. It is like a row in a table in a relational database.”

As referenced, an analogue for an Elasticsearch document would be a database record. It is a collection of related fields and values. For example, it might look something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
"title" : "Hello, World!",
"author" : "John Doe",
"description" : "This is a JSON document!"
}</code></pre>
</div>

You may be thinking to yourself: “My data is sitting in Postgres/MySQL/whatever, which is most certainly not in a JSON format! How do I turn the contents of a DB table into something Elasticsearch can read?”

Normally this work is handled automatically by a client (see below). Most users will never need to worry about translating the database contents into the Elasticsearch documents, as it will be handled automatically and “behind the scenes,” so to speak.

What about Word/Excel/PDF…?

Sometimes new users are confused about the term “document,” because their mental model (and possibly even the data they want to index) involves file formats like Word, PDF, Excel, RTF, PPT, and others. In Elasticsearch terminology, these formats are sometimes called “rich text documents,” and are completely different from Elasticsearch documents.

One reason this can be confusing is that Elasticsearch can index and search rich text documents. There are some plugins – the mapper-attachment and ingest-attachment (both supported on Bonsai) – which use the Apache Tika toolkit for extracting the contents of rich text documents and pushing it into Elasticsearch.

That said, if the data you’re trying to index resides in a rich text format, you can still index it into Elasticsearch. But when reading the documentation, keep in mind that “document” should refer to the JSON representation of the content, and some variant of “rich text” will refer to the file you’re trying to index.

Elasticsearch Client

A client is software that sits between your application and Elasticsearch cluster. It’s used to facilitate communication between your application and Elasticsearch. At a minimum, it will take data from the application, translate it into something Elasticsearch can understand, and then push that data into the cluster.

Most clients will also handle a lot of other Elasticsearch-related tasks, such as:

Automatically creating indices with the correct mappings/settings
Handling search queries and responses
Setting up and managing aliases

It is unlikely that you will need to create your own client. Elasticsearch maintains a list of language-specific clients that are well-documented and in wide use. Clients also exist for popular frameworks and content management systems, like:

Ruby on Rails (elasticsearch-rails and searchkick)
Django
Drupal
Wordpress
… and many more.

In short, there is probably already an open sourced, well-documented client available.

‍

Elasticsearch Core Concepts

For people new to Elasticsearch, shards can be a bit of a mystery. Why is it possible to add or remove replica shards on demand, but not primary shards? What’s the difference? How many indices can fit on a shard?

This article explains what shards are, how they work, and how they can best be used.

Where do shards come from?

First, a little bit of background: Elasticsearch is built on top of Lucene, which is a data storage and retrieval engine. What are called “shards” in Elasticsearch parlance are technically Lucene instances. Elasticsearch manages these different instances, spreading data across different instances, and automatically balancing those instances across different nodes.

Whenever an Elasticsearch index is created, that index will be composed of one or more shards. This means that an Elasticsearch index can spread data across several Lucene instances. This architecture is useful for both redundancy and parallelization purposes.

Shards play one of two roles: primary or replica. Primary shards are a logical partitioning of the data in the index, and are fixed at the time that the index is created. Primary shards are useful for parallelization; when a large amount of data is split across several primary shards, a node can run a query on several Lucene instances in parallel, reducing the overall time of the job.

Primary Shards Can Not Be Added/Removed Later

In contrast, replica shards are simply extra copies of the data. They are useful for redundancy or to handle extra search traffic, and can be added and removed on demand.

You can specify how many primary shards and replicas are used when creating a new index.

Replicas and High-Availability

Replica shards are not only capable of serving search traffic, but they also provide a level of protection against data loss. If a node hosting a primary shard is taken offline for some reason, Elasticsearch will promote its replica to a primary role, if a replica exists. However, if the index’s replication is set to 0, then it is not in a High-Availability configuration. In the event of a data loss incident, the data will simply be lost.

Replicas are a multiplier on the primary shards, and the total is calculated as primary * (1+replicas). In other words, if you create an index with 3 primary shards and 2 replicas, you will have 9 total shards, not 5 or 6.

Measuring your cluster’s index and shard usage

Elasticsearch offers some API endpoints to explore the state of your indices and shards. The <span class="inline-code"><pre><code>_cat</code></pre></span> APIs are helpful for human interaction. You can view your index states by visiting <span class="inline-code"><pre><code>/_cat/indices</code></pre></span>, which will show index names, primary shards and replicas. You can also inspect individual shard states and statistics by visiting <span class="inline-code"><pre><code>/_cat/shards</code></pre></span>. See example output below:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">$ curl -s https://user:[email protected]/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open images 1 0 0 0 130b 130b
green open videos 1 0 0 0 130b 130b
green open notes 1 0 0 0 130b 130b

$ curl -s https://user:[email protected]/_cat/shards?v
index shard pri rep state docs store ip node
images 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man
notes 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man
videos 0 p STARTED 0 130b XXX.XXX.XXX.XXX Sugar Man</code></pre>
</div>

We’ve also made this easy by creating a live interactive console for you. Just visit your cluster’s dashboard console, chose <span class="inline-code"><pre><code>GET</code></pre></span> from the dropdown, and run <span class="inline-code"><pre><code>/_cat/indices?v</code></pre></span> or <span class="inline-code"><pre><code>/_cat/shards?v</code></pre></span>.

Frequently Asked Questions About Shards

Q. How Many Indices Can I Put On A Shard?

Short Answer: An index can have many shards, but any given shard can only belong to one index.

Long Answer: It is possible to “fake” multiple indices per shard by using “type” fields and index aliases. There is a section of the Elasticsearch Guide dedicated to this approach. Essentially, if you have several data models, you could put two or more into a single index instead of separate indices, and use aliases to perform automated filtering of queries.

The downside to this approach is that it requires a fair amount of work. Most frameworks and content management systems with integrated Elasticsearch clients will not take this approach by default. Developers will need to create and manage the index settings manually instead of relying on automated tools.

Q. How Many Shards Do I Need?

Answer: It depends. Because replicas can be added/removed at will, the real question is how many primary shards are needed. Increasing the number of primary shards for an index is one way to improve performance because it allows the query to be processed in parallel. But there are a lot of reasons to not have a kajillion primary shards.

Generally, we recommend that if you don’t expect data to grow significantly, then:

One primary shard is fine if you have less than 100K documents
One primary shard per node is good if you have over 100K documents
One primary shard per CPU core is good if you have at least a couple million documents

If you anticipate lots of growth – orders of magnitude within a short time – then the problem is a little more complicated. Take a read through the Ideal Elasticsearch Index post, especially the benchmarking section. Or shoot an email to [email protected].

Q. How Do I Reduce The Number of Shards I’m Using?

Answer: There is an article dealing with this very subject: Reducing Shard Usage.

‍

Shard Primer

Bonsai makes upgrading major versions of Elasticsearch as painless as possible. There is no need to manage the operational details of deploying software upgrades to nodes or spinning up new servers. With Bonsai, version upgrades are instant and can be performed with zero downtime.

In this document, we’ll cover some general best practices and offer some Bonsai-specific guidelines for migrating your app to a new version of Elasticsearch.

Protip: Use one Elasticsearch cluster per environment

This means having one cluster for production, and another for staging, another for development, and so on. Some users like to put staging indices alongside production indices on the same cluster in order to ensure identical behaviors between staging and production applications. First, this is a terrible idea in general; you should never run staging/dev applications on the same resources as production. This is a recipe for disaster.

Second, separating out your environments allows for upgrading them one at a time. While this may sound tedious, it is the most prudent approach and allows you to discover potential problems before they impact production.

Step 1: Read the Release Notes Carefully

Make sure to perform your due diligence by reading the release notes and breaking changes that accompany the version you’re targeting for the upgrade.

Another thing to investigate is whether your application’s Elasticsearch client supports the upgrade candidate. There have been cases where a popular client or framework was several support versions behind the official Elasticsearch release. Some of these have resulted in hours of down time for users who upgraded the production Elasticsearch cluster beyond the version supported by their Elasticsearch client.

This is one of many reasons we recommend upgrading non-production environments first.

Step 2: Validate In Development

Upgrading across major versions sometimes comes with breaking changes, new dependencies, and tweaks in behavior. It is important to validate that the upgrade is safe before pushing it out to production. We advise starting by upgrading the least critical environment first. A variation of the blue-green deployment strategy is useful here.

The process looks like this:

1. The application communicates with the old Elasticsearch cluster via a supported Elasticsearch client. Upgrading the cluster to a newer version of Elasticsearch will also generally require upgrading the client:

2. A new cluster running a later version of Elasticsearch is provisioned for the application. This cluster is not initially connected to the application via the client, because the client needs to be upgraded too:

3. The Elasticsearch client is upgraded, and the new version is updated to point to the new cluster. The application then performs a reindex, which populates the new cluster with data.

4. If the reindex worked as expected, and the application is behaving normally, then the old cluster can be destroyed. If there are any issues, then the client can be downgraded and configured to point back to the old cluster.

Ensure it works as expected, and make any changes as needed. Deploy those changes to the next least critical environment and then upgrade the cluster for that environment. Continue in this fashion until reaching the production environment. By that point, you should be fairly confident that the application and search will work as expected.

Make sure to validate that searches will work as expected in terms of relevancy. Also make sure to test full deletion and reindexing in the least critical environments before upgrading production. Reindexing is something you should be familiar with anyway as a part of normal usage of Elasticsearch, such as changing analyzer settings, backfilling a new field, or - in this case - upgrading to a new major version.

Step 3: Upgrade the Production Cluster

Once you are satisfied that the candidate version will work in production as expected, the final step is to take it live. This last step is usually complicated by the constraint that search must not go down at all, and data loss is unacceptable. Because of this constraint, planning and possibly additional infrastructure (like message queues), are required to ensure a zero-downtime switch and a fallback path in case something breaks.

Of course, if you're fortunate enough to have a use case where the production app can be put into maintenance mode while the new cluster is repopulated, then you can simply use the same process as outlined in the previous step.

For everyone else, the basic process is to have the old application and cluster serve traffic while the new application is populating the new cluster from the source database. Once the new cluster is populated with the same data as the old cluster, the new application is promoted to a production role and begins serving traffic. This strategy allows developers to quickly roll back to a known working state in the event that there is a serious issue with the new system.

The exact steps will vary considerably by application and use case. A typical strategy for this is outlined below:

3. A fork is made of the production application. This fork will eventually be promoted to the production role. The fork is identical to the production application, except for the Elasticsearch client and any associated dependencies, which are updated to match the new version of Elasticsearch.

The fork is configured to read from the production DB and the client is configured to point to the new cluster:

4. The new cluster is indexed from the production DB. This allows the new cluster to "catch up" with the old cluster. If the production application handles a high volume of updates, it may be necessary to push updates into a queue of some kind instead of saving to the DB.

At the end of this process, the old and new clusters will have the same data:

5. The forked application is promoted to production. If updates were queued, those changes should be flushed to each cluster. Each cluster now has the same data, and the new Elasticsearch cluster is handling searches. No data was lost in the transition:

6. The old production application is still connected to the old cluster, but is not handling traffic. It exists only as an emergency fallback option. If the new production application and Elasticsearch version introduce a regression, the old production app can be promoted back to the production role.

In that event, updates would again need to be queued, and the old production app would index any updates made since the initial cutover. When that is done, the queue is again flushed, ensuring that no data has been lost:

7. If everything worked as expected, and a rollback is not needed, the old cluster and the old production application can be destroyed. The application is now running the latest version of Elasticsearch without data loss or downtime.

You will need to adapt this to your specific use case when planning out your blue-green strategy.

Ask Support

Upgrading major versions of Elasticsearch while running a production application can be tricky. If you’re unsure of what to do, are concerned about an edge case or special circumstance, or simply want to sanity check a plan, please do not hesitate to reach out to [email protected]. We’re here to help ensure the smoothest upgrade possible.

‍

Upgrading Major Versions

Capacity planning is the process of estimating the resources you’ll need over short and medium term timeframes. The result is used to size a cluster and avoid the pitfalls of inadequate resources (which cause performance, stability and reliability problems), and overprovisioning, which is a waste of money. The goal is to have just enough overhead to support the cluster’s growth at least 3-6 months into the future.

This document discusses how to estimate the resources you will need to get up and running on Elasticsearch. The process is roughly as follows:

Estimating Shard Requirements
Estimating Disk Requirements
Planning for Memory
Planning for Traffic

We have included a section with some sample calculations to help tie all of the information together.

Estimating Shard Requirements

Determining the number of shards that your cluster will need is one of the first steps towards estimating how many resources you will need. Every index in Elasticsearch is comprised of one or more shards. There are two types of shards: primary and replica, and how many total shards you need will depend not only on how many indices you have, but how those indices are sharded.

If you’re unfamiliar with the concept of shards, then you should read the brief, illustrated Shard Primer section first.

Why Is This Important?

Each shard in Elasticsearch is a Lucene instance, which often creates a large number of file descriptors. The number of open file descriptors on a node grows exponentially with shard counts. Left unchecked, this can lead to a number of serious problems.

Beware of High Shard Counts

There are a fixed number of file descriptors that can be opened per process; essentially, a large number of shards can crash Elasticsearch if this limit is reached, leading to data loss and service interruptions. This limit can be increased on a server, but there are implications for memory usage. There are a few strategies for managing these types of clusters, but that discussion is out of scope for preliminary capacity planning.

Some clusters can be designed to specifically accommodate a large number of shards, but that’s something of a specialized case. If you have an application that generates lots of shards, you will want your nodes to have plenty of buffer cache. For most users, simply being conscientious of how and when shards are created will suffice.

Full Text Search

The main question for this part of the planning process is how you plan to organize data. Commonly, users are indexing a database to get access to Elasticsearch’s full text search capabilities. They may have a Postgres or MySQL database with products, user records, blog posts, etc, and this data is meant to be indexed by Elasticsearch. In this use case, there is a fixed number of indices and the cluster will look something like this:

GET /_cat/indices
green open products 1 1 0 0 100b 100b
green open users 1 2 0 0 100b 100b
green open posts 3 2 0 0 100b 100b

In this case, the total number of shards is calculated by adding up all the shards used for all indices. The number of shards an index uses is the number of primary shards, p, times one the number of replica shards, r. Expressed mathematically, the total number of shards for an index is calculated as:

shards = p * (1 + r)

In the sample cluster above, the products index will need 1x(1+1)=2 shards, the users index will require 1x(1+2)=3 shards, and the posts index will require 3x(1+2)=9 shards. The total shards in the cluster is then 2+3+9=14 shards. If you need help deciding how many shards you will need for an index, check out the blog article The Ideal Elasticsearch Index, specifically the section called “Benchmarking is key” for some general guidelines.

Time Series Applications

Another common use case is to index time-series data. An example might be log entries of some kind. The application will create an index per hour/day/week/etc to hold these records. These use cases require a steadily-growing number of indices, and might look something like this:

GET /_cat/indices
green open events-20180101 1 1 0 0 100b 100b
green open events-20180102 1 1 0 0 100b 100b
green open events-20180103 1 1 0 0 100b 100b

With this use case, each index will likely have the same number of shards, but new indices will be added at regular intervals. With most time series applications, data is not stored indefinitely. A retention policy is used to remove data after a fixed period of time.

In this case, the number of shards needed is equal to the number of shards per index times the number of indices per time unit, times the retention policy. In other words, if an application is creating 1 index, with 1 primary and 1 replica shard, per day, and has a 30 day retention policy, then the cluster would need to support 60 shards:

1x(1+1) shards/index * 1 index/day * 30 days = 60 shards

Estimating Disk Requirements

The next characteristic to estimate is disk space. This is the amount of space on disk that your cluster will need to hold the data. If you’re a customer of Bonsai, this is the only type of disk usage you will be concerned with. If you’re doing capacity planning for another host (or self-hosting), you’ll want to take into account the fact that the operating system will need additional disk space for software, logs, configuration files, etc. You’ll need to factor in a much larger margin of safety if planning to run your own nodes.

Allocation

Benchmarking for Baselines

The best way to establish a baseline estimate for the amount of disk space you’ll need is to perform some benchmarking. This does not need to be a complicated process. It simply involves indexing some sample data into an Elasticsearch cluster. This can even be done locally. The idea is to collect some information on:

How many indices are needed?
How many shards are needed?
What is the average document size per index?

Database Size is a Bad Heuristic

Sometimes users will estimate their disk needs by looking at the size of a database (Postgres, MySQL, etc). This is not an accurate way to estimate Elasticsearch’s data footprint. ACID-compliant data stores are meant to be durable, and come with far more overhead than data in Lucene (what Elasticsearch is based on). There is a ton of overhead that will never make it into your Elasticsearch cluster. A 5GB Postgres database may only require a few MB in Elasticsearch. Benchmarking some sample data is a far greater tool for estimation.

Attachments Are Also a Bad Heuristic

Some applications are indexing rich text files, like DOC/DOCX, PPT, and PDFs. Users may look at the average file size (maybe a few MB) and multiply this by the total number of files to be indexed to estimate disk needs. This is also not accurate. Rich text files are packed with metadata, images, formatting rules and so on, bits that will never be indexed into Elasticsearch. A 10MB PDF may only take up a few KB of space once indexed. Again, benchmarking a random sample of your data will be far more accurate in estimating total disk needs.

Suppose you have a development instance of the application running locally, a local instance of Elasticsearch, and a local copy of your production data (or a reasonable corpus of test data). After indexing 10% of the production data into Elasticsearch, a call to /_cat/indices shows the following:

curl localhost:9200/_cat/indices
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open users-production 1 1 500 0 2.4mb 1.2mb
green open posts-production 1 1 1500 0 62.4mb 31.2mb
green open notes-production 1 1 300 0 11.6mb 5.8mb

In this example, there are 3 indices. Each index has one primary shard and one replica shard, for a total of 2 shards per index.

We can also see that users-production has 1.2MB of primary data occupied by 500 documents. This means one of these documents is 2.4KB on average. Similarly, posts-production documents average 20.8KB and notes-production documents average 19.3KB.

We can also estimate the disk footprint for each index populated with 100% of its data. users-production will require ~12MB, posts-production will require around 312MB and notes-production will require ~58MB. Thus, the baseline estimate is ~382MB for 100% of the production data.

The last piece of information to determine is the impact of replica shards. A replica shard is a copy of the primary data, hosted on another node to ensure high availabilty. The total footprint of the cluster data is equal to the primary data footprint times (1 + number_of_replicas).

So if you have a replication factor of 1, as in the example above, the baseline disk footprint would be 382MB x (1 + 1) = 764MB. If you wanted an additional replica, to keep a copy of the primary data on all 3 nodes, the footprint requirement would be 382MB x (1 + 2) = 1.1GB. (Note: if this is confusing, check out the Shard Primer page).

Last, it is a good idea to add a margin of safety to these estimates to account for larger documents, tweaks to mappings, and to provide some “cushion” for the operating system. Roughly 20% is a good amount; in the example above, this would give a baseline estimate of about 920MB disk space.

Medium-term Projections

The next step is to determine how quickly you’re adding data to each index. If your database creates timestamps when records are created, this information can be used to estimate the monthly growth in the number of records in each table. Suppose this analysis was performed on the sample application, and the following monthly growth rates were found:

users-production: +5%/mo
posts-production: +12%/mo
notes-production: +7%/mo

In a 6 month period, we would expect users-production to grow ~34% from its baseline, posts-production to grow ~97% from its baseline, and notes-production to grow ~50% from its baseline. Based on this, we can guess that in 6 months, the data will look like this:

GET /_cat/indices
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open users-production 1 1 6700 0 32.0mb 16.0mb
green open posts-production 1 1 29607 0 1.23gb 615mb
green open notes-production 1 1 4502 0 174mb 86.9mb

Based on this, the cluster should need at least 1.44GB. Add the 20% margin of safety for an estimate of ~1.75GB.

These calculations for the sample cluster show that we should plan on having at least 1.75GB of disk space available just for the cluster data. This amount will suffice for the initial indexing of the data, and should comfortably support the cluster as it grows over the next 6 months. At the end of that interval, resource usage can be re-evalutated, and resources added (or removed) if necessary.

Time Series Data

Some use cases involve time-series data, in which new indices are created on a regular basis. For example, log data may be partitioned into daily or hourly indices. In this case, the process of estimating disk needs is roughly the same, but instead of looking at document sizes, it’s better to look at the average index footprint.

Consider this sample cluster:

GET /_cat/indices
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open logs_20180101 1 1 27954 0 194mb 97.0mb
green open logs_20180102 1 1 29607 0 207mb 103mb
green open logs_20180103 1 1 28764 0 201mb 100.7mb

One could estimate that the average daily index requires 200MB of disk space. In six months, that would lead to around 36.7GB disk usage. With the margin of safety, a cluster with 45GB of disk allocated to the cluster is needed.

There are two caveats to add: first, time series data usually does not have a six month retention policy. A more accurate estimate would be to multiply the average daily index size by the number of days in the retention policy. If this application had a 30 day retention policy, the disk need would be closer to 7.2GB.

The second caveat is too many shards can be a problem (see Estimating Shard Usage for some discussion of why). Creating two shards every day for 6 months would lead to around 365 shards in the cluster, each with a lot of overhead in terms of open file descriptors. This could lead to crashes, data loss and serious service interruptions if the OS limits are too low, and memory problems if those limits are too high.

In any case, if the retention policy creates a demand for large numbers of open shards, the cluster needs to be sized not just to support the data, but the file descriptors as well. On Bonsai, this is not something users need to worry about, as these details are handled for you automatically.

Planning for Memory

Memory is an important component of a high-performing cluster. Efficient use of this resource helps to reduce the CPU cycles needed for a given search in several ways. First, matches that have been computed for a query can be cached in memory so that subsequent queries do not need to be computed again. And servers that have been sized with enough RAM can avoid the CPU overhead of working in virtual and swap memory. Saving CPU cycles with memory optimizations reduces request latency and increases the throughput of a cluster.

However, memory is a complicated subject. Optimizing system memory, cache invalidation and garbage collection are frequent subjects of Ph.D. theses in computer science. Fortunately, Bonsai handles server sizing and memory management for you. Our plans are sized to accommodate a vast majority of Elasticsearch users.

“I want enough RAM to hold all of my data!”

This is a common request, and it makes sense in principal. RAM and disk (usually SSD on modern servers) are both physical storage media, but RAM is several orders of magnitude faster at reading and writing than high-end SSD units. If all of the Elasticsearch data could fit into RAM, then we would expect an order of magnitude improvement in latency, right?

This tends to be reasonable for smaller clusters, but becomes less practical as a cluster scales. System RAM availability offers diminishing returns on performance improvements. Beyond a certain point, only a very specific set of use cases will benefit and the costs will necessarily be much higher.

Furthermore, Elasticsearch creates a significant number of in-memory data structures to improve search speeds, some of which can be fairly large (see the documentation on fielddata for an example). So if your plan is to base the memory size on disk footprint, you will need to not only need to measure that footprint, but also add enough for the OS, JVM, and in-memory data structures.

For all the breadth and depth of the subject, 95% of users can get away with using a simple heuristic: estimate needing 10-30% of the total data size for memory. 50% is enough for >99% of users. Note that because Bonsai manages the deployment and configuration of servers, this heuristic does not include memory available to the OS and JVM. Bonsai customers do not need to worry about these latter details.

So where does that heuristic break down? When do you really need to worry about memory footprint? If your application makes heavy use of any of the following, then memory usage will likely be a factor:

Highlighting
Aggregations (a.k.a faceting)
Dynamic scripting
Geospatial search
Deep pagination. This is when you have hundreds or thousands of pages of results; accessing results 9900 to 10000 requires substantially more overhead than results 1-100.
Sorting large numbers of documents. This is similar to, but distinct from, deep pagination. See Efficient sorting of geo distances in Elasticsearch for a real world example.
Frequent cache evictions (ex: caching timestamps by second or minute, instead of hourly or daily)

If your application is using one or more of these features, plan on needing more memory. If you would like to see the exact types of memory that Bonsai meters against, check out the Metering on Bonsai article.

Planning for Traffic

Capacity planning for traffic requirements can be tricky. Most applications do not have consistent traffic demands over the course of a day or week. Traffic patterns are often “spiky,” which complicates the estimation process. Generally, the greater the variance in throughput (as measured in requests per second, rps), the more capacity is needed to safely handle load.

Estimating Traffic Capacity

Users frequently base their estimate on some average number of requests: “Well, my application needs to serve 1M requests per day, which averages to 11-12 requests per second, so that’s what I’ll need.” This is a reasonable basis if your traffic is consistent (with a very low variance). But it is considerably inaccurate if your variance is more than ~10% of the average.

Consider the following simplified examples of weekly traffic patterns for two applications. The plots show the instantaneous throughput over the course of 7 days:

In each of these examples, the average throughput is the same, but the variance is markedly different. If they both plan on needing capacity for 5 requests per second, Application 1 will probably be fine because of Bonsai’s connection queueing, while Application 2 will be dramatically underprovisioned. Application 2 will be consistently demanding 1.5-2x more traffic than what it was designed to handle.

You’ll need to estimate your traffic based on the upper bounds of demand rather than the average. Some analysis will be necessary to determine the “spikyness” of your application’s search demands.

Traffic Volume and Request Latencies

There is a complex economic relationship between IO (as measured by CPU load, memory usage and network bandwidth) and maximum throughput. A given cluster of nodes has only a finite supply of resources to respond to the application’s demands for search. If requests come in faster than the nodes can respond to them, the requests can pile up and overwhelm the nodes. Everything slows down and eventually the nodes crash.

Simply: complex requests that perform a lot of calculations, require a lot of memory to complete, and consume a lot of bandwidth will lead to a much lower maximum throughput than simpler requests. If the nodes are not sized to handle peak loads, they can crash, restart and perform worse overall.

With multitenant class clusters, resources are shared among users and ensuring a fair allocation of resources is paramount. Bonsai addresses this complexity with the metric of concurrent connections. There is an entire section devoted to this in Metering on Bonsai. But essentially, all clusters have some allowance for the maximum number of simultaneous active connections.

Under this scheme, applications with low-IO requests can service a much higher throughput than applications with high-IO requests, thereby ensuring fair, stable performance for all users.

Estimating Your Concurrency Needs

A reasonable way to estimate your needs is using statistics gleaned from small scale benchmarking. If you are able to determine a p95 or p99 time for production requests during peak expected load, you can calculate the maximum throughput per connection.

For example, if your benchmarking shows that under load, 99% of all requests have a round-trip time of 50ms or less, then your application could reasonably service 20 requests per second, per connection. If you have also determined that you need to be able to service a peak of 120 rps, then you could estimate the number of concurrent connections needed by dividing: 120 rps / 20rps/connection = 6 connections.

In other words, a Bonsai plan with a search concurrency allowance of at least 6 will be enough to handle traffic at peak load. A few connections over this baseline should be able to account for random fluctuations and offer some headroom for growth.

Beware the Local Benchmarking

Users will occasionally set up a local instance of Elasticseach and perform a load test on their application to determine how many requests per second it can sustain. This is a problem because it neglects network effects.

Requests to a local Elasticsearch cluster do not have the overhead of negotiating an SSL connection over the Internet and transmitting payloads over this connection. These steps introduce a non-trivial latency to all requests, increasing round trip times and reducing throughput ceilings.

It’s better to set up a remote cluster, perform a small scale load test, and use the statistics to estimate the upper latency bounds at scale.

Another possibility with Bonsai is to select a plan with far more resources than needed, launch in production, measure performance, and then scale down as necessary. Billing is prorated, so the cost of this approach is sometimes justified by the time savings of not setting up performing and validating small-scale benchmarking.

Sample Calculations

Suppose the developer of online store decides to switch the application’s search function from ActiveRecord to Elasticsearch. She spends an afternoon collecting information:

She wants to index three tables: users, orders and products
There are 12,123 user records, which are growing by 500 a month
There are 8,040 order records, which are growing by 1,100 a month
There are 101,500 product records, which are growing by 2% month over month
According to the application logs, users are averaging 10K searches per day, with a peak of 20 rps

Estimating Shard Needs

She reads The Ideal Elasticsearch Index and decides that she will be fine with a 1x1 sharding scheme for the users and orders indices, but will want a 3x2 scheme for the products index, based on its size, growth, and importance to the application’s revenue.

This configuration means she will need 1x(1+1)=2 shards for the users index, 1x(1+1)=2 shards for the orders index, and 3x(1+2)=9 shards for the products index. This is a total of 13 shards, although she may eventually want to increase replication on the users and orders indices. She plans for 13-15 shards for her application.

Estimating Disk Needs

She sets up a local Elasticsearch cluster and indexes 5000 random records from each table. Her cluster looks like this:

GET /_cat/indices
green open users 1 1 5000 0 28m 14m
green open orders 1 1 5000 0 24m 12m
green open products 3 2 5000 0 540m 60m

Based on this, she determines:

The users index occupies 14MB / 5000 docs = 2.8KB per document.
The orders index occupies 12MB / 5000 docs = 2.4KB per document.
The products index occupies 60MB / 5000 docs = 12KB per document.

She uses this information to calculate a baseline:

She will need 2.8KB/doc x 12,123 docs x 2 copies = 68MB of disk for the users data
She will need 2.4KB/doc x 8,040 docs x 2 copies = 39MB of disk for the orders data
She will need 12KB/doc x 101,500 docs x 3 copies = 3654MB of disk for the products data
The total disk needed for the existing data is 68MB + 39MB + 3654MB = ~3.8GB.

She then uses the growth measurements to estimate how much space will be needed within 6 months:

The users index will have ~15,123 records. At 2.8KB/doc and a 1x1 shard scheme, this is 85MB.
The orders index will have ~14,640 records. At 2.4KB/doc and a 1x1 shard scheme, this is 70MB.
The products index will have ~114,300 records. At 12KB/doc and a 3x2 shard scheme, this is 4115MB.
The total disk needed in 6 months will be around 4.27GB.

Adding some overhead to account for unexpected changes in growth and mappings, she estimates that 5GB of disk should suffice for current needs and foreseeable future.

She also uses this to estimate her memory needs, and decides to estimate a memory footprint of up to 20% of the primary data, give or take. She estimates that 1.0GB should be sufficient for memory.

Estimating Traffic Needs

She knows from the application logs that her users hit the site with a peak of 20 requests per second. She creates a free Bonsai.io cluster, indexes some sample production data to it, and performs a small scale load test to determine what kinds of request latencies she can expect her application to experience while handling user searches with a cloud service.

She finds that 99.9% of all search traffic completes the round trip in less than 80ms. This gives her a conservative estimate of 12-13 requests per second per connection (1000ms per second / 80ms per request = 12.5 rps). With a search concurrency allowance of 2, she would be able to safely service around 25 connections, which is a little more than her current need for 20 rps.

Conclusion

Based on her tests and analysis, she decides that she will need a cluster with:

Capacity for at least 13-15 shards
A search concurrency of at least 2
1 GB allocated for memory
5 GB of disk to support the growth in data over the next 3-6 months

She goes to https://app.bonsai.io/pricing and finds a plan. She decides that at this stage, a multitenant class cluster offers the best deal, and finds that the $50/plan meets all of these criteria (and then some), so that’s what she picks.

‍

Capacity Planning

Bonsai has a variety of cost controls that allow us to provide a quality free tier of service. Among these is the periodic deletion of unused free clusters.

Free clusters that have not been used for several weeks are added to a schedule for deletion. Warnings are sent out to users several days in advance of deletion. If the cluster receives traffic before its scheduled termination date, then the cluster is automatically removed from the schedule.

If you are an infrequent user of your Bonsai cluster, but wish to avoid it from being terminated, simply upgrade to any paid plan.

‍

Periodic Cluster Cleanup

When you have successfully provisioned a Private Space-accessible cluster via Bonsai, you might be interested in looking at it via Kibana. However there is one problem: our Kibana server doesn't have network access to your private cluster. This is by design, as you don't want just anyone to be able to access your cluster from the public internet.

So what are your options?

We recommend deploying a free dyno in your Heroku Private Space that can then be configured to access your cluster. It can do this securely because the Kibana dyno will run in your Heroku Private Space.

Kibana-maker

Kibana maker is a simple script that we wrote that helps bootstrap a Heroku Application by configuring a small git repository and pushing that up to a heroku git repository.

You can pull down a copy of kibana-maker using the following command:

Steps

Identify the slug of your private space in the Heroku App
Identify the URL of your cluster in the Bonsai App
Note the version of Elasticsearch you are using
Using kibana-maker, run the following command: <span class="inline-code"><pre><code>cd /tmp && kibana-maker --name= --url= --space= --version=</code></pre></span>
You should now have a functioning Kibana instance running in your Heroku Private space

Kibana and Heroku Private Spaces

Heroku Private Spaces are network-isolated application containers available to Heroku Enterprise customers. Private Spaces allow organizations to host applications within a secure, HIPAA-compliant environment. They are ideal for apps that handle PII and other legally regulated types of data.

When third party add-ons are included in your build, additional steps need to be taken in order for your data to maximize the benefit of a private space. Most add-ons are operated outside of Heroku’s VPC, which means your Private Space application will be communicating across the public internet. For some use cases this is unacceptable, which is why Bonsai proudly supports joining our networks together allowing your traffic to travel on the private backbone of AWS. Joining these networks together securely requires some careful networking, called peering.

Fortunately, both Heroku and Bonsai run on AWS infrastructure, which offers a service called VPC Peering. VPC Peering is a network connection between two VPCs that allows appliances within each VPC to communicate as though they were in a single network.

Be aware of your security model

Bonsai clusters come in one of two architectures: multitenant or single tenant. Clusters in the single tenant architectures (Business and Enterprise tiers) are running on private, sandboxed nodes. Clusters in the multitenant architecture (the Standard tier) are in an environment resources are shared among multiple users. As a result, shared tier clusters are not available for VPC Peering.

This may or may not be acceptable for the data you plan to index. The rest of this guide assumes you are running on a single tenant architecture.

VPC Peering can be set up between a Heroku Private Space and a Bonsai cluster on an Enterprise plan. This configuration will ensure maximum isolation and protection of your data.

Common Issues

Running in a Private Space has some additional implications that may not be immediately obvious. The main points are:

The web Console will not be available, due to the fact that the Javascript is running outside of the private space. If you want to interface with your cluster directly, you'll need to connect to the VPC first.
The Kibana dashboard will also be unavailable because the proxy is outside of the private space. Kibana can be set up within a private space, but it requires additional client-side configuration. Contact [email protected] for more details.
Connecting to the cluster directly will need to happen within the VPC itself, so users with a Private Space-based Bonsai cluster will need to first access their VPC before calls to Elasticsearch will succeed. See Connecting to Bonsai for more information.

If you run into any issues with your Private Space-based Elasticsearch cluster, please reach out to us at [email protected].

Gather your Heroku Peering Network Settings

In your Heroku Private Space you’ll need to navigate to the Network tab, and make a note of some settings under the Peering sub-section of the page. We need the AWS Account ID, the AWS VPC ID, and the AWS VPC CIDR. We will use this data to initiate a peering connection with your Heroku Space.

Finding Your Private Space URL

Your Private Space URL will look something like:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://dashboard.heroku.com/teams//spaces//network</code></pre>
</div>

Enter your details into the Bonsai dashboard

Log into the Bonsai cluster dashboard by running <span class="inline-code"><pre><code>$ heroku addons:open bonsai -a</code></pre></span> and enter your details in the form provided:

Accept our Peering Request

Once Bonsai has the above data, we will initiate a peering request to your space which will show up in the Network tab and under the Peering subsection. It should look like this:

Network changes are not instantaneous

The lead time for this to show up ranges from 30 minutes to a few hours.

When you accept the invitation the UI should change to look like:

Once the request has been accepted, you will be able to use the cluster URL provided in the Bonsai dashboard.

Private Means Private!

The DNS entry for your cluster will be pointing to private internal IP addresses, which means you will not be able to access this cluster except from within the Heroku Space. Browsers and <span class="inline-code"><pre><code>curl</code></pre></span> commands will not work.

‍

Heroku Private Spaces and VPC Peering

The Snapshot and Restore feature is a really nice set of tools for backing up and restoring your data. Because Bonsai is a managed service that offers a multitenant architecture, there are some limits on how it can be used.

How Does Bonsai Manage Snapshots?

Bonsai takes regular, automatic backups of all paid clusters, and stores them in an encrypted S3 bucket in the same region as the cluster. These snapshots are taken at the start of every hour and are retained for two weeks.

HTTP 400: Cannot delete indices that are being snapshotted

In some rare cases, you may see an error like this when attempting to alter an index:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Cannot delete indices that are being snapshotted: [[my_index/dPzLciT9RlmS8OtGGm61IQ]]. Try again after snapshot finishes or cancel the currently running snapshot.
</code></pre>
</div>

This is happening because the action is taking place during the regular snapshot. Snapshots happen at the start of every hour (00:00, 01:00, 02:00, etc), and can take anywhere from a few seconds to a few minutes. If you attempt to delete your index during this time, the action will be blocked.

The solution is to wait a minute or two and try again.

In the unlikely event that your cluster suffers unrecoverable data loss (for example: a node hosting primary data is lost, and there’s no replica), then we will use the most recent successful snapshot to restore the data. Any data updates from the time of the snapshot to the time of its restoration will need to be reindexed.

Can I Take My Own Snapshots?

Not at this time. The technical explanation is that snapshot and restore operations can be extremely demanding on IO, and Elasticsearch will only allow one snapshot/restore operation to occur at a time, with subsequent calls sitting in a queue.

We’re Always Improving

We’re working on ways of making snapshot and restore features safer and easier to use on the Bonsai platform. If you have an idea or use case you’d like to see supported, shoot us an email and we’ll evaluate adding it to our development pipeline.

If you’ve read the article on Architecture Classes, specifically the section on Multi Tenant Class clusters, you’ll understand why this is problematic. Users with unrestrained access to that API could inadvertently take down a group of nodes with some ill-timed calls. Or, an impatient user wondering why his/her snapshot isn’t processing right away may repeat the call multiple times, populating the queue. It’s also plausible that a less experienced user may attempt taking a snapshot a minute.

Something to consider is the purpose of the snapshot. If your desire is simply to have an up to date backup, then Bonsai is already handling that with hourly snapshots. If the desire is to restore the snapshot locally, or to another cluster for testing/dev purposes, we may be able to accommodate through our support channels.

I Need More Frequent Snapshots / Longer Retention!

We can provide custom SLA’s – including more frequent snapshots and longer retention times for users on Enterprise plans. Send an email to support with your requirements and we’ll put together a quote.

I Have My Own S3 Bucket – Can You Just Use That?

The encrypted buckets we use are set up and managed automatically. It’s possible to have snapshots added to a different bucket, but only for Enterprise subscriptions. If you’re on an Enterprise tier cluster, please send us a request.

‍

Snapshots on Bonsai

Concurrent connections are effective number of connections a cluster can handle before it returns an HTTP 429 response. Bonsai meters on concurrency as one level of defense against Denial of Service (DoS) and noisy neighbor situations, which helps ensure users are not taking more than their fair share of resources.

This guide will cover concurrent connection limits and some strategies for avoiding them:

What are concurrent connection limits?
Upgrade to a plan with a higher concurrency allowance
Reduce the request overhead
Reduce the traffic rate

If reading over these suggestions doesn’t provide enough ideas for resolving the issue, you can always email our support team at [email protected] to discuss options.

What Are Concurrent Connection Limits?

Bonsai distinguishes between types of connections for concurrency metering purposes:

Searches. Search requests. The concurrency allowance for search traffic is generally much higher than for updates.
Updates. Adding, modifying or deleting a given document. Generally has the lowest concurrency allowance due to the high overhead of single document updates.This includes _bulk requests.

Bonsai also uses connection pools to account for short-lived bursts of traffic.

The effective number of connections the cluster can handle is the pool size, plus the number of "active" connections. Active connections are being served by the cluster and not sitting in the queue.

If all active connections are in use, subsequent connections are put into a FIFO queue and served when a connection becomes available. If all connections are in use and the request pool is exhausted, then Bonsai will return an immediate HTTP 429.

Concurrency is NOT Throughput!

Concurrency allowances are a limit on total active connections, not throughput (requests per second). A cluster with a search concurrency allowance of 10 connections can easily serve hundreds of requests per second. A typical p50 query time on Bonsai is ~10ms. At this latency, a cluster could serve a sustained 100 requests per second, per connection. That would give the hypothetical cluster a maximum throughput of around 1,000 requests per second before the queue would even be needed. In practice, throughput may be closer to 500 rps (to account for bursts, network effects and longer-running requests).

To put these numbers in perspective, a sustained rate of 500 rps is over 43M requests per day. StackExchange – the parent of sites like StackOverflow, ServerFault and SuperUser (and many more) is performing 34M daily Elasticsearch queries. By the time your application demands exceed StackExchange by 25%, you will likely already be on a single tenant configuration with no concurrency limits.

Queued Connections Have a 60s TTL

If the connection queue begins to fill up with connection requests, those requests will only be held for up to 60 seconds. After that time, Bonsai will return a HTTP 504 response. If you are seeing HTTP 429 and 504 errors in your logs, that is an indicator that your cluster has a high volume of long-running queries.

Upgrade

Concurrency allowances on Bonsai scale with the plan level. Often the fastest way to resolve concurrency issues is to simply upgrade the plan. Upgrades take effect instantly.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

Reduce Overhead

The more time a request takes to process, the longer the connection remains open and active. Too many of these requests in a given amount of time will exhaust the connection pool and request queue, resulting in HTTP 429 responses. Reducing the amount of time a request takes to process adds overhead for more traffic.

Some examples of requests which tend to have a lot of overhead and take longer to process:

Aggregations on large numbers of results
Geospatial sorting on large numbers of results
Highlighting
Custom scripting
Wildcard matching

If you have requests that perform a lot of processing, then finding ways to optimize them with filter caches and better scoping can improve throughput quite a bit.

Reduce Traffic Volume

Sometimes applications are written without considering scalability or impact on Elasticsearch. These often result in far more queries being made to the cluster than is really necessary. Some minor changes are all that is needed to reduce the volume and avoid HTTP 429 responses in a way that still makes the application usable.

A common example is autocomplete / typeahead scripts. Often the frontend is sending a query to the Elasticsearch cluster each time a user presses a key. The average computer user types around 3-4 characters per second, and (depending on the query and typist speed) a search launched by one keypress may not even be returned before the next keypress is made. This results in a piling up of requests. More users searching the app exacerbate the problem. Initiating a search every ~500 milliseconds instead of every keypress will be much more scalable without impacting the user experience.

Another example might be a site that boosts relevancy scores by page view. Whenever a page is requested, the application updates the corresponding document in Elasticsearch by incrementing a counter for the number of times the page has been viewed.

This strategy will boost a document’s position in the results in real time, but it also means the cluster is updated every time a user visits any page, potentially resulting in a high volume of expensive single document updates. It would be better to write the page counts to the local database and use a queue and worker system (like Resque) to push out updated page counts using the Bulk API every minute or so. This would be a much cheaper and more scalable approach, and would be just as effective.

‍

Reducing Concurrency

Bonsai meters on the total amount of disk space a cluster can consume. This is for capacity planning purposes, and to ensure multitenant customers have their fair share of resources. Bonsai calculates a cluster’s disk usage by looking at the total data store size in bytes. This information can be found in the Index Stats API.

Resolving disk overages can be resolved in a couple different ways that we will cover in this documentation:

Remove Stale Data / Indices
Purge Deleted Documents
Reindex with Smaller Mappings
Upgrade the Subscription

Remove Stale Data / Indices

There are some cases where one or more indices are created on a cluster for testing purposes, and are not actually being used for anything. These will count towards the data limits; if you’re getting overage notifications, then you should delete these indices.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_cat/indices
green open prod20180101 1 1 1015123 0 32M 64M
green open prod20180201 1 1 1016456 0 35M 70M
green open prod20180301 1 1 1017123 0 39M 78M
green open prod20180401 1 1 1018456 0 45M 90M
green open prod20180501 1 1 1019123 0 47M 94M
green open prod20180601 1 1 1020456 0 51M 102M</code></pre>
</div>

Removing the old and unneeded indices in the example above would free up 356MB. A single command could do it:

Purge Deleted Documents

Data in Elasticsearch is spread across lots of files called segments. Segments each contain some number of documents. An index could have dozens, hundreds or even thousands of segment files, and Elasticsearch will periodically merge some segment files into others.

When a document is deleted in Elasticsearch, its segment file is simply updated to mark the document as deleted. The data is not actually removed until that segment file is merged with another. Elasticsearch normally handles segment merging automatically, but forcing a segment merging will reduce the overall disk footprint of the cluster by eliminating deleted documents.

This can be done through the Optimize / Forcemerge API, but the same effect can be accomplished more efficiently by simply reindexing. Reindexing will cause the data to be refreshed, and no deleted documents will be tracked by Elasticsearch. This will reduce disk usage.

To check whether this will work for you, look at the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> data. There is a column called <span class="inline-code"><pre><code>docs.deleted</code></pre></span>, which shows how many documents are sitting on the disk and are marked as deleted. This should give a sense of how much data could be freed up by reindexing. For example:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">health status index pri rep docs.count docs.deleted store.size pri.store.size
green open my_index 3 2 15678948 6895795 47.1G 15.7G</code></pre>
</div>

In this case, the <span class="inline-code"><pre><code>docs.deleted</code></pre></span> is around 30% of the primary store, or around 4.8G of primary data. With replication, this works out to something like 14.4GB of total disk marked for deletion. Reindexing would reduce the cluster’s disk footprint by this much. The result would look like this:

Protip: Queue Writes and Reindex in the Background for Minimal Impact

Your app’s search could be down or degraded during a reindex. If reindexing will take a long time, that may make this option unfeasible. However, you could minimize the impact by using something like Kafka to queue writes while reindexing to a new index.

Search traffic can continue to be served from the old index until its replacement is ready. Flush the queued updates from Kafka into the new index, then destroy the old index and use an alias to promote the new index.

The tradeoff of this solution is that you’ll minimize the impact to your traffic/users, but you’ll need to set up and manage the queue software. You’ll also have a lot of duplicated data for a short period of time, so your footprint could be way above the subscription limit for a short time.

To prevent the state machine from disabling your cluster, you might want to consider temporarily upgrading to perform the operation, then downgrading when you’re done. Billing is prorated, so this would not add much to your invoice. You can always email us to discuss options before settling on a decision.

Reindex with Smaller Mappings

Mappings define how data is stored and indexed in an Elasticsearch cluster. There are some settings which can cause the disk footprint to grow exponentially.

For example, synonym expansion can lead to lots of extra tokens to be generated per input token (if you’re using WordNet, see our documentation article on it, specifically Why Wouldn’t Everyone Want WordNet?). If you’re using lots of index-time synonym expansion, then you’re essentially inflating the document sizes with lots of data, with the tradeoff (hopefully) being improved relevancy.

Another example would be Ngrams. Ngrams are tokens generated from the parts of other tokens. A token like “hello” could be broken into 2-grams like “he”, “el”, “ll”, and “lo”. In 3-grams, it would be “hel”, “ell” and “llo”. And so on. The Elasticsearch Guide has more examples.

It’s possible to generate multiple gram sizes for a single, starting with values as low as 1. Some developers use this to maximize substring matching. But there is an exponential growth in the number of grams generated for a single token:

This relationship is expressed mathematically as:

In other words, a token with a length of 5 and a minimum gram size of 1 would result in <span class="inline-code"><pre><code>(1/2)*5*(5+1)=15</code></pre></span> grams. A token with a length of 10 would result in 55 grams. The grams are generated per token, which leads to an explosion in terms for a document.

As a sample calculation: if a typical document in your corpus has a field with ~1,000 tokens and a Rayleigh distribution of length with an average of ~5, you could plausibly see something like a 1,100-1,200% inflation in disk footprint using Ngrams of minimum size 1. In other words, if the non-grammed document would need 100KB on disk, the Ngrammed version would need over 1MB. Virtually none of this overhead would improve relevancy, and would probably even hurt it.

Nested documents are another example of a feature can also increase your data footprint without necessarily improving relevancy.

The point is that there are plenty of features available that lead to higher disk usage than one might think at first glance. Check on your mappings carefully: look for large synonym expansions, make sure you’re using Ngrams with a minimum gram size of 3 or more (also look into EdgeNGrams if you’re attempting autocomplete), and see if you can get away with fewer nested objects. Reindex your data with the updated mappings, and you should see a definite improvement.

Upgrade the Subscription

If you find that you’re unable to remove data, reindex, or update your mappings – or that these changes don’t yield a stable resolution – then you will simply need to upgrade to the next subscription level.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

‍

Reducing Data Usage

Bonsai meters on the number of documents in an index. There are several types of“documents” in Elasticsearch, but Bonsai only counts the live Lucene documents in primary shards towards the document limit. Documents which have been marked as deleted, but have not yet been merged out of the segment files do not count towards the limit.

In this guide we cover a few different ways of reducing document usage:

How Document Overages Occur
Remove Old Data
Remove an Index
Compact Your Mappings
Upgrade the Subscription

How Document Overages Occur

Users frequently report having "only" a few hundred documents, but the dashboard registers several thousand. This is due to how Elasticsearch counts nested documents. The Index Stats API is used to determine how many documents are in a cluster. The Index Stats API counts nested documents by including all associated documents. In other words, if you have a document with 2 nested documents, this is reported as 3 total documents.

Elasticsearch has several different articles on how nested documents work, but the simplest answer is that it is creating the illusion of complex object by quietly creating multiple hidden documents.

A common point of confusion is that the /_cat/indices endpoint will show one set of documents, while the /_stats endpoint shows a much larger count. This is because the Cat API is counting the“visible” documents, while the Index Stats API is counting all documents. The _stats endpoint is a more true representation of a cluster’s document usage, and is the most fair to all users for metering purposes.

Remove Old Data

If your index has a lot of old data, or“stale” data (documents which rarely show up in searches), then you could simply delete those documents. Deleted documents do not count against your limits.

Remove an Index

Occasionally users are indexing time series data, or database tables that are not actually being searched by the application. Audit usage by using the Interactive Console to check the /_cat/indices endpoint. If you find that there are old or unnecessary indices with data, then delete those.

You may also want to check out the Index Trimmer.

Compact Your Mappings

Changing your mappings to nest less information can greatly reduce your document usage. Consider this sample document:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
"title": "Spiderman saves child from well",
"body": "Move over, Lassie! New York has a new hero. But is he also a menace?",
"authors": [
{
"name": "Jonah Jameson",
"title": "Sr. Editor",
},
{
"name": "Peter Parker",
"title": "Photos",
}
],
"comments": [
{
"username": "captain_usa",
"comment": "I understood that reference!",
},
{
"username": "man_of_iron",
"comment": "Congrats on being slightly more useful than a ladder.",
}
],
"photos": [
{
"url": "https://assets.dailybugle.com/12345",
"caption": "Spiderman delivering Timmy back to his mother",
}
]
}</code></pre>
</div>

Note that it’s nesting data for authors, comments and photos. The mapping above would actually result in creating 6 documents. Removing the comments and photos(which usually don’t need to be indexed anyway) would reduce the footprint by 50%.

If you’re using nested objects, review whether any of the nested information could stand to be left out, and then reindex with a smaller mapping.

Upgrade the Subscription

If you find that you’re unable to reduce documents through the options discussed here, then you will need to upgrade to the next plan.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

‍

Reducing Document Usage

The term “metering” in Bonsai refers to the limits imposed on the resources allocated to a cluster. A cluster’s limits are determined by its subscription level; higher-priced plans yield higher limits. There are several metered resources that can result in an overage. These documents explains what those resources are and how to resolve related overages:

How Bonsai Handles Overages
Concurrent Connections
Shards
Disk
Documents

How Bonsai Handles Overages

Overages are indicated on the Cluster dashboard. If your cluster is over its subscription limits, the overage will be indicated in red like so:

Bonsai uses “soft limits” for metering. This approach does not immediately penalize or disable clusters exceeding the limits of the subscription. This is a much more gentle way of treating users who are probably not even aware that they’re over their limits, or why that may be an issue.

When an overages is detected, it triggers a state machine that follows a specially-designed process that takes increasingly firm actions. This process has several steps:

1. Initial notification (immediately). The owner and members of their team is notified that there is an overage, and provided information about how to address it.

2. Second notification (5 days). A reminder is sent and warns that the cluster is about to be put into read-only mode. Clusters in read-only mode will receive an HTTP 403: Cluster Read Only error message.

3. Read-only mode (10 days). The cluster is put into read-only mode. Updates will fail with a 403 error.

4. Disabled (15 days). All access to the cluster is disabled. Both searches and updates will fail with a HTTP 403: Cluster Disabled error.

Extreme Overages Skip the Process

Overages that are particularly aggressive are subject to being disabled immediately. Bonsai uses a fairly generous algorithm to determine whether an overage is severe and subject to immediate cut-off. This step is uncommon, but is a definite possibility.

Stale Data May Be Lost

Free clusters that have been disabled for a period of time may be purged to free up resources for other users.

The fastest way to deal with an overage is to simply upgrade your cluster’s plan. Upgrades take effect instantly and will unlock any cluster set to read-only or disabled.

If upgrading is not possible for some reason, then the next best option is to address the issue directly. This document contains information about how to address an overage in each metered resource.

Give it a minute!

The resource usage indicators are not real time displays. Cluster stats are calculated every 10 minutes or so. If you address an overage and the change isn’t immediately reflected on the display, don’t worry. The changes will be detected, the dashboard will be updated, and any sanction in place on the cluster will be lifted automatically. Please wait 5-10 minutes before emailing support.

Concurrent Connections

More on concurrency and how to solve concurrency overages in our Reducing Concurrent Connections documentation.

Shards

A shard is the basic unit of work in Elasticsearch. If you haven’t read the Core Concept on Shards, that would be a good place to start. Bonsai meters on the total number of shards in a cluster. That means both primary and replica shards count towards the limit.

If you have a shard overage you can look at out Reducing Shard Usage documentation to resolve it.

Disk

Disk overages can be resolved in a couple different ways that are explained in our Reducing Data Usage documentation.

Documents

Bonsai meters on the number of documents in an index. There are several types of “documents” in Elasticsearch, but Bonsai only counts the live Lucene documents in primary shards towards the document limit. Documents which have been marked as deleted, but have not yet been merged out of the segment files do not count towards the limit.

Document overages can be resolved in a couple different ways that are explained in our Reducing Document Usage documentation.

‍

Billing: How Bonsai handles overages

Datadog is a monitoring service that allows customers to see real time metrics related to their application and infrastructure, as well as receive alerts for predefined events. Datadog offers an Elasticsearch integration for monitoring clusters, and Bonsai supports this integration.

Getting Started
Adjust Dashboard
Reference: Metrics Available on Standard Subscriptions
Reference: Metrics Available on Business / Enterprise Subscriptions

Getting Started

Log into Datadog, navigate to <span class="inline-code"><pre><code>Integrations > APIs</code></pre></span> to get your current API Key

In the table of API keys, grab your current API key:

Navigate to the Integrations on your cluster dashboard under Data. There is a section marked Datadog Integration.

Enter your Datadog API key, select the API site, then press Activate Datadog. You should start to see request metrics loaded into Datadog within a few minutes.

Adjust How Metrics Are Shown In The Datadog Dashboard

Note to Enterprise tier customers who has Grafana dashboards set up with our team:

The metrics Bonsai sends to Datadog are identical to the metrics sent to Grafana. Metrics data can be processed and rendered in a variety of ways, so it’s possible that the metrics shown in Datadog differ from metrics displayed in the Bonsai dashboard or Grafana. This is because Datadog features (such as smoothing) can mask performance patterns like short-lived load spikes. Bonsai considers the metrics displayed in Grafana to be authoritative.

Datadog's configuration may change on how units are reported in the graphs. The request durations are ideally shown in milliseconds, and the following steps will get you back on track:

1. In the Datadog dashboard, select Metrics info from the "cog" dropdown menu on a graph like so:

2. Select <span class="inline-code"><pre><code>*.p50</code></pre></span> and it'll take you to the Metrics Summary search for p50. Click on the Metric Name then Edit button:

3. Under Metadata, change the Unit from minute to millisecond and leave per as <span class="inline-code"><pre><code>(None)</code></pre></span>. Click save:

4. Search for the other request duration times and edit their units to reflect millisecond. Once all three are edited, the graph should reflect the smallest time unit in milliseconds:

If you have any questions or issues, or are not seeing metrics loading into Datadog after a long time, please reach out and let us know at [email protected].

Reference: Metrics Available on Standard Subscriptions

Clusters on Standard plans have access to a variety of metrics:

<table>
<tbody>
<tr><td>bonsai.req.2xx
(gauge)</td><td>Number of requests with a 2xx(successful) response code
Shown as request</td></tr>
<tr><td>bonsai.req.4xx
(gauge)</td><td>Number of requests with a 4xx(client error) response code
Shown as request</td></tr>
<tr><td>bonsai.req.5xx
(gauge)</td><td>Number of requests with a 5xx(server error) response code
Shown as request</td></tr>
<tr><td>bonsai.req.max_concurrency
(gauge)</td><td>Peak concurrent requests
Shown as connection</td></tr>
<tr><td>bonsai.req.p50
(gauge)</td><td>The median request duration
Shown as minute</td></tr>
<tr><td>bonsai.req.p95
(gauge)</td><td>The 95th percentile request duration
Shown as minute</td></tr>
<tr><td>bonsai.req.p99
(gauge)</td><td>The 99th percentile request duration
Shown as minute</td></tr>
<tr><td>bonsai.req.queue_depth
(gauge)</td><td>Peak queue depth(how many requests are waiting due to concurrency limits)
Shown as connection</td></tr>
<tr><td>bonsai.req.reads
(gauge)</td><td>The number of requests which read data
Shown as request</td></tr>
<tr><td>bonsai.req.rx_bytes
(gauge)</td><td>The number of bytes sent to elasticsearch
Shown as byte</td></tr>
<tr><td>bonsai.req.total
(gauge)</td><td>The total number of requests
Shown as request</td></tr>
<tr><td>bonsai.req.tx_bytes
(gauge)</td><td>The number of bytes sent to client
Shown as byte</td></tr>
<tr><td>bonsai.req.writes
(gauge)</td><td>The total number of writes
Shown as request</td></tr>
</tbody>
</table>

Reference: Metrics Available on Business / Enterprise Subscriptions

Metrics are tagged on a per-cluster basis, so you can easily segment between your Elasticsearch instances. The tags look like: <span class="inline-code"><pre><code>cluster:my-cluster-slug</code></pre></span>

Users with Business and Enterprise subscriptions have access to additional metrics:

<table cellpadding="5" cellspacing="1" border="1">
<tr><td></td><td></td></tr>
<tr><td>cpu.usage_guest</td><td>float</td></tr>
<tr><td>cpu.usage_guest_nice</td><td>float</td></tr>
<tr><td>cpu.usage_idle</td><td>float</td></tr>
<tr><td>cpu.usage_iowait</td><td>float</td></tr>
<tr><td>cpu.usage_irq</td><td>float</td></tr>
<tr><td>cpu.usage_nice</td><td>float</td></tr>
<tr><td>cpu.usage_softirq</td><td>float</td></tr>
<tr><td>cpu.usage_steal</td><td>float</td></tr>
<tr><td>cpu.usage_system</td><td>float</td></tr>
<tr><td>cpu.usage_user</td><td>float</td></tr>
<tr><td>disk.free</td><td>integer</td></tr>
<tr><td>disk.inodes_free</td><td>integer</td></tr>
<tr><td>disk.inodes_total</td><td>integer</td></tr>
<tr><td>disk.inodes_used</td><td>integer</td></tr>
<tr><td>disk.total</td><td>integer</td></tr>
<tr><td>disk.used</td><td>integer</td></tr>
<tr><td>disk.used_percent</td><td>float</td></tr>
<tr><td>diskio.io_time</td><td>integer</td></tr>
<tr><td>diskio.iops_in_progress</td><td>integer</td></tr>
<tr><td>diskio.read_bytes</td><td>integer</td></tr>
<tr><td>diskio.read_time</td><td>integer</td></tr>
<tr><td>diskio.reads</td><td>integer</td></tr>
<tr><td>diskio.weighted_io_time</td><td>integer</td></tr>
<tr><td>diskio.write_bytes</td><td>integer</td></tr>
<tr><td>diskio.write_time</td><td>integer</td></tr>
<tr><td>diskio.writes</td><td>integer</td></tr>
<tr><td>elasticsearch_breakers.accounting_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_cluster_health.active_primary_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.active_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.active_shards_percent_as_number</td><td>float</td></tr>
<tr><td>elasticsearch_cluster_health.initializing_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_data_nodes</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_nodes</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_pending_tasks</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.relocating_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.status</td><td>string</td></tr>
<tr><td>elasticsearch_cluster_health.status_code</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.task_max_waiting_in_queue_millis</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.timed_out</td><td>boolean</td></tr>
<tr><td>elasticsearch_cluster_health.unassigned_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_clusterstats_indices.completion_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.docs_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.docs_deleted</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.fielddata_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.fielddata_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.filter_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.filter_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.id_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_current</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_memory_size</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_queries</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_cache_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_cache_size</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_total_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_doc_values_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_fixed_bit_set_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_index_writer_max_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_index_writer_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_max_unsafe_auto_id_timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_norms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_points_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_stored_fields_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_term_vectors_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_terms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_version_map_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_primaries</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_replication</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.store_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.store_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_client</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_coordinating_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_data</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_data_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_ingest</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master_data</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.discovery_types_zen</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_spins</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_max_uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_mem_heap_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_mem_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_threads</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_bundled_jdk</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_using_bundled_jdk</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_vm_vendor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_vm_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.network_types_http_types_netty4</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.network_types_transport_types_netty4</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_allocated_processors</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_available_processors</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_cache_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_cores_per_socket</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_mhz</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_model</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_total_cores</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_total_sockets</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_vendor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_free_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_flavor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_type</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.versions_0</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.versions_1</td><td>string</td></tr>
<tr><td>elasticsearch_fs.data_0_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_read_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_read_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_write_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_write_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_read_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_read_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_write_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_write_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_used_disk_percent</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_used_disk_percent</td><td>float</td></tr>
<tr><td>elasticsearch_fs.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_.current_open</td><td>float</td></tr>
<tr><td>elasticsearch_.total_opened</td><td>float</td></tr>
<tr><td>elasticsearch_indices.completion_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.docs_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.docs_deleted</td><td>float</td></tr>
<tr><td>elasticsearch_indices.fielddata_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.fielddata_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.filter_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.filter_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_periodic</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_exists_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_exists_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_missing_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_missing_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.id_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_failed</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_noop_update_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current_docs</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_auto_throttle_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_docs</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_stopped_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_throttled_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_queries</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_cache_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_cache_size</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_total_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_current_as_source</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_current_as_target</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_external_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_external_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_listeners</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_open_contexts</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_doc_values_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_fixed_bit_set_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_index_writer_max_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_index_writer_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_max_unsafe_auto_id_timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_norms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_points_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_stored_fields_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_term_vectors_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_terms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_version_map_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.store_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.store_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_earliest_last_modified_age</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_operations</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_uncommitted_operations</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_uncommitted_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_total_capacity_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_total_capacity_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_current_loaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_total_loaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_total_unloaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_old_collection_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_old_collection_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_young_collection_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_young_collection_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_committed_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_non_heap_committed_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_non_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.threads_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.threads_peak_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_cfs_period_micros</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_cfs_quota_micros</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_number_of_elapsed_periods</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_number_of_times_throttled</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_time_throttled_nanos</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpuacct_usage_nanos</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_idle</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_15m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_1m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_5m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_stolen</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_sys</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_usage</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_user</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_0</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_1</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_2</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_actual_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_actual_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_free_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_os.uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_sys_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_total_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_user_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.max_file_descriptors</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_resident_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_share_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_total_virtual_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.open_file_descriptors</td><td>float</td></tr>
<tr><td>elasticsearch_process.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_threads</td><td>float</td></tr>
<tr><td>elasticsearch_transport.rx_count</td><td>float</td></tr>
<tr><td>elasticsearch_transport.rx_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_transport.server_open</td><td>float</td></tr>
<tr><td>elasticsearch_transport.tx_count</td><td>float</td></tr>
<tr><td>elasticsearch_transport.tx_size_in_bytes</td><td>float</td></tr>
<tr><td>haproxy.active_servers</td><td>integer</td></tr>
<tr><td>haproxy.addr</td><td>string</td></tr>
<tr><td>haproxy.backup_servers</td><td>integer</td></tr>
<tr><td>haproxy.bin</td><td>integer</td></tr>
<tr><td>haproxy.bout</td><td>integer</td></tr>
<tr><td>haproxy.cache_hits</td><td>integer</td></tr>
<tr><td>haproxy.cache_lookups</td><td>integer</td></tr>
<tr><td>haproxy.check_code</td><td>integer</td></tr>
<tr><td>haproxy.check_duration</td><td>integer</td></tr>
<tr><td>haproxy.check_fall</td><td>integer</td></tr>
<tr><td>haproxy.check_health</td><td>integer</td></tr>
<tr><td>haproxy.check_rise</td><td>integer</td></tr>
<tr><td>haproxy.check_status</td><td>string</td></tr>
<tr><td>haproxy.chkdown</td><td>integer</td></tr>
<tr><td>haproxy.chkfail</td><td>integer</td></tr>
<tr><td>haproxy.cli_abort</td><td>integer</td></tr>
<tr><td>haproxy.comp_byp</td><td>integer</td></tr>
<tr><td>haproxy.comp_in</td><td>integer</td></tr>
<tr><td>haproxy.comp_out</td><td>integer</td></tr>
<tr><td>haproxy.comp_rsp</td><td>integer</td></tr>
<tr><td>haproxy.conn_rate</td><td>integer</td></tr>
<tr><td>haproxy.conn_rate_max</td><td>integer</td></tr>
<tr><td>haproxy.conn_tot</td><td>integer</td></tr>
<tr><td>haproxy.connect</td><td>integer</td></tr>
<tr><td>haproxy.ctime</td><td>integer</td></tr>
<tr><td>haproxy.dcon</td><td>integer</td></tr>
<tr><td>haproxy.downtime</td><td>integer</td></tr>
<tr><td>haproxy.dreq</td><td>integer</td></tr>
<tr><td>haproxy.dresp</td><td>integer</td></tr>
<tr><td>haproxy.dses</td><td>integer</td></tr>
<tr><td>haproxy.econ</td><td>integer</td></tr>
<tr><td>haproxy.ereq</td><td>integer</td></tr>
<tr><td>haproxy.eresp</td><td>integer</td></tr>
<tr><td>haproxy.hanafail</td><td>integer</td></tr>
<tr><td>haproxy.http_response.1xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.2xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.3xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.4xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.5xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.other</td><td>integer</td></tr>
<tr><td>haproxy.iid</td><td>integer</td></tr>
<tr><td>haproxy.intercepted</td><td>integer</td></tr>
<tr><td>haproxy.last_chk</td><td>string</td></tr>
<tr><td>haproxy.lastchg</td><td>integer</td></tr>
<tr><td>haproxy.lastsess</td><td>integer</td></tr>
<tr><td>haproxy.lbtot</td><td>integer</td></tr>
<tr><td>haproxy.mode</td><td>string</td></tr>
<tr><td>haproxy.pid</td><td>integer</td></tr>
<tr><td>haproxy.qcur</td><td>integer</td></tr>
<tr><td>haproxy.qmax</td><td>integer</td></tr>
<tr><td>haproxy.qtime</td><td>integer</td></tr>
<tr><td>haproxy.rate</td><td>integer</td></tr>
<tr><td>haproxy.rate_lim</td><td>integer</td></tr>
<tr><td>haproxy.rate_max</td><td>integer</td></tr>
<tr><td>haproxy.req_rate</td><td>integer</td></tr>
<tr><td>haproxy.req_rate_max</td><td>integer</td></tr>
<tr><td>haproxy.req_tot</td><td>integer</td></tr>
<tr><td>haproxy.reuse</td><td>integer</td></tr>
<tr><td>haproxy.rtime</td><td>integer</td></tr>
<tr><td>haproxy.scur</td><td>integer</td></tr>
<tr><td>haproxy.sid</td><td>integer</td></tr>
<tr><td>haproxy.slim</td><td>integer</td></tr>
<tr><td>haproxy.smax</td><td>integer</td></tr>
<tr><td>haproxy.srv_abort</td><td>integer</td></tr>
<tr><td>haproxy.status</td><td>string</td></tr>
<tr><td>haproxy.stot</td><td>integer</td></tr>
<tr><td>haproxy.ttime</td><td>integer</td></tr>
<tr><td>haproxy.weight</td><td>integer</td></tr>
<tr><td>haproxy.wredis</td><td>integer</td></tr>
<tr><td>haproxy.wretr</td><td>integer</td></tr>
<tr><td>haproxy.wrew</td><td>integer</td></tr>
<tr><td>kernel.boot_time</td><td>integer</td></tr>
<tr><td>kernel.context_switches</td><td>integer</td></tr>
<tr><td>kernel.entropy_avail</td><td>integer</td></tr>
<tr><td>kernel.interrupts</td><td>integer</td></tr>
<tr><td>kernel.processes_forked</td><td>integer</td></tr>
<tr><td>mem.active</td><td>integer</td></tr>
<tr><td>mem.available</td><td>integer</td></tr>
<tr><td>mem.available_percent</td><td>float</td></tr>
<tr><td>mem.buffered</td><td>integer</td></tr>
<tr><td>mem.cached</td><td>integer</td></tr>
<tr><td>mem.commit_limit</td><td>integer</td></tr>
<tr><td>mem.committed_as</td><td>integer</td></tr>
<tr><td>mem.dirty</td><td>integer</td></tr>
<tr><td>mem.free</td><td>integer</td></tr>
<tr><td>mem.high_free</td><td>integer</td></tr>
<tr><td>mem.high_total</td><td>integer</td></tr>
<tr><td>mem.huge_page_size</td><td>integer</td></tr>
<tr><td>mem.huge_pages_free</td><td>integer</td></tr>
<tr><td>mem.huge_pages_total</td><td>integer</td></tr>
<tr><td>mem.inactive</td><td>integer</td></tr>
<tr><td>mem.low_free</td><td>integer</td></tr>
<tr><td>mem.low_total</td><td>integer</td></tr>
<tr><td>mem.mapped</td><td>integer</td></tr>
<tr><td>mem.page_tables</td><td>integer</td></tr>
<tr><td>mem.shared</td><td>integer</td></tr>
<tr><td>mem.slab</td><td>integer</td></tr>
<tr><td>mem.swap_cached</td><td>integer</td></tr>
<tr><td>mem.swap_free</td><td>integer</td></tr>
<tr><td>mem.swap_total</td><td>integer</td></tr>
<tr><td>mem.total</td><td>integer</td></tr>
<tr><td>mem.used</td><td>integer</td></tr>
<tr><td>mem.used_percent</td><td>float</td></tr>
<tr><td>mem.vmalloc_chunk</td><td>integer</td></tr>
<tr><td>mem.vmalloc_total</td><td>integer</td></tr>
<tr><td>mem.vmalloc_used</td><td>integer</td></tr>
<tr><td>mem.wired</td><td>integer</td></tr>
<tr><td>mem.write_back</td><td>integer</td></tr>
<tr><td>mem.write_back_tmp</td><td>integer</td></tr>
<tr><td>net.bytes_recv</td><td>integer</td></tr>
<tr><td>net.bytes_sent</td><td>integer</td></tr>
<tr><td>net.drop_in</td><td>integer</td></tr>
<tr><td>net.drop_out</td><td>integer</td></tr>
<tr><td>net.err_in</td><td>integer</td></tr>
<tr><td>net.err_out</td><td>integer</td></tr>
<tr><td>net.icmp_inaddrmaskreps</td><td>integer</td></tr>
<tr><td>net.icmp_inaddrmasks</td><td>integer</td></tr>
<tr><td>net.icmp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.icmp_indestunreachs</td><td>integer</td></tr>
<tr><td>net.icmp_inechoreps</td><td>integer</td></tr>
<tr><td>net.icmp_inechos</td><td>integer</td></tr>
<tr><td>net.icmp_inerrors</td><td>integer</td></tr>
<tr><td>net.icmp_inmsgs</td><td>integer</td></tr>
<tr><td>net.icmp_inparmprobs</td><td>integer</td></tr>
<tr><td>net.icmp_inredirects</td><td>integer</td></tr>
<tr><td>net.icmp_insrcquenchs</td><td>integer</td></tr>
<tr><td>net.icmp_intimeexcds</td><td>integer</td></tr>
<tr><td>net.icmp_intimestampreps</td><td>integer</td></tr>
<tr><td>net.icmp_intimestamps</td><td>integer</td></tr>
<tr><td>net.icmp_outaddrmaskreps</td><td>integer</td></tr>
<tr><td>net.icmp_outaddrmasks</td><td>integer</td></tr>
<tr><td>net.icmp_outdestunreachs</td><td>integer</td></tr>
<tr><td>net.icmp_outechoreps</td><td>integer</td></tr>
<tr><td>net.icmp_outechos</td><td>integer</td></tr>
<tr><td>net.icmp_outerrors</td><td>integer</td></tr>
<tr><td>net.icmp_outmsgs</td><td>integer</td></tr>
<tr><td>net.icmp_outparmprobs</td><td>integer</td></tr>
<tr><td>net.icmp_outredirects</td><td>integer</td></tr>
<tr><td>net.icmp_outsrcquenchs</td><td>integer</td></tr>
<tr><td>net.icmp_outtimeexcds</td><td>integer</td></tr>
<tr><td>net.icmp_outtimestampreps</td><td>integer</td></tr>
<tr><td>net.icmp_outtimestamps</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype0</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype11</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype3</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype4</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype5</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype8</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype0</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype3</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype8</td><td>integer</td></tr>
<tr><td>net.ip_defaultttl</td><td>integer</td></tr>
<tr><td>net.ip_forwarding</td><td>integer</td></tr>
<tr><td>net.ip_forwdatagrams</td><td>integer</td></tr>
<tr><td>net.ip_fragcreates</td><td>integer</td></tr>
<tr><td>net.ip_fragfails</td><td>integer</td></tr>
<tr><td>net.ip_fragoks</td><td>integer</td></tr>
<tr><td>net.ip_inaddrerrors</td><td>integer</td></tr>
<tr><td>net.ip_indelivers</td><td>integer</td></tr>
<tr><td>net.ip_indiscards</td><td>integer</td></tr>
<tr><td>net.ip_inhdrerrors</td><td>integer</td></tr>
<tr><td>net.ip_inreceives</td><td>integer</td></tr>
<tr><td>net.ip_inunknownprotos</td><td>integer</td></tr>
<tr><td>net.ip_outdiscards</td><td>integer</td></tr>
<tr><td>net.ip_outnoroutes</td><td>integer</td></tr>
<tr><td>net.ip_outrequests</td><td>integer</td></tr>
<tr><td>net.ip_reasmfails</td><td>integer</td></tr>
<tr><td>net.ip_reasmoks</td><td>integer</td></tr>
<tr><td>net.ip_reasmreqds</td><td>integer</td></tr>
<tr><td>net.ip_reasmtimeout</td><td>integer</td></tr>
<tr><td>net.packets_recv</td><td>integer</td></tr>
<tr><td>net.packets_sent</td><td>integer</td></tr>
<tr><td>net.tcp_activeopens</td><td>integer</td></tr>
<tr><td>net.tcp_attemptfails</td><td>integer</td></tr>
<tr><td>net.tcp_currestab</td><td>integer</td></tr>
<tr><td>net.tcp_estabresets</td><td>integer</td></tr>
<tr><td>net.tcp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.tcp_inerrs</td><td>integer</td></tr>
<tr><td>net.tcp_insegs</td><td>integer</td></tr>
<tr><td>net.tcp_maxconn</td><td>integer</td></tr>
<tr><td>net.tcp_outrsts</td><td>integer</td></tr>
<tr><td>net.tcp_outsegs</td><td>integer</td></tr>
<tr><td>net.tcp_passiveopens</td><td>integer</td></tr>
<tr><td>net.tcp_retranssegs</td><td>integer</td></tr>
<tr><td>net.tcp_rtoalgorithm</td><td>integer</td></tr>
<tr><td>net.tcp_rtomax</td><td>integer</td></tr>
<tr><td>net.tcp_rtomin</td><td>integer</td></tr>
<tr><td>net.udp_ignoredmulti</td><td>integer</td></tr>
<tr><td>net.udp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.udp_indatagrams</td><td>integer</td></tr>
<tr><td>net.udp_inerrors</td><td>integer</td></tr>
<tr><td>net.udp_noports</td><td>integer</td></tr>
<tr><td>net.udp_outdatagrams</td><td>integer</td></tr>
<tr><td>net.udp_rcvbuferrors</td><td>integer</td></tr>
<tr><td>net.udp_sndbuferrors</td><td>integer</td></tr>
<tr><td>net.udplite_ignoredmulti</td><td>integer</td></tr>
<tr><td>net.udplite_incsumerrors</td><td>integer</td></tr>
<tr><td>net.udplite_indatagrams</td><td>integer</td></tr>
<tr><td>net.udplite_inerrors</td><td>integer</td></tr>
<tr><td>net.udplite_noports</td><td>integer</td></tr>
<tr><td>net.udplite_outdatagrams</td><td>integer</td></tr>
<tr><td>net.udplite_rcvbuferrors</td><td>integer</td></tr>
<tr><td>net.udplite_sndbuferrors</td><td>integer</td></tr>
<tr><td>system.load1</td><td>float</td></tr>
<tr><td>system.load15</td><td>float</td></tr>
<tr><td>system.load5</td><td>float</td></tr>
<tr><td>system.n_cpus</td><td>integer</td></tr>
<tr><td>system.n_users</td><td>integer</td></tr>
<tr><td>system.uptime</td><td>integer</td></tr>
<tr><td>system.uptime_format</td><td>string</td></tr>
</table>

Using Datadog with Bonsai

The new Bonsai Business Plans come with more options than we have ever provided before. It may seem intimidating to choose, but there are a few simple guidelines that make it easy. Bonsai Business Plans also don’t require annual contracts. If your index or traffic changes periodically, you can change plans whenever you wish.

Let’s start by looking at the two plan types.

Choosing a plan type

Business Plans offer two main types - Compute and Capacity. The difference is inherent in their names: if your use-case necessitates a lot of data written to disk but less traffic (perhaps only a few per hour), Capacity allows you to get more bang for your buck in raw disk. As a contrast, Compute is designed for those that need a setup that can withstand from high traffic load or query complexity.

Planning for disk capacity

When you deploy an HA Elasticsearch cluster, you must provision enough disk for three things: 1.Your primary data 2. Your replica data, and 3. The normal maintenance routines performed by Lucene, the underlying search engine behind Elasticsearch.

Nobody likes using a search engine that doesn’t work. Failing to account for any of these factors will result in performance degradation, a.k.a. the infamous yellow or red cluster. 😫

How much primary data you can load into Elasticsearch, while still maintaining High Availability? This simple formula will help you calculate:

((number of nodes - 1) * the capacity of a single node) * 0.8 = the amount of data that can be loaded in your cluster

Let’s put this in a concrete example. A Business Capacity Large plan has a raw capacity of 150GB, with each of the three nodes contributing 50GB of disk. So the concrete numbers would be:

number of nodes = 3 per node capacity = 50GB total raw capacity = 3 * 50GB = 150GB Usable data = ((3-1) * 50GB) * 0.8 = 80GB

This means that if you have a total raw capacity of 150GB, you should only be planning to use 80GB of it for your search data usage. At first this seems like a huge gap in resources available versus resources usable (that’s a 53% drop!), but it’s a necessary plan to prevent getting paged at 3AM with red status clusters, poorly performing queries and/or data loss.

This formula, explained

Planning is key with distributed systems like Elasticsearch. When nodes inevitably go offline, it’s important to have replication in place for backup. The formula removes one node from your calculation (number of nodes - 1) so that your cluster will not lose any data. The additional failover nodes will maintain a green index status, even when a node goes offline for maintenance reasons. This ensures enough capacity for your primary data and a replica. Multiplying the total by 0.8 buffers your capacity by 20%, which accounts for Lucene’s maintenance routines.

Planning for computational requests and traffic

Now that we’ve covered raw capacity planning, let’s talk about handling traffic. Traffic and query computation strength maps to the size of the cluster. Larger clusters can handle a higher number of requests. You should consider three different numbers here:

1.) How many search requests will you be doing in any given minute? 2.) How many aggregations will you be doing in any given minute? 3.) Lastly, how many bulk updates will you perform each minute?

To ensure optimal performance, all three numbers should fit under the values in the table below.

<table>
<thead>
<tr><th>Aggregation Rate</th><th>Bulk Insert Rate</th><th>Search Rate</th><th>Ideal Plan/Size</th></tr>
</thead>
<tbody>
<tr><td>< 25 / minute</td><td>< 250 / minute</td><td>< 500 / minute</td><td>Large</td></tr>
<tr><td>< 50 / minute</td><td>< 500 / minute</td><td>< 1000 / minute</td><td>XLarge</td></tr>
<tr><td>< 100 / minute</td><td>< 1000 / minute</td><td>< 2000 / minute</td><td>2XLarge</td></tr>
</tbody>
</table>

Aggregation RateBulk Insert RateSearch RateIdeal Plan/Size< 25 / minute< 250 / minute< 500 / minuteLarge< 50 / minute< 500 / minute< 1000 / minuteXLarge< 100 / minute< 1000 / minute< 2000 / minute2XLarge

As a note: these numbers are conservative by design. Once you are up and running, you’ll be able to use the metrics panel in the Bonsai application to see the how your searches are performing in real time and be able to make sizing decisions, up or down, based on real data.

For those of you that want to dig even deeper, you can read our very thorough version of capacity planning as well.

Questions?

Our team has provisioned search engines that handle billions of requests each month. If you are still unsure about which plan is right for you, please contact us for a personalized consultation.

‍

Sizing Your Business Cluster

Bonsai’s support policy categorizes incident severity into three tiers. You can use these severity levels to help communicate the level of support needed.

24/7 Operational Coverage is Provided for All Clusters

Bonsai monitors its infrastructure 24 hours a day, 365 days a year. The Operations Team is automatically alerted to any problems in real time. One or more engineers will then respond to the event. These types of events include, but are not limited to:

Red Indexes
Failed EC2 Instances
Stuck GC
Failed backups
Network partitions / Split brain

This operational coverage is provided to all clusters, regardless of plan.

The severity classifications are defined as:

<table>
<thead>
<tr><th>Severity Level</th><th>Description</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>An incident of downtime or service degradation is causing severe impact to customer production availability. Active risk of data loss, traffic completely or mostly failing.</td></tr>
<tr><td>Severity 2</td><td>Moderate to severe service interruption that can be temporarily worked around. May cause a moderate to major degradation in user experience.</td></tr>
<tr><td>Severity 3</td><td>Low rate of minor service errors or intermittent degradations. Minor regressions to response times. Should be investigated and repaired but is not visibly harming customer’s production systems.</td></tr>
</tbody></table>

Coverage Hours

Incidents not related to a problem Bonsai’s infrastructure are treated as support issues. Coverage for these events is the same as our support hours:

<table>
<thead>
<tr><th>Severity Level</th><th>Standard</th><th>Business</th><th>Enterprise</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>Business hours</td><td>Business hours</td><td>24/7</td></tr>
<tr><td>Severity 2</td><td>Business hours</td><td>Business hours</td><td>Business hours</td></tr>
<tr><td>Severity 3</td><td>Business hours</td><td>Business hours</td><td>Business hours</td></tr>
</tbody></table>

Response Time

We will respond to inquiries about active incidents within these time windows.

<table>
<thead>
<tr><th>Severity Level</th><th>Standard</th><th>Business</th><th>Enterprise</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>8 hrs</td><td>1 hr</td><td>1 hr</td></tr>
<tr><td>Severity 2</td><td>24 hrs</td><td>4 hrs</td><td>4 hrs</td></tr>
<tr><td>Severity 3</td><td>48 hrs</td><td>8 hrs</td><td>8 hrs</td></tr>
</tbody></table>

Incident Response Times

Elasticsearch has a menagerie of available plugins, including support for custom ones. We often get asked through our support channels whether we support a particular plugin and whether the user can install their own. Here you’ll find all the info you could ever want to know about plugins on Bonsai Elasticsearch.

Do you support plugin X?

Bonsai supports all of the plugins that ship with Elasticsearch, with some caveats. Plugins like X-Pack, Shield and Marvel require a license for commercial purposes, which we do not bundle into our subscriptions. We also don’t generally deploy unofficial plugins to the Shared level plans for stability and security reasons.

These are the plugins we currently deploy by default for all clusters:

Mapper Attachments (version 2.x and earlier)
Ingest Attachment (version 5.x and up)
ICU Analyzer
Kuromoji Analyzer
Phonetic analyzer
Smart Chinese Analyzer
Stempel Analyzer
Delete-by-Query (version 2.x and up)

Can I install my own plugins on Bonsai?

The answer depends on several things. If you’re on an single tenant plan, then we’re likely able to accommodate. We will need to review the plugin’s functionality, version support and maintenance status, and whether it’s under active development. Supported plugins that are not the property of the customer would require an open source license.

If your cluster is on one of our shared clusters, then it’s not possible to install a non-standard plugin. At Bonsai we’d love to be able to give you everything you want for your cluster, but we need to be opinionated about our shared cluster architecture to ensure security and performance for all other users.

Disaster recovery plan for all clusters on our platform

Whether you’re hosting Elasticsearch on your own or choosing a hosted provider like Bonsai, every search cluster should have a disaster recovery plan.

Bonsai was built from the ground up to be a highly available system. We leverage a variety of best practices from the industry to achieve this. It all starts with the choice of Elasticsearch, which is a highly available search engine with built-in clustering and sharding support. Bonsai deploys all Elasticsearch clusters in a multi-node, multi-data center configuration to guarantee that your data is safe and secure. To further improve the High Availability of your cluster, Bonsai deploys all clusters behind an AWS Application Load Balancer. This allows you to connect to a singular URL, and get access to every node of the cluster through our load balancing algorithm.

All production Bonsai Clusters are deployed to minimum of three nodes for redundancy, and to prevent stalemates in leadership election. Each node in the cluster will be deployed to a separate AWS Availability Zone, giving us data center isolation as well. A Bonsai cluster could experience a complete loss of two AWS data centers, and the cluster will still continue to operate. This makes Bonsai clusters extremely fault-tolerant.

When a Bonsai cluster does experience a node loss, Elasticsearch will automatically reroute the primary and replica shards to machines that are up and running. In the background, AWS Auto Scaling Groups will immediately begin spinning up the replacement instance that will auto-bootstrap into your configured Elasticsearch configuration and version. Once the node has successfully provisioned, it will join the cluster, and then Elasticsearch will offload the relocated shards back to the empty machine.

In the off chance that Bonsai (and much of the internet with it) experiences an entire loss of an AWS EC2 region, all of your cluster’s data is maintained in AWS’s S3 system, which has a reliability guarantee of 99.99% uptime and 99.999999999% durability. If such a failure happens, Bonsai’s staff will work with your team to understand where you will be relocating your application, and can then initiate a restore process into a cluster in the same AWS Region while maintaining your existing DNS connections.

‍

Elasticsearch Disaster Recovery

Heroku customers who would like to transition across a public-private network boundary (either from a public space to a private space, or from a private space to a public space) have limited options for doing so. The boundary represents a firewall that is designed to prevent access from the public space to the private space. Within the private space itself, there are security protocols in place to prevent the exfiltration of data. Migrations across the boundary are difficult by design; this difficulty is an artifact of the additional layer of security granted by private spaces.

The preferred method for migrating a cluster is to use a blue-green deployment strategy. There are several general steps to this approach:

Create a new Bonsai addon using the plan of your choice. This will create a new Bonsai cluster. If you are trying to migrate a cluster into a private space, make sure this addon is created within the private space.
Reindex your content into the new cluster.
Update your application to read and write to the new cluster.
Verify that the application is working as expected.
Tear down the old cluster.

This strategy will accommodate migrations across the boundary, regardless of whether you're going from public to private, or private to public.

If a blue-green deployment strategy is not possible for some reason, please contact us to set up a call. We will consider alternatives based on your specific needs.

‍

Migrating Between Public and Private Spaces in Heroku

Bonsai has been around for a long time and has supported every major release of Elasticsearch since 0.20 (late 2012). Over the years, Bonsai has rolled out support for a number of versions of Elasticsearch, and in the process developed a time-tested, standardized approach.

When a new version of Elasticsearch is released, we will subsequently deprecate the oldest public version. This is to ensure that we can provide excellent support and reliable operational responses to any incident that may arise.

Dedicated Clusters Exempt!

Clusters in single tenant environments — Business, Enterprise, Dedicated, etc — are not affected by this version upgrade policy. These clusters will remain operational and unaffected by new releases unless/until the owner wishes to upgrade.

A quick summary of Bonsai’s version support policy is as follows:

Bonsai supports the two most recent major versions on Hobby and Standard tier clusters.
Bonsai does not automatically upgrade existing clusters when new Elasticsearch versions are released, or when old versions are retired.
Bonsai offers long-term version support for older major versions of Elasticsearch for Business and Enterprise tier clusters.
Hobby tier clusters always default to the cutting edge when available.
Hobby tier clusters are the first to be upgraded.
Clusters on EOL nodes (see below) will not be serviced and will receive limited technical support.
Bonsai does not force upgrades, except in cases involving a major security issue. These are rare, and typically patch-level upgrades.

How Bonsai Handles New Version Rollouts

Bonsai will first deploy and validate the new version. Then, when our operations team is satisfied that it safe and reliable for production use, we will begin defaulting new clusters to the latest version.

Clusters running on our oldest supported version receive several notifications about next steps.

The general process is:

Deploy the new version and make it the default for Hobby clusters. Users with paid clusters can also opt in to the new version where available
Validate that the latest version is safe for production, and ensure our operational tooling is adequate to respond to any incident that may arise
Default all new clusters to the latest version
Deprecate the oldest version:
Send a notice several weeks in advance to all impacted users, alerting them of the deprecation and providing instructions and guidance for upgrading
Send a final notice several days in advance to any impacted user who still has not upgraded
Migrate remaining clusters to the next supported version
Terminate nodes running the deprecated version

There are some caveats and exceptions here, based on the cluster’s plan level:

Hobby Clusters

These clusters are generally reserved for non-production purposes, like for hobbyists and students, PR-apps and the like. As a result, they are always the first candidates for version upgrades, as well as the first to be mass migrated off of deprecated versions.

In other words, when a Bonsai deploys a new version of Elasticsearch, Hobby clusters will be the first to default to the new version where available. This is one way we can validate performance and safety with minimal risk to production applications.

In a version deprecation, these clusters are also the first to be migrated off of the deprecated version up to the next supported version. Data loss or mapping changes are not common during these events, but are a distinct possibility.

As unpaid resources, we are unable to accommodate special treatment regarding this process. For example, if a user is on a free cluster and requests an extension to prepare, we will not be able to accommodate.

Standard tier Clusters

These clusters are running in multitenant environments, meaning the system resources are shared among many users. We’re not able to selectively upgrade, downgrade or “pin” clusters to a version in these environments. Changing versions literally involves migrating the cluster to another set of servers.

Users who are not able to upgrade to a new version of Elasticsearch have the option of upgrading to a Business or Enterprise tier plan. With these plans, we will provision a new isolated environment running whatever version of Elasticsearch is required, and migrate the cluster there. This obviates the need to worry about future version deprecations.

One other possibility for users who can’t upgrade either their cluster’s plan or Elasticsearch version is to remain on EOL nodes. This approach comes with some risks, which are described below.

Business and Enterprise tier Clusters

Clusters on these plans are running in isolated environments and generally not affected by version rollouts and deprecations on Bonsai. Major version upgrades for these clusters are by request only. Minor version upgrades may be made in the event of a vulnerability disclosure.

For example, if an Enterprise cluster is running on version X.Y.0, and a critical vulnerability is disclosed, we will work with the customer to establish a service window, deploy the patched version, and then perform a rolling restart. Service impacts are generally trivial and without down time.

Otherwise, Business and Enterprise tier clusters are not affected by version deprecations on Bonsai.

What are EOL Nodes?

An EOL node is a server running a version of Elasticsearch that Bonsai no longer supports. It has reached its End Of Life (EOL) and may be terminated at any time, for a variety of reasons. When that happens, our Ops Team will not replace it or address any resulting service interruption or data loss. Additionally, our Support Team will provide limited technical support for issues arising from being on an old version.

When a version of Elasticsearch is officially deprecated by Bonsai, clusters running on that version will be upgraded to the next supported version. In some cases, we will allow users to remain on the unsupported version with the caveat that their cluster is running on EOL nodes. Users with clusters on these resources accept all risks and responsibilities.

Some examples:

In the event that a security regression is disclosed, we may choose to promptly terminate EOL nodes without advance notice.
In the event of irrecoverable failure of the underlying EC2 instances, we may choose to terminate the EOL nodes without advance notice.

Users running on EOL nodes should actively work to get their application on a supported version as soon as possible to avoid this kind of outcome.

‍

How Bonsai Handles Elasticsearch Releases

Bonsai clusters support most Elasticsearch APIs out of the box, with a few exceptions. This article details those exceptions, along with a brief explanation of why they’re in place. Here is what we will cover :

_all and wildcard destructive actions
Tasks API
Node Hot Threads
Node Shutdown & Restart
Snapshots
Reindex
Cluster Shard Reroute
Cluster Settings
Index Search Warmers
Static Scripts
Update By Query API

While having many of the following endpoints can be helpful for power users, the majority of applications don’t directly need them. If, however, you find yourself stuck without one of these available, please email us and we’ll be happy to help.

_all and wildcard destructive actions

Wildcard delete actions are usually for clusters with a large number of indices, and can be useful for completely wiping out a cluster and starting over. Wildcard and _all destructive actions were initially available on shared tier clusters. However, we received an increasingly large number of threads from distressed developers that accidentally deleted their entire production clusters.

After some internal discussion, we decided to disable wildcard actions. Removing the ability to sweepingly delete everything forces users to slow down and identify exactly what they’re deleting, reducing the risk of accidental and permanent data loss.

Examples of _all and wildcard destruction:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">DELETE /*
DELETE /_all</code>
</pre></div>

Tasks API

Elasticsearch provides an API for viewing information about current running tasks and the ability to cancel them on versions 5.x and up. This API is disabled for multi-tenant plans. It can be enabled by request on Business and Enterprise plans through an email to support.

Node hot threads

A Java thread that uses lots of CPU and runs for an unusually long period of time is known as a hot thread. Elasticsearch provides an API to get the current hot threads on each node in the cluster. This information can be useful in forming a holistic picture of potential problems within the cluster.

Bonsai doesn’t support these endpoints on our shared tier to ensure user activity isn’t exposed to others. For a detailed explanation of why this is a concern, please see the article on Architecture Classes. Additionally, Bonsai is a managed service, so it’s really the responsibility of our Ops Team to investigate node-level issues.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

Node Shutdown & Restart

Elasticsearch provides an API for shutting down and restarting nodes. This functionality is unsupported across the platform. On our multitenant architecture, this prevents a user from shutting down a node or set of nodes that may be in use by another user. That action would have an adverse affect on other users which is why it is unsupported. This also prevents users from exacerbating whatever problem they’re trying to resolve.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

Snapshots

The Snapshot API allows users to create and restore snapshots of indices and cluster data. It’s useful as a backup tool and for recovering from problems or data loss. On Bonsai, we’re already taking regular snapshots and monitoring cluster states for problems,. This API is blocked to avoid problems associated with multiple users trying to snapshot/restore at once.

If you feel that you need a snapshot taken/restored, please reach out to our support team.

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">GET /_snapshot/_status
DELETE /_snapshot/{repository}
POST /_snapshot/{repository}
PUT /_snapshot/{repository}
GET /_snapshot/{repository}/_status
DELETE /_snapshot/{repository}/{snapshot}
GET /_snapshot/{repository}/{snapshot}
POST /_snapshot/{repository}/{snapshot}
PUT /_snapshot/{repository}/{snapshot}
POST /_snapshot/{repository}/{snapshot}/_create
PUT /_snapshot/{repository}/{snapshot}/_create
POST /_snapshot/{repository}/{snapshot}/_restore
GET /_snapshot/{repository}/{snapshot}/_status</code></pre>
</div>
</div>

Reindex

The Reindex API copies documents from one index to another. For example, it will copy documents from an index called <span class="inline-code"><pre><code>books</code></pre></span> into another index, like <span class="inline-code"><pre><code>new_books</code></pre></span>. It does this by using what is basically a scan and scroll search, reading the contents of one index into another. The API call looks something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_reindex</code>
</pre></div>

At this time, the Reindex API is supported on Bonsai for clusters running on Elasticsearch 5.6 and up. Because it can be extremely demanding on IO, which can impact performance and availability for other users on multitenant architectures, it is only available to pre-5.6 clusters on Business or Enterprise subscriptions.

One workaround is to set up an indexing script that uses the scan and scroll search to read documents from your old index and push them into the new index. For example, read the contents of <span class="inline-code"><pre><code>GET /my_index/_search?search_type=scan&scroll=1m</code></pre></span>, then <span class="inline-code"><pre><code>POST</code></pre></span> those retrieved docs into a new index. That would provide basically the same functionality.

Another option is to simply populate the new index directly from your database.

Cluster Shard Reroute

Elasticsearch provides an API to move shards around between nodes within a cluster. We don’t support this functionality on our shared plans for a few reasons. For one it interferes with our cluster management tooling, and there is a possibility for one or more users to allocate shards in a way that overloads a node.

If you need fine-grain control over the shard allocation within a cluster, please reach out to us and we can discuss your use case and look at whether single tenancy would be a good fit for you.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_cluster/reroute</code>
</pre></div>

Cluster settings

Elasticsearch provides an API to apply cluster-wide settings. We don’t support this in our shared environment for safety reasons. In an environment where system resources are shared, this API would affect all users simultaneously. So one user could affect the behavior of everyone’s cluster in ways that those users may not want. Instead, we block this API and remain opinionated about cluster settings.

If you need to change the system settings for your cluster, you’ll need to be in a single tenant environment. Reach out to us and let’s talk through it.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT /_cluster/settings</code>
</pre></div>

Index Search Warmers

Elasticsearch provides a mechanism to speed up searches prior to being run. It does this by basically pre-populating caches via automatically running search requests. This is called “warming” the data, and it’s typically done against searches that require heavy system resources.

We don’t support this on shared clusters for stability reasons. Essentially there isn’t a great way to throttle the impact of an arbitrary number of warmers. There is a possibility that a user could overwhelm the system resources by creating a large number of “heavy” warmers (aggregations, sorts on large lists, etc). It’s also somewhat of an anti-pattern in a multitenant environment.

If this is something critical to your app, you would need to be on a dedicated cluster. Please reach out to us if you have any further questions on this.

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">POST /_warmer/{name}
PUT /_warmer/{name}
POST /_warmers/{name}
PUT /_warmers/{name}
POST /{index}/_warmer/{name}
PUT /{index}/_warmer/{name}
POST /{index}/_warmers/{name}
PUT /{index}/_warmers/{name}
POST /{index}/{type}/_warmer/{name}
PUT /{index}/{type}/_warmer/{name}
POST /{index}/{type}/_warmers/{name}
PUT /{index}/{type}/_warmers/{name}</code></pre>
</div>
</div>

Static Scripts

Elasticsearch provides an API for adding and modifying static scripts within a cluster, which can be referenced by name. We can enable these endpoints on a case-by-case basis for Business and Enterprise clusters. Inline scripts with Painless are supported on all clusters.

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">DELETE /_scripts/{lang}/{id}
GET /_scripts/{lang}/{id}
POST /_scripts/{lang}/{id}
PUT /_scripts/{lang}/{id}
POST /_scripts/{lang}/{id}/_create
PUT /_scripts/{lang}/{id}/_create</code></pre>
</div>
</div>

Update By Query API

Elasticsearch provides an API for updating documents that match a given query. This API is disabled on multitenant clusters due to the large amount of overhead and risk it introduces. Namely, it's possible to craft a query that runs, uncontrolled, for a long period of time and consumes significant amounts of CPU in the process. On multitenant systems, this translates to a severe performance degradation for multiple users.

Customers wanting to make use of this API can do so on a Business or Enterprise subscription.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /{index}/_update_by_query</code>
</pre></div>

Unsupported API Endpoints

Bonsai supports a large number of Elasticsearch versions in regions all over the world. Multitenant class clusters usually have more limited options in terms of available regions and versions, while single tenant class clusters have far more options.

If you need a version or geographic region that is not listed here, reach out to our support and let us know what you need. We can get you a quote and timeline for getting up and running.

Multitenant Class

Bonsai operates a fleet of shared resource nodes in Oregon, Virginia, Ireland, Frankfurt, and Sydney. The Elasticsearch versions available on these nodes do not change often.

Sandbox clusters are limited to the most recent version of Elasticsearch and may be subject to automatic upgrades when a new version is released.

For all other multitenant plans, when Bonsai adds support for a new version, we will create a new server group rather than upgrading an existing group. The exception to this is a potential patch upgrade in response to some critical vulnerability. In other words, users will not be upgraded in place unless there is a vulnerability to address or if they are on a Sandbox plan. Read How Bonsai Handles Elasticsearch Releases for more information.

This table shows which versions of Elasticsearch are available for multitenant plans. Last updated: 2025-06-12

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>Sandbox</td><td>7.10.2</td><td>2.6.0</td></tr>
<tr><td>All other multitenant </td><td>5.6.16 / 6.8.21 / 7.10.2 </td><td>2.6.0</td></tr>
</tbody></table>

Multitenant Version Support by Region

Multitenant plans are supported in a handful of popular regions, although the versions available to free plans is limited. Additional regions are available to Business and Enterprise subscriptions.

<table>
<thead>
<tr><th>Cloud</th><th>Region</th><th>Location</th><th>Elasticsearch 5.6.16</th><th>Elasticsearch 6.8.21</th><th>Elasticsearch 7.10.2</th><th>OpenSearch 2.6.0</th></tr>
</thead>
<tbody>
<tr><td>AWS</td><td>us-east-1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard Free Plans</td></tr>
<tr><td>AWS</td><td>us-west-2</td><td>Oregon, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, FreePlans</td></tr>
<tr><td>AWS</td><td>eu-west-1</td><td>Ireland, EU</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
<tr><td>GCP</td><td>us-east1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
</tbody>
</table>

Single Tenant Class

Single tenant clusters can be deployed in Oregon, Virginia, Ireland, Frankfurt, Sydney, and Tokyo. We support a variety of Elasticsearch versions for these kinds of clusters and will default to the most recent minor version unless something else is specified.

This table shows which versions of Elasticsearch are available for single tenant plans. Last updated: 2025-06-12.

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>All single tenant </td><td>2.4.0 to 8.13.4</td><td>1.2.4 to 2.19.0</td></tr>
</tbody></table>

Enterprise Deployments

Bonsai can deploy and manage whichever version of Elasticsearch or OpenSearch your use case needs. Please reach out to us at [email protected].

Older Search Engine Version Pricing

For major search engine versions behind Bonsai’s current primary supported version (OpenSearch 2.x and Elasticsearch 7.x), clusters will be charged an operational and maintenance fee to ensure that we can continue to support these older versions, and that we can give your organization the time it needs to decide when to upgrade to a new major version.

The fee will be assessed based on the following table:

Search Engine Version	Operational Fee
OpenSearch 1.x or Elasticsearch 6.x	20%
OpenSearch 1.x or Elasticsearch 6.x	20%
Elasticsearch 5.x	15% additional
Elasticsearch 2.x	10% additional
Elasticsearch 1.x	5% additional

Which Versions Bonsai Supports

Bonsai offers two main architecture classes: multitenant and single tenant. The multitenant class – sometimes called “shared” – is designed to allow clusters to share hardware resources while still being securely sandboxed from one another. This allows us to provide unparalleled performance per dollar at smaller scales. All Hobby and Standard plans use multitenant architecture.

Multi Tenant Class

Bonsai’s multitenant class utilizes some sandboxing features built into Elasticsearch. This allows the service to run multiple clusters on a single instance of Elasticsearch per node. This approach saves substantial hardware and network resources, and allows for radical cost savings especially for students, hobbyists, startups, projects in development, small businesses, and so on.

A great benefit of this approach is that Bonsai is able to provide some really nice features out of the box, for no additional cost: all multitenant clusters – even the free ones – are running on 3 nodes. They also get industry standard SSL/TLS and HTTP Basic authentication (see Security for more information), which keeps your data safe. Plus, the Bonsai dashboard offers plenty of tools for monitoring, managing, and engaging with your cluster.

Because these clusters are running on a shared Elasticsearch instance, there are also a few limitations. For one, certain API endpoints and actions are unavailable for security and performance reasons. Snapshots and plugins are not manageable by users to avoid collisions and regressions. And usage is metered about how you’d expect for a free/low-cost SaaS.

Clusters on the multitenant class can often be identified by their plan name. “Hobby”, “Staging”, “Production”, and “Shared” are all terms used by plans running on a multitenant architecture.

Single Tenant Class

Bonsai’s single tenant class has a fairly standard configuration. The cluster is simply one or more nodes (three by default), each running Elasticsearch. These nodes are physically isolated and on a different network than those running multitenant clusters. This means that all available IO on the nodes is always 100% allocated to your cluster.

This approach offers all of the same benefits as the multitenant class: you get the same industry standard SSL/TLS and HTTP Basic authentication (see Security for details), and the Bonsai dashboard. In addition, the isolated environment is suitable for encryption at rest and VPC Peering (for applications with stringent security requirements).

Furthermore, this class offer extremely flexible deployments and scaling. Need it in a region we don’t support on the multitenant class? No problem! Have a plugin or script that is vital to your operation? We can package it into our deployment! Let us know what you need, and we can provide a quote.

Also, because this class is highly secure and customizable, we are also able to support amendments to our terms of service and privacy policy, custom SLAs and more. Contact us for questions and quotes.

Trade Offs

This article details the differences between the two architectures we offer. There are a number of trade offs between these classes, summarized below:

<table>
<thead>
<tr><th>Class</th><th>Pros</th><th>Cons</th></tr>
</thead>
<tbody>
<tr><td>Multi-tenant</td><td>- Extremely cost effective
- Great performance for the money
- Can scale up or down on demand</td><td>- Limits on usage (disk, memory, connections, etc)
- Can not install and run arbitrary plugins
- Noisy neighbors*
- VPC pairing and at-rest encryption not available
- Subject to general terms of service and privacy policy</td></tr>
<tr><td>Single Tenant</td><td>- Extremely powerful
- No metering on usage
- Can deploy arbitrary plugins
- No noisy neighbors*
- Can have at-rest encryption
- VPC Pairing
- Custom terms, SLAs, etc</td><td>- More expensive to operate
- Scaling can be more difficult</td></tr>
</tbody>
</table>

* The Noisy Neighbor Problem is a well-known issue that occurs in multitenant architectures. In this context, one or more clusters may inadvertently monopolize IO resources (CPU, network, memory, etc), which can adversely affect other users on the same nodes. Bonsai actively monitors and addresses these situations when they come up, although the issue is frequently transient and resolves itself within a few minutes. Single tenant architectures do not suffer from this issue.

Why Not Containers / VMs?

A frequent question that comes up when talking about our various service architectures is why we don’t use container or virtualization technologies. A service that incorporates these technologies would offer some nice benefits, like allowing users to install their own plugins and manage their own snapshots.

The simple answer is “overhead.” Running containers – and especially VMs – requires system resources via some orchestration daemon or hypervisor. And simply running multiple instances of Elasticsearch on a node would require multiple JREs, which wastes resources through duplication.

Any resources that are allocated towards management of environments are therefore unavailable for Elasticsearch to use. In comparison to an architecture that doesn’t have this overhead, the provider must either offer less performance for the money, or charge more money for the same performance.

There is also a practical aspect to Bonsai eschewing containers and virtual machines. It is impossible to provide both great support and absolute customization. For example, users who install their own plugins can introduce a variety of regressions into Elasticsearch’s performance and behavior. When they open a support ticket, the agent must either spend time bug squashing or decline assistance altogether. Being opinionated allows our team to focus on depth of knowledge rather than breadth, which leads to faster, higher quality resolutions.

Finally, there is a philosophical motivation for how we built the service. We want to make Elasticsearch accessible to people at all stages of development; from the hobbyists and students, all the way up to the billion dollar unicorns. And we want to make sure that it’s the best possible experience. This means being opinionated about certain features, and taking a more active role in managing the infrastructure.

‍

Bonsai Architecture

Elasticsearch protects itself from data loss with the disk-based shard allocator. This mechanism attempts to strike a balance between minimizing disk usage across all nodes while also minimizing the number of shard reallocation processes. This ensures that all nodes have as much disk headroom as possible, with minimal impact to cluster performance.

Elasticsearch is constantly checking on each node’s disk availability to make decisions about where to allocate and move shards. There are 4 important“stages” that help dictate this decision making process:

Alert watermark. When a node in a single tenant cluster reaches 70% disk or higher, Bonsai sends out an alert to the user.
Low watermark. When a node reaches its low watermark stage(Bonsai defaults to 70% disk used), the cluster will no longer allocate new shards to this node.
High watermark. When a node reaches its high watermark stage(Bonsai defaults to 75% disk used), the cluster will actively try to move shards off the node.
Flood stage. When a node reaches its flood stage watermark(Bonsai uses 95% disk used), the cluster will put all open indices into a state that only allows for reads and deletes.

The low and high watermark stages are when Business and Enterprise tiers receive an emailed notification from the Bonsai Support team with suggestions for scaling disk capacity. Clusters that are over the 75% threshold can start to experience performance issues that may include but are not limited to: increased bulk queue times, higher CPU and/or load usage, or slower than normal searches.

‍

How Bonsai Manages Watermarks

At Bonsai, security and privacy are always a top concern. We are constantly evaluating the service and our vendors for vulnerabilities and flaws, and we will immediately address anything that could put our customers at risk. This is how we keep your data secure:

Access Controls All Bonsai clusters are provisioned with a unique, randomized URL and have HTTP Basic Authentication enabled by default, using a randomly generated set of credentials. Under this scheme, it would take the world’s fastest super computer around 23.5 quadrillion years to guess.
Encrypted communications. All Bonsai clusters support SSL/TLS for encryption in transit. We use industry standard strength encryption to ensure your data is safe over the wire.
Encrypted at rest. Bonsai clusters are provisioned on hardware that is encrypted at rest by default. In addition to Amazon’s physical security controls, this means your data is safe from physical theft.
Regular Snapshots. All paid Bonsai clusters receive regular snapshots, which are stored in an offsite, encrypted S3 bucket in the same region as the cluster.
Firewalled. All Bonsai clusters are accessed via a custom-built, high-performance layer 7 routing proxy, and sit behind a tightly controlled firewall. This helps to ensure that the cluster and data are protected from port scans and unauthorized persons.
Advanced Networking. Bonsai can support IP whitelisting, and VPC Peering to users on single tenant clusters.

‍

How Does Bonsai Secure Data?

Jekyll is a static site generator written in Ruby. Jekyll supports a plugin model that Searchyll uses to read your site’s content and then index it into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Jekyll to index all of the content into a configured instance of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Jekyll installed and configured on your system.

Make sure you have Jekyll installed. This guide assumes you already have Jekyll installed and configured on your system. Visit the Jekyll Documentation to get started.
Spin up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend. Guides are available for setting up a cluster through bonsai.io or Heroku.

Configure Jekyll to output to Bonsai Elasticsearch

Jekyll’s configuration systems live in a file called `_config` by default. Add the following snippet to the file:

We also need to add the gem `searchyll`

Push the Data Into Elasticsearch

To get the site's data into Elasticsearch, you can load it by simply running the Jekyll build command:

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"

{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre>
</div>
</div>

Jekyll

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy with Searchkick. In this guide, we will start with a very basic Ruby on Rails application and add the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Throughout this guide, you will see some code examples. These code examples are drawn from a very simple Ruby on Rails application, and are designed to offer some real-world, working code that new users will find useful. The complete demo app can be found in this GitHub repo.

<div class="callout-warning">
<h3>Warning</h3>
<p>SearchKick uses the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:</p>
<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14+ (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>
<p>If you are receiving a <span class="inline-code warning">Elasticsearch::UnsupportedProductError</span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.</p>
</div>

<div class="callout-note">
<h3>Note</h3>
<p>In this example, we are going to connect to Elasticsearch using the Searchkick gem. There are also the official Elasticsearch gems for Rails, which are covered in another set of documentation.</p>
</div>

Step 1: Spin up a Bonsai Cluster

Make sure that there is a Bonsai Elasticsearch cluster ready for your app to interact with. This needs to be set up first so you know which version of the gems you need to install; Bonsai supports a large number of Elasticsearch versions, and the gems need to correspond to the version of Elasticsearch you’re running.

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Overview, in the Credentials tab:

Heroku users will also have a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable created when Bonsai is added to the application. This variable will contain the fully-qualified URL to the cluster.

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai UI, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

You can also use the Interactive Console. In the Cluster Dashboard, click on the Console tab. It will load a default view, which includes the version of Elasticsearch. The version of Elasticsearch is called “number” in the JSON response:

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Install the Gem

To install Searchkick, you will need the searchkick gem. Add the following to your Gemfile outside of any blocks:

This will install the gem for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Elasticsearch Version</th><th>Searchkick Version</th></tr>
</thead>
<tbody>
<tr><td>1.x</td><td>-> 1.5.1</td></tr>
<tr><td>2.x</td><td>-> 2.5.0</td></tr>
<tr><td>5.x</td><td>-> 3.1.3 (additional notes)</td></tr>
<tr><td>6.x and up</td><td>-> 4.0 and up</td></tr>
</tbody>
</table>

If you need a specific version of Searchkick to accommodate your Elasticsearch cluster, you can specify it in your Gemfile like so:

Once the gem has been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span>.

Step 4: Add Searchkick to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured to do so by adding the <span class="inline-code"><pre><code>searchkick</code></pre></span> keyword to it.

For example, our demo app has a User model that looks something like this:

Adding the searchkick keyword makes our User model searchable with Searchkick.

Searchkick provides a number of reasonable settings out of the box, but you can also pass in a hash of settings if you want to override the defaults. The hash keys generally correspond to the official Create Indices API. For example, this will allow you to create an index with 0 replicas:

If you have questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. The easiest way to do this with Searchkick is to have a search route per model. This involves updating your models’ corresponding controller, and defining routes in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>. You’ll also need to have some views that handle rendering the results, and a form that posts data to the controller(s). Take a look at how we implemented this in our demo app for some examples of how this is done:

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with <span class="inline-code"><pre><code>searchkick</code></pre></span>. It looks something like this:

To implement search, we updated the file <span class="inline-code"><pre><code>app/controllers/users_controllers.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class UsersController< ApplicationController
#... a bunch of controller actions, removed for brevity
def search
@results = User.search(params[:q])
end
#... more controller actions removed for brevity
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">Rails.application.routes.draw do
resources :users do
collection do
post :search # creates a route called users_search
end
end
end</code></pre>
</div>
</div>

Next, we need to have some views to render the data we get back from Elasticsearch. The <span class="inline-code"><pre><code>search</code></pre></span> controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/users/search.html.erb</code></pre></span> and adding:

<% if @results.present? %>
<%= render partial: 'search_result', collection: @results, as: :result %>
<% else %>
Nothing here, chief!
<% end %>

</code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. We also created a file for a partial called <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span> and added:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
<%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

<%= result.company %>

<%= result.company_description %>

</code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

At this point, the <span class="inline-code"><pre><code>User</code></pre></span> model is configured for searching in Elasticsearch, and has routes for sending a query to Elasticsearch. The next step is to render a form so that a user can actually use this feature. This is possible with a basic <span class="inline-code"><pre><code>form_with</code></pre></span> helper.

In this demo app, we added this to the navigation bar:

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Searchkick Where Your Cluster is Located

Searchkick looks for an environment variable called <span class="inline-code"><pre><code>ELASTICSEARCH_URL</code></pre></span>, and if it doesn’t find it, it uses <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Searchkick is pointed to the correct URL.

Bonsai does offer a gem, bonsai-searchkick, which populates the necessary environment variable automatically. If you're using this gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

Heroku users will have this already, and can skip to the next step. Other users will need to make sure this environment variable is manually set in their application environment. If you have access to the host, you can run this command in your command line:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:[email protected]:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

You are not using the bonsai-searchkick gem for some reason, OR
You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you need to do this, then you can create a file called <span class="inline-code"><pre><code>config/initializers/elasticsearch.rb</code></pre></span>. Inside this file, you will want to put something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
ENV["ELASTICSEARCH_URL"] = ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
ENV["ELASTICSEARCH_URL"] = "https://abcd123:[email protected]:443"</code></pre>
</div>
</div>

If you’re wondering why we prefer to use an environment variable instead of the URL, it’s simply a best practice. The cluster URL is considered sensitive information, in that anyone with the fully-qualified URL is going to have full read/write access to the cluster.

So if you have it in an initializer and check it into source control, that creates an attack vector. Many people have been burned by committing sensitive URLs, keys, passwords, etc to git, and it’s best to avoid it.

Additionally, if you ever need to change your cluster URL, updating the initializer will require another pass through CI and a deployment. Whereas you could otherwise just change the environment variable and restart Rails. Environment variables are simply the better way to go.

Step 7: Push Data into Elasticsearch

Now that the app has everything it needs to query the cluster and render the results, we need to push data into the cluster. There are a few ways to do this.

One method is to open up a Rails console and run <span class="inline-code"><pre><code>.reindex</code></pre></span>. So if you want to reindex a model called <span class="inline-code"><pre><code>User</code></pre></span>, you would run <span class="inline-code"><pre><code>User.index</code></pre></span>.

Another method is to use Rake tasks from the command line. If you wanted to reindex that same <span class="inline-code"><pre><code>User</code></pre></span> model, you could run: <span class="inline-code"><pre><code>bundle exec rake searchkick:reindex CLASS=User</code></pre></span>. Alternatively, if you have multiple Searchkick-enabled models, you could run <span class="inline-code"><pre><code>rake searchkick:reindex:all</code></pre></span>.

Regardless of how you do it, Searchkick will create an index named after the ActiveRecord table of the model, the environment, and a timestamp. So reindexing the <span class="inline-code"><pre><code>User</code></pre></span> model in a development environment might result in an index called <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span>. This allows Searchkick to provide zero-downtime updates to settings and mappings.

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Searchkick. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, where it will be passed to Searchkick, which queries Elasticsearch. Elasticsearch will search the <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span> index, and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The UsersController will then ensure the appropriate views are rendered. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Searchkick in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We installed the Searchkick gem, added it to a model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Searchkick. This documentation is not exhaustive, and there are a lot of really cool features that Searchkick offers. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on additional subjects in the section below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

‍

Searchkick

Here’s how to get started with Bonsai Elasticsearch and Ruby on Rails using Chewy.

Warning

Chewy is built on top of the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14 (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>

EngineVersionHighest Compatible Gem VersionElasticsearch5.x7.13Elasticsearch6.x7.14+ ( sic)Elasticsearch7.x7.13OpenSearch1.x7.13

If you are receiving a <span class="inline-code"><pre><code>Elasticsearch::UnsupportedProductError</code></pre></span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.

Note

As of January 2021, Chewy supports up to Elasticsearch 5.x. Users wanting to use Chewy will need to ensure they are not running anything later than 5.x. Support for later versions is planned.

Add Chewy to the Gemfile

In order to use the Chewy gem, add the gem to the Gemfile like so:

Then run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install it.

Write an Initializer

A Chewy initializer will need to be written so that the Rails app can connect to your Bonsai cluster.

You can include your Bonsai cluster URL (located in Credentials of Bonsai dashboard) in the initializer, but hard-coding your credentials is not recommended. Instead, export them manually in the application environment to an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>.

We recommend using something like dotenv for this, but you can also set it manually like so:

Heroku users will not need to do so, as their Bonsai cluster URL will already be in a Config Var of the same name.

Create a file called <span class="inline-code"><pre><code>config/initializers/chewy.rb</code></pre></span>. With the environment variable in place, save the initializer with the following:

The official Chewy documentation has more details on ways to modify and refine Chewy’s behavior.

Configure Your Models

Any Rails model that you will want to be searchable with Elasticsearch will need to be configured to do so by creating a Chewy index definition and adding model-observing code.

For example, here’s how we created an index definition for our demo app in <span class="inline-code"><pre><code>app/chewy/users_index.rb</code></pre></span>:

Then, the User model is written like so:

You can read more about configuring how a model will be ingested by Elasticsearch in the official Chewy documentation.

Indexing Your Documents

The Chewy gem comes with some Rake tasks for getting your documents into the cluster. Once you have set up your models as desired, run the following in your application environment:

That will push your data into Elasticsearch and synchronize the data.

Next Steps

Chewy is a mature ODM with plenty of great features. Toptal (the maintainers of Chewy) have some great documentation exploring some of what Chewy can do: Elasticsearch for Ruby on Rails: A Tutorial to the Chewy Gem.

‍

Chewy

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy. In this guide, we will cover the steps and the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Warning

The official Elasticsearch Ruby client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

Note

In this example, we are going to connect to Elasticsearch using the official Elasticsearch gems. There are other gems for this, namely SearchKick, which is covered in another set of documentation.

Step 1: Spin up a Bonsai Cluster

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Dashboard, in the Credentials tab:

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Add the Gems

There are a few gems you will need in order to make all of this work. Add the following to your Gemfile outside of any blocks:

This will install the gems for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Branch</th><th>Elasticsearch Version</th></tr>
</thead>
<tbody>
<tr><td>0.1</td><td>-> 1.x</td></tr>
<tr><td>2.x</td><td>-> 2.x</td></tr>
<tr><td>5.x</td><td>-> 5.x</td></tr>
<tr><td>6.x</td><td>-> 6.x</td></tr>
<tr><td>master</td><td>-> master</td></tr>
</tbody>
</table>

For example, if your version of Elasticsearch is 6.x, then you would use something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: '6.x'</code></pre>
</div>
</div>

Make sure the branch you choose corresponds to the version of Elasticsearch that your Bonsai cluster is running.

What Do These Gems Do, Anyway?

elasticsearch-model. This gem is the only one that is actually required. It does the actual search integration for Ruby on Rails applications.
elasticsearch-rails. This optional gem adds some nice tools, such as rake tasks and ActiveSupport instrumentation.
bonsai-elasticsearch-rails. This optional gem saves you the step of setting up an initializer. By default, the Elasticsearch client will attempt to create a connection to <span class="inline-code"><pre><code>localhost:9200</code></pre></span>, which is not where your Bonsai cluster is located. This gem simply reads the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable which contains the cluster URL, and uses it to override the defaults.

Once the gems have been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install them.

Step 4: Add Elasticsearch to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured. You will need to require the `elasticsearch/model` library and include the necessary modules in the model.

For example, this demo app has a <span class="inline-code"><pre><code>User</code></pre></span> model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

Your model will need to have these lines included, at a minimum.

What Do These Lines Do?

<span class="inline-code"><pre><code>require 'elasticsearch/model'</code></pre></span>require 'elasticsearch/model' loads the Elasticsearch model library, if it has not already been loaded. This isn’t strictly necessary, provided the file is loaded somewhere at runtime.
<span class="inline-code"><pre><code>include Elasticsearch::Model</code></pre></span> tells the app that this model will be searchable by Elasticsearch. This is mandatory for the model to be searchable with Elasticsearch.
<span class="inline-code"><pre><code>include Elasticsearch::Model::Callbacks</code></pre></span> is an optional line, but injects Elasticsearch into the ActiveRecord lifecycle. So when an ActiveRecord object is created, updated or destroyed, it will also be created/updated/destroyed in Elasticsearch.
<span class="inline-code"><pre><code>settings index: { number_of_shards: 1 }</code></pre></span> This is also optional, but strongly recommended. By default, Elasticsearch will create an index for the model, with 5 primary shards and 1 replica. This will actually create 10 shards, and is ludicrously over-provisioned for most apps. This line simply overrides the default and specifies 1 primary shard(a replica will also be created by default).

You can optionally avoid (or increase) replicas by amending the settings like this:

If you have more questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. There are a few different strategies for this, but we generally recommend using a dedicated controller for it, unrelated to the model(s) being indexed and searched. This is a more flexible approach, and keeps concerns separated. You can always render object-specific partials if your results involve multiple models.

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with a handful of attributes. It looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

To implement search, we created a file called <span class="inline-code"><pre><code>app/controllers/search_controller.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class SearchController< ApplicationController
def run
# This will search all models that have `include Elasticsearch::Model`
@results = Elasticsearch::Model.search(params[:q]).records
end
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

Next, we need to have some Views to render the data we get back from Elasticsearch. The run controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/search/run.html.erb</code></pre></span> and adding:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript"> Search Results
<% if @results.present? %>
<%= render partial: 'search_result', collection: @results, as: :result %>
<% else %>
Nothing here, chief!
<% end %></code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. Create a file called <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span> and add:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
<%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

<%= result.company %>

<%= result.company_description %></code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

In this demo app, we added this to the navigation bar:

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Elasticsearch Where Your Cluster is Located

By default, the gems will try to connect to a cluster running on <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Elasticsearch is pointed to the correct URL.

If you are using the bonsai-elasticsearch-rails gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:[email protected]:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

You are not using the bonsai-elasticsearch-rails gem for some reason, OR
You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
Elasticsearch::Model.client = Elasticsearch::Client.new url: ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
Elasticsearch::Model.client = Elasticsearch::Client.new url: "https://abcd123:[email protected]:443"</code></pre>
</div>
</div>

Step 7: Push Data into Elasticsearch

Assuming you have a database populated with data, the next step is to get that data into Elasticsearch.

Something to keep in mind here: Bonsai does not support lazy index creation, and even Elastic does not recommend using this feature. You’ll need to create the indices manually for each model that you want to search before you try and populate the cluster with data.

If you have the <span class="inline-code"><pre><code>elasticsearch-rails</code></pre></span> gem, you can use one of the built-in Rake tasks. To do that, create a file called <span class="inline-code"><pre><code>lib/tasks/elasticsearch.rake</code></pre></span> and simply put this inside:

<div class="code-snippet-container">
<a fs-copyclip-element="click-16" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-16" class="hljs language-javascript">require 'elasticsearch/rails/tasks/import'</code></pre>
</div>
</div>

Now you will be able to use a Rake task that will automatically create(or recreate) the index:

<div class="code-snippet-container">
<a fs-copyclip-element="click-17" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-17" class="hljs language-javascript">bundle exec rake environment elasticsearch:import:model CLASS='User'</code></pre>
</div>
</div>

You’ll need to run that Rake task for each model you have designated as destined for Elasticsearch. Note that if you’re relying on the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable to configure the Rails client, this variable will need to be present in the environment running the Rake task. Otherwise it will populate your local Elasticsearch instance(or raise some exceptions).

If you don’t use the Rake task, then you can also create and populate the index from within a Rails console with:

<div class="code-snippet-container">
<a fs-copyclip-element="click-18" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-18" class="hljs language-javascript"># Will return nil if the index already exists:
User.__elasticsearch__.create_index!

# Will delete the index if it already exists, then recreate:
User.__elasticsearch__.create_index! force: true

# Import the data into the Elasticsearch index:
User.import</code></pre>
</div>
</div>

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Elasticsearch. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, where it will be passed to Elasticsearch. Elasticsearch will search all of the indices it has (just <span class="inline-code"><pre><code>users</code></pre></span> at this point), and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The SearchController will then render the appropriate views. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Elasticsearch in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We added the gems, configured the model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Elasticsearch on Rails. This documentation is not exhaustive. However, there are a number of other use cases not discussed here. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on these additional subjects below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

‍

Ruby on Rails

Check Your Version

Haystack does not yet support Elasticsearch 6.x and up. Please ensure your Bonsai cluster version is 5.x or less.

Users of Django/django-elasticsearch-dsl can easily integrate with Bonsai Elasticsearch! This library is built on top of elasticsearch-dsl which is built on top of the low level elasticsearch-py maintained by Elastic.

Let’s get started:

Adding the libraries

You’ll need to add the django-elasticsearch-dsl library to your <span class="inline-code"><pre><code>requirements.txt</code></pre></span> file:

Full Instructions can be found here

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. We recommend adding the cluster URL to an environment variable, <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>, to avoid hard-coding your authentication credentials.

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

‍

Django with django-elasticsearch-dsl

Users of Django/Haystack can easily integrate with Bonsai Elasticsearch! We recommend using the official Python client, as it is being actively developed alongside Elasticsearch.

Note

Haystack does not yet support Elasticsearch 6.x or 7.x, which is the current default on Bonsai. Heroku users can provision a 5.x cluster by adding Bonsai via the command line and passing in a <span class="inline-code"><pre><code>--version</code></pre></span> flag like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">heroku addons:create bonsai --version=5</code></pre>
</div>
</div>

Note that some versions require a paid plan. Free clusters are always provisioned on the latest version of Elasticsearch, regardless of which version was requested. See more details here.

Let’s get started:

Adding the Elasticsearch library

You’ll need to add the elasticsearch-py and django-haystack libraries to your requirements.txt file:

Connecting to Bonsai

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">ES_URL = urlparse(os.environ.get('BONSAI_URL') or 'http://127.0.0.1:9200/')

HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
'INDEX_NAME': 'haystack',
},
}

if ES_URL.username:
HAYSTACK_CONNECTIONS['default']['KWARGS'] = {"http_auth": ES_URL.username + ':' + ES_URL.password}</code></pre>
</div>
</div>

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

Common Issues

One of the most common issues users see relates to SSL certs. Bonsai URLs are using TLS to secure communications with the cluster, and our certificates are signed by an authority (CA) that has verified our identity. Python needs access to the proper root certificate in order to verify that the app is actually communicating with Bonsai; if it can’t find the certificate it needs, you’ll start seeing exception messages like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Root certificates are missing for certificate</code></pre>
</div>

The fix is straightforward enough. Simply install the certifi package in the environment where your app is hosted. You can do this locally by running <span class="inline-code"><pre><code>pip install certifi</code></pre></span>. Heroku users will also need to modify their <span class="inline-code"><pre><code>requirements.txt</code></pre></span>, per the documentation:

I can’t install root certificates or certifi

If certifi and root cert management isn’t possible, you can simply bypass verification by modifying your <span class="inline-code"><pre><code>HAYSTACK_CONNECTIONS</code></pre></span> like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">HAYSTACK_CONNECTIONS = {
'default': {
'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
'INDEX_NAME': 'documents',
'KWARGS': {
'use_ssl': True,
'verify_certs': False,
}
},
}</code></pre>
</div>
</div>

This instructs Haystack to use TLS without verifying the host. This does allow for the possibility of MITM attacks, but the probably of that happening is pretty low. You’ll need to weigh the expediency of this approach against the unlikely event of someone eavesdropping, and decide whether there is an acceptable security risk of leaking data in that way.

‍

Django with Haystack

Warning

The official Elasticsearch Python client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the client. This change prevents the Python client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch.

If you are receiving an error stating "The client noticed that the server is not a supported distribution of Elasticsearch," then you'll need to downgrade the client to 7.13 or lower.

Setting up your Python app to use Bonsai Elasticsearch is quick and easy. Just follow the steps below:

Adding the Elasticsearch library

You’ll need to add the elasticsearch library to your <span class="inline-code"><pre><code>Pipfile</code></pre></span> file like so:

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. The following code is a good starter for integrating Bonsai Elasticsearch into your app:

# Log transport details (optional):

logging.basicConfig(level=logging.INFO)

# Parse the auth and host from env:

bonsai = os.environ['BONSAI_URL']
auth = re.search('https\:\/\/(.*)\@', bonsai).group(1).split(':')
host = bonsai.replace('https://%s:%s@' % (auth[0], auth[1]), '')

# Optional port

match = re.search('(:\d+)', host)
if match:
p = match.group(0)
host = host.replace(p, '')
port = int(p.split(':')[1])
else:
port=443

# Connect to cluster over SSL using auth for best security:

es_header = [{
'host': host,
'port': port,
'use_ssl': True,
'http_auth': (auth[0],auth[1])
}]

# Instantiate the new Elasticsearch connection:

es = Elasticsearch(es_header)

# Verify that Python can talk to Bonsai (optional):

es.ping()

</code></pre></div></div>

Feel free to adapt that code to your particular app. Keep in mind that the core elements here can be moved around, but really shouldn’t be changed or further simplified.

For example, the snippet above parses out the authentication credentials from the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable. This is so the username and password don’t need to be hard coded into your app, where they could pose a security risk. Also, the <span class="inline-code"><pre><code>host</code></pre></span>, <span class="inline-code"><pre><code>port</code></pre></span> and <span class="inline-code"><pre><code>use_ssl</code></pre></span> parameters are important for SSL encryption to work properly. Simply using your <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> as the host will not work because of limitations in urllib. It needs to be a plain URL, with the credentials passed to the client as a tuple.

‍

Python

Getting Elasticsearch up and running with a PHP app is fairly straightforward. We recommend using the official PHP client, as it is being actively developed alongside Elasticsearch.

Adding the Elasticsearch library

Like Heroku, we recommend Composer for dependency management in PHP projects, so you’ll need to add the Elasticsearch library to your <span class="inline-code"><pre><code>composer.json file</code></pre></span>:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">{
"require": {
"elasticsearch/elasticsearch": "1.0"
}
}</code></pre></div></div>

Heroku will automatically add the library when you deploy your code.

Using the library

The elasticsearch-php library is configured almost entirely by associative arrays. To initialize your client, use the following code block:

Note that the host is pulled from the environment. When you add Bonsai to your app, the cluster will be automatically created and a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable will be added to your environment. The initialization process above allows you to access your cluster without needing to hardcode your authentication credentials.

Indexing your documents

Bonsai does not support lazy index creation, so you will need to create your index before you can send over your documents. You can specify a number of parameters to configure your new index; the code below is the minimum needed to get started:

Once your index has been created, you can add a document by specifying a body (an associative array of fields and data), target index, type and (optionally) a document ID. If you don’t specify an ID, Elasticsearch will create one for you:

‍

PHP

Bonsai Elasticsearch integrates with your node.js app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.js client to your dependencies list:

package.json:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">"dependencies": {
"elasticsearch": "10.1.2"
....</code></pre></div></div>

You’ll also want to make sure the client is installed locally for testing purposes:

Self-hosted users: when you sign up with Bonsai and create a cluster, it will be provisioned with a custom URL that looks like https://user:[email protected]. Make note of this URL because it will be needed momentarily.

Heroku users: you’ll want to make sure that you’ve added Bonsai Elasticsearch to your app. Visit our addons page to see available plans. You can add Bonsai to your app through the dashboard, or by running:

Update your index.js file with the following:

index.js:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">var bonsai_url = process.env.BONSAI_URL;
var elasticsearch = require('elasticsearch');
var client = new elasticsearch.Client({
host: bonsai_url,
log: 'trace'
});

// Test the connection:
// Send a HEAD request to "/" and allow
// up to 30 seconds for it to complete.
client.ping({
requestTimeout: 30000,
}, function (error) {
if (error) {
console.error('elasticsearch cluster is down!');
} else {
console.log('All is well');
}
});</code></pre></div></div>

The code above does several things:

Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
Adds the Elasticsearch.js library
Instantiates the client using your private Bonsai URL
Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment. Something like this should work:

Heroku users: You don’t need to worry about this step. The environment variable is automatically populated for you when you add Bonsai Elasticsearch to your app. You can verify that it exists by running:

Go ahead and spin up your node app with <span class="inline-code"><pre><code>npm start</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">Elasticsearch INFO: 2016-02-03T23:44:41Z
Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
-> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

<- 200</code></pre></div>

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

‍

Node.js

Bonsai Elasticsearch integrates with your .Net app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.Net and NEST client to your dependencies list:

You’ll also want to make sure the client is installed locally for testing purposes:

Update your Main.cs file with the following:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">using System;
using Nest;

namespace search.con
{
class Program
{
static void Main(string[] args)
{
var server = new Uri(Environment.GetEnvironmentVariable("BONSAI_URL"));
var conn = new ConnectionSettings(server);
// use gzip to reduce network bandwidth
conn.EnableHttpCompression();
// lets the HttpConnection control this for optimal access (default: 80)
conn.ConnectionLimit(-1);
var client = new ElasticClient(conn);

var resp = client.Ping();
if (resp.IsValid)
{
Console.WriteLine("All is well");
}
else
{
Console.WriteLine("Elasticsearch cluster is down");
}
}
}
}
</code></pre></div></div>

The code above does several things:

Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
References the Elasticsearch.Net and NEST libraries
Instantiates the client using your private Bonsai URL
Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment variables. Something like this should work:

Go ahead and spin up your .Net app with <span class="inline-code"><pre><code>dotnet run</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

Elasticsearch INFO: 2016-02-03T23:44:41Z
Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
-> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

<- 200

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

‍

.Net

No results were found for ' '.

Recommended docs

Compliance & Security

GDPR

Elasticsearch & OpenSearch Version Support

Which Versions Bonsai Supports

Getting Started

Upgrading and Changing a Cluster’s Plan

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form.

Upgrading Your Bonsai Cluster

Overview

Upgrade Process

Once this final restore has completed, the cluster will be running entirely on the new hardware, with no data loss.

Downtime and Effort

Zero Downtime for Searches:
- Your search operations will not experience any downtime during the upgrade process. This ensures continuous availability for read operations.
Short Window for Writes:
- There will be a brief period where write operations are paused. This typically lasts less than a minute, depending on the amount of data being transferred.
- An error message will be returned during this pause: 403 Forbidden along with a JSON response indicating that Cluster is currently read-only for maintenance. Please try your request again in a few minutes. See [status.bonsai.io](<http://status.bonsai.io/>) or contact [[email protected]](<mailto:[email protected]>) for updates.. It's essential to handle this gracefully in your application.
Data Volume Considerations:
- The time required for the upgrade depends on the size of your data and the rate of updates. Larger datasets may take slightly longer, but the write pause is generally kept under a minute.
Using a Queue and Buffering:
- As a general rule, we highly recommend using a queue to buffer write operations wherever possible. This could be something as simple as a Redis queue, kafka or similar that implements retries and exponential backoff for failed writes to the cluster. This ensures that transient network issues or other connection problems don’t cause lost writes.
- During the upgrade process, this kind of queue can be paused to ensure that any writes attempted during the brief readonly phase are retried and successfully processed once the cluster is upgraded.

‍

Let’s discuss how to change a cluster’s plan the Bonsai’s Cluster Dashboard.

‍
To begin, navigate to the cluster’s Plan under Settings on the cluster dashboard.

If you haven’t added billing information yet, please do so at Account Billing. Detailed steps can be found in Add a Credit Card.

A successfully changed plan will notify you at the top with Plan scheduled for update. The Plan has been upgraded to the Standard Micro plan in this example.

Can't Downgrade?

‍

Upgrading and Changing a Cluster’s Plan

Bonsai has created and supports a Terraform Provider to enable creation and management of Bonsai Elasticsearch and OpenSearch Clusters from within your infrastructure as code.

‍

Create an Elasticsearch or OpenSearch cluster with the Bonsai.io Terraform Provider

1. Create API Key and Secret

‍

To generate an API Key and Secret Token, head over to your account's API Tokens overview page.

‍

and click the "Generate Token" button.

‍

‍

More in-depth instructions are available at the Bonsai Documentation site!

2. Configure the Terraform Provider

Next, you'll need to configure the Bonsai Terraform Provider with the API Key and Token pair generated in the previous step.

‍

There are two options here:

‍

Implicitly configure the provider by setting the BONSAI_API_KEY and BONSAI_API_TOKEN environment variables in the environment that will execute your Terraform code, or,
Explicitly configure the provider by setting the api_key and api_token provider variables in terraform, as detailed below.

‍

Choose an option that best fits your use case, and you're ready to start creating Elasticsearch and OpenSearch clusters!

‍

# provider.tf

# Configure the Bonsai Provider using the required_providers stanza.
terraform {
  required_providers {
    bonsai = {
      source  = "omc/bonsai"
      version = "~> 1.0"
    }
  }
}

provider "bonsai" {
  # Default: The Bonsai Terraform Provider will fetch the api_key configuration
  # value from the BONSAI_API_KEY environment variable.
  #
  # Uncomment the following line to set directly via variable.
  # api_key = var.bonsai_api_key

  # Default: The Bonsai Terraform Provider will fetch the api_token configuration
  # value from the BONSAI_API_TOKEN environment variable.
  #
  # Uncomment the following line to set directly via variable.
  # api_token = var.bonsai_api_token
}

3. Configure the Terraform Resources

Configuring a Bonsai Elasticsearch or OpenSearch cluster is easy, we only need to make 4 decisions, reflected as Terraform Resource Attributes for our cluster:

‍

name: The human-readable name of the cluster.
plan: The subscription plan.
space: The space describing the server group and geographic location of the cluster.some text
- For AWS, these follow the format of omc/bonsai/$REGION/common, where $REGION is an AWS Region.
- For GCP, these follow the format of omc/bonsai-gcp/$REGION/common, where $REGION is a GCP Region.
release: The version of Elasticsearch or OpenSearch to be deployed.

‍

Here's an example of listing all available Plans, Releases, and Spaces.

‍

# data.tf

// Fetch all Available Bonsai Plans
data "bonsai_plans" "list" {}

// Fetch all Available Bonsai Spaces
data "bonsai_spaces" "list" {}

// Fetch all Available Bonsai Releases
data "bonsai_releases" "list" {}

// Output collected data
output "bonsai_spaces" {
  value = data.bonsai_spaces.list
}

output "bonsai_release" {
  value = data.bonsai_releases.list
}

output "bonsai_plans" {
    value = data.bonsai_plans.list
}

With that output, we can create an optimized OpenSearch cluster that matches our needs.

‍

# main.tf

resource "bonsai_cluster" "test" {    
    name = "Terraform Created Cluster"  
    plan = {  
        slug = "sandbox"  
    }  
      
    space = {  
        path = "omc/bonsai/us-east-1/common"  
    }  
      
    release = {  
       slug = "opensearch-2.6.0-mt"  
    }  
}

4. Configure the Terraform Outputs

And finally, we'll configure our infrastructure data output, so that we can reference our newly created cluster directly in the rest of our infrastructure.

Related note! If you're wondering what the sensitive = true configuration does below, it's required because the cluster's user and password are treated as sensitive outputs by the provider.

If you don't mark their output as sensitive, you'll receive an error to reduce the risk of accidentally exporting sensitive information as plain text!

‍

We also recommend encrypting your Terraform state at rest, and if possible, before storage.

output "bonsai_cluster_id" {  
    value = bonsai_cluster.test.id  
}  
  
output "bonsai_cluster_name" {  
    value = bonsai_cluster.test.name  
}  
  
output "bonsai_cluster_host" {  
    value = bonsai_cluster.test.access.host  
}  
  
output "bonsai_cluster_port" {  
    value = bonsai_cluster.test.access.port  
}  
  
output "bonsai_cluster_scheme" {  
    value = bonsai_cluster.test.access.scheme  
}  
  
output "bonsai_cluster_url" {  
    value = bonsai_cluster.test.access.url  
}  
  
output "bonsai_cluster_slug" {  
    value = bonsai_cluster.test.slug  
}  
  
output "bonsai_cluster_user" {  
    value = bonsai_cluster.test.access.user  
    sensitive = true  
}  
  
output "bonsai_cluster_password" {  
    value = bonsai_cluster.test.access.password  
    sensitive = true  
}  
  
output "bonsai_cluster_state" {  
    value = bonsai_cluster.test.state  
}  
  
output "bonsai_cluster_stats" {  
    value = bonsai_cluster.test.stats  
}  
  
output "bonsai_cluster_plan" {  
    value = bonsai_cluster.test.plan  
}  
  
output "bonsai_cluster_release" {  
    value = bonsai_cluster.test.release  
}  
  
output "bonsai_cluster_space" {  
    value = bonsai_cluster.test.space  
}

Next Steps & Additional Resources

‍

For additional details on managing your Bonsai resources with Terraform, check out our blog post, "Introducing Bonsai's Terraform Provider for Elasticsearch and OpenSearch".

‍

Check out the Provider on GitHub and the official HashiCorp Terraform Registry!

‍

Using Terraform with Bonsai

The Problem

Possible Solutions

Download your mappings.
Download a copy of your old cluster data, or design and implement indexing scripts to do it on your own.
Re-create your indices and the mappings on your new cluster.
Index your data on the new cluster.

elasticsearch-dump

The process for getting started is simple:

Download the library:

npm install elasticdump

Copy over mappings and data into a new cluster, either through a download and reindex or via urls.

Here’s an example of what a migration might look like:

# Backup index data to a file:
elasticdump \
  --input=https://key:[email protected]:443/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping

# Index the data into your cluster with the file:
elasticdump \
  --input=/data/my_index.json \
  --output=https://key:[email protected]:443/my_index \
  --type=data

You’ll need to use your cluster credentials to access your index from a terminal session. See our docs on Cluster Credentials here: https://bonsai.io/docs/credential-management.

Managing your own reindex

Much of what elasticdump does can be manually written if necessary, using curl or whatever language you prefer. For example, downloading mappings can be done using curl:

curl -XGET "https://key:[email protected]:443/_mapping?pretty=true" > mappings.json

And later, with a new cluster, you can PUT your new mappings to its corresponding index:

curl -XPUT "https://key:[email protected]:443/index_name/_mapping" \
  -H 'Content-Type: application/json' \
  -d @mappings.json

Further Resources

We’ve seen it all and are here to help. Please reach our to [email protected] and we’ll point you in the right direction. Cheers!

How to Download Elasticsearch and OpenSearch Data Without Snapshots

Bonsai Elasticsearch can be removed via the command line or Heroku’s app Dashboard.

Danger!

This will destroy all associated data and cannot be undone!

Using the Command Line

The Bonsai Add-on can be removed from the application via Heroku’s command line tool:

This will destroy the cluster and all of its data instantly.

Using the Heroku Dashboard

Next, click on the Resources tab:

Find the Bonsai Add-on in the list of application Add-ons, then click the carot menu:

Select “Delete Add-on” from the list of options. You will need to type in the app’s name to confirm the removal of the Bonsai Add-on:

Click on “Remove Add-on” to complete the process.

‍

How to remove Bonsai from your Heroku app

When it’s time to scale, you can easily upgrade your shared plan, or migrate to a dedicated cluster.

How to Update Your Cluster Plan

Note on plan changes

Most plan changes take effect instantly. However, if you’re trying to upgrade or downgrade across an architecture class, there will be a delay in processing.

Updating your cluster plan with Heroku is fairly simple. You can do this in either of two ways:

Through the Heroku CLI tool
Through the Heroku UI - your app dashboard

Using the Command Line

You can view all of our available plans for Heroku users on the Bonsai Heroku Add-on page:

If you have several applications with Bonsai Add-ons, you can specify which one you would like to upgrade or downgrade with the <span class="inline-code"><pre><code>-a</code></pre></span> flag:

Using the Heroku app Dashboard

Log into your Heroku account and open your app dashboard. In this example, the app is called <span class="inline-code"><pre><code>mycoolelasticsearchapp</code></pre></span>:

Click on the Resources tab to view your Add-ons:

You’ll see a list of your Add-ons, with a carot menu on the far right side.

Clicking on the carot icon will open up a dropdown menu. Select <span class="inline-code"><pre><code>Modify Plan</code></pre></span> to open a modal, and you can choose a new plan for your cluster:

How Heroku Handles Billing

Heroku handles all of the billing, and prorates by the second.

630s on a $50/mo plan = $0.012
2970s on a $150/mo plan = $0.17
6300s on a $50/mo plan = $0.12
Total: $0.30

This amount would be added to the customer’s next invoice from Heroku.

Migrating Across Architectures

Alternately, if a customer downgrades from a “Dedicated” plan to a “Shared” plan, a data migration will need to be performed, and the old servers torn down.

The time required for these migrations can vary. If you have questions or concerns, please send us a note at [email protected]

‍

Scale your cluster in Heroku

Our Heroku Add-on

We offer a free Hobby plan for development and testing on Heroku. A list of all available plans and capacities can be found at https://elements.heroku.com/addons/bonsai.

There are two ways to add Bonsai to your Heroku app:

1. Through the Heroku CLI tool

2. Through the Heroku app dashboard

Using the Heroku CLI tool

You can add Bonsai to your app with this command:

You can verify that the operation was successful by running <span class="inline-code"><pre><code>heroku config:get BONSAI_URL</code></pre></span>:

If this value is null, then Bonsai was not properly added to your account. Try again and look for error messages. If you still have problems, give us a shout.

Once the Add-on has been successfully added, you can check on the status of your cluster by running <span class="inline-code"><pre><code>$ heroku addons:open bonsai -a</code></pre></span>

Specifying a Search Engine

Allowed values are: "elasticsearch" and "opensearch".

Adding a specific version of a Search Engine

There is a list of available versions documented here: Supported Elasticsearch Versions. In short, the options available will be determined by three factors:

Region. Your cluster will be provisioned in the same AWS region where your Heroku app is located. However, some regions support more versions than other regions, based on demand.
Plan. Versions are rolled out to Hobby and Staging tier plans in advance of Production grade plans. Additionally, older versions of Elasticsearch may only be supported on Business tier plans.
Age. Bonsai will occasionally deprecate older versions of Elasticsearch. If you need a version of Elasticsearch that is no longer supported, the only way to get it will be to upgrade to a Business plan.

To view a list of our current supported regions, see our documentation here:

Bonsai Supported Search Engine Versions

For further help with your app.json and addons, see Heroku’s documentation here:

Heroku App JSON Schema

Using the Heroku dashboard

To add Bonsai through the Heroku UI, open up your application in the Heroku dashboard

When you are ready, click on “Provision.” Your new cluster will be instantly created. Your dashboard will now show this:

There will be a section called “Config Vars”:

Click on “Reveal Config Vars” to see your environment variables. This will reveal the value of each variable like so:

You can copy the contents of the BONSAI_URL to your clipboard and paste it into a browser or curl command to see the response from Elasticsearch:

HTTP 401: Authentication required

Keep Your URL Secret

Bad: “My cluster is https://username:[email protected]”

Good: “My cluster is somehost-1234567”

If your URL is leaked somehow, you can regenerate the credentials.

Add Bonsai to your Heroku app

An event like this is handled as a Severity 1 incident.

How much time will it take to restore?

When an entire Bonsai cluster fails to come up (for example with 5 TB of disk usage), how does Bonsai restore the cluster from a backup and how much time will it take?

An event like this is handled as a Severity 1 incident.

When multiple nodes across different Availability Zones for a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

A Bonsai cluster could experience a complete loss of one AWS data center, and the cluster will still continue to operate. This makes Bonsai clusters extremely fault-tolerant.

An event like this is handled as a Severity 1 incident.

When a single node on a Bonsai cluster falters for some reason, what is the likely impact to availability of the system and what are the possible recovery steps?

When increasing disk capacity by adding new nodes or changing an instance type on a Bonsai cluster, what is the likely impact to availability of the system and what are the possible recovery steps?

There are 2 options for a minor version upgrade.

‍Option 1 - Enterprise Plans

‍Option 2 - Non-Enterprise Plans

Version upgrades happen instantaneously
Can be performed with zero downtime
Offers recovery steps depending on your use case’s approach

When upgrading a Bonsai cluster’s Elasticsearch or OpenSearch minor version, what is the likely impact to availability of the system and what are the possible recovery steps?

There are 2 options for a major version upgrade.

‍Option 1 - Enterprise Plans

Option 2 - Non-Enterprise Plans

Version upgrades happen instantaneously
Can be performed with zero downtime
Offers recovery steps depending on your use case’s approach

When upgrading a Bonsai cluster’s Elasticsearch or OpenSearch major version, what is the likely impact to availability of the system and what are the possible recovery steps?

Yes.

Does Bonsai Support HTTP/2?

No. HTTP Pipelining is not supported by Bonsai's platform, and instead Bonsai supports HTTP 2 instead.

Does Bonsai Support HTTP Pipelining?

‍Groovy Errors

Does Bonsai Support Groovy Scripting?

Yes! CORS is enabled for all clusters by default. Users will not need to perform any additional configuration on their clusters in order to set it up.

Does Bonsai Provide CORS Support?

Windows

Elasticsearch can also be run as a service in Windows.

Linux

Arch Linux

Some distributions have preconfigured Elasticsearch binaries available through repositories. Arch Linux, for example, offers it through the community repo, and can be easily installed via pacman:

This package also comes with a systemd service file for starting/stopping Elasticsearch with <span class="inline-code"><pre><code>sudo systemctl elasticsearch.service</code></pre></span>.

Ubuntu and Debian-flavors

Red Hat / Suse / Fedora / RPM

Elasticsearch does provide an RPM file for installing Elasticsearch on distros using rpm:

<span class="inline-code"><pre><code>rpm</code></pre></span> should handle all of the dependency checks as well, so it will tell you if there is something missing.

Testing the Install

The <span class="inline-code"><pre><code>curl</code></pre></span> request and Elasticsearch response should look something like this:

If you see this response, then your local Elasticsearch cluster is up and running! If not, review the documentation for getting Elasticsearch up and running on your operating system and try again.

‍

Testing Elasticsearch Locally

Private Networks

This URL has the following parts:

Protocol (ex: https). The protocol to use for communicating with Elasticsearch. This can be either HTTP or HTTPS. The primary difference between the two is that HTTPS is encrypted, whereas HTTP is not. Bonsai defaults to HTTPS to ensure that your data can not be read in transit.
Authentication Credentials (ex: a1b2c3d4e:5f6g7h8i9). This is a randomly-generated username/password pair. By default, you will need to include these credentials with every request to your cluster, otherwise Bonsai will return an <span class="inline-code"><pre><code>HTTP 401: Authorization Required</code></pre></span> message.
Hostname (ex: something-1234567). The hostname will be a concatenated string generated by your cluster's name, and a random number. This provides some security through obscurity by making guesses and enumeration extraordinarily difficult, as well as avoiding name collisions.
Region (ex: us-east-1). This is the region in which the cluster is provisioned. There are a number of possible regions for this, and we use the AWS naming conventions to indicate where the cluster is located.
Domain (ex: bonsaisearch.net). Bonsai clusters can be accessed via one of two domains: bonsai.io and bonsaisearch.net. This is to avoid the rare case of a TLD outage; if the .io or .net TLD is down for some reason, the cluster can still be accessed.

Let's examine these in greater detail.

Protocols and Ports

All Bonsai clusters default to secure HTTPS using recent versions of TLS for encrypted communication. These connections should use port 443, the standard for SSL/TLS.

We also support unencrypted HTTP access over ports 80 and 9200.

You can see a table describing the protocols and ports available here:

Authentication Credentials

Credentials are Random

The cluster credentials are not the same as what you would use to log into your Bonsai/Heroku account. These are randomly generated when the cluster is created.

HTTP 401: Authentication required

Keep Your URL Secret

Bad: "My cluster is https://username:[email protected]"

Good: "My cluster is somehost-1234567"

If your URL is leaked somehow, you can regenerate the credentials. See the dashboard documentation for more information.

Hostname

Hostnames and Support

Region

All Bonsai clusters have a region slug included in the URL. Some example slugs might be:

us-east-1
us-west-2
eu-west-1
... and so on.

Domain

All Bonsai clusters can be accessed via either of two domains: bonsai.io or bonsaisearch.net. In other words, both of these URLs will work:

Connecting to Elasticsearch via Curl

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">curl https://a1b2c3d4e:[email protected]
{
"name" : "PvRcoFq",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "DNlbVYS0TIGYwbQ6CUNwTw",
"version" : {
"number" : "5.4.3",
"build_hash" : "eed30a8",
"build_date" : "2017-06-22T00:34:03.743Z",
"build_snapshot" : false,
"lucene_version" : "6.5.1"
},
"tagline" : "You Know, for Search"
}
</code></pre>
</div>
</div>

Curl can be used to create and remove indices, add data, check on cluster state and more:

Connecting to Elasticsearch via Browser

HTTP 401: Authorization required When Using a Browser

If you're using a browser to interact with Elasticsearch and you are seeing HTTP 401 errors, try to copy/paste the credentials back into the address bar.

This method is good for reading the cluster state, but not so great for creating indices, updating data, etc.

Connecting to Elasticsearch via Interactive Console

It should be noted that Bonsai offers Private Spaces and when a space is private the Interactive Console is not accessible.

‍

Connecting to Bonsai

If you’re already familiar with the basics, we have a blog post / white paper on The Ideal Elasticsearch Index, which has a ton of information and things to think about when creating an index.

Important Note on Index Auto-Creation

Manually Creating an Index

The second is through the Interactive Console. The Interactive Console is a feature provided by Bonsai and found in your cluster dashboard.

Let’s create an example index called <span class="inline-code"><pre><code>acme-production</code></pre></span> from the command line with curl.

Authentication required

Updating the Index Settings

We can inspect the index settings with a GET call to <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> like so:

Let’s add a replica to the index:

Now, when we re-query the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> endpoint, we can see that there are now two replicas, where before there was only one:

Similarly, if we wanted to remove all the replicas, we could simply modify the JSON payload like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">$ curl -XPUT http://user:[email protected]/acme-production/_settings -d '{"index":{"number_of_replicas":0}}'
{"acknowledged":true}

$ curl -XGET http://user:[email protected]/_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open acme-production 1 0 0 0 318b 159b</code></pre>
</div>
</div>

Adding Data to Your Index

Let’s insert a “Hello, world” test document to verify that your new index is available, and to highlight some basic Elasticsearch concepts.

Dynamic Mapping Can Be Dangerous

We can see the mapping that Elasticsearch generated by using the <span class="inline-code"><pre><code>_mapping</code></pre></span> API:

Alternatively, you can see it in the search results with the <span class="inline-code"><pre><code>_search</code></pre></span> endpoint:

Check the _source

The _source field does also add some overhead. It can be disabled in the mappings. See the Elasticsearch documentation for more details.

Destroying Your Index

When you have decided you no longer need the “acme-production” index, you can destroy it with a one liner:

There’s No ‘Undo’ Button

‍

Creating Your First Index

General Solutions

From 1.x

‍

Upgrading to ES 2

General Solutions

<span class="inline-code"><pre><code>text</code></pre></span>: you want full text search analysis on

<span class="inline-code"><pre><code>keyword</code></pre></span>: you want to search for the value exactly as is, a PO Number, or url for example

From 2.x

From 1.x

‍

Upgrading to ES 5

General Solutions

Rename all remaining mapping types to <span class="inline-code"><pre><code>_doc</code></pre></span> to align with the future default

Use 1 index per document type.

Custom Type Field

From 5.x

After working through the General Solutions above, we recommend you do the following:

upgrading to at least v5.6 and then turning on <span class="inline-code"><pre><code>index.mapping.single_type: true</code></pre></span> to verify you haven't missed anything.

From 2.x

From 1.x

‍

Upgrading to ES 6

General Solutions

Rename all remaining mapping types to <span class="inline-code"><pre><code>_doc</code></pre></span> to align with the future default

Use 1 index per document type.

Custom Type Field

From 6.x

After working through the General Solutions above, we recommend you do the following:

When creating indices in v6 start setting <span class="inline-code"><pre><code>include_type_name=false</code></pre></span> to get ready for the ES7 behavior

From 5.x

After working through the General Solutions above, we recommend you do the following:

upgrading to at least v5.6 and then turning on <span class="inline-code"><pre><code>index.mapping.single_type: true</code></pre></span> to verify you haven't missed anything.

From 2.x

From 1.x

‍

Upgrading to ES 7

Do you have a Business Continuity Plan implemented?

How does your platform prevent service interruption due to catastrophic events?

How does your team handle catastrophic events, such as a natural disaster or global pandemic ?

If you have any questions about any of this, please don’t hesitate to reach out to [email protected].

‍

Bonsai's Business Continuity Plans

Last updated: 20-Jan-2021

What's the Bottom Line for Me as a Bonsai Customer?

What is the SSPL?

What Does This Mean to Me as a Bonsai Customer?

Is Bonsai Going to Shut Down / Drop Elasticsearch?

Will Bonsai Slowly Become Obsolete?

No. We are working out the details for our service’s continued growth and development alongside Elasticsearch.

Why Is Elastic Doing This?

I Have Another Question / Concern

‍

Bonsai SSPL FAQ

Because of this ruling, Bonsai will now be charging sales tax to customers in a variety of states. Here’s what you should know:

Is my state going to charge me sales tax?

Does this apply to all Bonsai plans sold in these states?

Our Hobby Tier will remain free. Standard, Business, and Enterprise plans will all be subject to sales tax.

Plans sold through Heroku, or other third party channels will be taxed through those providers. Please follow up with your reseller for more information.

When does Bonsai plan to charge these taxes?

These taxes will be collected starting in January 2020.

Will these taxes be charged retroactively?

There is no indication that this will be necessary.

How do I opt-in to get charged for my state's sales tax?

Follow our step by step guide for Account Settings Billing.

If I have questions, who should I ask?

‍

Bonsai's Policy on Collecting Sales Tax

Credits

Customers may downgrade a cluster's plan or destroy it at any time to receive a prorated service credit. This credit will be automatically applied to the next billing cycle.

Refunds

Extenuating Circumstances?

Customers with extenuating circumstances should reach out to [email protected] and describe their situation for consideration.

‍

Refund & Credit Policy

If you would like to request a SOC 2 Type 2 Report for SOC 2 compliance, please email us at [email protected].

Once we recieve your request, a PDF of the report will be emailed.

‍

How to get a SOC 2 Type 2 Report

Any Bonsai employee who suspects that a data breach has occurred must immediately notify the CTO and CEO. The CTO will form a response team to handle the incident.

The Bonsai response team will investigate all reported data breaches to confirm if a data exfiltration has occurred. Once confirmed, the response team will perform the following steps:

Notify impacted customers within an hour of confirmation by email
Coordinate with customers to identify the correct immediate course of action
Identify the root cause of the incident, and then work to remediate the issue

To report a data breach or suspicious activity, email us at [email protected] with a description of events and observations.

‍

Bonsai's Policy on Handling Data Breaches

How is Bonsai GDPR Compliant?

Performed a software audit focused on collection and use of customer data, and purged some metrics deemed unnecessary for business.
Removed Facebook tracking pixels and replaced with GDPR-compliant frontend components.
Removed third party integrations that do not comply with GDPR.
Created a process for our European customers to sign a Data Processing Agreement (DPA) with Bonsai. To sign a DPA, please reach out to [email protected].

Bonsai's Privacy Policy

Any changes to our policies will be alerted to you via email.
Personal Information Section. This outlines how we use PII (personal identifiable information).
Purposes of Personal Information Processing Activities Section. This section identifies where we use your information to denote GDPR compliance.
How We Share Your Personal Information with Third Parties. We do not share your personal information unless there is a legitimate business interest to do so (Article 6.1). Historically this has always been true at Bonsai, but now it is explicitly stated in our policy.
How You Can Access or Change Your Personal Information. This section explains how to update or remove your information.
Cookies. This section outlines what cookies are, what kind we use, and why Bonsai uses them. It also provides resources for removing cookies from your browsers and opting out of advertisements.
Data Security, Integrity and Retention. This is an updated and renamed section from the previously named "Protection of PII."
Retention of Personal Information. Specifies the length of time Bonsai retains your information.
Third-Party Links on Our Websites. Covers any case in which Bonsai might link out to another site that may not have the same level of protection of personal information as Bonsai.
Children’s Personal Information. Clarifies that Bonsai does not collect information on minors, and only allow the use of our Services for individuals over 18.
International Transfer of Information. Bonsai's adherence to privacy laws in the countries where our users and their data is located.
International Users & Data. This is a big section that accounts for a lot of the changes for GDPR. It outlines the rights of EEA citizens, how Bonsai will adhere to them, and covers some abnormal cases that would prevent fulfilling those rights - such as legal proceedings or investigations, which are highly unlikely.

More Information

As always, we’re here to address any questions you may have, so please email us at [email protected] if you need any clarification.

‍

GDPR

Once we receive your request, a DPA will be emailed via HelloSign for review and signature.

‍

How to get a DPA

If you would like to request a Business Associate's Agreement (BAA) for HIPAA compliance, please email us at [email protected].

‍

How to get a BAA

An HTTP 404: Not Found error is returned by the API when a requested resource can not be found. This can happen for one of several reasons:

You are attempting to hit an endpoint that does not exist (like https://api.bonsai.io/covfefe).
You are attempting to access some information that is not available to your account.

Example

An HTTP 404: Not Found error may look something like this:

The <span class="inline-code"><pre><code>"status": 404</code></pre></span> key confirms the error is an HTTP 404: Not Found error.

Troubleshooting

If you're still unsure why you are receiving the error, then please shoot us an email.

‍

API Error 404: Not Found

Some examples of situations where a user might see an HTTP 403: Forbidden response from the API:

We have detected activity on your account that appears fraudulent or highly suspicious, and have blocked access to the API until further notice.
We have identified a Terms of Service violation and are blocking access until it is resolved.
The API may be down for maintenance.

Example

A call to the API which results in an HTTP 403: Forbidden response may look something like this:

The <span class="inline-code"><pre><code>"status": 403</code></pre></span> key indicates that the error is indeed an HTTP 403: Forbidden error.

Troubleshooting

If the error indicates a temporary interruption, such as maintenance mode, then check out our Twitter account for updates, or shoot us an email.

‍

API Error 403: Forbidden

An HTTP 402: Payment Required error occurs when your account is past due and you try to make a request to the API. Bonsai only provides API access to accounts which are up to date on payments.

Example

A call to the API which results in an HTTP 402: Payment Required error may look something like this:

‍

The <span class="inline-code"><pre><code>"status": 402</code></pre></span> key indicates the HTTP 402: Payment Required error.

Troubleshooting

If everything seems correct with your account, you can always contact support and we will be glad to assist.

‍

API Error 402: Payment Required

An HTTP 422: Unprocessable Entity error occurs when a request to the API can not be processed. This is a client-side error, meaning the problem is with the request itself, and not the API.

If you are receiving an HTTP 422: Unprocessable Entity error, there are several possibilities for why it might be occurring:

You are sending in a payload that is not valid JSON
You are sending HTTP headers, such as <span class="inline-code"><pre><code>Content-Type</code></pre></span> or <span class="inline-code"><pre><code>Accept</code></pre></span>, which specify a value other than <span class="inline-code"><pre><code>application/json</code></pre></span>
The request may be valid JSON, but there is something wrong with the content. For example, instructing the API to upgrade a cluster to a plan which does not exist.

Example

A call to the API that results in an HTTP 422: Unprocessable Entity error may look something like this:

The <span class="inline-code"><pre><code>"status": 422</code></pre></span> key indicates the HTTP 422: Unprocessable Entity error.

Troubleshooting

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 422: Unprocessable Entity

If you are receiving an HTTP 429: Too Many Requests error, then you are hitting the API too frequently. The API documentation introduction describes which limits are in place.

Example

A call to the API that results in an HTTP 429: Too Many Requests error may look something like this:

Troubleshooting

The first step is to look at how often you are polling the API. If you're checking the API while waiting for a cluster to provision or update, then every 3-5 seconds should be adequate.

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 429: Too Many Requests

An HTTP 401: Unauthorized error occurs when a request to the API could not be authenticated. All requests to API resources must use some authentication scheme to prove access rights to the resource.

If you are receiving an HTTP 401: Unauthorized error, there are several possibilities for why it might be occurring:

The authentication credentials are missing
The authentication credentials are incorrect
The authentication credentials belong to a token that has been revoked

Check that the authentication credentials you are passing along in the request are correct and belong to an active token.

Example

A call to the API that results in an HTTP 401: Unauthorized error may look something like this:

The <span class="inline-code"><pre><code>"status": 401</code></pre></span> key indicates the HTTP 401: Unauthorized error.

Troubleshooting

The first thing to do is to carefully read the list of errors returned by the API. This will often include some hints about what is happening:

If you're sure that the credentials are correct, then you may want to try isolating the problem. Try making a curl call to the API and see what happens. For example, using Basic Auth:

If the request succeeds, then you have eliminated the API Token as the source of the problem, and it's likely an issue with how the application is making the call to the API.

If all else fails, you can always contact support and we will be glad to assist.

‍

API Error 401: Unauthorized

Alpha Stage

The Bonsai API supports HTTP Basic Authentication over TLS >= 1.2 as one means for authenticating requests. The authentication protocol utilizes an Authorization header, with the contents: Basic .

For Basic Auth, the token key corresponds to the“user” parameter and the token secret corresponds to the“password” parameter.

‍

Using Basic Authentication

Alpha Stage

Which Scheme Should I Use?

Failed Authentication

Requests that do not have the proper authentication will receive an HTTP 401: Not Authorized response. This can happen for a variety of reasons, including(but not limited too):

The token itself has been revoked
The token key can not be found(perhaps due to a typo)
The token secret does not match the secret provided
One or more HTTP headers have been miscalculated
(When using HMAC) the X-BonsaiApi-Time timestamp deviates more than 60 seconds from the server time
Some other filtering rule has been violated

If you are having trouble authenticating your requests to the API, please reach out to [email protected].

‍

Authentication

Alpha Stage

HMAC

This authentication protocol requires that all API requests include three HTTP headers:

X-BonsaiApi-Time. The current Unix time – seconds since epoch. This value helps to guarantee uniqueness over time, and must be within one minute of our server time to prevent replay attacks.

X-BonsaiApi-Key. The API token’s key, as generated within the Bonsai application. This key will also have a corresponding secret.

X-BonsaiApi-Hmac. The hexadecimal HMAC-SHA1 digest of your shared secret and the concatenation of the above time and public key.

For example, in Ruby, the X-BonsaiApi-Auth header can be computed as: OpenSSL::HMAC.hexdigest('sha1', token_secret, "#{time}#{token_key}"), where the token_secret is the API key’s secret.

‍

Using HMAC Authentication

Alpha Stage

The Releases API provides users a method to explore the different versions of Elasticsearch available to their account. This API supports the following actions:

View all available releases for your account
View a single release for your account

All calls to the Releases API must be authenticated with an active API token.

The Bonsai Release Object

The Bonsai API provides a standard format for Release objects. A Release object includes:

View all available releases

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the releases available to your account:

View a single release

The Bonsai API provides a method to get information about a single release available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Release object:

Releases API Introduction

Alpha Stage

The Spaces API provides users a method to explore the server groups and geographic regions available to their account, where clusters may be provisioned. This API supports the following actions:

View all available spaces for your account
View a single space for your account

All calls to the Spaces API must be authenticated with an active API token.

The Bonsai Space Object

The Bonsai API provides a standard format for Space objects. A Space object includes:

provider. A String representing a machine-readable name for the cloud provider in which this space is deployed.
region. A String representing a machine-readable name for the geographic region of the server group.

</td>
</tr>
</tbody>
</table>

View all available spaces

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the spaces available to your account:

View a single space

The Bonsai API provides a method to get information about a single space available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces/[:path]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Space object:

Spaces API Introduction

Alpha Stage

The Plans API gives users the ability to explore the different cluster subscription plans available to their account. This API supports the following actions:

View all plans available for your account.
View a single plan available for your account.

All calls to the Plans API must be authenticated with an active API token.

The Bonsai Plan Object

The Bonsai API provides a standard format for Plan objects. A Plan object includes:

View all plans

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the Plans available to your account:

View a single plan

The Bonsai API provides a method to retrieve information about a single Plan available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans/[:plan-slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Plan object:

Plans API Introduction

Alpha Stage

The Clusters API provides a means of managing clusters on your account. This API supports the following actions:

View all clusters in your account
View a single cluster in your account
Create a new cluster
Update a cluster in your account
Destroy a cluster in your account

All calls to the Clusters API must be authenticated with an active API token.

The Bonsai Cluster Object

The Bonsai API provides a standard format for Cluster objects. A Cluster object includes:

slug. The unique, machine-readable name of the plan.
uri. A URI to retrieve more information about this plan.

version. The version of the release this cluster is running on.
slug. The unique slug of the release.
package_name. The package name of the release.
service_type. The name of the search service.
uri. A URI to retrieve more information about this plan.

path. The path to the space. This string maps to a geographic region or data center.
region. The geographic region in which the cluster is running.
uri. A URI with more information about the space

docs. The number of documents in the index.
shards_used. The number of shards the cluster is using.
data_bytes_used. Integer representing the number of bytes the cluster is using on disk.

host. The host name of the cluster.
port. The HTTP port the cluster is running on.
scheme. The HTTP scheme needed to access the cluster (defaults to "https")

</td>
</tr>
<tr>
<td>state</td><td>A String representing the current state of the cluster. This indicates what the cluster is doing at any given moment. There are 8 defined states:

DEPROVISIONED. The cluster has been destroyed.
DEPROVISIONING. The cluster is in the process of being destroyed.
DISABLED. The cluster has been disabled.
MAINTENANCE. The cluster is in maintenance mode.
PROVISIONED. The cluster has been created and is ready for use.
PROVISIONING. The cluster is in the process of being created.
READONLY. The cluster is in read only mode.
UPDATING PLAN. The cluster's plan is being updated.

</td>
</tr>
</tbody>
</table>

View all clusters

Supported Parameters (Query String Parameters)

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/clusters</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an HTTP 200: OK code, along with a JSON list representing the clusters on your account:

View a single cluster

The Bonsai API provides a method to retrieve information about a single cluster on your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Cluster object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
"cluster": {
"slug": "second-testing-clust-1234567890",
"name": "second_testing_cluster",
"uri": "https://api.bonsai.io/clusters/second-testing-clust-1234567890",
"plan": {
"slug": "sandbox-aws-us-east-1",
"uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"
},
"release": {
"version": "7.2.0",
"slug": "elasticsearch-7.2.0",
"package_name": "7.2.0",
"service_type": "elasticsearch",
"uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"
},
"space": {
"path": "omc/bonsai/us-east-1/common",
"region": "aws-us-east-1",
"uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"
},
"stats": {
"docs": 0,
"shards_used": 0,
"data_bytes_used": 0
},
"access": {
"host": "second-testing-clust-1234567890.us-east-1.bonsaisearch.net",
"port": 443,
"scheme": "https"
},
"state": "PROVISIONED"
}
}</code></pre>
</div>
</div>

Create a new cluster

Supported Parameters

HTTP Request

An HTTP POST call is made to /clusters along with a JSON payload of the supported parameters.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with a short message and details about the cluster that was created:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">{
"message": "Your cluster is being provisioned.",
"monitor": "https://api.bonsai.io/clusters/test-5-x-3968320296",
"access": {
"user": "utji08pwu6",
"pass": "18v1fbey2y",
"host": "test-5-x-3968320296",
"port": 443,
"scheme": "https",
"url": "https://utji08pwu6:[email protected]:443"
},
"status": 202
}</code></pre>
</div>
</div>

Error

An <span class="inline-code"><pre><code>HTTP 422: Unprocessable Entity</code></pre></span> error may arise if you are trying to create one too many Sandbox clusters on your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">{
"errors": [
"The requested plan is not available for provisioning. Solution: Please use the plans endpoint for a list of available plans.",
"Your request could not be processed. "
],
"status":422
}</code></pre>
</div>
</div>

If you are not creating a Sandbox cluster, please refer to the API Error 422: Unprocessable Entity documentation.

Update a cluster

Supported Parameters

HTTP Request

To make a change to an existing cluster, make an HTTP PUT call to <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> with a JSON body for one or more of the supported params.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with short message:

Destroy a cluster

The Bonsai API provides a method to delete a cluster from your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP DELETE call is made to the <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> endpoint.

HTTP Response

Bonsai will respond with an HTTP 202: Accepted code, along with a short message:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">{
"message": "Your cluster is being deprovisioned.",
"monitor": "https://api.bonsai.io/clusters/[:slug]",
"status": 202
}</code></pre>
</div>
</div>

Cluster API Introduction

Alpha Stage

This introduction to the Bonsai API includes the following sections:

Introduction: What is the API and what is required to make it work?
Request Data. Sending data to the API.
Response Data. Receiving data from the API.
Error Handling. What to expect if something fails.
API Access Limitations. How often can you hit the API?
Breaking Changes. How Bonsai handles changes to the API.
Questions, Problems, Feature Requests. Providing feedback about the API.

Introduction

Clusters API. Create, view, manage and destroy clusters on your account.
Plans API. View subscription plans available to your account.
Spaces API. View locations where your account may provision new clusters.
Releases API. View versions of Elasticsearch available to your account.

Request Data

Response Data

Error Handling

In the event that one or more errors are raised, the API will return a JSON response detailing the problem. The response will have a status code, and an array of error messages. For example:

Error codes for the API are documented in the API Error Codes section.

API Access Limitations

Additionally, access to the API may be blocked if:

Your Bonsai account is cancelled (HTTP 404).
Your Bonsai account is suspended due to non-payment (HTTP 402).
Your shared secret is revoked (HTTP 401).
If fraudulent or bot activity is suspected (HTTP 403).
Any violation of our Terms of Service (HTTP 403).

Breaking Changes

The Bonsai team considers the following changes to be backwards compatible, and can be made without advance notice:

Adding new API endpoints
Non-breaking changes to existing API endpoints:
Adding new optional parameters
Adding additional attributes to response bodies
Adding new, optional features
Adding support for new formats(such as XML)
Adding support for additional authentication options

Questions, Problems, Feature Requests

‍

Introduction to the API

Alpha Stage

An API Token is required in order to access the API. Tokens are associated to a user within a given account. To create a credential, navigate to your account and click on the API Tokens tab:

Click on the Generate Token button to create a token:

You can see even more details about a request by clicking on its Details button:

If there is some security concern with the request, there will be a flash message indicating that it was not made safely.

‍

Creating an API token

Pagination Query Parameters

All APIs supporting recognizes the following request parameters for pagination:

Pagination Response Fragment

All API responses which support pagination will include a top level pagination fragment in the JSON response body, which looks like:

With the above information, you can infer how many pages there are and iterate until the list is exhausted.

API Result Pagination

The easiest solution is to simply catch and retry HTTP 503's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

‍

HTTP 503: Service Unavailable

If you are having trouble with this resolving this issue, please let us know.

‍

HTTP 502: Bad Gateway

For example, if you have a mapping that expects a number in a particular field, and then index a document with some other data type in that field, Elasticsearch will reject it with an HTTP 400:

HTTP 400: Bad Request

HTTP 501: Not Implemented

A typo in the URL. If you're seeing this in the command line or terminal, then it's possible the hostname is wrong due to a typo or incomplete copy/paste.
The cluster has been destroyed. If you deprovision a cluster, it will be destroyed instantly. Further requests to the old URL will result in an HTTP 404 Cluster Not Found response.
The cluster has not yet been provisioned. There are a couple cases in which clusters take a few minutes to come online. Attempting to access the cluster before it's operational can lead to an HTTP 404 response.

HTTP 404: Cluster Not Found

‍

HTTP 403: Cluster Read-only

This error is raised when a request is sent to a cluster that has been disabled. Clusters can be disabled for one of several reasons, but the most common reason is due to an overage.

‍

HTTP 403: Cluster Disabled

To elaborate on this, all Bonsai cluster URLs follow this format:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://username:[email protected]
</code></pre>
</div>

The username and password in this URL are not the credentials used for logging in to Bonsai, but are randomly generated alphanumeric strings. So your URL might look something like:

Not All APIs are Available

I'm including credentials and still getting a 401!

If you're sure that the credentials are correct and being supplied, send us an email and we will investigate.

HTTP 401: Authorization Required

In some rare cases, the Bonsai Ops Team will put a cluster into maintenance mode. There are a lot of reasons this may happen:

Load shedding
Data migrations
Rolling restarts
Version upgrades
... and more.

HTTP 403: Maintenance

The easiest solution is to simply catch and retry HTTP 500's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

‍

HTTP 500: Internal Server Error

This error indicates that the request body to Elasticsearch exceeded the limits of the Bonsai proxy layer. This can be caused by a few things:

A request larger than 40MB. Elasticsearch's Query DSL can be fairly verbose JSON, particularly when queries are complex. The 40MB cap is meant to be a safety mechanism to prevent runaway queries from overwhelming the routing layer, while still being an order of magnitude higher than 99.9% of request bodies.
Indexing too many documents at once. The Elasticsearch Bulk API allows applications to index groups of documents in a single request. Sending a single batch of millions of documents could easily trigger the HTTP 413 message.
Lots of request headers. Metadata about a request can be passed to Elasticsearch in the form of request headers. Bonsai allows up to 16KB for request headers; this should be enough for whatever CORS and content-type specification needs to occur. Note that the TLS and authentication headers in the request are not counted towards this limit.
Indexing large files. When Elasticsearch indexes a rich text file like a PDF or Word document, it converts the file into a Base64 string to compress it for transit. Still, it's possible that this string is longer than 40MB, which could trigger the HTTP 413 error.

If you're seeing this error, check that your queries are sane and not 40MB of flattened JSON. Ensure you're not explicitly sending lots of headers to your cluster.

If you're seeing this message during bulk indexing, then decrease your batch sizes by half and try again. Repeat until you can reindex without receiving an HTTP 413.

‍

HTTP 413: Request Too Large

HTTP 429: Too Many Requests

This response is distinct from the "Cluster not found" message. This message indicates that you're trying to access an index that is not registered with Elasticsearch. For example:

There are a couple reasons you might see this:

Race condition. You or your app may be trying to access an index before it was created.
Typo. You may have misspelled or only partially copy/pasted the name of the index you're trying to access.
The index has been deleted. Trying to access an index that has been deleted will return an HTTP 404 from Elasticsearch.

Important Note on Index Auto-Creation

The solution to this error message is to confirm that the index name is correct. If so, make sure it is properly created (with all the mappings it needs), and try again.

HTTP 404: Index Not Found

For more information on timeouts, please see our recommendations on Connection Management.

‍

HTTP 504: Gateway Timeout

A HTTP 402 response indicates a cluster’s account is behind on payments. All requests to the cluster will return the following message:

‍

HTTP 402: Payment Required

First Steps

The next step is to review the documentation on Connecting to Bonsai, which contains plenty of information about creating connections and testing the availability of your Bonsai cluster.

This message tells us that the client tried to reach Elasticsearch over localhost, which is a red flag (discussed below). It means the client is not pointed at your Bonsai cluster.

Common Connection Issues

Connection Refused
No handler found for URI
Name or service not known
HTTP 401 / Missing Authentication Headers
Missing SSL Root Certificate

Connection Refused

This error indicates that a request was made to a server that is not accepting connections.

What is Localhost?

Bonsai clusters run in a private network that does not include any of your infrastructure, which is why trying to reach it via localhost will always fail.

Why Port 9200?

By default, Elasticsearch runs on port 9200. While you can access your Bonsai cluster over port 9200, this is not recommended due to lack of encryption in transit.

All users will all need to ensure their Elasticsearch client is pointed at the correct URL and ensure that the Elasticsearch client is properly configured.

Ruby / Rails

If you’re using the elastisearch-rails client, simply add the following gem to your Gemfile:

If you'd prefer to keep your Gemfile sparse, you can initialize the client yourself like so:

Other Languages / Frameworks

No handler found for URI

Here are some examples:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">
# Request to /_cat/indices with GET method:
$ curl -XGET https://user:[email protected]/_cat/indices
green open test1 1 1 0 0 318b 159b
green open test2 1 1 0 0 318b 159b

# Request to /_cat/indices with PUT method:
$ curl -XPUT https://user:[email protected]/_cat/indices
No handler found for uri [/_cat/indices] and method [PUT]</code></pre>
</div>
</div>

The solution for this issue is simple: use the correct HTTP method. The Elasticsearch documentation will offer guidance on which methods pertain to the endpoint you're trying to use.

Name or service not known

The "Name or service not known" error indicates that there has been a failure in determining an IP address for the URL's domain. This typically implicates one of several root causes:

There is a typo in the URL.
There is a TLD outage.
The Internet Service Provider (ISP) handling your application's traffic may have lost a DNS server, or is having another outage.

TLD outage

A TLD is something like ".com," ".net," ".org," etc. A TLD outage affects all domains under the TLD. So an outage of the ".net" TLD would affect all domains with the ".net" suffix.

Fortunately, all Bonsai clusters can be accessed via either of two domains:

bonsai.io
bonsaisearch.net

ISP Outage

If this happens, there is basically nothing you can do about it as a user, aside from complaining to your application sysadmin.

Missing Authentication Headers

What's less clear from this documentation is that including the complete URL in your Elasticsearch client may not be enough to create a secure connection.

Here is a basic example in Ruby using <span class="inline-code"><pre><code>Net::HTTP</code></pre></span>, demonstrating how a URL with auth can still receive an HTTP 401 response:

Here are some examples of calculating the base64 string and adding the request header in several different languages:

Android

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript"> public static Map getHeaders() {
Map headers = new HashMap<>();
String credentials = "randomuser:randompass";
String auth = "Basic " + Base64.encodeToString(credentials.getBytes(), Base64.NO_WRAP);
headers.put("Authorization", auth);
headers.put("Content-type", "application/json");
return headers;
}</code></pre>
</div>
</div>

Java

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-java">public static Map getHeaders() {
Map headers = new HashMap<>();
String credentials = "randomuser:randompass";
String auth = "Basic " + Base64.getEncoder().encodeToString(credentials.getBytes());
headers.put("Authorization", auth);
headers.put("Content-type", "application/json");
return headers;
}</code></pre>
</div>
</div>

‍

Ruby

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-ruby">require 'base64'
require 'net/http'

uri = URI("https://something-12345.us-east-1.bonsaisearch.net")
req = Net::HTTP::Get.new(uri)
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
http.request(req)
}</code></pre>
</div>
</div>

Missing SSL Root Certificate

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">SSL: CERTIFICATE_VERIFY_FAILED</code></pre>
</div>

This is almost certainly due to a server level misconfiguration.

These are just a sample of the errors you might see, but it all points to the application's inability to verify Bonsai's certificate.

1. Make sure the CA root certificate for Bonsai is available

2. Make sure the certificate(s) for Bonsai are up to date

Connection Issues

This document outlines several additional strategies that users can take to maximize the performance and throughput for their cluster.

Implement HTTP Keep-alive

Normally when an application makes a request to the cluster, it needs to perform a lot of steps. Very generally, it looks like this:

Set up the connection locally
Reach the target server
Establish a secure communication protocol via SSL/TLS (this step is actually a dozen smaller steps, see The SSL/TLS Handshake for an infographic)
Send the authentication credentials and the request
Wait for a response
Receive the response
Tear down the connection
Return the response to the application

Consult the documentation for your Elasticsearch client for more details on implementing this feature.

Utilize a Durable Message Queue

Prefer Bulk Updates Over Single-Document Updates

‍

Improving Throughput and Performance

In this guide we cover a few different ways of reducing shard usage:

What is a Shard Overage?
Delete Unneeded Indices
Use a Different Sharding Scheme
Reduce replication
Data Collocation
Upgrade the Subscription

What is a Shard Overage?

A shard overage occurs when your cluster has more shards than the subscription allows. In practice, this usually means one of three things:

Extraneous indices.
Sharding scheme is not optimal.
Replication too high

Delete Unneeded Indices

To determine if you have extraneous indices, use the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> endpoint to get a list of all the indices you have in your cluster:

If you see indices that you don’t need, you can simply delete them:

Use a Different Sharding Scheme

Reduce replication

Multitenant subscription plans (Hobby and Standard tiers) are required to have at least 1 replica shard in use per index. Indices on these plans cannot have a replica count of 0.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript"><script> console.log('hello'); </script>GET /_cat/indices?v
health status index pri rep docs.count docs.deleted store.size pri.store.size
green open test1 5 2 0 0 318b 159b
green open test2 5 2 0 0 318b 159b</code></pre>
</div>

Fortunately, reducing the replica count for the index is a small JSON body to the <span class="inline-code"><pre><code>_settings</code></pre></span> endpoint:

That simple request shaved 10 shards off of this cluster’s usage.

Replication == Availability and Redundancy

Having replication of at least 1 mitigates against all these problems.

Data Collocation

Another solution for reducing shard usage involves using aliases and custom routing rules to collocate different data models onto the same group of shards.

The latter configuration with the same 1x1 scheme would only require two shards (one primary, one replica) instead of six:

There are others. Data collocation is mentioned here as a possibility, and one that only works for certain users. It is by no means a recommendation.

Upgrade the Subscription

If you find that you’re unable to reduce shards through the options discussed here, then you will need to upgrade to the next plan.

‍

Reducing Shard Usage

Node

Cluster

Index

Index (noun)

“How Many Indices Can I Put on a Shard?”

In other words, an index can have many shards, but each shard can only belong to one index. This relationship precludes collocating multiple indices on a single shard.

Index (verb)

Elasticsearch is not a Crawler

Shards

Primary Shards

So Why Not a Kagillion Shards?

Primary Shards Can Not Be Added/Removed Later

Replica Shards

The Elasticsearch definition for replica shards sums it up nicely:

A replica is a copy of the primary shard, and has two purposes:

Increase failover: a replica shard can be promoted to a primary shard if the primary fails
Increase performance: get and search requests can be handled by primary or replica shards. By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard.

Replication is a Best Practice

If you’re running a production application, all of your indices should have a replication factor of at least 1. Otherwise, you’re exposed to data loss if anything unexpected happens.

Replicas are Easy

In contrast to primary shards, which can not be added/removed after the index is created, replicas can be added and removed at any time.

Document(s)

From the Elasticsearch documentation: “A document is a JSON document which is stored in Elasticsearch. It is like a row in a table in a relational database.”

As referenced, an analogue for an Elasticsearch document would be a database record. It is a collection of related fields and values. For example, it might look something like this:

What about Word/Excel/PDF…?

Elasticsearch Client

Most clients will also handle a lot of other Elasticsearch-related tasks, such as:

Automatically creating indices with the correct mappings/settings
Handling search queries and responses
Setting up and managing aliases

Ruby on Rails (elasticsearch-rails and searchkick)
Django
Drupal
Wordpress
… and many more.

In short, there is probably already an open sourced, well-documented client available.

‍

Elasticsearch Core Concepts

This article explains what shards are, how they work, and how they can best be used.

Where do shards come from?

Primary Shards Can Not Be Added/Removed Later

In contrast, replica shards are simply extra copies of the data. They are useful for redundancy or to handle extra search traffic, and can be added and removed on demand.

You can specify how many primary shards and replicas are used when creating a new index.

Replicas and High-Availability

Measuring your cluster’s index and shard usage

Frequently Asked Questions About Shards

Q. How Many Indices Can I Put On A Shard?

Short Answer: An index can have many shards, but any given shard can only belong to one index.

Q. How Many Shards Do I Need?

Generally, we recommend that if you don’t expect data to grow significantly, then:

One primary shard is fine if you have less than 100K documents
One primary shard per node is good if you have over 100K documents
One primary shard per CPU core is good if you have at least a couple million documents

Q. How Do I Reduce The Number of Shards I’m Using?

Answer: There is an article dealing with this very subject: Reducing Shard Usage.

‍

Shard Primer

In this document, we’ll cover some general best practices and offer some Bonsai-specific guidelines for migrating your app to a new version of Elasticsearch.

Protip: Use one Elasticsearch cluster per environment

Step 1: Read the Release Notes Carefully

Make sure to perform your due diligence by reading the release notes and breaking changes that accompany the version you’re targeting for the upgrade.

This is one of many reasons we recommend upgrading non-production environments first.

Step 2: Validate In Development

The process looks like this:

3. The Elasticsearch client is upgraded, and the new version is updated to point to the new cluster. The application then performs a reindex, which populates the new cluster with data.

Step 3: Upgrade the Production Cluster

The exact steps will vary considerably by application and use case. A typical strategy for this is outlined below:

The fork is configured to read from the production DB and the client is configured to point to the new cluster:

You will need to adapt this to your specific use case when planning out your blue-green strategy.

Ask Support

‍

Upgrading Major Versions

This document discusses how to estimate the resources you will need to get up and running on Elasticsearch. The process is roughly as follows:

Estimating Shard Requirements
Estimating Disk Requirements
Planning for Memory
Planning for Traffic

We have included a section with some sample calculations to help tie all of the information together.

Estimating Shard Requirements

If you’re unfamiliar with the concept of shards, then you should read the brief, illustrated Shard Primer section first.

Why Is This Important?

Beware of High Shard Counts

Full Text Search

GET /_cat/indices
green open products 1 1 0 0 100b 100b
green open users 1 2 0 0 100b 100b
green open posts 3 2 0 0 100b 100b

shards = p * (1 + r)

Time Series Applications

GET /_cat/indices
green open events-20180101 1 1 0 0 100b 100b
green open events-20180102 1 1 0 0 100b 100b
green open events-20180103 1 1 0 0 100b 100b

1x(1+1) shards/index * 1 index/day * 30 days = 60 shards

Estimating Disk Requirements

Allocation

Benchmarking for Baselines

How many indices are needed?
How many shards are needed?
What is the average document size per index?

Database Size is a Bad Heuristic

Attachments Are Also a Bad Heuristic

In this example, there are 3 indices. Each index has one primary shard and one replica shard, for a total of 2 shards per index.

Medium-term Projections

users-production: +5%/mo
posts-production: +12%/mo
notes-production: +7%/mo

Based on this, the cluster should need at least 1.44GB. Add the 20% margin of safety for an estimate of ~1.75GB.

Time Series Data

Consider this sample cluster:

Planning for Memory

“I want enough RAM to hold all of my data!”

Highlighting
Aggregations (a.k.a faceting)
Dynamic scripting
Geospatial search
Deep pagination. This is when you have hundreds or thousands of pages of results; accessing results 9900 to 10000 requires substantially more overhead than results 1-100.
Sorting large numbers of documents. This is similar to, but distinct from, deep pagination. See Efficient sorting of geo distances in Elasticsearch for a real world example.
Frequent cache evictions (ex: caching timestamps by second or minute, instead of hourly or daily)

Planning for Traffic

Estimating Traffic Capacity

Consider the following simplified examples of weekly traffic patterns for two applications. The plots show the instantaneous throughput over the course of 7 days:

Traffic Volume and Request Latencies

Under this scheme, applications with low-IO requests can service a much higher throughput than applications with high-IO requests, thereby ensuring fair, stable performance for all users.

Estimating Your Concurrency Needs

Beware the Local Benchmarking

It’s better to set up a remote cluster, perform a small scale load test, and use the statistics to estimate the upper latency bounds at scale.

Sample Calculations

Suppose the developer of online store decides to switch the application’s search function from ActiveRecord to Elasticsearch. She spends an afternoon collecting information:

She wants to index three tables: users, orders and products
There are 12,123 user records, which are growing by 500 a month
There are 8,040 order records, which are growing by 1,100 a month
There are 101,500 product records, which are growing by 2% month over month
According to the application logs, users are averaging 10K searches per day, with a peak of 20 rps

Estimating Shard Needs

Estimating Disk Needs

She sets up a local Elasticsearch cluster and indexes 5000 random records from each table. Her cluster looks like this:

GET /_cat/indices
green open users 1 1 5000 0 28m 14m
green open orders 1 1 5000 0 24m 12m
green open products 3 2 5000 0 540m 60m

Based on this, she determines:

The users index occupies 14MB / 5000 docs = 2.8KB per document.
The orders index occupies 12MB / 5000 docs = 2.4KB per document.
The products index occupies 60MB / 5000 docs = 12KB per document.

She uses this information to calculate a baseline:

She will need 2.8KB/doc x 12,123 docs x 2 copies = 68MB of disk for the users data
She will need 2.4KB/doc x 8,040 docs x 2 copies = 39MB of disk for the orders data
She will need 12KB/doc x 101,500 docs x 3 copies = 3654MB of disk for the products data
The total disk needed for the existing data is 68MB + 39MB + 3654MB = ~3.8GB.

She then uses the growth measurements to estimate how much space will be needed within 6 months:

The users index will have ~15,123 records. At 2.8KB/doc and a 1x1 shard scheme, this is 85MB.
The orders index will have ~14,640 records. At 2.4KB/doc and a 1x1 shard scheme, this is 70MB.
The products index will have ~114,300 records. At 12KB/doc and a 3x2 shard scheme, this is 4115MB.
The total disk needed in 6 months will be around 4.27GB.

Adding some overhead to account for unexpected changes in growth and mappings, she estimates that 5GB of disk should suffice for current needs and foreseeable future.

Estimating Traffic Needs

Conclusion

Based on her tests and analysis, she decides that she will need a cluster with:

Capacity for at least 13-15 shards
A search concurrency of at least 2
1 GB allocated for memory
5 GB of disk to support the growth in data over the next 3-6 months

‍

Capacity Planning

Bonsai has a variety of cost controls that allow us to provide a quality free tier of service. Among these is the periodic deletion of unused free clusters.

If you are an infrequent user of your Bonsai cluster, but wish to avoid it from being terminated, simply upgrade to any paid plan.

‍

Periodic Cluster Cleanup

So what are your options?

Kibana-maker

Kibana maker is a simple script that we wrote that helps bootstrap a Heroku Application by configuring a small git repository and pushing that up to a heroku git repository.

You can pull down a copy of kibana-maker using the following command:

Steps

Identify the slug of your private space in the Heroku App
Identify the URL of your cluster in the Bonsai App
Note the version of Elasticsearch you are using
Using kibana-maker, run the following command: <span class="inline-code"><pre><code>cd /tmp && kibana-maker --name= --url= --space= --version=</code></pre></span>
You should now have a functioning Kibana instance running in your Heroku Private space

Kibana and Heroku Private Spaces

Be aware of your security model

This may or may not be acceptable for the data you plan to index. The rest of this guide assumes you are running on a single tenant architecture.

VPC Peering can be set up between a Heroku Private Space and a Bonsai cluster on an Enterprise plan. This configuration will ensure maximum isolation and protection of your data.

Common Issues

Running in a Private Space has some additional implications that may not be immediately obvious. The main points are:

The web Console will not be available, due to the fact that the Javascript is running outside of the private space. If you want to interface with your cluster directly, you'll need to connect to the VPC first.
The Kibana dashboard will also be unavailable because the proxy is outside of the private space. Kibana can be set up within a private space, but it requires additional client-side configuration. Contact [email protected] for more details.
Connecting to the cluster directly will need to happen within the VPC itself, so users with a Private Space-based Bonsai cluster will need to first access their VPC before calls to Elasticsearch will succeed. See Connecting to Bonsai for more information.

If you run into any issues with your Private Space-based Elasticsearch cluster, please reach out to us at [email protected].

Gather your Heroku Peering Network Settings

Finding Your Private Space URL

Your Private Space URL will look something like:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://dashboard.heroku.com/teams//spaces//network</code></pre>
</div>

Enter your details into the Bonsai dashboard

Log into the Bonsai cluster dashboard by running <span class="inline-code"><pre><code>$ heroku addons:open bonsai -a</code></pre></span> and enter your details in the form provided:

Accept our Peering Request

Once Bonsai has the above data, we will initiate a peering request to your space which will show up in the Network tab and under the Peering subsection. It should look like this:

Network changes are not instantaneous

The lead time for this to show up ranges from 30 minutes to a few hours.

When you accept the invitation the UI should change to look like:

Once the request has been accepted, you will be able to use the cluster URL provided in the Bonsai dashboard.

Private Means Private!

‍

Heroku Private Spaces and VPC Peering

How Does Bonsai Manage Snapshots?

HTTP 400: Cannot delete indices that are being snapshotted

In some rare cases, you may see an error like this when attempting to alter an index:

The solution is to wait a minute or two and try again.

Can I Take My Own Snapshots?

We’re Always Improving

I Need More Frequent Snapshots / Longer Retention!

I Have My Own S3 Bucket – Can You Just Use That?

‍

Snapshots on Bonsai

This guide will cover concurrent connection limits and some strategies for avoiding them:

What are concurrent connection limits?
Upgrade to a plan with a higher concurrency allowance
Reduce the request overhead
Reduce the traffic rate

If reading over these suggestions doesn’t provide enough ideas for resolving the issue, you can always email our support team at [email protected] to discuss options.

What Are Concurrent Connection Limits?

Bonsai distinguishes between types of connections for concurrency metering purposes:

Searches. Search requests. The concurrency allowance for search traffic is generally much higher than for updates.
Updates. Adding, modifying or deleting a given document. Generally has the lowest concurrency allowance due to the high overhead of single document updates.This includes _bulk requests.

Bonsai also uses connection pools to account for short-lived bursts of traffic.

The effective number of connections the cluster can handle is the pool size, plus the number of "active" connections. Active connections are being served by the cluster and not sitting in the queue.

Concurrency is NOT Throughput!

Queued Connections Have a 60s TTL

Upgrade

Concurrency allowances on Bonsai scale with the plan level. Often the fastest way to resolve concurrency issues is to simply upgrade the plan. Upgrades take effect instantly.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

Reduce Overhead

Some examples of requests which tend to have a lot of overhead and take longer to process:

Aggregations on large numbers of results
Geospatial sorting on large numbers of results
Highlighting
Custom scripting
Wildcard matching

If you have requests that perform a lot of processing, then finding ways to optimize them with filter caches and better scoping can improve throughput quite a bit.

Reduce Traffic Volume

‍

Reducing Concurrency

Resolving disk overages can be resolved in a couple different ways that we will cover in this documentation:

Remove Stale Data / Indices
Purge Deleted Documents
Reindex with Smaller Mappings
Upgrade the Subscription

Remove Stale Data / Indices

Removing the old and unneeded indices in the example above would free up 356MB. A single command could do it:

Purge Deleted Documents

Protip: Queue Writes and Reindex in the Background for Minimal Impact

Reindex with Smaller Mappings

Mappings define how data is stored and indexed in an Elasticsearch cluster. There are some settings which can cause the disk footprint to grow exponentially.

This relationship is expressed mathematically as:

Nested documents are another example of a feature can also increase your data footprint without necessarily improving relevancy.

Upgrade the Subscription

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

‍

Reducing Data Usage

In this guide we cover a few different ways of reducing document usage:

How Document Overages Occur
Remove Old Data
Remove an Index
Compact Your Mappings
Upgrade the Subscription

How Document Overages Occur

Elasticsearch has several different articles on how nested documents work, but the simplest answer is that it is creating the illusion of complex object by quietly creating multiple hidden documents.

Remove Old Data

If your index has a lot of old data, or“stale” data (documents which rarely show up in searches), then you could simply delete those documents. Deleted documents do not count against your limits.

Remove an Index

You may also want to check out the Index Trimmer.

Compact Your Mappings

Changing your mappings to nest less information can greatly reduce your document usage. Consider this sample document:

If you’re using nested objects, review whether any of the nested information could stand to be left out, and then reindex with a smaller mapping.

Upgrade the Subscription

If you find that you’re unable to reduce documents through the options discussed here, then you will need to upgrade to the next plan.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

‍

Reducing Document Usage

How Bonsai Handles Overages
Concurrent Connections
Shards
Disk
Documents

How Bonsai Handles Overages

Overages are indicated on the Cluster dashboard. If your cluster is over its subscription limits, the overage will be indicated in red like so:

When an overages is detected, it triggers a state machine that follows a specially-designed process that takes increasingly firm actions. This process has several steps:

1. Initial notification (immediately). The owner and members of their team is notified that there is an overage, and provided information about how to address it.

3. Read-only mode (10 days). The cluster is put into read-only mode. Updates will fail with a 403 error.

4. Disabled (15 days). All access to the cluster is disabled. Both searches and updates will fail with a HTTP 403: Cluster Disabled error.

Extreme Overages Skip the Process

Stale Data May Be Lost

Free clusters that have been disabled for a period of time may be purged to free up resources for other users.

The fastest way to deal with an overage is to simply upgrade your cluster’s plan. Upgrades take effect instantly and will unlock any cluster set to read-only or disabled.

If upgrading is not possible for some reason, then the next best option is to address the issue directly. This document contains information about how to address an overage in each metered resource.

Give it a minute!

Concurrent Connections

More on concurrency and how to solve concurrency overages in our Reducing Concurrent Connections documentation.

Shards

If you have a shard overage you can look at out Reducing Shard Usage documentation to resolve it.

Disk

Disk overages can be resolved in a couple different ways that are explained in our Reducing Data Usage documentation.

Documents

Bonsai meters on the number of documents in an index. There are several types of “documents” in Elasticsearch, but Bonsai only counts the live Lucene documents in primary shards towards the document limit. Documents which have been marked as deleted, but have not yet been merged out of the segment files do not count towards the limit.

Document overages can be resolved in a couple different ways that are explained in our Reducing Document Usage documentation.

‍

Billing: How Bonsai handles overages

Getting Started
Adjust Dashboard
Reference: Metrics Available on Standard Subscriptions
Reference: Metrics Available on Business / Enterprise Subscriptions

Getting Started

Log into Datadog, navigate to <span class="inline-code"><pre><code>Integrations > APIs</code></pre></span> to get your current API Key

In the table of API keys, grab your current API key:

Navigate to the Integrations on your cluster dashboard under Data. There is a section marked Datadog Integration.

Enter your Datadog API key, select the API site, then press Activate Datadog. You should start to see request metrics loaded into Datadog within a few minutes.

Adjust How Metrics Are Shown In The Datadog Dashboard

Note to Enterprise tier customers who has Grafana dashboards set up with our team:

Datadog's configuration may change on how units are reported in the graphs. The request durations are ideally shown in milliseconds, and the following steps will get you back on track:

1. In the Datadog dashboard, select Metrics info from the "cog" dropdown menu on a graph like so:

2. Select <span class="inline-code"><pre><code>*.p50</code></pre></span> and it'll take you to the Metrics Summary search for p50. Click on the Metric Name then Edit button:

3. Under Metadata, change the Unit from minute to millisecond and leave per as <span class="inline-code"><pre><code>(None)</code></pre></span>. Click save:

4. Search for the other request duration times and edit their units to reflect millisecond. Once all three are edited, the graph should reflect the smallest time unit in milliseconds:

If you have any questions or issues, or are not seeing metrics loading into Datadog after a long time, please reach out and let us know at [email protected].

Reference: Metrics Available on Standard Subscriptions

Clusters on Standard plans have access to a variety of metrics:

Reference: Metrics Available on Business / Enterprise Subscriptions

Users with Business and Enterprise subscriptions have access to additional metrics:

Using Datadog with Bonsai

Let’s start by looking at the two plan types.

Choosing a plan type

Planning for disk capacity

Nobody likes using a search engine that doesn’t work. Failing to account for any of these factors will result in performance degradation, a.k.a. the infamous yellow or red cluster. 😫

How much primary data you can load into Elasticsearch, while still maintaining High Availability? This simple formula will help you calculate:

((number of nodes - 1) * the capacity of a single node) * 0.8 = the amount of data that can be loaded in your cluster

Let’s put this in a concrete example. A Business Capacity Large plan has a raw capacity of 150GB, with each of the three nodes contributing 50GB of disk. So the concrete numbers would be:

number of nodes = 3 per node capacity = 50GB total raw capacity = 3 * 50GB = 150GB Usable data = ((3-1) * 50GB) * 0.8 = 80GB

This formula, explained

Planning for computational requests and traffic

1.) How many search requests will you be doing in any given minute? 2.) How many aggregations will you be doing in any given minute? 3.) Lastly, how many bulk updates will you perform each minute?

To ensure optimal performance, all three numbers should fit under the values in the table below.

For those of you that want to dig even deeper, you can read our very thorough version of capacity planning as well.

Questions?

Our team has provisioned search engines that handle billions of requests each month. If you are still unsure about which plan is right for you, please contact us for a personalized consultation.

‍

Sizing Your Business Cluster

Bonsai’s support policy categorizes incident severity into three tiers. You can use these severity levels to help communicate the level of support needed.

24/7 Operational Coverage is Provided for All Clusters

Red Indexes
Failed EC2 Instances
Stuck GC
Failed backups
Network partitions / Split brain

This operational coverage is provided to all clusters, regardless of plan.

The severity classifications are defined as:

Coverage Hours

Incidents not related to a problem Bonsai’s infrastructure are treated as support issues. Coverage for these events is the same as our support hours:

Response Time

We will respond to inquiries about active incidents within these time windows.

<table>
<thead>
<tr><th>Severity Level</th><th>Standard</th><th>Business</th><th>Enterprise</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>8 hrs</td><td>1 hr</td><td>1 hr</td></tr>
<tr><td>Severity 2</td><td>24 hrs</td><td>4 hrs</td><td>4 hrs</td></tr>
<tr><td>Severity 3</td><td>48 hrs</td><td>8 hrs</td><td>8 hrs</td></tr>
</tbody></table>

Incident Response Times

Do you support plugin X?

These are the plugins we currently deploy by default for all clusters:

Mapper Attachments (version 2.x and earlier)
Ingest Attachment (version 5.x and up)
ICU Analyzer
Kuromoji Analyzer
Phonetic analyzer
Smart Chinese Analyzer
Stempel Analyzer
Delete-by-Query (version 2.x and up)

Can I install my own plugins on Bonsai?

Disaster recovery plan for all clusters on our platform

Whether you’re hosting Elasticsearch on your own or choosing a hosted provider like Bonsai, every search cluster should have a disaster recovery plan.

‍

Elasticsearch Disaster Recovery

The preferred method for migrating a cluster is to use a blue-green deployment strategy. There are several general steps to this approach:

Create a new Bonsai addon using the plan of your choice. This will create a new Bonsai cluster. If you are trying to migrate a cluster into a private space, make sure this addon is created within the private space.
Reindex your content into the new cluster.
Update your application to read and write to the new cluster.
Verify that the application is working as expected.
Tear down the old cluster.

This strategy will accommodate migrations across the boundary, regardless of whether you're going from public to private, or private to public.

If a blue-green deployment strategy is not possible for some reason, please contact us to set up a call. We will consider alternatives based on your specific needs.

‍

Migrating Between Public and Private Spaces in Heroku

Dedicated Clusters Exempt!

A quick summary of Bonsai’s version support policy is as follows:

Bonsai supports the two most recent major versions on Hobby and Standard tier clusters.
Bonsai does not automatically upgrade existing clusters when new Elasticsearch versions are released, or when old versions are retired.
Bonsai offers long-term version support for older major versions of Elasticsearch for Business and Enterprise tier clusters.
Hobby tier clusters always default to the cutting edge when available.
Hobby tier clusters are the first to be upgraded.
Clusters on EOL nodes (see below) will not be serviced and will receive limited technical support.
Bonsai does not force upgrades, except in cases involving a major security issue. These are rare, and typically patch-level upgrades.

How Bonsai Handles New Version Rollouts

Clusters running on our oldest supported version receive several notifications about next steps.

The general process is:

Deploy the new version and make it the default for Hobby clusters. Users with paid clusters can also opt in to the new version where available
Validate that the latest version is safe for production, and ensure our operational tooling is adequate to respond to any incident that may arise
Default all new clusters to the latest version
Deprecate the oldest version:
Send a notice several weeks in advance to all impacted users, alerting them of the deprecation and providing instructions and guidance for upgrading
Send a final notice several days in advance to any impacted user who still has not upgraded
Migrate remaining clusters to the next supported version
Terminate nodes running the deprecated version

There are some caveats and exceptions here, based on the cluster’s plan level:

Hobby Clusters

Standard tier Clusters

One other possibility for users who can’t upgrade either their cluster’s plan or Elasticsearch version is to remain on EOL nodes. This approach comes with some risks, which are described below.

Business and Enterprise tier Clusters

Otherwise, Business and Enterprise tier clusters are not affected by version deprecations on Bonsai.

What are EOL Nodes?

Some examples:

In the event that a security regression is disclosed, we may choose to promptly terminate EOL nodes without advance notice.
In the event of irrecoverable failure of the underlying EC2 instances, we may choose to terminate the EOL nodes without advance notice.

Users running on EOL nodes should actively work to get their application on a supported version as soon as possible to avoid this kind of outcome.

‍

How Bonsai Handles Elasticsearch Releases

_all and wildcard destructive actions
Tasks API
Node Hot Threads
Node Shutdown & Restart
Snapshots
Reindex
Cluster Shard Reroute
Cluster Settings
Index Search Warmers
Static Scripts
Update By Query API

_all and wildcard destructive actions

Examples of _all and wildcard destruction:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">DELETE /*
DELETE /_all</code>
</pre></div>

Tasks API

Node hot threads

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

Node Shutdown & Restart

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

Snapshots

If you feel that you need a snapshot taken/restored, please reach out to our support team.

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">GET /_snapshot/_status
DELETE /_snapshot/{repository}
POST /_snapshot/{repository}
PUT /_snapshot/{repository}
GET /_snapshot/{repository}/_status
DELETE /_snapshot/{repository}/{snapshot}
GET /_snapshot/{repository}/{snapshot}
POST /_snapshot/{repository}/{snapshot}
PUT /_snapshot/{repository}/{snapshot}
POST /_snapshot/{repository}/{snapshot}/_create
PUT /_snapshot/{repository}/{snapshot}/_create
POST /_snapshot/{repository}/{snapshot}/_restore
GET /_snapshot/{repository}/{snapshot}/_status</code></pre>
</div>
</div>

Reindex

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_reindex</code>
</pre></div>

Another option is to simply populate the new index directly from your database.

Cluster Shard Reroute

If you need fine-grain control over the shard allocation within a cluster, please reach out to us and we can discuss your use case and look at whether single tenancy would be a good fit for you.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_cluster/reroute</code>
</pre></div>

Cluster settings

If you need to change the system settings for your cluster, you’ll need to be in a single tenant environment. Reach out to us and let’s talk through it.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT /_cluster/settings</code>
</pre></div>

Index Search Warmers

If this is something critical to your app, you would need to be on a dedicated cluster. Please reach out to us if you have any further questions on this.

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">POST /_warmer/{name}
PUT /_warmer/{name}
POST /_warmers/{name}
PUT /_warmers/{name}
POST /{index}/_warmer/{name}
PUT /{index}/_warmer/{name}
POST /{index}/_warmers/{name}
PUT /{index}/_warmers/{name}
POST /{index}/{type}/_warmer/{name}
PUT /{index}/{type}/_warmer/{name}
POST /{index}/{type}/_warmers/{name}
PUT /{index}/{type}/_warmers/{name}</code></pre>
</div>
</div>

Static Scripts

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">DELETE /_scripts/{lang}/{id}
GET /_scripts/{lang}/{id}
POST /_scripts/{lang}/{id}
PUT /_scripts/{lang}/{id}
POST /_scripts/{lang}/{id}/_create
PUT /_scripts/{lang}/{id}/_create</code></pre>
</div>
</div>

Update By Query API

Customers wanting to make use of this API can do so on a Business or Enterprise subscription.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /{index}/_update_by_query</code>
</pre></div>

Unsupported API Endpoints

If you need a version or geographic region that is not listed here, reach out to our support and let us know what you need. We can get you a quote and timeline for getting up and running.

Multitenant Class

Bonsai operates a fleet of shared resource nodes in Oregon, Virginia, Ireland, Frankfurt, and Sydney. The Elasticsearch versions available on these nodes do not change often.

Sandbox clusters are limited to the most recent version of Elasticsearch and may be subject to automatic upgrades when a new version is released.

This table shows which versions of Elasticsearch are available for multitenant plans. Last updated: 2025-06-12

Multitenant Version Support by Region

Multitenant plans are supported in a handful of popular regions, although the versions available to free plans is limited. Additional regions are available to Business and Enterprise subscriptions.

Single Tenant Class

This table shows which versions of Elasticsearch are available for single tenant plans. Last updated: 2025-06-12.

Enterprise Deployments

Bonsai can deploy and manage whichever version of Elasticsearch or OpenSearch your use case needs. Please reach out to us at [email protected].

Older Search Engine Version Pricing

The fee will be assessed based on the following table:

Search Engine Version	Operational Fee
OpenSearch 1.x or Elasticsearch 6.x	20%
OpenSearch 1.x or Elasticsearch 6.x	20%
Elasticsearch 5.x	15% additional
Elasticsearch 2.x	10% additional
Elasticsearch 1.x	5% additional

Which Versions Bonsai Supports

Multi Tenant Class

Single Tenant Class

Trade Offs

This article details the differences between the two architectures we offer. There are a number of trade offs between these classes, summarized below:

Why Not Containers / VMs?

‍

Bonsai Architecture

Alert watermark. When a node in a single tenant cluster reaches 70% disk or higher, Bonsai sends out an alert to the user.
Low watermark. When a node reaches its low watermark stage(Bonsai defaults to 70% disk used), the cluster will no longer allocate new shards to this node.
High watermark. When a node reaches its high watermark stage(Bonsai defaults to 75% disk used), the cluster will actively try to move shards off the node.
Flood stage. When a node reaches its flood stage watermark(Bonsai uses 95% disk used), the cluster will put all open indices into a state that only allows for reads and deletes.

‍

How Bonsai Manages Watermarks

Access Controls All Bonsai clusters are provisioned with a unique, randomized URL and have HTTP Basic Authentication enabled by default, using a randomly generated set of credentials. Under this scheme, it would take the world’s fastest super computer around 23.5 quadrillion years to guess.
Encrypted communications. All Bonsai clusters support SSL/TLS for encryption in transit. We use industry standard strength encryption to ensure your data is safe over the wire.
Encrypted at rest. Bonsai clusters are provisioned on hardware that is encrypted at rest by default. In addition to Amazon’s physical security controls, this means your data is safe from physical theft.
Regular Snapshots. All paid Bonsai clusters receive regular snapshots, which are stored in an offsite, encrypted S3 bucket in the same region as the cluster.
Firewalled. All Bonsai clusters are accessed via a custom-built, high-performance layer 7 routing proxy, and sit behind a tightly controlled firewall. This helps to ensure that the cluster and data are protected from port scans and unauthorized persons.
Advanced Networking. Bonsai can support IP whitelisting, and VPC Peering to users on single tenant clusters.

‍

How Does Bonsai Secure Data?

Jekyll is a static site generator written in Ruby. Jekyll supports a plugin model that Searchyll uses to read your site’s content and then index it into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Jekyll to index all of the content into a configured instance of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Jekyll installed and configured on your system.

Make sure you have Jekyll installed. This guide assumes you already have Jekyll installed and configured on your system. Visit the Jekyll Documentation to get started.
Spin up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend. Guides are available for setting up a cluster through bonsai.io or Heroku.

Configure Jekyll to output to Bonsai Elasticsearch

Jekyll’s configuration systems live in a file called `_config` by default. Add the following snippet to the file:

We also need to add the gem `searchyll`

Push the Data Into Elasticsearch

To get the site's data into Elasticsearch, you can load it by simply running the Jekyll build command:

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"

{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre>
</div>
</div>

Jekyll

Step 1: Spin up a Bonsai Cluster

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Overview, in the Credentials tab:

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

Option 2: Interactive Console

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Install the Gem

To install Searchkick, you will need the searchkick gem. Add the following to your Gemfile outside of any blocks:

This will install the gem for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

If you need a specific version of Searchkick to accommodate your Elasticsearch cluster, you can specify it in your Gemfile like so:

Once the gem has been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span>.

Step 4: Add Searchkick to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured to do so by adding the <span class="inline-code"><pre><code>searchkick</code></pre></span> keyword to it.

For example, our demo app has a User model that looks something like this:

Adding the searchkick keyword makes our User model searchable with Searchkick.

If you have questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

Our Example

To implement search, we updated the file <span class="inline-code"><pre><code>app/controllers/users_controllers.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class UsersController< ApplicationController
#... a bunch of controller actions, removed for brevity
def search
@results = User.search(params[:q])
end
#... more controller actions removed for brevity
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">Rails.application.routes.draw do
resources :users do
collection do
post :search # creates a route called users_search
end
end
end</code></pre>
</div>
</div>

<% if @results.present? %>
<%= render partial: 'search_result', collection: @results, as: :result %>
<% else %>
Nothing here, chief!
<% end %>

</code></pre>
</div>
</div>

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
<%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

<%= result.company %>

<%= result.company_description %>

</code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

In this demo app, we added this to the navigation bar:

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Searchkick Where Your Cluster is Located

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:[email protected]:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

You are not using the bonsai-searchkick gem for some reason, OR
You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

Step 7: Push Data into Elasticsearch

Now that the app has everything it needs to query the cluster and render the results, we need to push data into the cluster. There are a few ways to do this.

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Searchkick. In our demo app, we have this simple list of users:

Congratulations! You have implemented Searchkick in Rails!

Final Thoughts

You can find information on additional subjects in the section below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

‍

Searchkick

Here’s how to get started with Bonsai Elasticsearch and Ruby on Rails using Chewy.

Warning

EngineVersionHighest Compatible Gem VersionElasticsearch5.x7.13Elasticsearch6.x7.14+ ( sic)Elasticsearch7.x7.13OpenSearch1.x7.13

Note

As of January 2021, Chewy supports up to Elasticsearch 5.x. Users wanting to use Chewy will need to ensure they are not running anything later than 5.x. Support for later versions is planned.

Add Chewy to the Gemfile

In order to use the Chewy gem, add the gem to the Gemfile like so:

Then run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install it.

Write an Initializer

A Chewy initializer will need to be written so that the Rails app can connect to your Bonsai cluster.

We recommend using something like dotenv for this, but you can also set it manually like so:

Heroku users will not need to do so, as their Bonsai cluster URL will already be in a Config Var of the same name.

Create a file called <span class="inline-code"><pre><code>config/initializers/chewy.rb</code></pre></span>. With the environment variable in place, save the initializer with the following:

The official Chewy documentation has more details on ways to modify and refine Chewy’s behavior.

Configure Your Models

Any Rails model that you will want to be searchable with Elasticsearch will need to be configured to do so by creating a Chewy index definition and adding model-observing code.

For example, here’s how we created an index definition for our demo app in <span class="inline-code"><pre><code>app/chewy/users_index.rb</code></pre></span>:

Then, the User model is written like so:

You can read more about configuring how a model will be ingested by Elasticsearch in the official Chewy documentation.

Indexing Your Documents

The Chewy gem comes with some Rake tasks for getting your documents into the cluster. Once you have set up your models as desired, run the following in your application environment:

That will push your data into Elasticsearch and synchronize the data.

Next Steps

‍

Chewy

Warning

Note

In this example, we are going to connect to Elasticsearch using the official Elasticsearch gems. There are other gems for this, namely SearchKick, which is covered in another set of documentation.

Step 1: Spin up a Bonsai Cluster

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Dashboard, in the Credentials tab:

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Add the Gems

There are a few gems you will need in order to make all of this work. Add the following to your Gemfile outside of any blocks:

This will install the gems for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

For example, if your version of Elasticsearch is 6.x, then you would use something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: '6.x'</code></pre>
</div>
</div>

Make sure the branch you choose corresponds to the version of Elasticsearch that your Bonsai cluster is running.

What Do These Gems Do, Anyway?

elasticsearch-model. This gem is the only one that is actually required. It does the actual search integration for Ruby on Rails applications.
elasticsearch-rails. This optional gem adds some nice tools, such as rake tasks and ActiveSupport instrumentation.
bonsai-elasticsearch-rails. This optional gem saves you the step of setting up an initializer. By default, the Elasticsearch client will attempt to create a connection to <span class="inline-code"><pre><code>localhost:9200</code></pre></span>, which is not where your Bonsai cluster is located. This gem simply reads the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable which contains the cluster URL, and uses it to override the defaults.

Once the gems have been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install them.

Step 4: Add Elasticsearch to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured. You will need to require the `elasticsearch/model` library and include the necessary modules in the model.

For example, this demo app has a <span class="inline-code"><pre><code>User</code></pre></span> model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
include Elasticsearch::Model
include Elasticsearch::Model::Callbacks
settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

Your model will need to have these lines included, at a minimum.

What Do These Lines Do?

<span class="inline-code"><pre><code>require 'elasticsearch/model'</code></pre></span>require 'elasticsearch/model' loads the Elasticsearch model library, if it has not already been loaded. This isn’t strictly necessary, provided the file is loaded somewhere at runtime.
<span class="inline-code"><pre><code>include Elasticsearch::Model</code></pre></span> tells the app that this model will be searchable by Elasticsearch. This is mandatory for the model to be searchable with Elasticsearch.
<span class="inline-code"><pre><code>include Elasticsearch::Model::Callbacks</code></pre></span> is an optional line, but injects Elasticsearch into the ActiveRecord lifecycle. So when an ActiveRecord object is created, updated or destroyed, it will also be created/updated/destroyed in Elasticsearch.
<span class="inline-code"><pre><code>settings index: { number_of_shards: 1 }</code></pre></span> This is also optional, but strongly recommended. By default, Elasticsearch will create an index for the model, with 5 primary shards and 1 replica. This will actually create 10 shards, and is ludicrously over-provisioned for most apps. This line simply overrides the default and specifies 1 primary shard(a replica will also be created by default).