White Hex icon
Introducing: A search & relevancy assessment from engineers, not theorists. Learn more

Documentation

Welcome to the Bonsai Documentation. This is the place to learn about how to integrate, setup, and manage your Bonsai account

Upgrading Your Bonsai Cluster

Overview

Upgrading your Bonsai cluster to a more powerful one is a straightforward process, designed to minimize downtime and maintain the integrity of your data. This document outlines the steps involved, the expected downtime, and the effort required for a successful upgrade.

Upgrade Process

Bonsai utilizes a two-phase snapshot and restore process to migrate your data from one hardware cluster to another. A snapshot of your data is taken from the current hardware and restored to the new new hardware. There is no downtime or production impact to your cluster during this phase. This phase of the upgrade can take time, depending on how much data you have on your existing hardware.

Once this is completed, a second snapshot-restore operation is performed. This operation is simply a delta — covering only the data that has been changed since the phase-1 shapshot was taken. Usually this second phase lasts a few seconds, or maybe a minute. During this phase, your cluster is placed into a read-only mode. Search traffic is not impacted, but writes are blocked until the restore completes.

Once this final restore has completed, the cluster will be running entirely on the new hardware, with no data loss.

Downtime and Effort

  1. Zero Downtime for Searches:
    • Your search operations will not experience any downtime during the upgrade process. This ensures continuous availability for read operations.
  2. Short Window for Writes:
    • There will be a brief period where write operations are paused. This typically lasts less than a minute, depending on the amount of data being transferred.
    • An error message will be returned during this pause: 403 Forbidden along with a JSON response indicating that Cluster is currently read-only for maintenance. Please try your request again in a few minutes. See [status.bonsai.io](<http://status.bonsai.io/>) or contact [support@bonsai.io](<mailto:support@bonsai.io>) for updates.. It's essential to handle this gracefully in your application.
  3. Data Volume Considerations:
    • The time required for the upgrade depends on the size of your data and the rate of updates. Larger datasets may take slightly longer, but the write pause is generally kept under a minute.
  4. Using a Queue and Buffering:
    • As a general rule, we highly recommend using a queue to buffer write operations wherever possible. This could be something as simple as a Redis queue, kafka or similar that implements retries and exponential backoff for failed writes to the cluster. This ensures that transient network issues or other connection problems don’t cause lost writes.
    • During the upgrade process, this kind of queue can be paused to ensure that any writes attempted during the brief readonly phase are retried and successfully processed once the cluster is upgraded.

Let’s discuss how to change a cluster’s plan the Bonsai’s Cluster Dashboard.


To begin, navigate to the cluster’s Plan under Settings on the cluster dashboard.

If you haven’t added billing information yet, please do so at Account Billing. Detailed steps can be found in Add a Credit Card.

Once there is billing information listed on the account, you will see the different options for changing the cluster’s plan. Select the plan you would like to change to and click the green Change to … Plan button. In this example, this `documentation_cluster` is on a Sandbox plan and it will be upgraded to a Standard Micro plan.

A successfully changed plan will notify you at the top with Plan scheduled for update. The Plan has been upgraded to the Standard Micro plan in this example.

Can't Downgrade?

Please note that downgrading a plan may fail to update if it puts the cluster in an Overage State for the new plan. For example, downgrading a Standard Micro plan to a Sandbox plan will fail if there is 11 shards on the cluster (Sandbox plan has a shard limit of 10).

Upgrading and Changing a Cluster’s Plan

Hold Up!

Bonsai clusters can be provisioned in a few different ways. These instructions are for users who want to be billed directly by Bonsai. If you are a Heroku customer and are adding Bonsai to your app, then you will not need these instructions. Instead, check out:

Creating an account on Bonsai is the first step towards provisioning and scaling an Elasticsearch cluster on demand. The following guide has everything you need to know about signing up.

  1. Fill out account details
  2. Describe your project
  3. Set up your cluster details
  4. Confirm your email
  5. FAQs

1. Fill out account details

To start the sign up process, head to our Sign up page, where you’ll begin by providing account information. Click Create Account to continue.

2. Describe your project

In this section, you can provide some extra project details to allow our project team to best support your cluster. We have a few questions for you to answer to tailor your experience when using Bonsai.

Click Next to move on.

3. Set up your cluster details

Fill out your cluster name, then select a release and a region. All hobby and testing clusters are provisioned with the latest version. More information about our version and region support can be found here.

You don’t have to sign up with payment details - your first testing cluster is on us! If you wish to provision or upgrade to a larger cluster with more resources, you’ll need to add a credit card after confirming your email.

Click Provision Cluster to complete your signup.

4. Confirm your email

If everything checks out, you should receive an email from us shortly with a confirmation button.

Navigate to your email client and click on Confirm email.

You’re all set! Thank you for signing up.

If you don't receive the sign up email, please check your Spam folder. You can also reach out to us for help.

5. FAQs

Q: How does Bonsai validate emails?

A: We validate emails using RFC 2822 and RFC 3696. If you have a non-conforming email address, let us know at support@bonsai.io.

Q: Do you have tips for creating a strong password?

A: Yes! We’re a security-conscious bunch, but we don’t have any arcane rules about what kinds of characters you must use for your password. Why? We’ll let xkcd explain it, but the tl;dr is our password policy simply enforces a minimum length of 10 characters. We also reject common passwords that have been pwned. Sadly, correct horse battery staple appears in our blacklist.

We also highly recommend using a password manager like 1Password to generate unique, secure passwords.

Q: What happens if I try to sign up with an email address that I used before or that is taken?

A: If you have signed up for Bonsai in the past using this email address, you will receive an email directing you to log in using it.

Q: Do you have tips on picking a cluster region?

A: You will want to select a region that’s as close to where your application is hosted as possible to minimize latency. Doing so will ensure the fastest search and best user experience for your application.

Bonsai is built on Amazon Web Services (AWS) and Google Cloud Platform (GCP), and we run clusters in several of their regions. Users are able to provision clusters in the following regions:

Amazon Web Services
  • us-east-1 (Virginia)
  • us-west-2 (Oregon)
  • eu-west-1 (Ireland)
  • ap-southeast-2 (Sydney)
  • eu-central-1 (Frankfurt)
Google Cloud Platform
  • gcp-us-east4 (Virginia)
  • gcp-us-east1 (South Carolina)
  • gcp-us-central1 (Iowa)
  • gcp-us-west1 (Oregon)

These regions are supported due to broad demand. We can support other regions as well, but pricing will vary. Shoot us an email if you’d like to learn more about getting Bonsai running in a region not listed above.

Q: How should I approach naming clusters?

A: Users are able to manage multiple clusters through their Bonsai dashboard once they have confirmed their email. Not only is the cluster name a label for you to distinguish between different applications and environments, but it’s also used as part of the cluster’s unique hostname.

For example, if you name your cluster “Erin’s Exciting Elasticsearch Experiment,” that will be the name of your cluster. The host name of your cluster URL will be automatically generated into something like erins-exciting-elasticsea-123456.

Cluster names can be changed later, but the host name that is generated when the cluster is first created is immutable.

Anything not listed here?

As always, feel free to email us at support@bonsai.io if you have further questions.

Signing Up

The Bonsai cluster dashboard’s Overview provides a series of useful metrics about your cluster.

This article will cover the following:

  • Cluster Information
  • Performance metric
  • Traffic Summary
  • Usage
  • Data Allocation
  • Tenants
  • How to find help

Cluster Information

Overview provides general information about your cluster at the top:

The account name, cluster’s name, and health status dot (which will be green, yellow or red) is found here. Below that, the region the cluster is provisioned in, the version of Elasticsearch it’s running, and the subscription plan tier is display.

Performance

The first component is the Performance heatmap:

This heatmap reveals how fast requests are. Each column represents a “slice” of time. Each row, or “bucket”, in the slice represents duration speed. The"hotter" a bucket is colored, the more requests there are in that bucket. To further help visualize the difference in the quantity of requests for each bucket, every slice of time can be viewed as a histogram on hover.

You can check out our Metrics documentation for a more detailed dive into cluster metrics.

Traffic Summary

Traffic Summary highlights several statistics in the last 24 hours:

  • Request Count: This is the total number of requests your cluster has served in the past 24 hours. If there is a ‘-’ (hyphen character), then there is no data available to report. Also indicated are request counts for the previous 24 hours under Yesterday, and the percentage change between the two days.
  • Duration Median: This indicates your median request latency. The first number is the median response time for all requests over the past 24 hours. Also indicated are the median request latency for the previous 24 hours under Yesterday, and the percentage change between the two days. The median is an important metric because it’s more resistant to long tail effects and gives a better picture of overall performance than averages.
  • Duration 95th: This shows the 95th percentile in response times. The previous 24 hour period is found under Yesterday, and the percentage change between the two days. A percentile indicates how much of the data set falls below a particular value. For example, if the p95 response time for a cluster is 100ms, that means 95% of all requests are occurring in 100ms or less. This is an important metric for benchmarking, especially with high traffic volumes.

Usage

This component shows the cluster’s current usage versus the limits of your plan for 3 items:

  • Docs: This is the total number of documents you have in your cluster. We’re counting all documents, which can sometimes lead to confusion when nested documents are involved. If you have a parent document with three child documents, that counts as four documents - not one. This can be a source of confusion, as Elasticsearch may report different counts based on the endpoint queried.
  • Data: This is the disk footprint of your cluster, or the amount of data your cluster is occupying on the server.
  • Shards: This is the number of shards in your cluster across all indices. We’re counting both primary and replica shards here.

Data Allocation

This component indicates how your data is allocated across your cluster. If the allocation seems radically unbalanced, that can be an indication that you should reindex your data with an updated shard scheme. Documentation on this can be found in our Capacity Planning documentation.

Business / Enterprise Plans

If you upgrade to a Business / Enterprise cluster, you may see some extra nodes appear here, and may further observe that these nodes have few or 0 shards allocated. This is expected. These extra nodes will be cleaned up and removed later.

Tenants

Enterprise subscriptions support private multitenancy. For clusters running on these subscriptions, there will also be a Tenants table that lists tenants on the cluster:

Cluster Overview

Bonsai cluster dashboard’s Metrics is the place to troubleshoot cluster traffic issues and view performance metrics. This article will cover:

  • Navigating to Metrics
  • Metrics Utilities
  • Metrics Overview

Navigating to Metrics

Metrics is located in each cluster’s dashboard. Log into Bonsai, click on your cluster, and click on Metrics within the left sidebar:

Metrics Utilities

Time Window Selector

Use this selector to choose between four window sizes for metrics in the:

  • (1h) last 1 hour
  • (1d) last 24 hours
  • (7d) last 7 days
  • (28d) last 28 days

Time Scrubber

Click the left and right arrows to go back or forth in time within the selected window size.

UTC and Local Timezone Toggle

Click on the timezone to toggle between displaying the graph timestamps in UTC time or your local browser timezone.

Highlighting

You can drill down to smaller windows of time on any graph by clicking and dragging to select a time range.

Metrics Overview

More information doesn’t necessarily mean more clarity. When something happens to your traffic and cluster responses, it’s important to know how to see your metrics and draw conclusions.

We’ll cover what each graph displays and some examples of what they will look like given certain use cases (such as periods of high-traffic, or clusters in a normal state compared to ones that are experiencing downtime). We’ll start with the most information-dense graph: the Requests heat map.

Requests (Counts & Duration) heat map

This graph reveals how fast requests are. Each column in the graph represents a “slice” of time. Each row, or “bucket”, in the slice represents duration speed. The "hotter" a bucket is colored, the more requests there are in that bucket. To further help visualize the difference in the quantity of requests for each bucket, every slice of time can be viewed as a histogram on hover.

Example 1

This heat map displays a cluster with consistent traffic with most requests in the 200-300ms range to complete.

Example 2

This cluster has light, sporadic traffic. It’s important to note that the "heat" color of every bucket is determined relative to the other data in the graph - so a side-by-side comparison of two request heat maps using color won’t be accurate.

Request Counts

This graph shows the number of requests handled by the cluster at a given time.

Request Duration Percentiles

This graph, similar to the Requests heat map, shows a distribution of request speed based on 3 percentiles of the requests in that time slice: p50 (50%), p95 (95%), and p99 (99%). This is helpful in determining where the bulk of your requests sit in terms of speed and how slow the outliers are.

Proxy Queue time

Proxy Queue time is the total amount of time requests were queued (or paused) at our load balancing layer. Queue time is ideally 0; however, in the event that you send many requests in parallel, our load balancer will queue up requests while waiting for executing requests to finish. This is part of our Quality of Service layer.

Concurrency

Concurrency shows the number of requests that are happening at the same time. Since clusters are limited on concurrency, this can be an important one to keep an eye on. When you reach your plan’s max concurrency, you will notice queue time start to consistently increase.

Bandwidth

This graph shows the amount of data crossing the network - going into the cluster (shown in green) and coming from the cluster (in blue).

We expect most bandwidth graphs to look something like the graph below — a relatively higher count of "From Client" data (read requests) compared to "To Client" data (write or indexing requests).

The relationship between green to blue bars in this graph really depends on your use-case. A staging cluster, for example, might see a larger ratio of Write:Read data. It’s important to note that this graph deals exclusively in data - a high-traffic cluster will probably see a lot of data coming “From” the cluster, but a low-traffic cluster with very complicated queries and large request bodies will also have a larger “From Client” data than would otherwise be expected. Therefore, it’s helpful to look at request counts to get a feeling for the average "size" of a request.

Response Codes

This graph can do two things:

  • It will confirm that responses are successful: 2xx responses. This means that everything is moving along well and requests are formed correctly.
  • In the less positive case, it can be a debugging tool that can help figure out where any buggy behavior is coming from. In general, 4xx requests are the result of a malformed query from your app or some client, whilst a 5xx request is from our end.

It’s important to note while reading this graph, that 5xx requests don’t necessarily mean that your cluster is down. A common situation on multitenant plans is a cluster that’s getting throttled by a noisy neighbor who’s taking up a lot of resources on the server. This can interrupt some(but not all) normal behavior on your cluster, resulting in a mix of 2xx and 5xx requests.

Tolerance for a few 5xx requests every now and then should be expected with any cloud service. We’re committed to getting all production clusters a 99.99% uptime (i.e. expected 0.001% 5xx responses), and we often have a track record of four 9’s and higher.

We have a lot of people that are very sensitive to 5xx requests. In these cases, it’s usually best to be on a higher plan or a single tenant plan. Reach out to us at support@bonsai.io if this is something your team needs.

Business and Enterprise Plans - Additional Metrics

Clusters running on Business and Enterprise grade clusters will have access to some additional metrics. If you would like to get access to these metrics for your cluster, please reach out and our team will walk you through the process.

System Load

System load is the average number of processes waiting on a resource within a given period of time. This is often reported in 1, 5 and 15 minute windows. The Bonsai dashboard shows the system load average over the past minute.

It is helpful to think of system load as how saturated a node is with tasks; as long as the node's load average is lower than the number of its available CPUs, the node is able to handle all of its work without getting backed up. If the load average is larger than the number of its available CPUs, that means that some tasks are being scheduled. When tasks are delayed like this, performance suffers, and the performance impact is correlated to how high the load average gets.

Elasticsearch Queue

Elasticsearch utilizes a number of thread pools for handling various tasks. These pools are also backed by a queue, so that if an tasks is created and a thread is not available to execute it, the task is queued until a thread becomes available. This metric shows the total number of tasks sitting in an Elasticsearch queue.

It's important to note that this is not the same as a request queue. A single request can result in multiple tasks being created within Elasticsearch. It is also important to note that these queues have a finite length. If the queue is full and an additional task is created, Elasticsearch will reject it with a message like "rejected execution (queue capacity 50)" and an HTTP 429 response.

Elasticsearch Bulk

This metric shows the number of _bulk requests that have been processed over time. Bulk requests are an efficient way to insert or update data in your cluster. Naturally, your payload sizes need to be more than 1-2 documents in order to get the benefits of bulk updates. Usually batches of 50-500 are ideal.

Elasticsearch Search

This metric shows the number of search requests that have been processed by Elasticsearch. It's important to distinguish between user searches and shard searches. A user search is performed by your application (often in response to some user action or search in the app), and may translate into multiple shard searches. This is true if your indices have multiple primary shards, or if you're searching across multiple indices. A query for the top X results will be passed to all relevant shards; each shard will perform the search and return the top X results. The coordinating node is then responsible for collating those results, sorting, and returning the top X.

Elasticsearch does utilize a thread pool specifically for searches, so depending on the types of searches you're running, and the volume of search traffic, it's possible to get a message like "rejected execution (queue capacity 50)" and an HTTP 429 response.

JVM Heap

Elasticsearch is a Java-based search engine that runs in the Java Virtual Machine (JVM). The JVM has a special area of memory where objects are stored, called "heap space." This space is periodically garbage-collected, meaning objects that are no longer in use are destroyed to free up space in memory.

This metric shows the percentage of heap space that is currently occupied by Elasticsearch's objects and arrays. Lower is ideal, because high heap usage generally means more frequent, and longer-lasting garbage collection pauses, which manifests as slower performance and higher latency.

JVM Young GC Time

This metric shows the amount of time in milliseconds that the JVM spent reclaiming memory from new or short-lived objects. This type of garbage collection is expected and shouldn't lead to HTTP 503 or 504 errors. However, if it is chronic and frequent, it may lead to slow performance.

JVM Old GC Time

This metrics shows the amount of time in milliseconds the JVM spent reclaiming memory from long-surviving objects. The JVM periodically pauses the application so it can free up heap space, which means some operations are stopped for a period of time. This can result in perceived slow response times, and in some extreme cases can lead to system restarts and HTTP 503 and 504 responses.

CPU IO Wait

This metric shows the percentage of the time that the CPU(s) waited on IO operations. This means that an operation requested IO (like reading or writing to disk) and then had to wait for the system to complete the request. A certain amount of IOWait is expected for any IO operation, and usually it's on the order of nanoseconds. This isn't indicative of a problem on its own.

Excessive wait times are problematic though. It is usually correlated to high system load and a high volume of updates. Sometimes that can be addressed through hardware scaling or better throttling of updates. In rare cases it can indicate a problem with the hardware itself, like an SSD drive failing.

CPU User

This metric shows the percentage of the time that the CPU(s) were executing code in user space. The user space is where all code runs, outside of the operating system's kernel. On Bonsai, this space is primarily dedicated to Elasticsearch, so the metric is roughly the amount of time the CPU spent processing instructions by the Elasticsearch code.

The metric can vary widely between clusters, based on application, hardware and use case. There is not necessarily an ideal value or range for this metric. However, large spikes or long periods of high processing times can manifest as poor performance. It often indicates that the hardware is not able to keep up with the demands of the application, although it can sometimes indicate a problem with the hardware itself.

Dashboard Metrics

Bonsai offers support for automatically removing old indices from your cluster to save space. If you’re using Bonsai to host logs or some application that creates regular indices following a time-series naming convention, then you can specify one or more prefix patterns and Bonsai will automatically purge the oldest indices that match each pattern. The pattern will only match from the start of the index name.

This feature is found in the Cluster dashboard, under the Trimmer section:

For example, assume you’re indexing time-series data, say, number of support tickets in a day. You’re putting these in time-series indices like "support_tickets-201801010000", "support_tickets-201801020000", and so on.

With this feature, you could specify a pattern like "support_tickets-", and we’ll auto-prune the oldest indices first when you reach the size limit specified for the pattern. Indices scheduled for removal will be highlighted in red.

Please note we will not purge the last remaining index that matches the pattern, even if the size is above the limit.

Note on Trimmer and deleting documents

The trimmer feature only allows you to delete whole indexes and only if there are more than one with the same trimmer pattern. The trimmer does not delete just certain documents in an index. To remove a number of documents in an index your best option is to use delete_by_query.

Here is an example for deleting the 50 oldest documents, according to a hypothetical "date" field:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST //_delete_by_query?conflicts=proceed

{
 "query": {
   "match_all": {}
 },
 "sort": {
   "_script": {
     "type": "date",
     "order": "asc"    }
 },
 "size": 50
}</code></pre></div>

The Elasticsearch Search API can be used to identify documents to delete. Please use your Console (or an equivalent tool) and build your query using the Search API first, and verify that you will only be deleting the documents you want. Do this before running the delete_by_query call against your query.

Trimmer

To navigate to your personal profile, click on your initials in the upper right corner and select Profile Settings from the dropdown menu. Then navigate to the Security tab.

  1. Single Sign-On
  2. Password Management
  3. Browser Session Management

1. Single Sign-On

Single Sign-On (SSO) is the ability to have a third party service validate your identity. You can enable Google SSO which offers additional security like multi-factor authentication (MFA). Bonsai also supports Okta.

To use this feature, your identity provider must match your Bonsai.io account email address. For example, if your Google email address is "bob.smith@gmail.com," then your Bonsai.io account must use this same email address in order to verify your identity.

Once you have SSO set up, you will no longer be able to log in with your username/password. Logging in will need to be done through the identity provider.

To revert back to username/password authentication, you will need to disable SSO. To do so, simply click on Disable SSO.

If you see this section greyed out then your account admin has required that you use SSO.

2. Password Management

To update your password, enter your old password and a new password. Bonsai strongly recommends using a password manager like 1Password or LastPass to keep your passwords secure, and to help randomly generate new passwords.

Protip: Use a strong password

We’re a security-conscious bunch, and we don’t have any arcane rules about what kinds of characters you must use for your password. Why? We’ll let xkcd explain it. Tl;dr: our password policy simply enforces a minimum length of 10 characters. We also reject common passwords that have been pwned. Sadly, correct horse battery staple appears in our blacklist.

Note: updating your password will revoke all of your active sessions and force you to log in again.

3. Browser Session Management

View and revoke your active sessions by scrolling down to Active Sessions. If you have a session on another device, you can see its IP address and information about the device.

You can revoke sessions individually, or revoke all. Revoking all sessions will also revoke your current session that you are using to view your profile, and doing so will require you to log in again.

Account Security (SSO, Passwords, Sessions)

You can add or delete a credit card, update your payment profile, or add coupon codes in the Billing tab of Account Settings.

  • Add a credit card
  • Check Card Details
  • Set default credit card
  • Delete credit cards on file
  • Add an Address
  • Coupon Codes
  • Statements

Add a credit card

To begin adding a credit card to the account, click on Add credit card to bring up a new form.

Fill in the credit card information and click Add Card to proceed. Click Cancel to go back.

Your Financial Data, Secured

When you add a credit card in Bonsai, the information is encrypted and passed to our payment processor. We have verified that this service is Level 1 PCI Compliant. This level of compliance is the highest level of security a business can offer. We do not host or store any financial data, and your credit card details can not be accessed by anyone on our team or within the payment processor.

A successfully added credit card will be listed.

Check Card Details

Once there is at least one card listed in Credit Cards, click on the overflow menu icon to reveal a drop-down menu to View Details on that card.

Card Details lists the credit card information.

Set default credit card

You can associate multiple credit cards to your account. If you only have one credit card on file, it will automatically be the default. If you have multiple credit cards, you have the option to update the default credit card for payment.

To set another credit card as default, click on the overflow menu icon to reveal a drop-down menu to Set Default on a that card.

Delete credit cards on file

To remove a credit card from an account, click on the overflow menu icon to reveal a drop-down menu to Delete a credit card.

Please note, you can delete a default credit card if all 3 of the following statements are true:

  • it’s the only credit card on file
  • there are no active paid clusters on the account
  • the account balance is $0

If you are having trouble deleting a default card, let us know at support@bonsai.io.

Add an Address

Once a credit card has been successfully added, the Address section will appear under Credit Cards. Adding an address here will aid in applying taxes based on your location. If tax is charged, then at what rate depends on this address. See Bonsai's Policy on Collecting Sales Tax for further information.

Clicking on the Add Address button will take you to a form where you can use an existing address populated from your credit card or fill out a new address.

A successfully added address will take you back to the Billing tab.

Coupon Codes

If you have a coupon code, you can add it in the Coupon Code section. Simply enter the code and click on Apply Coupon.

If the code is accepted, it will be listed under Coupon Codes.

If the code is not valid, there will be an error message.

If you have a code that isn’t working but should, shoot us an email at support@bonsai.io and we’ll check it out.

Statements

Download and view the account statements.

Billing: Settings

You and the people you refer can earn credit that will be applied to your account to on paid Bonsai plans. We'll cover how this process works in the following guide.

  1. Overview
  2. Refer new users with an email invite
  3. Refer new users with a shareable referral URL
  4. The new user referral sign up process
  5. FAQs

1. Overview

How it works:

<ol>
<li>Send your unique referral URL to a new user.
<ul>
<li>Enter the email of the person you are referring under your Referrals tab, and we’ll automatically send them an invite email with everything they need to know.</li>
and/or
<li>Copy your shareable link, and share your URL via email, social media, or whatever method works for you.</li>
</ul>
</li>
<li>The new user signs up using your referral URL.</li>
<li>The new user pays for their first month on a paid cluster plan.</li>
<li>You earn $50 AND they receive $50 toward a paid Bonsai plan.</li>
<li>You can send your referral URL to as many people as you want, but you will stop earning credit once you have reached the limit of $50.</li>
</ol>

To begin, navigate to Account Settings then to the Referrals tab.

Under the Referrals tab you will be able to access your unique referral URL and see how many referrals have been sent and accepted:

2. Refer new users with an email invite

Entering a new user's email address and clicking Invite will automatically send them an email with a detailed explanation of how to sign up for Bonsai using your referral URL. A success message on the dashboard will notify you that your invite has been successfully sent.

The email we send to new users looks like this:

3. Refer new users with a shareable referral URL

Clicking  Copy Link will automatically copy your referral URL to your clipboard.

Paste your unique referral URL to share it with new users through your personal email, social media, or wherever!

4. The new user referral sign up process

<ol>
<li>When a new user is directed to our Sign Up page using your referral URL, they will see the following page:
<figure class="w-richtext-figure-type-image w-richtext-align-center"><div><img src="https://global-uploads.webflow.com/63c81e4decde60e815417fc3/6464200ec961bced99877436_6410e020d4469.png" alt=""></div></figure>
</li>
<li>By completing the sign up process through your referral URL, their account will be marked as having been created with your referral code</li>
<li>Once the new user has upgraded their cluster to a Staging plan or higher, you and the new user will both receive a credit of $50 towards the next monthly bill. Hooray!</li>
</ol>

5. FAQs

Q: I don't have a Referrals tab in my Account Settings.

A: Only accounts with Standard an up clusters have access to Referrals.

The Referrals tab will show up in your Account Settings once you have at least one paid cluster in your account.

Q: I invited a friend with my referral URL, but I didn't receive any credit.

A: In order to receive the credit your friend has to both sign up for a new account with your referral URL and upgrade to their cluster to Standard, Business, or Enterprise. If you don't see your credit after that, please let us know at support@bonsai.io. We're happy to get this sorted out for you.

Q: Can I accumulate credit and use it in the future?

A: No, your credit will automatically be applied to the following month’s payment.

Q: My referral credit is more than my plan’s cost. Does that carry over to the billing cycle?

A: No. Since this is a one-time credit, it can only be applied to the following month.

Referral Credit Program

Note on Logs

Logs are real-time they only appear on this page while requests are happening. Logs do not persist.

Bonsai provides a real-time stream of all the requests hitting your cluster. There is also a subtab for the Top 20 Slow Requests. This will begin to populate if requests slow down measurably.

The streaming logs show a timestamp, HTTP verb and endpoint, along with how long it took Elasticsearch to respond and what HTTP response code was returned. Request/response payloads are not captured.

Bonsai does not expose server logs at this time.

Logs

Credential management is imperative to tracking who and what has access to your cluster. At Bonsai, regardless of plan level, every request made to your cluster requires a username and password. Security is a default, not an upgrade.

With the Credential Management section of the cluster dashboard, you can add and remove access credentials. In this guide, we will cover:

  1. Introduction to Credentials Management
  2. Regenerating your master credentials
  3. Creating new credentials
  4. Deciding which credential type to use
  5. Which credential types does your subscription support?

1. Introduction to Credentials Management

You can see your current credentials and generate new ones by logging into your cluster dashboard and navigating to the Credentials section.

Every cluster has a Master credential that is created when the cluster is provisioned. The Master credential can never be revoked, only rotated.

There are three types of credentials you can generate:

  1. Full-access
  2. Read-only
  3. Custom

With custom auth controls you can specify things like:

  • Which indices (or index patterns) are accessible.
  • What Elasticsearch actions may be performed.
  • Where requests are allowed to originate.

2. Regenerating your master credentials

You would want to regenerate your default credentials if your fully-qualified URL has been linked (say, if somehow was copy-pasted into an email, GitHub issue, or, perish the thought, StackOverflow). To do that, simply click the yellow  Regenerate button. This will instantly regenerate a new, randomized authentication pair.

The old credentials will remain active for two minutes or so. After that time the old keys are revoked. The purpose of the two minute warning is to give administrators the opportunity to update their application with the new credentials before the old ones expire.

3. Creating new credentials

To create a new credential, click on the Create Credential button.

Choose one of the three types (full-access, read-only, or custom), give it a name, and then click Generate.

Note

We advise giving your credential a human-friendly name, like ACME_frontend, python_indexer, or docs_search_component. It’s an easy way to help you and your teammates remember how each credential is used. When you generate a new credential, Bonsai shows your credential details.

Credential Details

You can view who created the credential (in this example, Leo), the access key and secret (username and password), the allowed settings, and some quick links.

Settings

This displays what indices are accessible, which Elasticsearch actions are allowed, and if there are whitelisted IP’s or CIDR blocks. If your cluster is on the Business tier and above, these fields are customizable.

URL Quick Links

The Elasticsearch access url is excellent for pasting into a terminal and executing curl commands. Use Kibana access to launch your Bonsai-hosted instance Kibana, included with every Bonsai cluster. Read more about Bonsai hosted Kibana here.

4. Deciding which credential type to use

Full Access

Full-access tokens are best used for back-end applications that handle indexing or act as a proxy for user’s input for querying.

Read Only

When choosing this type, the form will pre-populate with the allowed action (or ‘privilege’) `indices:data/read/*`. This allowed read-only specific actions: count, explain, get, exists, mget, get indexed scripts, more like this, multi percolate/search/termvector), percolate, scroll, clear_scroll, search, suggest, tv.

Custom

If you have specific security needs, generate a custom credential. Increase your team’s security tolerance by using custom credentials for things like limiting index actions to only certain IP addresses, or making certain indices search-only. There are three fields for custom credentials: Indices, Actions, and IP/CIDRs.

Indices

This sections allowed you to list a set of indices that are permitted, or create a pattern such as "logs_2019-12-*":

Leaving this blank will allow all indices present on the cluster accessible.

Allowed Actions

Specify access privileges from the searchable dropdown:

If you ever need help figuring out exactly which actions map to your needs, please email support and we’ll point you in the right direction. Leaving this blank will allow all access privileges.

IP Address or CIDR block

Use this section to control where you allow requests to be made from. Whitelist individual IP addresses for monitoring privileges, or write a CIDR block that only allows your company to access an internal-only index or cluster. Leaving this blank will allow any IP address by default.

Further reading:

5. Which credential types does your subscription support?

If you receive a notice to upgrade for access to read-only or custom credential management, you’ll need to navigate to Manage tab and upgrade your cluster to Standard, Business, or Enterprise:

Clicking on the Upgrade this cluster link will take you to your management dashboard, where you can upgrade to a Business or Enterprise subscription.

Credential Management

Changing a cluster's name is quick and easy! From the cluster dashboard, simply navigate to the Settings link under the Manage header:

Under the "Edit Cluster Name" form, enter the new name for the cluster, and click on "Save."

Changing your cluster's name will not result in downtime or loss of data. It will also not change the cluster's URL. If the name is already taken by an active cluster in your account, you will receive an error.

Change a Cluster's Name

Destroying your cluster is simple. In this doc we cover:

  1. How to deprovision a cluster
  1. FAQ

How to deprovision a cluster

If you need to deprovision(destroy) a cluster, navigate to your cluster, and head to the settings section of the dashboard.

Once you verify your account password in the form, you’ll be able to click the Deprovsion button.

Once a cluster has been deprovisioned, it is destroyed immediately. The data is deleted, it will instantly stop responding, and all requests will return an HTTP 404.

When you deprovision a paid cluster on Bonsai, you will automatically be credited with a prorated amount. This credit will apply to future invoices. For example, if you had a $50/mo cluster, which you destroyed after 3 weeks, you would get(roughly) $12.50 credit applied to your account.

Note

This section does not apply to Heroku users. If you're a Heroku user, check out:

FAQ

I don’t see the Settings section on my dashboard.

If you can’t find the Settings section on a cluster dashboard, then that means your Role on your team is either Member or Billing , neither of which can deprovision a cluster. Contact your team Admin to change your role, or handle the deprovision themselves. Read more about how teams on Bonsai work.

What if I accidentally deprovisioned my cluster?

The data in your cluster will be automatically deleted. Bonsai retains hourly snapshots for the past 24 hours, and daily snapshots for the past 7 days. So if you accidentally deleted data you needed, our team might be able to provide a partial restore, depending on when we are alerted. Contact us for more help.

Deprovision A Cluster

Bonsai offers support for integrating with other services. This menu can be found by clicking on the Integrations tab of your cluster dashboard.

Bonsai currently offers the support for the following integrations:

We are always looking for new services to integrate with, so if you would like to see Bonsai add support for another service, please reach out and let us know.

Integrations

The Console is a web-based UI for interacting with your cluster. Think of it like a user-friendly version of curl. If you want to try out some queries or experiment with the Elasticsearch API, this is a great place to start:

The UI allows the user to select an HTTP verb, enter an endpoint and run the command. The results are shown in the navy console on the right. The box below the verb and endpoint boxes can be used to create a request body.

Beware

Some places in the Elasticsearch documentation suggest using a GET request even when passing a request body. An example would be something like:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_search
{
 "query" : {
   "term" : {
     "user" : "george"
   }
 }
}</code></pre></div>

Without getting bogged down in pedantry, GET is a questionable verb to use in this case. A POST is more appropriate. It’s what the UI is assuming anyway. If you paste in a request body and use a GET verb, the body will be ignored.

Console

Need to update a credit card or provide access to your clusters to a team member? Use the sections below to manage all things account-related.

To access your Account Settings, click on the upper right-hand dropdown and select Account Settings.

From your account settings page, you can…

Account Settings

Control security within your whole team in your Account Security tab by enabling Single-sign on for your Account.

Single-sign on (SSO) is the ability to have a third party service validate your identity, like Google and Github. Bonsai currently support Google SSO. You can enable Google SSO which offers additional security like multi-factor authentication (MFA).

Individual team members can finalize this setup in their Profile Setting's Security tab.

Once Google SSO is enabled on your account, you will be able to require your team members to log in with Google.

Require SSO For All Team Members

Search Clips allow you to query any of your clusters and view readable, real-time responses. The Query Builder helps you navigate the Elasticsearch DSL with auto-completion and syntax error highlighting, and the results are exportable as JSON or CSV.

With Search Clips, you can build and share queries with your team members. You can invite some collaboration on the query structure while looking at the same results. Utilizing Search Clips in this way could create loops of building queries, learning from the output, refining, and ultimately lead to a well designed query that then gets integrated into an application.

To manage team members on an account, navigate to Managing Team Members.

In this document we will cover the following actions:

  1. View all Clips
  2. Create a Clip
  3. View and edit a Clip
  4. Viewing results
  5. Export a Clip
  6. Delete a Clip

1. View all Clips

Click on the icon to the left of the account drop down on the upper right hand corner of the screen, and click on Search Clips from that menu. You will be able to view and edit Clips related to clusters in any of your accounts created by you or your team members.

Accessing Search Clips for the first time? Click Get started with Search Clips

2. Create a Clip

Click the New Clip button in the upper right hand corner of the Search Clips page.

This will open a modal to enter information for your new Clip. Enter a name and select the cluster and indices you wish to query. All indices can be queried at once by selecting '_all', or select individual indices. Clicking on Create Clip will create and redirect you to the Search Clip.

3. View and edit a Clip

The main sections of the Search Clip view are the Query Editor on the left, and the Response View on the right. Updating the query in the editor will automatically refresh results, or you can click the refresh button on the upper right hand corner of the Response View. The name of the Clip and the name of the cluster that is currently being queried are also shown on this page.

Update the cluster, indices, and name of the Clip by clicking on Settings in the upper right hand corner.

4. Viewing Results

The Response View has tabs for Hits, Aggregations (Aggs), Metadata, and the Raw response. The Hits table shows the returned hits in a readable format, and allows you to toggle which columns you would like to see. If your query has any aggregations, they will be summarized in the Aggs tab.

5. Export a Clip

Click on the Export button to export your Clip results to JSON or CSV.

6. Delete a Clip

Clips can be deleted from the Search Clips page by clicking on the toggle to the right of the Clip you would like to delete, and selecting the Delete option.

Search Clips

Bonsai occasionally sends out updates about new features, and can also send out weekly reports about your cluster's usage and performance. You can manage whether or not you receive these emails in the Notification Preferences tab.

Change Notification Preferences

You can edit your name and company size in the account profile.

Update Your Organization's Profile

You can create a new account from the upper right-hand dropdown by selecting Add Account within Switch Account's drop-down menu

Give your new account a name and click Create Account.

You will then be prompted to create your free sandbox cluster in the new account. You can switch between your accounts from the drop-down menu:

Creating a New Account

You can invite multiple team members to your Account, each with specialized roles. Account admins can invite new users to join the Account. At least one Billing and one Admin role are needed for each account. If you only have one team member, that person assumes the Admin and Billing role by default.

Click here to go directly to Team Members in the dashboard

Add or update team members

You can edit your name and contact information in the Profile tab.

Edit your Profile details

Need to change your email, password, or change the notifications you receive about clusters? Use the sections below to manage your personal profile.

To navigate to your personal profile, click on your initials in the upper right corner and select Profile Settings from the dropdown menu.

Profile Settings

Navigate to Cancel Account to begin the cancellation process.

If you want to cancel your account, you’ll need to first make sure that any active clusters you have are deprovisioned first. If you have any active clusters on your account, you’ll see a notice like this:

For instructions on how to deprovision your clusters, please visit Cluster Dashboard.

Once you have deprovisioned all of the clusters on your account, you will be able to cancel your account completely. You will see a screen requesting your account password to confirm the cancellation.

So long, farewell, auf Wiedersehen, good night!

We’re sorry that you no longer want to have an account with us, but wish you the very best with your application and search. If there’s anything you feel we could do better, please don’t hesitate to send us an email with comments.

Cancel the account

Migrating from a Heroku Account to a Direct Account requires minor configuration changes in the Heroku application. This will require redeployment of the application.

How It Works:

  1. Sign up for an account at Bonsai.io. Please make sure to add your billing information.
  2. Change your application to use ELASTICSEARCH_URL environment variable rather than BONSAI_URL.
  3. Then configure your application in Heroku with the new ELASTICSEARCH_URL shown below:
    <span class="inline-code"><pre><code>heroku config:set ELASTICSEARCH_URL=$(heroku config:get BONSAI_URL)</code></pre></span>
    When you uninstall the Bonsai add-on, Heroku will remove the BONSAI_URL configuration setting. By redeploying your application to use this different environment variable now, you can remove any downtime for your application in later steps.
  4. Email support@bonsai.io and include: A) the email address associated with your new Bonsai account and B) the cluster(s) that you want migrated over.
  5. We’ll perform the migration. Your cluster URLs and all your data will remain intact. You will be billed at the monthly rate once that migration is complete. We’ll let you know once this step is done.
  6. Once we have confirmed that the migration is complete, remove the Bonsai addon(s) from your Heroku app so you’re not being billed twice! Uninstalling the Bonsai add-on at this step in Elasticsearch will remove the BONSAI_URL.
  7. You can migrate the rest of your application at your convenience. Any cluster we have migrated will now belong to your Bonsai.io account and can be managed there. Please let us know if your application is not functioning as expected.

That’s it! Migrations are zero-downtime and take only a few minutes once our support team takes the ticket.

Migrating from Heroku to a Direct Account

Hugo is a static site generator written in Go. It is conceptually similar to Jekyll, albeit with far more speed and flexibility. Hugo also supports generating output formats other than HTML, which allows users to pipe content directly into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Hugo to generate the exact format needed to submit the file to the _bulk endpoint of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Hugo installed and configured on your system

  1. Make Sure You Have Hugo Installed. This guide assumes you already have Hugo installed and and configured on your system. Visit the Hugo Documentation to get started.
  2. Spin Up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend.
  3. Create an Index on the Cluster. In this example, we’re going to push data into an index called <span class="inline-code"><pre><code>hugo</code></pre></span>. This index needs to be created before any data can be stored on it. The index can be created either through the Interactive Console, or with a tool like <span class="inline-code"><pre><code>curl</code></pre></span>:

Use the URL for your cluster. A Bonsai URL looks something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">curl -XPUT https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/hugo</code></pre></div>

Configure Hugo to Output to Bonsai Elasticsearch

Hugo’s configuration settings live in a file called <span class="inline-code"><pre><code>config.toml</code></pre></span> by default. This file may also have a <span class="inline-code"><pre><code>.json</code></pre></span>. or <span class="inline-code"><pre><code>.yaml</code></pre></span>/<span class="inline-code"><pre><code>yml</code></pre></span>yml extension. Add the following snippet based on your config file format:

TOML:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">[outputs]
home = ["HTML", "RSS", "Bonsai"]

[outputFormats.Bonsai]
baseName = "bonsai"
isPlainText = true
mediaType = "application/json"
notAlternative = true

[params.bonsai]
vars = ["title", "summary", "date", "publishdate", "expirydate", "permalink"]
params = ["categories", "tags"]</code></pre></div></div>

JSON:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">

{
 "outputs": {
   "home": [
     "HTML",
     "RSS",
     "Bonsai"
   ]  
},
 "outputFormats": {
   "Bonsai": {
     "baseName": "bonsai",
     "isPlainText": true,
     "mediaType": "application/json",
     "notAlternative": true
   }
 },
 "params": {
   "bonsai": {
     "vars": [
       "title",
       "summary",
       "date",
       "publishdate",
       "expirydate",
       "permalink"
     ],
     "params": [
       "categories",
       "tags"
     ]
   }
 }
}</code></pre></div></div>


YAML:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"> outputs:
 home:
   - HTML
   - RSS
   - Bonsai
outputFormats:
 Bonsai:
   baseName: bonsai
   isPlainText: true
   mediaType: application/json
   notAlternative: true
params:
 bonsai:
   vars:
     - title
     - summary
     - date
     - publishdate
     - expirydate
     - permalink
   params:
     - categories
     - tags</code></pre></div></div>

This snippet defines a new output called “Bonsai”, and specifies some associated variables.

Creating the JSON template

Hugo needs to have a template for rendering data in a way that Elasticsearch will understand. To do this, we will define a JSON template that conforms to the Elasticsearch Bulk API.

Create a template called <span class="inline-code"><pre><code>layouts/_default/list.bonsai.json</code></pre></span> and give it the following content:

<div class="code-snippet-container"><a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">{{/* Generates a valid Elasticsearch _bulk index payload */}}
{{- $section := $.Site.GetPage "section" .Section }}
{{- range .Site.AllPages -}}
 {{- if or (and (.IsDescendant $section) (and (not .Draft) (not .Params.private))) $section.IsHome -}}
   {{/* action / metadata */}}
   {{ (dict "index" (dict "_index" "hugo" "_type" "doc"  "_id" .UniqueID)) | jsonify }}
   {{ (dict "objectID" .UniqueID "date" .Date.UTC.Unix "description" .Description "dir" .Dir "expirydate" .ExpiryDate.UTC.Unix "fuzzywordcount" .FuzzyWordCount "keywords" .Keywords "kind" .Kind "lang" .Lang "lastmod" .Lastmod.UTC.Unix "permalink" .Permalink "publishdate" .PublishDate "readingtime" .ReadingTime "relpermalink" .RelPermalink "summary" .Summary "title" .Title "type" .Type "url" .URL "weight" .Weight "wordcount" .WordCount "section" .Section "tags" .Params.Tags "categories" .Params.Categories "authors" .Params.Authors) | jsonify }}
 {{- end -}}
{{- end }}</code></pre></div></div>

When the site is generated, this will result in creating a file called public/bonsai.json, which will have the content stored in a way that can be pushed directly into Elasticsearch using the Bulk API.

Push the Data Into Elasticsearch

To get the site’s data into Elasticsearch, render it by running <span class="inline-code"><pre><code>hugo</code></pre></span> on the command line. Then send it to your Bonsai cluster with <span class="inline-code"><pre><code>curl</code></pre></span>:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">curl -H "Content-Type: application/x-ndjson" -XPOST "https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/_bulk" --data-binary @public/bonsai.json</code></pre></div></div>

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre></div></div>

Hugo

Bonsai supports the ElasticHQ monitoring and management interface. This open source software gives you insight into the state of your cluster. If you’re looking to see details about performance, check out Cluster Metrics.

The GitHub repo has tons of documentation and how-to guides. This article lays out some common methods of running ElasticHQ locally.

Using OS X with Pow

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">mkdir -p ~/.pow/elastichq
git clone git@github.com:royrusso/elasticsearch-HQ.git ~/.pow/elastichq/public</code></pre></div></div>

Navigate your browser to http://elastichq.dev/. You should see the ElasticHQ dashboard. Enter your Bonsai cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using Grunt

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">git clone git@github.com:royrusso/elasticsearch-HQ.git
cd elasticsearch-HQ
git checkout master
npm install
grunt server</code></pre></div></div>

You should see the ElasticHQ dashboard at <span class="inline-code"><pre><code>http://localhost:9000/</code></pre></span>. Enter your bonsai.io cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using Apache

First, make sure Apache is running. Typically this means you can access a directory via your browser at <span class="inline-code"><pre><code>http://localhost:80/</code></pre></span>. Your install may be different, so use whatever URL/folder is appropriate for you.

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">cd   # The root directory of your HTTP server
git@github.com:royrusso/elasticsearch-HQ.git
cd elasticsearch-HQ/
git checkout master</code></pre></div></div>

Open up your browser to <span class="inline-code"><pre><code>http://localhost:80/elasticsearch-HQ/</code></pre></span> and you should see the ElasticHQ dashboard. Enter your bonsai.io cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using ElasticHQ with Bonsai

Logstash is a data processing tool for ingesting logs into Elasticsearch. It plays a prominent role in the Elastic suite, and a common question is whether Bonsai offers support for it.

The answer is a qualified “yes.” Logstash is a server-side tool, meaning it runs outside of Bonsai’s infrastructure and Bonsai is not involved in its configuration or management. But as a host, Bonsai is not opinionated about where your cluster’s data comes from. So if you have Logstash running on your servers, you can configure an output to your Bonsai cluster, and it will work.

Connecting your Logstash instance to a Bonsai cluster is as easy as adding an output to the Logstash configuration file like so:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">output {
   elasticsearch {
       # https://randomuser:randompass@something-12345.us-east-1.bonsai.io
       # would be entered as:
       hosts     => ["something-12345.us-east-1.bonsai.io:443"]
       user      => "randomuser"
       password  => "randompass"
       ssl       => true
       index     => ""
   }
}</code></pre></div>

Autocreation and Bonsai

If an application sends data to an index which does not exist, Elasticsearch will create that index and assume its mappings from the data in the payload. This feature is called autocreation, and it is supported in a limited capacity on Bonsai. Certain base names can be used for autocreation. Those base names are:

  • .kibana
  • events
  • filebeat
  • kibana-int
  • logstash
  • requests

This means your Logstash index must start with one of these index names, or it will not be automatically created.

We have made a number of tests and verified that we are fully compatible with Logstash, as of version 1.5+. Older versions of Logstash don’t have support for SSL/TLS or HTTP Basic Auth; these older versions can work with Bonsai, but only without the benefits of encryption or authentication.

If you have any issues getting Logstash to pass data along to Bonsai, check the documentation to make sure it’s set up correctly. If that doesn’t help, feel free to reach out to us at support@bonsai.io and we’ll do our best to get you pointed in the right direction.

Using Logstash with Bonsai

Kibana is an open-source data visualization and dashboard tool built for rich analytics. It takes advantage of Elasticsearch’s full-text querying and aggregation capabilities to built highly flexible and interactive dashboards.

All Bonsai clusters support Kibana out of the box. It is possible to use Kibana through one of several ways: via your Bonsai cluster dashboard, as a free Heroku app, or locally/as a private server.

Cluster Dashboard

Bonsai provides Kibana instances to clusters running on Elasticsearch versions 5.x and up. You can launch your Kibana instance right from your dashboard:

Clicking on the Kibana link will open up your Kibana instance:

Please be patient, as it may take a few seconds for Kibana to load.

As a Free Heroku App

If you have a Heroku account, there is a GitHub project that offers a click to deploy button. Clicking on the button will walk you through the process of deploying a free Heroku app running Kibana, which can be configured with a URL to an Elasticsearch cluster.

If you don’t have a cluster yet, a free Bonsai cluster will be created and attached to the Kibana app. If you already have a Bonsai cluster, you can link to it during the build process.

Locally / Private Server

You may also download Kibana and run it locally or on a private server. Not all versions of Kibana are compatible with all versions of Elasticsearch, so make sure to check the compatibility matrix and download a version that will work with your Bonsai cluster.

( Note: You can also install Kibana using a repository and package manager, but this will likely involve downloading the latest version and may not be compatible with your cluster)

Once you have Kibana downloaded, you’ll need to configure it to point at your Bonsai cluster. Open up the <span class="inline-code"><pre><code>config/kibana.yml</code></pre></span> file and to set the value for <span class="inline-code"><pre><code>elasticsearch_url</code></pre></span>. For example:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">elasticsearch_url: "https://:@something-12345.us-east-1.bonsai.io"</code></pre></div>

In some later versions of Kibana, you may need to separately specify your Bonsai cluster’s username/password as configuration options:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">elasticsearch_url: "https://:@something-12345.us-east-1.bonsai.io:443"
elasticsearch.username: ""
elasticsearch.password: ""</code></pre></div>

Once Kibana has been configured, you can run it with bin/kibana (or bin\kibana.bat on Windows). This will start up the Kibana server with the settings pointing to your Bonsai cluster.

Last, open up a browser to http://localhost:5601 to finish setting up Kibana and get started. Note that if you’re running Kibana on a remote server, you’ll need to replace localhost with the IP address or domain of the remote server.

Using Kibana with Bonsai

You must have an Enterprise account to enable Okta Single Sign-On. Once enabled for your account, all users must login via Okta. This can be done from the login form with any password, or from the Okta End-User Dashboard.

To enable Okta you must:

  1. Reach out to support@bonsai.io and request Okta for your Bonsai account.
  2. Navigate to your account security page to find a form to enter your metadata from Okta. You will see a screen similar to this:
  1. In Okta, select the Sign On tab for the Bonsai app:
  1. Click on View Setup Instructions which will open a new page. Scroll to the bottom of the page to find and copy your IDP metadata.
  1. Paste the metadata into the form on the Bonsai account security page and submit by clicking Add Metadata.
  2. Now Log into Bonsai via the Okta End-User Dashboard and Bonsai will activate and require Okta for all users associated with your account.

Okta Single Sign-On

Filebeat is a lightweight shipper for forwarding and centralizing log data. It monitors the log files or locations that you specify, collects log events, and forwards them to Elasticsearch for indexing. A common question is whether Bonsai offers support for it.

The answer is a qualified “yes.” Filebeat is a server-side tool, meaning it runs outside of Bonsai’s infrastructure and Bonsai is not involved in its configuration or management. But as a host, Bonsai is not opinionated about where your cluster’s data comes from. So if you have Filebeat running on your servers, you can configure an output to your Bonsai cluster, and it will work.

To connect Filebeat  to a Bonsai cluster you just need to add your Bonsai URL to the filebeat.yml file like this:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">output.elasticsearch:
 hosts: ["wp-play-8646224217.us-east-1.bonsaisearch.net:443"]
 protocol: "https"
 username: "aaa" # The randomly-generated username for your cluster
 password: "xxx" # The randomly-generated password for your cluster</code></pre></div></div>

Autocreation and Bonsai

If an application sends data to an index which does not exist, Elasticsearch will create that index and assume its mappings from the data in the payload. This feature is called autocreation, and it is supported in a limited capacity on Bonsai. Certain base names can be used for autocreation. Those base names are:

  • .kibana
  • events
  • filebeat
  • kibana-int
  • logstash
  • requests

This means your Filebeat index must start with one of these index names, or it will not be automatically created.

It is important to note that Filebeat requires the OSS distribution of Elasticsearch, so for this process to work the OSS version of Filebeat needs to be used.

Using Filebeat with Bonsai

WordNet is a huge lexical database that collects and orders English words into groups of synonyms. It can offer major improvements in relevancy, but it is not at all necessary for many use cases. Make sure you understand the tradeoffs (discussed below) well before setting it up.

There are two ways to use WordNet with Bonsai. Users can add a subset of the list using the Elasticsearch API, or use the WordNet file that comes standard with all Bonsai clusters.

First, a brief background on synonyms and WordNet. If you want to jump around, the main sections of this document are:

  • What Are Synonyms in Elasticsearch
  • How Does WordNet Improve Synonyms?
  • Why Wouldn’t Everyone Want WordNet?
  • Using WordNet via the Elasticsearch API
  • Using the WordNet List File, wn_s.pl
  • Resources

What Are Synonyms in Elasticsearch?

Let’s say that you have an online store with a lot of products. You want users to be able to search for those products, but you want that search to be smart. For example, say that your user searches for “bowtie pasta.” You may have a product called “Funky Farfalle” which is related to their search term but which would not be returned in the results because the title has “farfalle” instead of “bowtie pasta”. How do you address this issue?

Elasticsearch has a mechanism for defining custom synonyms, through the Synonym Token Filter. This lets search administrators define groups of related terms and even corrections to commonly misspelled terms. A solution to this use case might look like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
   "settings": {
       "index" : {
           "analysis" : {
               "filter" : {
                   "synonym" : {
                       "type" : "synonym",
                       "synonyms" : [
                           "bowtie pasta, farfalle"
                       ]
                   }
               }
           }
       }
   }
}</code></pre></div>

This is great for solving the proximate issue, but what it can get extremely tedious to define all groups of related words in your index.

How Does WordNet Improve Synonyms?

WordNet is essentially a text database which places English words into synsets - groups of synonyms - and can be considered as something of a cross between a dictionary and a thesaurus. An entry in WordNet looks something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript"></code></pre></div>

Let’s break it down:

This line expresses that the word ‘kitty’ is a noun, and the first word in synset 102122298 (which includes other terms like “kitty-cat,” “pussycat,” and so on). The line also indicates ‘kitty’ is the fourth most commonly used term according to semantic concordance texts. You can read more about the structure and precise definitions of WordNet entries in the documentation.

The WordNet has become extremely useful in text processing applications, including data storage and retrieval. Some use cases require features like synonym processing, for which a lexical grouping of tokens is invaluable.

Why Wouldn’t Everyone Want WordNet?

Relevancy tuning can be a deeply complex subject, and WordNet – especially when the complete file is used – has tradeoffs, just like any other strategy. Synonym expansion can be really tricky and can result in unexpected sorting, lower performance and more disk use. WordNet can introduce all of these issues with varying severity.

When synonyms are expanded at index time, Elasticsearch uses WordNet to generate all tokens related to a given token, and writes everything out to disk. This has several consequences: slower indexing speed, higher load during indexing, and significantly more disk use. Larger index sizes often correspond to memory issues as well.

There is also the problem of updating. If you ever want to change your synonym list, you’ll need to reindex everything from scratch. And WordNet includes multi-term synonyms in its database, which can break phrase queries.

Expanding synonyms at query time resolves some of those issues, but introduces others. Namely, performing expansion and matching at query time adds overhead to your queries in terms of server load and latency. And it still doesn’t really address the problem of multi word synonyms.

The Elasticsearch documentation some really great examples of what this means. The takeaway is that WordNet is not a panacea for relevancy tuning, and it may introduce unexpected results unless you’re doing a lot of preprocessing or additional configuration.

tl;dr: Do not simply assume that chucking a massive synset collection at your cluster will make it faster with more relevant results.

Using WordNet via the Elasticsearch API

Elasticsearch supports several different list formats, including the WordNet format. WordNet synonyms are maintained in a Prolog file called <span class="inline-code"><pre><code>wn_s.pl</code></pre></span>. To use these in your cluster, you’ll need to download the WordNet archive and extract the <span class="inline-code"><pre><code>wn_s.pl</code></pre></span> file. You’ll then need to create your synonyms list by reading this file into a request to your cluster.

The target index could be created with settings like so:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT https://randomuser:randompass@something-12345.us-east-1.bonsai.io/some_index

   {
     "settings": {
       "analysis": {
         "filter": {
           "wn_synonym_filter": {
             "type": "synonym",
             "format" : "wordnet",
             "synonyms" : [
                 "s(100000001,1,"abstain",v,1,0).",
                 "s(100000001,2,"refrain",v,1,0).",
                 "s(100000001,3,"desist",v,1,0).",
                 #... more synonyms, read from wn_s.pl file
             ]
           }
         },
         "analyzer": {
           "my_synonyms": {
             "tokenizer": "standard",
             "filter": [
               "lowercase",
               "wn_synonym_filter"
             ]
           }
         }
       }
     }
   }</code></pre></div>

There are a number of ways to generate this request. You could do it programmatically with a language like Python, or Bash scripts with <span class="inline-code"><pre><code>curl</code></pre></span>, or any language with which you feel comfortable.

A benefit of using a subset of the list would be more control over your mappings and data footprint. Depending on when your analyzer is running, you could save IO by not computing unnecessary expansions for terms not in your corpus or search parameters. Reducing the overhead will improve performance overall.

Using the WordNet List File, wn_s.pl

If you would rather use the official WordNet list, it is part of our Elasticsearch deployment. You can follow the official Elasticsearch documentation for WordNet synonyms, and link to the file with <span class="inline-code"><pre><code>analysis/wn_s.pl</code></pre></span>. For example:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT https://username:password@my-awesome-cluster.us-east-1.bonsai.io/some_index
{
   "settings": {
       "index" : {
           "analysis" : {
               "analyzer" : {
                   "synonym" : {
                       "tokenizer" : "whitespace",
                       "format" : "wordnet",
                       "filter" : ["synonym"]
                   }
               },
               "filter" : {
                   "synonym" : {
                       "type" : "synonym",
                       "format" : "wordnet",
                       "synonyms_path" : "analysis/wn_s.pl"
                  }
               }
           }
       }
   }
}</code></pre></div>

Resources

WordNet is a large subject and a great topic to delve deeper into. Here are some links for further reading:

Using Wordnet with Bonsai

Setting up your Java app to use Bonsai Elasticsearch is quick and easy. Just follow these simple steps:

Adding the Elasticsearch library

You’ll need to add the <span class="inline-code"><pre><code>elasticsearch-rest-high-level-client</code></pre></span> library to your pom.xml file like so:


   org.elasticsearch.client
   elasticsearch-rest-client
   6.2.3

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-java">package io.omc;

import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.client.DefaultConnectionKeepAliveStrategy;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.net.URI;

public class Main {

public static void main(String[] args) {
 String connString = System.getenv("BONSAI_URL");
 URI connUri = URI.create(connString);
 String[] auth = connUri.getUserInfo().split(":");

 CredentialsProvider cp = new BasicCredentialsProvider();
 cp.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(auth[0], auth[1]));

 RestHighLevelClient rhlc = new RestHighLevelClient(
   RestClient.builder(new HttpHost(connUri.getHost(), connUri.getPort(), connUri.getScheme()))
     .setHttpClientConfigCallback(
       httpAsyncClientBuilder -> httpAsyncClientBuilder.setDefaultCredentialsProvider(cp)
         .setKeepAliveStrategy(new DefaultConnectionKeepAliveStrategy())));

 // https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-search.html
 SearchRequest searchRequest = new SearchRequest();
 SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
 searchSourceBuilder.query(QueryBuilders.matchAllQuery());
 searchRequest.source(searchSourceBuilder);

 try {
  SearchResponse resp  = rhlc.search(searchRequest);
  // Show that the query worked
  System.out.println(resp.toString());
 } catch (Exception ex) {
  // Log the exception
  System.out.println(ex.toString());
 }

 // Need to close the client so the thread will exit
 try {
  rhlc.close();
 } catch (Exception ex) {

 }
}
}</code></pre></div></div>

Feel free to adapt that code to your particular app. Keep in mind that the core elements here can be moved around, but really shouldn’t be changed or further simplified.

For example, the snippet above parses out the authentication credentials from the  <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable. This is so the username and password don’t need to be hard coded into your app, where they could pose a security risk.

Java

Bonsai Elasticsearch integrates with your .Net app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.Net and NEST client to your dependencies list:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">...


....</code></pre></div></div>

You’ll also want to make sure the client is installed locally for testing purposes:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">$ dotnet restore</code></pre></div></div>

Self-hosted users: when you sign up with Bonsai and create a cluster, it will be provisioned with a custom URL that looks like https://user:pass@my-awesome-cluster.us-east-1.bonsai.io. Make note of this URL because it will be needed momentarily.

Heroku users: you’ll want to make sure that you’ve added Bonsai Elasticsearch to your app. Visit our addons page to see available plans. You can add Bonsai to your app through the dashboard, or by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">$ heroku addons:create bonsai: -a</code></pre></div></div>

Update your Main.cs file with the following:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">using System;
using Nest;

namespace search.con
{
   class Program
   {
       static void Main(string[] args)
       {
           var server = new Uri(Environment.GetEnvironmentVariable("BONSAI_URL"));
           var conn = new ConnectionSettings(server);
           // use gzip to reduce network bandwidth
           conn.EnableHttpCompression();            
           // lets the HttpConnection control this for optimal access (default: 80)
           conn.ConnectionLimit(-1);
           var client = new ElasticClient(conn);

           var resp = client.Ping();
           if (resp.IsValid)
           {
               Console.WriteLine("All is well");
           }
           else
           {
               Console.WriteLine("Elasticsearch cluster is down");
           }
       }
   }
}
</code></pre></div></div>

The code above does several things:

  1. Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
  2. References the Elasticsearch.Net and NEST libraries
  3. Instantiates the client using your private Bonsai URL
  4. Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment variables. Something like this should work:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ export BONSAI_URL="https://username:password@my-awesome-cluster-123.us-east-1.bonsai.io"</code></pre></div></div>

Heroku users: You don’t need to worry about this step. The environment variable is automatically populated for you when you add Bonsai Elasticsearch to your app. You can verify that it exists by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">$ heroku config:get BONSAI_URL</code></pre></div></div>

Go ahead and spin up your .Net app with <span class="inline-code"><pre><code>dotnet run</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

Elasticsearch INFO: 2016-02-03T23:44:41Z
 Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
 Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
 -> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

 <- 200

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

.Net

Bonsai Elasticsearch integrates with your node.js app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.js client to your dependencies list:

package.json:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">"dependencies": {
 "elasticsearch": "10.1.2"
 ....</code></pre></div></div>

You’ll also want to make sure the client is installed locally for testing purposes:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">$ npm install --save elasticsearch
$ npm install</code></pre></div></div>

Self-hosted users: when you sign up with Bonsai and create a cluster, it will be provisioned with a custom URL that looks like https://user:pass@my-awesome-cluster.us-east-1.bonsai.io. Make note of this URL because it will be needed momentarily.

Heroku users: you’ll want to make sure that you’ve added Bonsai Elasticsearch to your app. Visit our addons page to see available plans. You can add Bonsai to your app through the dashboard, or by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">$ heroku addons:create bonsai: -a</code></pre></div></div>

Update your index.js file with the following:

index.js:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">var bonsai_url    = process.env.BONSAI_URL;
var elasticsearch = require('elasticsearch');
var client        = new elasticsearch.Client({
                           host: bonsai_url,
                           log: 'trace'
                       });

// Test the connection:
// Send a HEAD request to "/" and allow
// up to 30 seconds for it to complete.
client.ping({
 requestTimeout: 30000,
}, function (error) {
 if (error) {
   console.error('elasticsearch cluster is down!');
 } else {
   console.log('All is well');
 }
});</code></pre></div></div>

The code above does several things:

  1. Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
  2. Adds the Elasticsearch.js library
  3. Instantiates the client using your private Bonsai URL
  4. Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment. Something like this should work:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ export BONSAI_URL="https://username:password@my-awesome-cluster-123.us-east-1.bonsai.io"</code></pre></div></div>

Heroku users: You don’t need to worry about this step. The environment variable is automatically populated for you when you add Bonsai Elasticsearch to your app. You can verify that it exists by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">$ heroku config:get BONSAI_URL</code></pre></div></div>

Go ahead and spin up your node app with <span class="inline-code"><pre><code>npm start</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">Elasticsearch INFO: 2016-02-03T23:44:41Z
 Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
 Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
 -> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

 <- 200</code></pre></div>

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

Node.js

Getting Elasticsearch up and running with a PHP app is fairly straightforward. We recommend using the official PHP client, as it is being actively developed alongside Elasticsearch.

Adding the Elasticsearch library

Like Heroku, we recommend Composer for dependency management in PHP projects, so you’ll need to add the Elasticsearch library to your <span class="inline-code"><pre><code>composer.json file</code></pre></span>:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">{
   "require": {
       "elasticsearch/elasticsearch": "1.0"
   }
}</code></pre></div></div>

Heroku will automatically add the library when you deploy your code.

Using the library

The elasticsearch-php library is configured almost entirely by associative arrays. To initialize your client, use the following code block:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-php">// Initialize the parameters for the cluster
$params = array();
$params['hosts'] = array (
   getenv("BONSAI_URL"),
);

$client = new Elasticsearch\Client($params);</code></pre></div></div>

Note that the host is pulled from the environment. When you add Bonsai to your app, the cluster will be automatically created and a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable will be added to your environment. The initialization process above allows you to access your cluster without needing to hardcode your authentication credentials.

Indexing your documents

Bonsai does not support lazy index creation, so you will need to create your index before you can send over your documents. You can specify a number of parameters to configure your new index; the code below is the minimum needed to get started:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-php">$indexParams = array();
$indexParams['index']  = 'my_index';    //index

$client->indices()->create($indexParams);</code></pre></div></div>

Once your index has been created, you can add a document by specifying a body (an associative array of fields and data), target index, type and (optionally) a document ID. If you don’t specify an ID, Elasticsearch will create one for you:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-php">$params['body']  = array('testField' => 'abc');

$params['index'] = 'my_index';
$params['type']  = 'my_type';
$params['id']    = 'my_id';

// Document will be indexed to my_index/my_type/my_id
$ret = $client->index($params);</code></pre></div></div>

PHP

Warning

The official Elasticsearch Python client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the client. This change prevents the Python client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch.

If you are receiving an error stating "The client noticed that the server is not a supported distribution of Elasticsearch," then you'll need to downgrade the client to 7.13 or lower.

Setting up your Python app to use Bonsai Elasticsearch is quick and easy. Just follow the steps below:

Adding the Elasticsearch library

You’ll need to add the elasticsearch library to your <span class="inline-code"><pre><code>Pipfile</code></pre></span> file like so:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-python">pipenv install elasticsearch elasticsearch-dsl</code></pre></div></div>

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">import os, base64, re, logging
from elasticsearch import Elasticsearch

# Log transport details (optional):

logging.basicConfig(level=logging.INFO)

# Parse the auth and host from env:

bonsai = os.environ['BONSAI_URL']
auth = re.search('https\:\/\/(.*)\@', bonsai).group(1).split(':')
host = bonsai.replace('https://%s:%s@' % (auth[0], auth[1]), '')

# Optional port

match = re.search('(:\d+)', host)
if match:
 p = match.group(0)
 host = host.replace(p, '')
 port = int(p.split(':')[1])
else:
 port=443

# Connect to cluster over SSL using auth for best security:

es_header = [{
'host': host,
'port': port,
'use_ssl': True,
'http_auth': (auth[0],auth[1])
}]

# Instantiate the new Elasticsearch connection:

es = Elasticsearch(es_header)

# Verify that Python can talk to Bonsai (optional):

es.ping()

</code></pre></div></div>

Feel free to adapt that code to your particular app. Keep in mind that the core elements here can be moved around, but really shouldn’t be changed or further simplified.

For example, the snippet above parses out the authentication credentials from the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable. This is so the username and password don’t need to be hard coded into your app, where they could pose a security risk. Also, the <span class="inline-code"><pre><code>host</code></pre></span>, <span class="inline-code"><pre><code>port</code></pre></span> and <span class="inline-code"><pre><code>use_ssl</code></pre></span> parameters are important for SSL encryption to work properly. Simply using your <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> as the host will not work because of limitations in urllib. It needs to be a plain URL, with the credentials passed to the client as a tuple.

Python

Users of Django/Haystack can easily integrate with Bonsai Elasticsearch! We recommend using the official Python client, as it is being actively developed alongside Elasticsearch.

Note

Haystack does not yet support Elasticsearch 6.x or 7.x, which is the current default on Bonsai. Heroku users can provision a 5.x cluster by adding Bonsai via the command line and passing in a <span class="inline-code"><pre><code>--version</code></pre></span> flag like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">heroku addons:create bonsai --version=5</code></pre>
</div>
</div>

Note that some versions require a paid plan. Free clusters are always provisioned on the latest version of Elasticsearch, regardless of which version was requested. See more details here.

Let’s get started:

Adding the Elasticsearch library

You’ll need to add the elasticsearch-py and django-haystack libraries to your requirements.txt file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">elasticsearch>=1.0.0,<2.0.0
django-haystack>=1.0.0,<2.0.0</code></pre>
</div>
</div>

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. We recommend adding the cluster URL to an environment variable, <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>, to avoid hard-coding your authentication credentials.

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">ES_URL = urlparse(os.environ.get('BONSAI_URL') or 'http://127.0.0.1:9200/')

HAYSTACK_CONNECTIONS = {
   'default': {
       'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
       'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
       'INDEX_NAME': 'haystack',
   },
}

if ES_URL.username:
   HAYSTACK_CONNECTIONS['default']['KWARGS'] = {"http_auth": ES_URL.username + ':' + ES_URL.password}</code></pre>
</div>
</div>

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

Common Issues

One of the most common issues users see relates to SSL certs. Bonsai URLs are using TLS to secure communications with the cluster, and our certificates are signed by an authority (CA) that has verified our identity. Python needs access to the proper root certificate in order to verify that the app is actually communicating with Bonsai; if it can’t find the certificate it needs, you’ll start seeing exception messages like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Root certificates are missing for certificate</code></pre>
</div>

The fix is straightforward enough. Simply install the certifi package in the environment where your app is hosted. You can do this locally by running <span class="inline-code"><pre><code>pip install certifi</code></pre></span>. Heroku users will also need to modify their <span class="inline-code"><pre><code>requirements.txt</code></pre></span>, per the documentation:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">certifi==0.0.8</code></pre>
</div>
</div>

I can’t install root certificates or certifi

If certifi and root cert management isn’t possible, you can simply bypass verification by modifying your <span class="inline-code"><pre><code>HAYSTACK_CONNECTIONS</code></pre></span> like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">HAYSTACK_CONNECTIONS = {
   'default': {
           'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
           'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
           'INDEX_NAME': 'documents',
           'KWARGS': {
             'use_ssl': True,
             'verify_certs': False,
           }
    },
}</code></pre>
</div>
</div>

This instructs Haystack to use TLS without verifying the host. This does allow for the possibility of MITM attacks, but the probably of that happening is pretty low. You’ll need to weigh the expediency of this approach against the unlikely event of someone eavesdropping, and decide whether there is an acceptable security risk of leaking data in that way.

Django with Haystack

Check Your Version

Haystack does not yet support Elasticsearch 6.x and up. Please ensure your Bonsai cluster version is 5.x or less.

Users of Django/django-elasticsearch-dsl can easily integrate with Bonsai Elasticsearch! This library is built on top of elasticsearch-dsl which is built on top of the low level elasticsearch-py maintained by Elastic.

Let’s get started:

Adding the libraries

You’ll need to add the django-elasticsearch-dsl library to your  <span class="inline-code"><pre><code>requirements.txt</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">django-elasticsearch-dsl>=0.5.1
elasticsearch-dsl>=6.0,<6.2</code></pre>
</div>
</div>

Full Instructions can be found here

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. We recommend adding the cluster URL to an environment variable,  <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>, to avoid hard-coding your authentication credentials.

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">ES_URL = urlparse(os.environ.get('BONSAI_URL') or 'http://127.0.0.1:9200/')

ELASTICSEARCH_DSL={
   'default': {
       'hosts': ES_URL
   },
}</code></pre>
</div>
</div>

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

Django with django-elasticsearch-dsl

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy. In this guide, we will cover the steps and the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Throughout this guide, you will see some code examples. These code examples are drawn from a very simple Ruby on Rails application, and are designed to offer some real-world, working code that new users will find useful. The complete demo app can be found in this GitHub repo.

Warning

The official Elasticsearch Ruby client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14 (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>

If you are receiving a <span class="inline-code"><pre><code>Elasticsearch::UnsupportedProductError</code></pre></span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.

Note

In this example, we are going to connect to Elasticsearch using the official Elasticsearch gems. There are other gems for this, namely SearchKick, which is covered in another set of documentation.

Step 1: Spin up a Bonsai Cluster

Make sure that there is a Bonsai Elasticsearch cluster ready for your app to interact with. This needs to be set up first so you know which version of the gems you need to install; Bonsai supports a large number of Elasticsearch versions, and the gems need to correspond to the version of Elasticsearch you’re running.

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

  • If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
  • Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Dashboard, in the Credentials tab:

Heroku users will also have a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable created when Bonsai is added to the application. This variable will contain the fully-qualified URL to the cluster.

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

You can also use the Interactive Console. In the Cluster Dashboard, click on the Console tab. It will load a default view, which includes the version of Elasticsearch. The version of Elasticsearch is called “number” in the JSON response:

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">curl https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443
{
 "name" : "ip-172-31-14-16",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "jVJrINr5R5GVVXHGcRhMdA",
 "version" : {
   "number" : "7.2.0",
   "build_flavor" : "oss",
   "build_type" : "tar",
   "build_hash" : "508c38a",
   "build_date" : "2019-06-20T15:54:18.811730Z",
   "build_snapshot" : false,
   "lucene_version" : "8.0.0",
   "minimum_wire_compatibility_version" : "6.8.0",
   "minimum_index_compatibility_version" : "6.0.0-beta1"
 },
 "tagline" : "You Know, for Search"
}</code></pre>
</div>
</div>

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Add the Gems

There are a few gems you will need in order to make all of this work. Add the following to your Gemfile outside of any blocks:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: 'master'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: 'master'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: 'master'</code></pre>
</div>
</div>

This will install the gems for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Branch</th><th>Elasticsearch Version</th></tr>
</thead>
<tbody>
<tr><td>0.1</td><td>-> 1.x</td></tr>
<tr><td>2.x</td><td>-> 2.x</td></tr>
<tr><td>5.x</td><td>-> 5.x</td></tr>
<tr><td>6.x</td><td>-> 6.x</td></tr>
<tr><td>master</td><td>-> master</td></tr>
</tbody>
</table>

For example, if your version of Elasticsearch is 6.x, then you would use something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: '6.x'</code></pre>
</div>
</div>

Make sure the branch you choose corresponds to the version of Elasticsearch that your Bonsai cluster is running.

What Do These Gems Do, Anyway?

  • elasticsearch-model. This gem is the only one that is actually required. It does the actual search integration for Ruby on Rails applications.
  • elasticsearch-rails. This optional gem adds some nice tools, such as rake tasks and ActiveSupport instrumentation.
  • bonsai-elasticsearch-rails. This optional gem saves you the step of setting up an initializer. By default, the Elasticsearch client will attempt to create a connection to <span class="inline-code"><pre><code>localhost:9200</code></pre></span>, which is not where your Bonsai cluster is located. This gem simply reads the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable which contains the cluster URL, and uses it to override the defaults.

Once the gems have been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install them.

Step 4: Add Elasticsearch to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured. You will need to require the `elasticsearch/model` library and include the necessary modules in the model.

For example, this demo app has a <span class="inline-code"><pre><code>User</code></pre></span> model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

Your model will need to have these lines included, at a minimum.

What Do These Lines Do?

  • <span class="inline-code"><pre><code>require 'elasticsearch/model'</code></pre></span>require 'elasticsearch/model' loads the Elasticsearch model library, if it has not already been loaded. This isn’t strictly necessary, provided the file is loaded somewhere at runtime.
  • <span class="inline-code"><pre><code>include Elasticsearch::Model</code></pre></span> tells the app that this model will be searchable by Elasticsearch. This is mandatory for the model to be searchable with Elasticsearch.
  • <span class="inline-code"><pre><code>include Elasticsearch::Model::Callbacks</code></pre></span> is an optional line, but injects Elasticsearch into the ActiveRecord lifecycle. So when an ActiveRecord object is created, updated or destroyed, it will also be created/updated/destroyed in Elasticsearch.
  • <span class="inline-code"><pre><code>settings index: { number_of_shards: 1 }</code></pre></span> This is also optional, but strongly recommended. By default, Elasticsearch will create an index for the model, with 5 primary shards and 1 replica. This will actually create 10 shards, and is ludicrously over-provisioned for most apps. This line simply overrides the default and specifies 1 primary shard(a replica will also be created by default).

You can optionally avoid (or increase) replicas by amending the settings like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: {
   number_of_shards:   1, # 1 primary shard; probably fine for most users
   number_of_replicas: 0  # 0 replicas; fine for dev, not for production.
 }
end</code></pre>
</div>
</div>

If you have more questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. There are a few different strategies for this, but we generally recommend using a dedicated controller for it, unrelated to the model(s) being indexed and searched. This is a more flexible approach, and keeps concerns separated. You can always render object-specific partials if your results involve multiple models.

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with a handful of attributes. It looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

To implement search, we created a file called <span class="inline-code"><pre><code>app/controllers/search_controller.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class SearchController< ApplicationController
 def run
   # This will search all models that have `include Elasticsearch::Model`
   @results = Elasticsearch::Model.search(params[:q]).records
 end
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">post '/search', to: 'search#run'</code></pre>
</div>
</div>

Next, we need to have some Views to render the data we get back from Elasticsearch. The run controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/search/run.html.erb</code></pre></span> and adding:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript">  Search Results
 <% if @results.present? %>
   <%= render partial: 'search_result', collection: @results, as: :result %>
 <% else %>
   Nothing here, chief!
 <% end %></code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. Create a file called <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span> and add:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
   <%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

   <%= result.company %>

   <%= result.company_description %></code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

At this point, the <span class="inline-code"><pre><code>User</code></pre></span> model is configured for searching in Elasticsearch, and has routes for sending a query to Elasticsearch. The next step is to render a form so that a user can actually use this feature. This is possible with a basic <span class="inline-code"><pre><code>form_with</code></pre></span> helper in a Rails view.

In this demo app, we added this to the navigation bar:

<div class="code-snippet-container">
<a fs-copyclip-element="click-12" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-12" class="hljs language-javascript"><%= form_with(url: "/search", method: "post", class: 'form-inline my-2 my-lg-0', local: true) do %>
   <%= text_field_tag(:q, nil, class: "form-control mr-sm-2", placeholder: "Search") %>
   <%= button_tag("Search", class: "btn btn-outline-info my-2 my-sm-0", name: nil) %>
<% end %></code></pre>
</div>
</div>

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Elasticsearch Where Your Cluster is Located

By default, the gems will try to connect to a cluster running on <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Elasticsearch is pointed to the correct URL.

If you are using the bonsai-elasticsearch-rails gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

Heroku users will have this already, and can skip to the next step. Other users will need to make sure this environment variable is manually set in their application environment. If you have access to the host, you can run this command in your command line:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

  • You are not using the bonsai-elasticsearch-rails gem for some reason, OR
  • You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you need to do this, then you can create a file called <span class="inline-code"><pre><code>config/initializers/elasticsearch.rb</code></pre></span>. Inside this file, you will want to put something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
Elasticsearch::Model.client = Elasticsearch::Client.new url: ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
Elasticsearch::Model.client = Elasticsearch::Client.new url: "https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

If you’re wondering why we prefer to use an environment variable instead of the URL, it’s simply a best practice. The cluster URL is considered sensitive information, in that anyone with the fully-qualified URL is going to have full read/write access to the cluster.

So if you have it in an initializer and check it into source control, that creates an attack vector. Many people have been burned by committing sensitive URLs, keys, passwords, etc to git, and it’s best to avoid it.

Additionally, if you ever need to change your cluster URL, updating the initializer will require another pass through CI and a deployment. Whereas you could otherwise just change the environment variable and restart Rails. Environment variables are simply the better way to go.

Step 7: Push Data into Elasticsearch

Assuming you have a database populated with data, the next step is to get that data into Elasticsearch.

Something to keep in mind here: Bonsai does not support lazy index creation, and even Elastic does not recommend using this feature. You’ll need to create the indices manually for each model that you want to search before you try and populate the cluster with data.

If you have the <span class="inline-code"><pre><code>elasticsearch-rails</code></pre></span> gem, you can use one of the built-in Rake tasks. To do that, create a file called <span class="inline-code"><pre><code>lib/tasks/elasticsearch.rake</code></pre></span> and simply put this inside:

<div class="code-snippet-container">
<a fs-copyclip-element="click-16" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-16" class="hljs language-javascript">require 'elasticsearch/rails/tasks/import'</code></pre>
</div>
</div>

Now you will be able to use a Rake task that will automatically create(or recreate) the index:

<div class="code-snippet-container">
<a fs-copyclip-element="click-17" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-17" class="hljs language-javascript">bundle exec rake environment elasticsearch:import:model CLASS='User'</code></pre>
</div>
</div>

You’ll need to run that Rake task for each model you have designated as destined for Elasticsearch. Note that if you’re relying on the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable to configure the Rails client, this variable will need to be present in the environment running the Rake task. Otherwise it will populate your local Elasticsearch instance(or raise some exceptions).

If you don’t use the Rake task, then you can also create and populate the index from within a Rails console with:

<div class="code-snippet-container">
<a fs-copyclip-element="click-18" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-18" class="hljs language-javascript"># Will return nil if the index already exists:
User.__elasticsearch__.create_index!

# Will delete the index if it already exists, then recreate:
User.__elasticsearch__.create_index! force: true

# Import the data into the Elasticsearch index:
User.import</code></pre>
</div>
</div>

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Elasticsearch. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, where it will be passed to Elasticsearch. Elasticsearch will search all of the indices it has (just <span class="inline-code"><pre><code>users</code></pre></span> at this point), and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The SearchController will then render the appropriate views. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Elasticsearch in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We added the gems, configured the model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Elasticsearch on Rails. This documentation is not exhaustive. However, there are a number of other use cases not discussed here. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on these additional subjects below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

Ruby on Rails

Here’s how to get started with Bonsai Elasticsearch and Ruby on Rails using Chewy.

Warning

Chewy is built on top of the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14 (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>

EngineVersionHighest Compatible Gem VersionElasticsearch5.x7.13Elasticsearch6.x7.14+ ( sic)Elasticsearch7.x7.13OpenSearch1.x7.13

If you are receiving a <span class="inline-code"><pre><code>Elasticsearch::UnsupportedProductError</code></pre></span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.

Note

As of January 2021, Chewy supports up to Elasticsearch 5.x. Users wanting to use Chewy will need to ensure they are not running anything later than 5.x. Support for later versions is planned.

Add Chewy to the Gemfile

In order to use the Chewy gem, add the gem to the Gemfile like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">gem 'chewy'
</code></pre>
</div>
</div>

Then run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install it.

Write an Initializer

A Chewy initializer will need to be written so that the Rails app can connect to your Bonsai cluster.

You can include your Bonsai cluster URL (located in Credentials of Bonsai dashboard) in the initializer, but hard-coding your credentials is not recommended. Instead, export them manually in the application environment to an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>.

We recommend using something like dotenv for this, but you can also set it manually like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># Substitute with your own Bonsai cluster URL:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Heroku users will not need to do so, as their Bonsai cluster URL will already be in a Config Var of the same name.

Create a file called <span class="inline-code"><pre><code>config/initializers/chewy.rb</code></pre></span>. With the environment variable in place, save the initializer with the following:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">Chewy.settings = {
  host: ENV['BONSAI_URL']
}</code></pre>
</div>
</div>

The official Chewy documentation has more details on ways to modify and refine Chewy’s behavior.

Configure Your Models

Any Rails model that you will want to be searchable with Elasticsearch will need to be configured to do so by creating a Chewy index definition and adding model-observing code.

For example, here’s how we created an index definition for our demo app in <span class="inline-code"><pre><code>app/chewy/users_index.rb</code></pre></span>:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">class UsersIndex < Chewy::Index
  define_type User
end</code></pre>
</div>
</div>

Then, the User model is written like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">class User < ApplicationRecord
  update_index('users#user') { self }
end</code></pre>
</div>
</div>

You can read more about configuring how a model will be ingested by Elasticsearch in the official Chewy documentation.

Indexing Your Documents

The Chewy gem comes with some Rake tasks for getting your documents into the cluster. Once you have set up your models as desired, run the following in your application environment:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">bundle exec rake chewy:deploy
</code></pre>
</div>
</div>

That will push your data into Elasticsearch and synchronize the data.

Next Steps

Chewy is a mature ODM with plenty of great features. Toptal (the maintainers of Chewy) have some great documentation exploring some of what Chewy can do: Elasticsearch for Ruby on Rails: A Tutorial to the Chewy Gem.

Chewy

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy with Searchkick. In this guide, we will start with a very basic Ruby on Rails application and add the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Throughout this guide, you will see some code examples. These code examples are drawn from a very simple Ruby on Rails application, and are designed to offer some real-world, working code that new users will find useful. The complete demo app can be found in this GitHub repo.

<div class="callout-warning">
<h3>Warning</h3>
<p>SearchKick uses the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:</p>
<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14+ (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>
<p>If you are receiving a <span class="inline-code warning">Elasticsearch::UnsupportedProductError</span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.</p>
</div>

<div class="callout-note">
<h3>Note</h3>
<p>In this example, we are going to connect to Elasticsearch using the Searchkick gem. There are also the official Elasticsearch gems for Rails, which are covered in another set of documentation.</p>
</div>

Step 1: Spin up a Bonsai Cluster

Make sure that there is a Bonsai Elasticsearch cluster ready for your app to interact with. This needs to be set up first so you know which version of the gems you need to install; Bonsai supports a large number of Elasticsearch versions, and the gems need to correspond to the version of Elasticsearch you’re running.

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

  • If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
  • Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Overview, in the Credentials tab:

Heroku users will also have a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable created when Bonsai is added to the application. This variable will contain the fully-qualified URL to the cluster.

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai UI, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

You can also use the Interactive Console. In the Cluster Dashboard, click on the Console tab. It will load a default view, which includes the version of Elasticsearch. The version of Elasticsearch is called “number” in the JSON response:

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">curl https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443
{
 "name" : "ip-172-31-14-16",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "jVJrINr5R5GVVXHGcRhMdA",
 "version" : {
   "number" : "7.2.0",
   "build_flavor" : "oss",
   "build_type" : "tar",
   "build_hash" : "508c38a",
   "build_date" : "2019-06-20T15:54:18.811730Z",
   "build_snapshot" : false,
   "lucene_version" : "8.0.0",
   "minimum_wire_compatibility_version" : "6.8.0",
   "minimum_index_compatibility_version" : "6.0.0-beta1"
 },
 "tagline" : "You Know, for Search"
}</code></pre>
</div>
</div>

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Install the Gem

To install Searchkick, you will need the searchkick gem. Add the following to your Gemfile outside of any blocks:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">gem 'searchkick'
</code></pre>
</div>
</div>

This will install the gem for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Elasticsearch Version</th><th>Searchkick Version</th></tr>
</thead>
<tbody>
<tr><td>1.x</td><td>-> 1.5.1</td></tr>
<tr><td>2.x</td><td>-> 2.5.0</td></tr>
<tr><td>5.x</td><td>-> 3.1.3 (additional notes)</td></tr>
<tr><td>6.x and up</td><td>-> 4.0 and up</td></tr>
</tbody>
</table>

If you need a specific version of Searchkick to accommodate your Elasticsearch cluster, you can specify it in your Gemfile like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'searchkick', '3.1.3'  # For Elasticsearch 5.x
</code></pre>
</div>
</div>

Once the gem has been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span>.

Step 4: Add Searchkick to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured to do so by adding the <span class="inline-code"><pre><code>searchkick</code></pre></span> keyword to it.

For example, our demo app has a User model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">class User < ApplicationRecord
 searchkick
end</code></pre>
</div>
</div>

Adding the searchkick keyword makes our User model searchable with Searchkick.

Searchkick provides a number of reasonable settings out of the box, but you can also pass in a hash of settings if you want to override the defaults. The hash keys generally correspond to the official Create Indices API. For example, this will allow you to create an index with 0 replicas:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">class User < ApplicationRecord
 searchkick settings: { number_of_replicas: 0 }
end</code></pre>
</div>
</div>

If you have questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. The easiest way to do this with Searchkick is to have a search route per model. This involves updating your models’ corresponding controller, and defining routes in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>. You’ll also need to have some views that handle rendering the results, and a form that posts data to the controller(s). Take a look at how we implemented this in our demo app for some examples of how this is done:

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with <span class="inline-code"><pre><code>searchkick</code></pre></span>. It looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">class User < ApplicationRecord
 searchkick
end</code></pre>
</div>
</div>

To implement search, we updated the file <span class="inline-code"><pre><code>app/controllers/users_controllers.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class UsersController< ApplicationController
 #... a bunch of controller actions, removed for brevity
 def search
   @results = User.search(params[:q])
 end
 #... more controller actions removed for brevity
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">Rails.application.routes.draw do
 resources :users do
   collection do
     post :search  # creates a route called users_search
   end
 end
end</code></pre>
</div>
</div>

Next, we need to have some views to render the data we get back from Elasticsearch. The <span class="inline-code"><pre><code>search</code></pre></span> controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/users/search.html.erb</code></pre></span> and adding:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript">Search Results

 <% if @results.present? %>
   <%= render partial: 'search_result', collection: @results, as: :result %>
 <% else %>
   Nothing here, chief!
 <% end %>

</code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. We also created a file for a partial called <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span> and added:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
   <%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

   <%= result.company %>

   <%= result.company_description %>

   </code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

At this point, the <span class="inline-code"><pre><code>User</code></pre></span> model is configured for searching in Elasticsearch, and has routes for sending a query to Elasticsearch. The next step is to render a form so that a user can actually use this feature. This is possible with a basic <span class="inline-code"><pre><code>form_with</code></pre></span> helper.

In this demo app, we added this to the navigation bar:

<div class="code-snippet-container">
<a fs-copyclip-element="click-12" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-12" class="hljs language-javascript"><%= form_with(url: "/search", method: "post", class: 'form-inline my-2 my-lg-0', local: true) do %>
   <%= text_field_tag(:q, nil, class: "form-control mr-sm-2", placeholder: "Search") %>
   <%= button_tag("Search", class: "btn btn-outline-info my-2 my-sm-0", name: nil) %>
<% end %></code></pre>
</div>
</div>

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Searchkick Where Your Cluster is Located

Searchkick looks for an environment variable called <span class="inline-code"><pre><code>ELASTICSEARCH_URL</code></pre></span>, and if it doesn’t find it, it uses <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Searchkick is pointed to the correct URL.

Bonsai does offer a gem, bonsai-searchkick, which populates the necessary environment variable automatically. If you're using this gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

Heroku users will have this already, and can skip to the next step. Other users will need to make sure this environment variable is manually set in their application environment. If you have access to the host, you can run this command in your command line:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

  • You are not using the bonsai-searchkick gem for some reason, OR
  • You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you need to do this, then you can create a file called <span class="inline-code"><pre><code>config/initializers/elasticsearch.rb</code></pre></span>. Inside this file, you will want to put something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
ENV["ELASTICSEARCH_URL"] = ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
ENV["ELASTICSEARCH_URL"] = "https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

If you’re wondering why we prefer to use an environment variable instead of the URL, it’s simply a best practice. The cluster URL is considered sensitive information, in that anyone with the fully-qualified URL is going to have full read/write access to the cluster.

So if you have it in an initializer and check it into source control, that creates an attack vector. Many people have been burned by committing sensitive URLs, keys, passwords, etc to git, and it’s best to avoid it.

Additionally, if you ever need to change your cluster URL, updating the initializer will require another pass through CI and a deployment. Whereas you could otherwise just change the environment variable and restart Rails. Environment variables are simply the better way to go.

Step 7: Push Data into Elasticsearch

Now that the app has everything it needs to query the cluster and render the results, we need to push data into the cluster. There are a few ways to do this.

One method is to open up a Rails console and run <span class="inline-code"><pre><code>.reindex</code></pre></span>. So if you want to reindex a model called <span class="inline-code"><pre><code>User</code></pre></span>, you would run <span class="inline-code"><pre><code>User.index</code></pre></span>.

Another method is to use Rake tasks from the command line. If you wanted to reindex that same <span class="inline-code"><pre><code>User</code></pre></span> model, you could run: <span class="inline-code"><pre><code>bundle exec rake searchkick:reindex CLASS=User</code></pre></span>. Alternatively, if you have multiple Searchkick-enabled models, you could run <span class="inline-code"><pre><code>rake searchkick:reindex:all</code></pre></span>.

Regardless of how you do it, Searchkick will create an index named after the ActiveRecord table of the model, the environment, and a timestamp. So reindexing the <span class="inline-code"><pre><code>User</code></pre></span> model in a development environment might result in an index called <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span>. This allows Searchkick to provide zero-downtime updates to settings and mappings.

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Searchkick. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, where it will be passed to Searchkick, which queries Elasticsearch. Elasticsearch will search the <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span> index, and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The UsersController will then ensure the appropriate views are rendered. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Searchkick in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We installed the Searchkick gem, added it to a model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Searchkick. This documentation is not exhaustive, and there are a lot of really cool features that Searchkick offers. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on additional subjects in the section below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

Searchkick

Jekyll is a static site generator written in Ruby. Jekyll supports a plugin model that Searchyll uses to read your site’s content and then index it into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Jekyll to index all of the content into a configured instance of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Jekyll installed and configured on your system.

  1. Make sure you have Jekyll installed. This guide assumes you already have Jekyll installed and configured on your system. Visit the Jekyll Documentation to get started.
  2. Spin up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend. Guides are available for setting up a cluster through bonsai.io or Heroku.

Configure Jekyll to output to Bonsai Elasticsearch

Jekyll’s configuration systems live in a file called `_config` by default.  Add the following snippet to the file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">elasticsearch:
 url: "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io"
 index_name: jekyll-posts

plugins:
 - searchyll</code></pre>
</div>
</div>

We also need to add the gem `searchyll`

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># Manually add the Searchyll gem into your Gemfile
bundle install</code></pre>
</div>
</div>

Push the Data Into Elasticsearch

To get the site's data into Elasticsearch, you can load it by simply running the Jekyll build command:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">jekyll build
</code></pre>
</div>
</div>

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"  

{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre>
</div>
</div>

Jekyll

At Bonsai, security and privacy are always a top concern. We are constantly evaluating the service and our vendors for vulnerabilities and flaws, and we will immediately address anything that could put our customers at risk. This is how we keep your data secure:

  • Access Controls All Bonsai clusters are provisioned with a unique, randomized URL and have HTTP Basic Authentication enabled by default, using a randomly generated set of credentials. Under this scheme, it would take the world’s fastest super computer around 23.5 quadrillion years to guess.
  • Encrypted communications. All Bonsai clusters support SSL/TLS for encryption in transit. We use industry standard strength encryption to ensure your data is safe over the wire.
  • Encrypted at rest. Bonsai clusters are provisioned on hardware that is encrypted at rest by default. In addition to Amazon’s physical security controls, this means your data is safe from physical theft.
  • Regular Snapshots. All paid Bonsai clusters receive regular snapshots, which are stored in an offsite, encrypted S3 bucket in the same region as the cluster.
  • Firewalled. All Bonsai clusters are accessed via a custom-built, high-performance layer 7 routing proxy, and sit behind a tightly controlled firewall. This helps to ensure that the cluster and data are protected from port scans and unauthorized persons.
  • Advanced Networking. Bonsai can support IP whitelisting, and VPC Peering to users on single tenant clusters.

How Does Bonsai Secure Data?
Note: The following article applies to Business and Enterprise tiers with dedicated single tenant clusters. These watermark alerts do not apply to Sandbox, Staging, or Standard tiers on shared multitenant clusters; instead, these tiers will be notified once an overage is detected on at least one metered metric.

Elasticsearch protects itself from data loss with the disk-based shard allocator. This mechanism attempts to strike a balance between minimizing disk usage across all nodes while also minimizing the number of shard reallocation processes. This ensures that all nodes have as much disk headroom as possible, with minimal impact to cluster performance.

Elasticsearch is constantly checking on each node’s disk availability to make decisions about where to allocate and move shards. There are 4 important“stages” that help dictate this decision making process:

  1. Alert watermark. When a node in a single tenant cluster reaches 70% disk or higher, Bonsai sends out an alert to the user.
  2. Low watermark. When a node reaches its low watermark stage(Bonsai defaults to 70% disk used), the cluster will no longer allocate new shards to this node.
  3. High watermark. When a node reaches its high watermark stage(Bonsai defaults to 75% disk used), the cluster will actively try to move shards off the node.
  4. Flood stage. When a node reaches its flood stage watermark(Bonsai uses 95% disk used), the cluster will put all open indices into a state that only allows for reads and deletes.

The low and high watermark stages are when Business and Enterprise tiers receive an emailed notification from the Bonsai Support team with suggestions for scaling disk capacity. Clusters that are over the 75% threshold can start to experience performance issues that may include but are not limited to: increased bulk queue times, higher CPU and/or load usage, or slower than normal searches.

How Bonsai Manages Watermarks

Bonsai offers two main architecture classes: multitenant and single tenant. The multitenant class – sometimes called “shared” – is designed to allow clusters to share hardware resources while still being securely sandboxed from one another. This allows us to provide unparalleled performance per dollar at smaller scales. All Hobby and Standard plans use multitenant architecture.

The single tenant class – sometimes called “dedicated” – maps one cluster to a private set of hardware resources. Because these resources are not shared with any other cluster, single tenant configurations provide maximum performance, security and customization. All Business and Enterprise plans use dedicated architecture.

Multi Tenant Class

Bonsai’s multitenant class utilizes some sandboxing features built into Elasticsearch. This allows the service to run multiple clusters on a single instance of Elasticsearch per node. This approach saves substantial hardware and network resources, and allows for radical cost savings especially for students, hobbyists, startups, projects in development, small businesses, and so on.

A great benefit of this approach is that Bonsai is able to provide some really nice features out of the box, for no additional cost: all multitenant clusters – even the free ones – are running on 3 nodes. They also get industry standard SSL/TLS and HTTP Basic authentication (see Security for more information), which keeps your data safe. Plus, the Bonsai dashboard offers plenty of tools for monitoring, managing, and engaging with your cluster.

Because these clusters are running on a shared Elasticsearch instance, there are also a few limitations. For one, certain API endpoints and actions are unavailable for security and performance reasons. Snapshots and plugins are not manageable by users to avoid collisions and regressions. And usage is metered about how you’d expect for a free/low-cost SaaS.

Clusters on the multitenant class can often be identified by their plan name. “Hobby”, “Staging”, “Production”, and “Shared” are all terms used by plans running on a multitenant architecture.

Single Tenant Class

Bonsai’s single tenant class has a fairly standard configuration. The cluster is simply one or more nodes (three by default), each running Elasticsearch. These nodes are physically isolated and on a different network than those running multitenant clusters. This means that all available IO on the nodes is always 100% allocated to your cluster.

This approach offers all of the same benefits as the multitenant class: you get the same industry standard SSL/TLS and HTTP Basic authentication (see Security for details), and the Bonsai dashboard. In addition, the isolated environment is suitable for encryption at rest and VPC Peering (for applications with stringent security requirements).

Furthermore, this class offer extremely flexible deployments and scaling. Need it in a region we don’t support on the multitenant class? No problem! Have a plugin or script that is vital to your operation? We can package it into our deployment! Let us know what you need, and we can provide a quote.

Also, because this class is highly secure and customizable, we are also able to support amendments to our terms of service and privacy policy, custom SLAs and more. Contact us for questions and quotes.

Trade Offs

This article details the differences between the two architectures we offer. There are a number of trade offs between these classes, summarized below:

<table>
<thead>
<tr><th>Class</th><th>Pros</th><th>Cons</th></tr>
</thead>
<tbody>
<tr><td>Multi-tenant</td><td>- Extremely cost effective
- Great performance for the money
- Can scale up or down on demand</td><td>- Limits on usage (disk, memory, connections, etc)
- Can not install and run arbitrary plugins
- Noisy neighbors*
- VPC pairing and at-rest encryption not available
- Subject to general terms of service and privacy policy</td></tr>
<tr><td>Single Tenant</td><td>- Extremely powerful
- No metering on usage
- Can deploy arbitrary plugins
- No noisy neighbors*
- Can have at-rest encryption
- VPC Pairing
- Custom terms, SLAs, etc</td><td>- More expensive to operate
- Scaling can be more difficult</td></tr>
</tbody>
</table>

* The Noisy Neighbor Problem is a well-known issue that occurs in multitenant architectures. In this context, one or more clusters may inadvertently monopolize IO resources (CPU, network, memory, etc), which can adversely affect other users on the same nodes. Bonsai actively monitors and addresses these situations when they come up, although the issue is frequently transient and resolves itself within a few minutes. Single tenant architectures do not suffer from this issue.

Why Not Containers / VMs?

A frequent question that comes up when talking about our various service architectures is why we don’t use container or virtualization technologies. A service that incorporates these technologies would offer some nice benefits, like allowing users to install their own plugins and manage their own snapshots.

The simple answer is “overhead.” Running containers – and especially VMs – requires system resources via some orchestration daemon or hypervisor. And simply running multiple instances of Elasticsearch on a node would require multiple JREs, which wastes resources through duplication.

Any resources that are allocated towards management of environments are therefore unavailable for Elasticsearch to use. In comparison to an architecture that doesn’t have this overhead, the provider must either offer less performance for the money, or charge more money for the same performance.

There is also a practical aspect to Bonsai eschewing containers and virtual machines. It is impossible to provide both great support and absolute customization. For example, users who install their own plugins can introduce a variety of regressions into Elasticsearch’s performance and behavior. When they open a support ticket, the agent must either spend time bug squashing or decline assistance altogether. Being opinionated allows our team to focus on depth of knowledge rather than breadth, which leads to faster, higher quality resolutions.

Finally, there is a philosophical motivation for how we built the service. We want to make Elasticsearch accessible to people at all stages of development; from the hobbyists and students, all the way up to the billion dollar unicorns. And we want to make sure that it’s the best possible experience. This means being opinionated about certain features, and taking a more active role in managing the infrastructure.

Bonsai Architecture

Bonsai supports a large number of Elasticsearch versions in regions all over the world. Multitenant class clusters usually have more limited options in terms of available regions and versions, while single tenant class clusters have far more options.

If you need a version or geographic region that is not listed here, reach out to our support and let us know what you need. We can get you a quote and timeline for getting up and running.

Multitenant Class

Bonsai operates a fleet of shared resource nodes in Oregon, Virginia, Ireland, Frankfurt, and Sydney. The Elasticsearch versions available on these nodes do not change often.

Sandbox clusters are limited to the most recent version of Elasticsearch and may be subject to automatic upgrades when a new version is released.

For all other multitenant plans, when Bonsai adds support for a new version, we will create a new server group rather than upgrading an existing group. The exception to this is a potential patch upgrade in response to some critical vulnerability. In other words, users will not be upgraded in place unless there is a vulnerability to address or if they are on a Sandbox plan. Read How Bonsai Handles Elasticsearch Releases for more information.

This table shows which versions of Elasticsearch are available for multitenant plans. Last updated: 2023-08-10.

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>Sandbox</td><td>7.10.2</td><td>2.6.0</td></tr>
<tr><td>All other multitenant </td><td>5.6.16 / 6.8.21 / 7.10.2 </td><td>2.6.0</td></tr>
</tbody></table>

Multitenant Version Support by Region

Multitenant plans are supported in a handful of popular regions, although the versions available to free plans is limited. Additional regions are available to Business and Enterprise subscriptions.

<table>
<thead>
<tr><th>Cloud</th><th>Region</th><th>Location</th><th>Elasticsearch 5.6.16</th><th>Elasticsearch 6.8.21</th><th>Elasticsearch 7.10.2</th><th>OpenSearch 2.6.0</th></tr>
</thead>
<tbody>
<tr><td>AWS</td><td>us-east-1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard Free Plans</td></tr>
<tr><td>AWS</td><td>us-west-2</td><td>Oregon, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, FreePlans</td></tr>
<tr><td>AWS</td><td>eu-west-1</td><td>Ireland, EU</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
<tr><td>GCP</td><td>us-east1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
</tbody>
</table>

Single Tenant Class

Single tenant clusters can be deployed in Oregon, Virginia, Ireland, Frankfurt, Sydney, and Tokyo. We support a variety of Elasticsearch versions for these kinds of clusters and will default to the most recent minor version unless something else is specified.

This table shows which versions of Elasticsearch are available for single tenant plans. Last updated: 2023-08-10.

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>All single tenant </td><td>2.4.0 to 7.10.2</td><td>1.2.4 to 2.6.0</td></tr>
</tbody></table>

Enterprise Deployments

Bonsai can deploy and manage whichever version of Elasticsearch or OpenSearch your use case needs. Please reach out to us at support@bonsai.io.

Older Search Engine Version Pricing

For major search engine versions behind Bonsai’s current primary supported version (OpenSearch 2.x and Elasticsearch 7.x), clusters will be charged an operational and maintenance fee to ensure that we can continue to support these older versions, and that we can give your organization the time it needs to decide when to upgrade to a new major version.

The fee will be assessed based on the following table:

Search Engine VersionOperational Fee
OpenSearch 1.x or Elasticsearch 6.x20%
OpenSearch 1.x or Elasticsearch 6.x20%
Elasticsearch 5.x15% additional
Elasticsearch 2.x10% additional
Elasticsearch 1.x5% additional
Which Versions Bonsai Supports

Bonsai clusters support most Elasticsearch APIs out of the box, with a few exceptions. This article details those exceptions, along with a brief explanation of why they’re in place. Here is what we will cover :

  • _all and wildcard destructive actions
  • Tasks API
  • Node Hot Threads
  • Node Shutdown & Restart
  • Snapshots
  • Reindex
  • Cluster Shard Reroute
  • Cluster Settings
  • Index Search Warmers
  • Static Scripts
  • Update By Query API

While having many of the following endpoints can be helpful for power users, the majority of applications don’t directly need them. If, however, you find yourself stuck without one of these available, please email us and we’ll be happy to help.

_all and wildcard destructive actions

Wildcard delete actions are usually for clusters with a large number of indices, and can be useful for completely wiping out a cluster and starting over. Wildcard and _all destructive actions were initially available on shared tier clusters. However, we received an increasingly large number of threads from distressed developers that accidentally deleted their entire production clusters.

After some internal discussion, we decided to disable wildcard actions. Removing the ability to sweepingly delete everything forces users to slow down and identify exactly what they’re deleting, reducing the risk of accidental and permanent data loss.

Examples of _all and wildcard destruction:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">DELETE /*
DELETE /_all</code>
</pre></div>

Tasks API

Elasticsearch provides an API for viewing information about current running tasks and the ability to cancel them on versions 5.x and up. This API is disabled for multi-tenant plans. It can be enabled by request on Business and Enterprise plans through an email to support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">GET /_tasks
GET /_tasks/{task_id}
POST /_tasks/{task_id}/_cancel</code></pre>
</div>
</div>

Node hot threads

A Java thread that uses lots of CPU and runs for an unusually long period of time is known as a hot thread. Elasticsearch provides an API to get the current hot threads on each node in the cluster. This information can be useful in forming a holistic picture of potential problems within the cluster.

Bonsai doesn’t support these endpoints on our shared tier to ensure user activity isn’t exposed to others. For a detailed explanation of why this is a concern, please see the article on Architecture Classes. Additionally, Bonsai is a managed service, so it’s really the responsibility of our Ops Team to investigate node-level issues.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">GET /_cluster/nodes/hotthreads
GET /_cluster/nodes/hot_threads
GET /_cluster/nodes/{node_id}/hotthreads
GET /_cluster/nodes/{node_id}/hot_threads
GET /_nodes/hotthreads
GET /_nodes/hot_threads
GET /_nodes/{node_id}/hotthreads
GET /_nodes/{node_id}/hot_threads</code></pre>
</div>
</div>

Node Shutdown & Restart

Elasticsearch provides an API for shutting down and restarting nodes. This functionality is unsupported across the platform. On our multitenant architecture, this prevents a user from shutting down a node or set of nodes that may be in use by another user. That action would have an adverse affect on other users which is why it is unsupported. This also prevents users from exacerbating whatever problem they’re trying to resolve.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">POST /_cluster/nodes/_restart
POST /_cluster/nodes/_shutdown
POST /_cluster/nodes/{node_id}/_restart
POST /_cluster/nodes/{node_id}/_shutdown
POST /_shutdown</code></pre>
</div>
</div>

Snapshots

The Snapshot API allows users to create and restore snapshots of indices and cluster data. It’s useful as a backup tool and for recovering from problems or data loss. On Bonsai, we’re already taking regular snapshots and monitoring cluster states for problems,. This API is blocked to avoid problems associated with multiple users trying to snapshot/restore at once.

If you feel that you need a snapshot taken/restored, please reach out to our support team.

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">GET    /_snapshot/_status
DELETE /_snapshot/{repository}
POST   /_snapshot/{repository}
PUT    /_snapshot/{repository}
GET    /_snapshot/{repository}/_status
DELETE /_snapshot/{repository}/{snapshot}
GET    /_snapshot/{repository}/{snapshot}
POST   /_snapshot/{repository}/{snapshot}
PUT    /_snapshot/{repository}/{snapshot}
POST   /_snapshot/{repository}/{snapshot}/_create
PUT    /_snapshot/{repository}/{snapshot}/_create
POST   /_snapshot/{repository}/{snapshot}/_restore
GET    /_snapshot/{repository}/{snapshot}/_status</code></pre>
</div>
</div>

Reindex

The Reindex API copies documents from one index to another. For example, it will copy documents from an index called <span class="inline-code"><pre><code>books</code></pre></span> into another index, like <span class="inline-code"><pre><code>new_books</code></pre></span>. It does this by using what is basically a scan and scroll search, reading the contents of one index into another. The API call looks something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_reindex</code>
</pre></div>

At this time, the Reindex API is supported on Bonsai for clusters running on Elasticsearch 5.6 and up. Because it can be extremely demanding on IO, which can impact performance and availability for other users on multitenant architectures, it is only available to pre-5.6 clusters on Business or Enterprise subscriptions.

One workaround is to set up an indexing script that uses the scan and scroll search to read documents from your old index and push them into the new index. For example, read the contents of <span class="inline-code"><pre><code>GET /my_index/_search?search_type=scan&scroll=1m</code></pre></span>, then <span class="inline-code"><pre><code>POST</code></pre></span> those retrieved docs into a new index. That would provide basically the same functionality.

Another option is to simply populate the new index directly from your database.

Cluster Shard Reroute

Elasticsearch provides an API to move shards around between nodes within a cluster. We don’t support this functionality on our shared plans for a few reasons. For one it interferes with our cluster management tooling, and there is a possibility for one or more users to allocate shards in a way that overloads a node.

If you need fine-grain control over the shard allocation within a cluster, please reach out to us and we can discuss your use case and look at whether single tenancy would be a good fit for you.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /_cluster/reroute</code>
</pre></div>

Cluster settings

Elasticsearch provides an API to apply cluster-wide settings. We don’t support this in our shared environment for safety reasons. In an environment where system resources are shared, this API would affect all users simultaneously. So one user could affect the behavior of everyone’s cluster in ways that those users may not want. Instead, we block this API and remain opinionated about cluster settings.

If you need to change the system settings for your cluster, you’ll need to be in a single tenant environment. Reach out to us and let’s talk through it.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT /_cluster/settings</code>
</pre></div>

Index Search Warmers

Elasticsearch provides a mechanism to speed up searches prior to being run. It does this by basically pre-populating caches via automatically running search requests. This is called “warming” the data, and it’s typically done against searches that require heavy system resources.

We don’t support this on shared clusters for stability reasons. Essentially there isn’t a great way to throttle the impact of an arbitrary number of warmers. There is a possibility that a user could overwhelm the system resources by creating a large number of “heavy” warmers (aggregations, sorts on large lists, etc). It’s also somewhat of an anti-pattern in a multitenant environment.

If this is something critical to your app, you would need to be on a dedicated cluster. Please reach out to us if you have any further questions on this.

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">POST /_warmer/{name}
PUT  /_warmer/{name}
POST /_warmers/{name}
PUT  /_warmers/{name}
POST /{index}/_warmer/{name}
PUT  /{index}/_warmer/{name}
POST /{index}/_warmers/{name}
PUT  /{index}/_warmers/{name}
POST /{index}/{type}/_warmer/{name}
PUT  /{index}/{type}/_warmer/{name}
POST /{index}/{type}/_warmers/{name}
PUT  /{index}/{type}/_warmers/{name}</code></pre>
</div>
</div>

Static Scripts

Elasticsearch provides an API for adding and modifying static scripts within a cluster, which can be referenced by name. We can enable these endpoints on a case-by-case basis for Business and Enterprise clusters. Inline scripts with Painless are supported on all clusters.

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">DELETE /_scripts/{lang}/{id}
GET    /_scripts/{lang}/{id}
POST   /_scripts/{lang}/{id}
PUT    /_scripts/{lang}/{id}
POST   /_scripts/{lang}/{id}/_create
PUT    /_scripts/{lang}/{id}/_create</code></pre>
</div>
</div>

Update By Query API

Elasticsearch provides an API for updating documents that match a given query. This API is disabled on multitenant clusters due to the large amount of overhead and risk it introduces. Namely, it's possible to craft a query that runs, uncontrolled, for a long period of time and consumes significant amounts of CPU in the process. On multitenant systems, this translates to a severe performance degradation for multiple users.

Customers wanting to make use of this API can do so on a Business or Enterprise subscription.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST /{index}/_update_by_query</code>
</pre></div>

Unsupported API Endpoints

Bonsai has been around for a long time and has supported every major release of Elasticsearch since 0.20 (late 2012). Over the years, Bonsai has rolled out support for a number of versions of Elasticsearch, and in the process developed a time-tested, standardized approach.

When a new version of Elasticsearch is released, we will subsequently deprecate the oldest public version. This is to ensure that we can provide excellent support and reliable operational responses to any incident that may arise.

Dedicated Clusters Exempt!

Clusters in single tenant environments — Business, Enterprise, Dedicated, etc — are not affected by this version upgrade policy. These clusters will remain operational and unaffected by new releases unless/until the owner wishes to upgrade.

A quick summary of Bonsai’s version support policy is as follows:

  • Bonsai supports the two most recent major versions on Hobby and Standard tier clusters.
  • Bonsai does not automatically upgrade existing clusters when new Elasticsearch versions are released, or when old versions are retired.
  • Bonsai offers long-term version support for older major versions of Elasticsearch for Business and Enterprise tier clusters.
  • Hobby tier clusters always default to the cutting edge when available.
  • Hobby tier clusters are the first to be upgraded.
  • Clusters on EOL nodes (see below) will not be serviced and will receive limited technical support.
  • Bonsai does not force upgrades, except in cases involving a major security issue. These are rare, and typically patch-level upgrades.

How Bonsai Handles New Version Rollouts

Bonsai will first deploy and validate the new version. Then, when our operations team is satisfied that it safe and reliable for production use, we will begin defaulting new clusters to the latest version.

Clusters running on our oldest supported version receive several notifications about next steps.

The general process is:

  • Deploy the new version and make it the default for Hobby clusters. Users with paid clusters can also opt in to the new version where available
  • Validate that the latest version is safe for production, and ensure our operational tooling is adequate to respond to any incident that may arise
  • Default all new clusters to the latest version
  • Deprecate the oldest version:
  • Send a notice several weeks in advance to all impacted users, alerting them of the deprecation and providing instructions and guidance for upgrading
  • Send a final notice several days in advance to any impacted user who still has not upgraded
  • Migrate remaining clusters to the next supported version
  • Terminate nodes running the deprecated version

There are some caveats and exceptions here, based on the cluster’s plan level:

Hobby Clusters

These clusters are generally reserved for non-production purposes, like for hobbyists and students, PR-apps and the like. As a result, they are always the first candidates for version upgrades, as well as the first to be mass migrated off of deprecated versions.

In other words, when a Bonsai deploys a new version of Elasticsearch, Hobby clusters will be the first to default to the new version where available. This is one way we can validate performance and safety with minimal risk to production applications.

In a version deprecation, these clusters are also the first to be migrated off of the deprecated version up to the next supported version. Data loss or mapping changes are not common during these events, but are a distinct possibility.

As unpaid resources, we are unable to accommodate special treatment regarding this process. For example, if a user is on a free cluster and requests an extension to prepare, we will not be able to accommodate.

Standard tier Clusters

These clusters are running in multitenant environments, meaning the system resources are shared among many users. We’re not able to selectively upgrade, downgrade or “pin” clusters to a version in these environments. Changing versions literally involves migrating the cluster to another set of servers.

Users who are not able to upgrade to a new version of Elasticsearch have the option of upgrading to a Business or Enterprise tier plan. With these plans, we will provision a new isolated environment running whatever version of Elasticsearch is required, and migrate the cluster there. This obviates the need to worry about future version deprecations.

One other possibility for users who can’t upgrade either their cluster’s plan or Elasticsearch version is to remain on EOL nodes. This approach comes with some risks, which are described below.

Business and Enterprise tier Clusters

Clusters on these plans are running in isolated environments and generally not affected by version rollouts and deprecations on Bonsai. Major version upgrades for these clusters are by request only. Minor version upgrades may be made in the event of a vulnerability disclosure.

For example, if an Enterprise cluster is running on version X.Y.0, and a critical vulnerability is disclosed, we will work with the customer to establish a service window, deploy the patched version, and then perform a rolling restart. Service impacts are generally trivial and without down time.

Otherwise, Business and Enterprise tier clusters are not affected by version deprecations on Bonsai.

What are EOL Nodes?

An EOL node is a server running a version of Elasticsearch that Bonsai no longer supports. It has reached its End Of Life (EOL) and may be terminated at any time, for a variety of reasons. When that happens, our Ops Team will not replace it or address any resulting service interruption or data loss. Additionally, our Support Team will provide limited technical support for issues arising from being on an old version.

When a version of Elasticsearch is officially deprecated by Bonsai, clusters running on that version will be upgraded to the next supported version. In some cases, we will allow users to remain on the unsupported version with the caveat that their cluster is running on EOL nodes. Users with clusters on these resources accept all risks and responsibilities.

Some examples:

  • In the event that a security regression is disclosed, we may choose to promptly terminate EOL nodes without advance notice.
  • In the event of irrecoverable failure of the underlying EC2 instances, we may choose to terminate the EOL nodes without advance notice.

Users running on EOL nodes should actively work to get their application on a supported version as soon as possible to avoid this kind of outcome.

How Bonsai Handles Elasticsearch Releases

Heroku customers who would like to transition across a public-private network boundary (either from a public space to a private space, or from a private space to a public space) have limited options for doing so. The boundary represents a firewall that is designed to prevent access from the public space to the private space. Within the private space itself, there are security protocols in place to prevent the exfiltration of data. Migrations across the boundary are difficult by design; this difficulty is an artifact of the additional layer of security granted by private spaces.

The preferred method for migrating a cluster is to use a blue-green deployment strategy. There are several general steps to this approach:

  1. Create a new Bonsai addon using the plan of your choice. This will create a new Bonsai cluster. If you are trying to migrate a cluster into a private space, make sure this addon is created within the private space.
  2. Reindex your content into the new cluster.
  3. Update your application to read and write to the new cluster.
  4. Verify that the application is working as expected.
  5. Tear down the old cluster.

This strategy will accommodate migrations across the boundary, regardless of whether you're going from public to private, or private to public.

If a blue-green deployment strategy is not possible for some reason, please contact us to set up a call. We will consider alternatives based on your specific needs.

Migrating Between Public and Private Spaces in Heroku

Bonsai has more safeguards to protect your search cluster’s reliability than any other Elasticsearch provider.

Disaster recovery plan for all clusters on our platform

Whether you’re hosting Elasticsearch on your own or choosing a hosted provider like Bonsai, every search cluster should have a disaster recovery plan.

Bonsai was built from the ground up to be a highly available system. We leverage a variety of best practices from the industry to achieve this. It all starts with the choice of Elasticsearch, which is a highly available search engine with built-in clustering and sharding support. Bonsai deploys all Elasticsearch clusters in a multi-node, multi-data center configuration to guarantee that your data is safe and secure. To further improve the High Availability of your cluster, Bonsai deploys all clusters behind an AWS Application Load Balancer. This allows you to connect to a singular URL, and get access to every node of the cluster through our load balancing algorithm.

All production Bonsai Clusters are deployed to minimum of three nodes for redundancy, and to prevent stalemates in leadership election. Each node in the cluster will be deployed to a separate AWS Availability Zone, giving us data center isolation as well. A Bonsai cluster could experience a complete loss of two AWS data centers, and the cluster will still continue to operate. This makes Bonsai clusters extremely fault-tolerant.

When a Bonsai cluster does experience a node loss, Elasticsearch will automatically reroute the primary and replica shards to machines that are up and running. In the background, AWS Auto Scaling Groups will immediately begin spinning up the replacement instance that will auto-bootstrap into your configured Elasticsearch configuration and version. Once the node has successfully provisioned, it will join the cluster, and then Elasticsearch will offload the relocated shards back to the empty machine.

In the off chance that Bonsai (and much of the internet with it) experiences an entire loss of an AWS EC2 region, all of your cluster’s data is maintained in AWS’s S3 system, which has a reliability guarantee of 99.99% uptime and 99.999999999% durability. If such a failure happens, Bonsai’s staff will work with your team to understand where you will be relocating your application, and can then initiate a restore process into a cluster in the same AWS Region while maintaining your existing DNS connections.

Elasticsearch Disaster Recovery

Elasticsearch has a menagerie of available plugins, including support for custom ones. We often get asked through our support channels whether we support a particular plugin and whether the user can install their own. Here you’ll find all the info you could ever want to know about plugins on Bonsai Elasticsearch.

Do you support plugin X?

Bonsai supports all of the plugins that ship with Elasticsearch, with some caveats. Plugins like X-Pack, Shield and Marvel require a license for commercial purposes, which we do not bundle into our subscriptions. We also don’t generally deploy unofficial plugins to the Shared level plans for stability and security reasons.

These are the plugins we currently deploy by default for all clusters:

Can I install my own plugins on Bonsai?

The answer depends on several things. If you’re on an single tenant plan, then we’re likely able to accommodate. We will need to review the plugin’s functionality, version support and maintenance status, and whether it’s under active development. Supported plugins that are not the property of the customer would require an open source license.

If your cluster is on one of our shared clusters, then it’s not possible to install a non-standard plugin. At Bonsai we’d love to be able to give you everything you want for your cluster, but we need to be opinionated about our shared cluster architecture to ensure security and performance for all other users.

More Questions?

Still have questions or concerns that weren’t addressed here? Shoot us an email and let us know!

Plugins on Bonsai

Bonsai’s support policy categorizes incident severity into three tiers. You can use these severity levels to help communicate the level of support needed.

24/7 Operational Coverage is Provided for All Clusters

Bonsai monitors its infrastructure 24 hours a day, 365 days a year. The Operations Team is automatically alerted to any problems in real time. One or more engineers will then respond to the event. These types of events include, but are not limited to:

  • Red Indexes
  • Failed EC2 Instances
  • Stuck GC
  • Failed backups
  • Network partitions / Split brain

This operational coverage is provided to all clusters, regardless of plan.

The severity classifications are defined as:

<table>
<thead>
<tr><th>Severity Level</th><th>Description</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>An incident of downtime or service degradation is causing severe impact to customer production availability. Active risk of data loss, traffic completely or mostly failing.</td></tr>
<tr><td>Severity 2</td><td>Moderate to severe service interruption that can be temporarily worked around. May cause a moderate to major degradation in user experience.</td></tr>
<tr><td>Severity 3</td><td>Low rate of minor service errors or intermittent degradations. Minor regressions to response times. Should be investigated and repaired but is not visibly harming customer’s production systems.</td></tr>
</tbody></table>

Coverage Hours

Incidents not related to a problem Bonsai’s infrastructure are treated as support issues. Coverage for these events is the same as our support hours:

<table>
<thead>
<tr><th>Severity Level</th><th>Standard</th><th>Business</th><th>Enterprise</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>Business hours</td><td>Business hours</td><td>24/7</td></tr>
<tr><td>Severity 2</td><td>Business hours</td><td>Business hours</td><td>Business hours</td></tr>
<tr><td>Severity 3</td><td>Business hours</td><td>Business hours</td><td>Business hours</td></tr>
</tbody></table>

Response Time

We will respond to inquiries about active incidents within these time windows.

<table>
<thead>
<tr><th>Severity Level</th><th>Standard</th><th>Business</th><th>Enterprise</th></tr>
</thead>
<tbody>
<tr><td>Severity 1</td><td>8 hrs</td><td>1 hr</td><td>1 hr</td></tr>
<tr><td>Severity 2</td><td>24 hrs</td><td>4 hrs</td><td>4 hrs</td></tr>
<tr><td>Severity 3</td><td>48 hrs</td><td>8 hrs</td><td>8 hrs</td></tr>
</tbody></table>

Incident Response Times

The new Bonsai Business Plans come with more options than we have ever provided before. It may seem intimidating to choose, but there are a few simple guidelines that make it easy. Bonsai Business Plans also don’t require annual contracts. If your index or traffic changes periodically, you can change plans whenever you wish.

Let’s start by looking at the two plan types.

Choosing a plan type

Business Plans offer two main types - Compute and Capacity. The difference is inherent in their names: if your use-case necessitates a lot of data written to disk but less traffic (perhaps only a few per hour), Capacity allows you to get more bang for your buck in raw disk. As a contrast, Compute is designed for those that need a setup that can withstand from high traffic load or query complexity.

Planning for disk capacity

When you deploy an HA Elasticsearch cluster, you must provision enough disk for three things: 1.Your primary data 2. Your replica data, and 3. The normal maintenance routines performed by Lucene, the underlying search engine behind Elasticsearch.

Nobody likes using a search engine that doesn’t work. Failing to account for any of these factors will result in performance degradation, a.k.a. the infamous yellow or red cluster. 😫

How much primary data you can load into Elasticsearch, while still maintaining High Availability? This simple formula will help you calculate:

((number of nodes - 1) * the capacity of a single node) * 0.8 = the amount of data that can be loaded in your cluster

Let’s put this in a concrete example. A Business Capacity Large plan has a raw capacity of 150GB, with each of the three nodes contributing 50GB of disk. So the concrete numbers would be:

number of nodes = 3 per node capacity = 50GB total raw capacity = 3 * 50GB = 150GB Usable data = ((3-1) * 50GB) * 0.8 = 80GB

This means that if you have a total raw capacity of 150GB, you should only be planning to use 80GB of it for your search data usage. At first this seems like a huge gap in resources available versus resources usable (that’s a 53% drop!), but it’s a necessary plan to prevent getting paged at 3AM with red status clusters, poorly performing queries and/or data loss.

This formula, explained

Planning is key with distributed systems like Elasticsearch. When nodes inevitably go offline, it’s important to have replication in place for backup. The formula removes one node from your calculation (number of nodes - 1) so that your cluster will not lose any data. The additional failover nodes will maintain a green index status, even when a node goes offline for maintenance reasons. This ensures enough capacity for your primary data and a replica. Multiplying the total by 0.8 buffers your capacity by 20%, which accounts for Lucene’s maintenance routines.

Planning for computational requests and traffic

Now that we’ve covered raw capacity planning, let’s talk about handling traffic. Traffic and query computation strength maps to the size of the cluster. Larger clusters can handle a higher number of requests. You should consider three different numbers here:

1.) How many search requests will you be doing in any given minute? 2.) How many aggregations will you be doing in any given minute? 3.) Lastly, how many bulk updates will you perform each minute?

To ensure optimal performance, all three numbers should fit under the values in the table below.

<table>
<thead>
<tr><th>Aggregation Rate</th><th>Bulk Insert Rate</th><th>Search Rate</th><th>Ideal Plan/Size</th></tr>
</thead>
<tbody>
<tr><td>< 25 / minute</td><td>< 250 / minute</td><td>< 500 / minute</td><td>Large</td></tr>
<tr><td>< 50 / minute</td><td>< 500 / minute</td><td>< 1000 / minute</td><td>XLarge</td></tr>
<tr><td>< 100 / minute</td><td>< 1000 / minute</td><td>< 2000 / minute</td><td>2XLarge</td></tr>
</tbody>
</table>

Aggregation RateBulk Insert RateSearch RateIdeal Plan/Size< 25 / minute< 250 / minute< 500 / minuteLarge< 50 / minute< 500 / minute< 1000 / minuteXLarge< 100 / minute< 1000 / minute< 2000 / minute2XLarge

As a note: these numbers are conservative by design. Once you are up and running, you’ll be able to use the metrics panel in the Bonsai application to see the how your searches are performing in real time and be able to make sizing decisions, up or down, based on real data.

For those of you that want to dig even deeper, you can read our very thorough version of capacity planning as well.

Questions?

Our team has provisioned search engines that handle billions of requests each month. If you are still unsure about which plan is right for you, please contact us for a personalized consultation.

Sizing Your Business Cluster

Datadog is a monitoring service that allows customers to see real time metrics related to their application and infrastructure, as well as receive alerts for predefined events. Datadog offers an Elasticsearch integration for monitoring clusters, and Bonsai supports this integration.

  1. Getting Started
  2. Adjust Dashboard
  3. Reference: Metrics Available on Standard Subscriptions
  4. Reference: Metrics Available on Business / Enterprise Subscriptions

Getting Started

Log into Datadog, navigate to <span class="inline-code"><pre><code>Integrations > APIs</code></pre></span> to get your current API Key

In the table of API keys, grab your current API key:

Navigate to the Integrations on your cluster dashboard under Data. There is a section marked Datadog Integration.

Enter your Datadog API key, select the API site, then press Activate Datadog. You should start to see request metrics loaded into Datadog within a few minutes.

Adjust How Metrics Are Shown In The Datadog Dashboard

Note to Enterprise tier customers who has Grafana dashboards set up with our team:

The metrics Bonsai sends to Datadog are identical to the metrics sent to Grafana. Metrics data can be processed and rendered in a variety of ways, so it’s possible that the metrics shown in Datadog differ from metrics displayed in the Bonsai dashboard or Grafana. This is because Datadog features (such as smoothing) can mask performance patterns like short-lived load spikes. Bonsai considers the metrics displayed in Grafana to be authoritative.

Datadog's configuration may change on how units are reported in the graphs. The request durations are ideally shown in milliseconds, and the following steps will get you back on track:

1. In the Datadog dashboard, select  Metrics info from the "cog" dropdown menu on a graph like so:

97f75c9ee6187678eef0543b3d736b11.png


2. Select <span class="inline-code"><pre><code>*.p50</code></pre></span> and it'll take you to the Metrics Summary search for p50. Click on the  Metric Name then Edit button:

3. Under  Metadata, change the Unit from minute to millisecond and leave per as <span class="inline-code"><pre><code>(None)</code></pre></span>. Click save:

4. Search for the other request duration times and edit their units to reflect millisecond. Once all three are edited, the graph should reflect the smallest time unit in milliseconds:

920323812cf940d1700877de5a00ffcc.png

If you have any questions or issues, or are not seeing metrics loading into Datadog after a long time, please reach out and let us know at support@bonsai.io.

Reference: Metrics Available on Standard Subscriptions

Clusters on Standard plans have access to a variety of metrics:

<table>
<tbody>
<tr><td>bonsai.req.2xx
(gauge)</td><td>Number of requests with a 2xx(successful) response code
Shown as request
</td></tr>
<tr><td>bonsai.req.4xx
(gauge)</td><td>Number of requests with a 4xx(client error) response code
Shown as request
</td></tr>
<tr><td>bonsai.req.5xx
(gauge)</td><td>Number of requests with a 5xx(server error) response code
Shown as request
</td></tr>
<tr><td>bonsai.req.max_concurrency
(gauge)</td><td>Peak concurrent requests
Shown as connection
</td></tr>
<tr><td>bonsai.req.p50
(gauge)</td><td>The median request duration
Shown as minute
</td></tr>
<tr><td>bonsai.req.p95
(gauge)</td><td>The 95th percentile request duration
Shown as minute
</td></tr>
<tr><td>bonsai.req.p99
(gauge)</td><td>The 99th percentile request duration
Shown as minute
</td></tr>
<tr><td>bonsai.req.queue_depth
(gauge)</td><td>Peak queue depth(how many requests are waiting due to concurrency limits)
Shown as connection
</td></tr>
<tr><td>bonsai.req.reads
(gauge)</td><td>The number of requests which read data
Shown as request
</td></tr>
<tr><td>bonsai.req.rx_bytes
(gauge)</td><td>The number of bytes sent to elasticsearch
Shown as byte
</td></tr>
<tr><td>bonsai.req.total
(gauge)</td><td>The total number of requests
Shown as request
</td></tr>
<tr><td>bonsai.req.tx_bytes
(gauge)</td><td>The number of bytes sent to client
Shown as byte
</td></tr>
<tr><td>bonsai.req.writes
(gauge)</td><td>The total number of writes
Shown as request
</td></tr>
</tbody>
</table>

Reference: Metrics Available on Business / Enterprise Subscriptions

Metrics are tagged on a per-cluster basis, so you can easily segment between your Elasticsearch instances. The tags look like: <span class="inline-code"><pre><code>cluster:my-cluster-slug</code></pre></span>

Users with Business and Enterprise subscriptions have access to additional metrics:

<table cellpadding="5" cellspacing="1" border="1">
<tr><td></td><td></td></tr>
<tr><td>cpu.usage_guest</td><td>float</td></tr>
<tr><td>cpu.usage_guest_nice</td><td>float</td></tr>
<tr><td>cpu.usage_idle</td><td>float</td></tr>
<tr><td>cpu.usage_iowait</td><td>float</td></tr>
<tr><td>cpu.usage_irq</td><td>float</td></tr>
<tr><td>cpu.usage_nice</td><td>float</td></tr>
<tr><td>cpu.usage_softirq</td><td>float</td></tr>
<tr><td>cpu.usage_steal</td><td>float</td></tr>
<tr><td>cpu.usage_system</td><td>float</td></tr>
<tr><td>cpu.usage_user</td><td>float</td></tr>
<tr><td>disk.free</td><td>integer</td></tr>
<tr><td>disk.inodes_free</td><td>integer</td></tr>
<tr><td>disk.inodes_total</td><td>integer</td></tr>
<tr><td>disk.inodes_used</td><td>integer</td></tr>
<tr><td>disk.total</td><td>integer</td></tr>
<tr><td>disk.used</td><td>integer</td></tr>
<tr><td>disk.used_percent</td><td>float</td></tr>
<tr><td>diskio.io_time</td><td>integer</td></tr>
<tr><td>diskio.iops_in_progress</td><td>integer</td></tr>
<tr><td>diskio.read_bytes</td><td>integer</td></tr>
<tr><td>diskio.read_time</td><td>integer</td></tr>
<tr><td>diskio.reads</td><td>integer</td></tr>
<tr><td>diskio.weighted_io_time</td><td>integer</td></tr>
<tr><td>diskio.write_bytes</td><td>integer</td></tr>
<tr><td>diskio.write_time</td><td>integer</td></tr>
<tr><td>diskio.writes</td><td>integer</td></tr>
<tr><td>elasticsearch_breakers.accounting_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.accounting_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.fielddata_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.in_flight_requests_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.parent_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_estimated_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_limit_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_overhead</td><td>float</td></tr>
<tr><td>elasticsearch_breakers.request_tripped</td><td>float</td></tr>
<tr><td>elasticsearch_cluster_health.active_primary_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.active_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.active_shards_percent_as_number</td><td>float</td></tr>
<tr><td>elasticsearch_cluster_health.initializing_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_data_nodes</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_nodes</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.number_of_pending_tasks</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.relocating_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.status</td><td>string</td></tr>
<tr><td>elasticsearch_cluster_health.status_code</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.task_max_waiting_in_queue_millis</td><td>integer</td></tr>
<tr><td>elasticsearch_cluster_health.timed_out</td><td>boolean</td></tr>
<tr><td>elasticsearch_cluster_health.unassigned_shards</td><td>integer</td></tr>
<tr><td>elasticsearch_clusterstats_indices.completion_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.docs_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.docs_deleted</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.fielddata_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.fielddata_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.filter_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.filter_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.id_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_current</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_memory_size</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_queries</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.percolate_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_cache_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_cache_size</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.query_cache_total_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_doc_values_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_fixed_bit_set_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_index_writer_max_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_index_writer_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_max_unsafe_auto_id_timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_norms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_points_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_stored_fields_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_term_vectors_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_terms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.segments_version_map_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_primaries_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_replication_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_index_shards_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_primaries</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_replication</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.shards_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.store_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_indices.store_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_client</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_coordinating_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_data</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_data_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_ingest</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master_data</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_master_only</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.count_total</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.discovery_types_zen</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_spins</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.fs_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_max_uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_mem_heap_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_mem_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_threads</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_bundled_jdk</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_using_bundled_jdk</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_vm_vendor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.jvm_versions_0_vm_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.network_types_http_types_netty4</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.network_types_transport_types_netty4</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_allocated_processors</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_available_processors</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_cache_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_cores_per_socket</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_mhz</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_model</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_total_cores</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_total_sockets</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_cpu_0_vendor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_free_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.os_mem_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_count</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_flavor</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.packaging_types_0_type</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_0_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_10_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_11_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_12_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_13_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_14_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_15_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_16_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_17_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_18_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_19_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_1_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_2_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_3_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_4_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_5_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_6_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_7_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_isolated</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_jvm</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_requires_keystore</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_site</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_8_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_description</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_elasticsearch_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_has_native_controller</td><td>boolean</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_java_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.plugins_9_version</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_avg</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_max</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.process_open_file_descriptors_min</td><td>float</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.versions_0</td><td>string</td></tr>
<tr><td>elasticsearch_clusterstats_nodes.versions_1</td><td>string</td></tr>
<tr><td>elasticsearch_fs.data_0_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.data_0_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_read_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_read_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_write_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_devices_0_write_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_read_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_read_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_write_kilobytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.io_stats_total_write_operations</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.least_usage_estimate_used_disk_percent</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.most_usage_estimate_used_disk_percent</td><td>float</td></tr>
<tr><td>elasticsearch_fs.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_available_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_io_op</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_io_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_read_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_reads</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_write_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_disk_writes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_fs.total_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_.current_open</td><td>float</td></tr>
<tr><td>elasticsearch_.total_opened</td><td>float</td></tr>
<tr><td>elasticsearch_indices.completion_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.docs_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.docs_deleted</td><td>float</td></tr>
<tr><td>elasticsearch_indices.fielddata_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.fielddata_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.filter_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.filter_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_periodic</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.flush_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_exists_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_exists_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_missing_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_missing_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.get_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.id_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_delete_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_failed</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_index_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_noop_update_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.indexing_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current_docs</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_current_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_auto_throttle_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_docs</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_stopped_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_throttled_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.merges_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_queries</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.percolate_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_cache_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_cache_size</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.query_cache_total_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_current_as_source</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_current_as_target</td><td>float</td></tr>
<tr><td>elasticsearch_indices.recovery_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_external_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_external_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_listeners</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.refresh_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_evictions</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_hit_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_memory_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.request_cache_miss_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_fetch_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_open_contexts</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_query_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_scroll_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.search_suggest_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_count</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_doc_values_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_fixed_bit_set_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_index_writer_max_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_index_writer_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_max_unsafe_auto_id_timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_norms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_points_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_stored_fields_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_term_vectors_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_terms_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.segments_version_map_memory_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.store_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.store_throttle_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_indices.suggest_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_earliest_last_modified_age</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_operations</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_uncommitted_operations</td><td>float</td></tr>
<tr><td>elasticsearch_indices.translog_uncommitted_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_current</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_total</td><td>float</td></tr>
<tr><td>elasticsearch_indices.warmer_total_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_total_capacity_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_direct_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_total_capacity_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.buffer_pools_mapped_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_current_loaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_total_loaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.classes_total_unloaded_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_old_collection_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_old_collection_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_young_collection_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.gc_collectors_young_collection_time_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_committed_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_heap_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_non_heap_committed_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_non_heap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_old_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_survivor_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_peak_max_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_peak_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.mem_pools_young_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.threads_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.threads_peak_count</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_jvm.uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_cfs_period_micros</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_cfs_quota_micros</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_number_of_elapsed_periods</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_number_of_times_throttled</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpu_stat_time_throttled_nanos</td><td>float</td></tr>
<tr><td>elasticsearch_os.cgroup_cpuacct_usage_nanos</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_idle</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_15m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_1m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_load_average_5m</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_stolen</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_sys</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_usage</td><td>float</td></tr>
<tr><td>elasticsearch_os.cpu_user</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_0</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_1</td><td>float</td></tr>
<tr><td>elasticsearch_os.load_average_2</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_actual_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_actual_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_free_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.mem_used_percent</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_free_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_total_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.swap_used_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_os.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_os.uptime_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_percent</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_sys_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_total_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.cpu_user_in_millis</td><td>float</td></tr>
<tr><td>elasticsearch_process.max_file_descriptors</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_resident_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_share_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.mem_total_virtual_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_process.open_file_descriptors</td><td>float</td></tr>
<tr><td>elasticsearch_process.timestamp</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.analyze_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bench_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.bulk_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_started_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.fetch_shard_store_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.flush_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.force_merge_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.generic_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.get_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.index_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.listener_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.management_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.merge_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.optimize_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.percolate_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.refresh_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.search_throttled_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.snapshot_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.suggest_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.warmer_threads</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_active</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_completed</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_largest</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_queue</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_rejected</td><td>float</td></tr>
<tr><td>elasticsearch_thread_pool.write_threads</td><td>float</td></tr>
<tr><td>elasticsearch_transport.rx_count</td><td>float</td></tr>
<tr><td>elasticsearch_transport.rx_size_in_bytes</td><td>float</td></tr>
<tr><td>elasticsearch_transport.server_open</td><td>float</td></tr>
<tr><td>elasticsearch_transport.tx_count</td><td>float</td></tr>
<tr><td>elasticsearch_transport.tx_size_in_bytes</td><td>float</td></tr>
<tr><td>haproxy.active_servers</td><td>integer</td></tr>
<tr><td>haproxy.addr</td><td>string</td></tr>
<tr><td>haproxy.backup_servers</td><td>integer</td></tr>
<tr><td>haproxy.bin</td><td>integer</td></tr>
<tr><td>haproxy.bout</td><td>integer</td></tr>
<tr><td>haproxy.cache_hits</td><td>integer</td></tr>
<tr><td>haproxy.cache_lookups</td><td>integer</td></tr>
<tr><td>haproxy.check_code</td><td>integer</td></tr>
<tr><td>haproxy.check_duration</td><td>integer</td></tr>
<tr><td>haproxy.check_fall</td><td>integer</td></tr>
<tr><td>haproxy.check_health</td><td>integer</td></tr>
<tr><td>haproxy.check_rise</td><td>integer</td></tr>
<tr><td>haproxy.check_status</td><td>string</td></tr>
<tr><td>haproxy.chkdown</td><td>integer</td></tr>
<tr><td>haproxy.chkfail</td><td>integer</td></tr>
<tr><td>haproxy.cli_abort</td><td>integer</td></tr>
<tr><td>haproxy.comp_byp</td><td>integer</td></tr>
<tr><td>haproxy.comp_in</td><td>integer</td></tr>
<tr><td>haproxy.comp_out</td><td>integer</td></tr>
<tr><td>haproxy.comp_rsp</td><td>integer</td></tr>
<tr><td>haproxy.conn_rate</td><td>integer</td></tr>
<tr><td>haproxy.conn_rate_max</td><td>integer</td></tr>
<tr><td>haproxy.conn_tot</td><td>integer</td></tr>
<tr><td>haproxy.connect</td><td>integer</td></tr>
<tr><td>haproxy.ctime</td><td>integer</td></tr>
<tr><td>haproxy.dcon</td><td>integer</td></tr>
<tr><td>haproxy.downtime</td><td>integer</td></tr>
<tr><td>haproxy.dreq</td><td>integer</td></tr>
<tr><td>haproxy.dresp</td><td>integer</td></tr>
<tr><td>haproxy.dses</td><td>integer</td></tr>
<tr><td>haproxy.econ</td><td>integer</td></tr>
<tr><td>haproxy.ereq</td><td>integer</td></tr>
<tr><td>haproxy.eresp</td><td>integer</td></tr>
<tr><td>haproxy.hanafail</td><td>integer</td></tr>
<tr><td>haproxy.http_response.1xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.2xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.3xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.4xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.5xx</td><td>integer</td></tr>
<tr><td>haproxy.http_response.other</td><td>integer</td></tr>
<tr><td>haproxy.iid</td><td>integer</td></tr>
<tr><td>haproxy.intercepted</td><td>integer</td></tr>
<tr><td>haproxy.last_chk</td><td>string</td></tr>
<tr><td>haproxy.lastchg</td><td>integer</td></tr>
<tr><td>haproxy.lastsess</td><td>integer</td></tr>
<tr><td>haproxy.lbtot</td><td>integer</td></tr>
<tr><td>haproxy.mode</td><td>string</td></tr>
<tr><td>haproxy.pid</td><td>integer</td></tr>
<tr><td>haproxy.qcur</td><td>integer</td></tr>
<tr><td>haproxy.qmax</td><td>integer</td></tr>
<tr><td>haproxy.qtime</td><td>integer</td></tr>
<tr><td>haproxy.rate</td><td>integer</td></tr>
<tr><td>haproxy.rate_lim</td><td>integer</td></tr>
<tr><td>haproxy.rate_max</td><td>integer</td></tr>
<tr><td>haproxy.req_rate</td><td>integer</td></tr>
<tr><td>haproxy.req_rate_max</td><td>integer</td></tr>
<tr><td>haproxy.req_tot</td><td>integer</td></tr>
<tr><td>haproxy.reuse</td><td>integer</td></tr>
<tr><td>haproxy.rtime</td><td>integer</td></tr>
<tr><td>haproxy.scur</td><td>integer</td></tr>
<tr><td>haproxy.sid</td><td>integer</td></tr>
<tr><td>haproxy.slim</td><td>integer</td></tr>
<tr><td>haproxy.smax</td><td>integer</td></tr>
<tr><td>haproxy.srv_abort</td><td>integer</td></tr>
<tr><td>haproxy.status</td><td>string</td></tr>
<tr><td>haproxy.stot</td><td>integer</td></tr>
<tr><td>haproxy.ttime</td><td>integer</td></tr>
<tr><td>haproxy.weight</td><td>integer</td></tr>
<tr><td>haproxy.wredis</td><td>integer</td></tr>
<tr><td>haproxy.wretr</td><td>integer</td></tr>
<tr><td>haproxy.wrew</td><td>integer</td></tr>
<tr><td>kernel.boot_time</td><td>integer</td></tr>
<tr><td>kernel.context_switches</td><td>integer</td></tr>
<tr><td>kernel.entropy_avail</td><td>integer</td></tr>
<tr><td>kernel.interrupts</td><td>integer</td></tr>
<tr><td>kernel.processes_forked</td><td>integer</td></tr>
<tr><td>mem.active</td><td>integer</td></tr>
<tr><td>mem.available</td><td>integer</td></tr>
<tr><td>mem.available_percent</td><td>float</td></tr>
<tr><td>mem.buffered</td><td>integer</td></tr>
<tr><td>mem.cached</td><td>integer</td></tr>
<tr><td>mem.commit_limit</td><td>integer</td></tr>
<tr><td>mem.committed_as</td><td>integer</td></tr>
<tr><td>mem.dirty</td><td>integer</td></tr>
<tr><td>mem.free</td><td>integer</td></tr>
<tr><td>mem.high_free</td><td>integer</td></tr>
<tr><td>mem.high_total</td><td>integer</td></tr>
<tr><td>mem.huge_page_size</td><td>integer</td></tr>
<tr><td>mem.huge_pages_free</td><td>integer</td></tr>
<tr><td>mem.huge_pages_total</td><td>integer</td></tr>
<tr><td>mem.inactive</td><td>integer</td></tr>
<tr><td>mem.low_free</td><td>integer</td></tr>
<tr><td>mem.low_total</td><td>integer</td></tr>
<tr><td>mem.mapped</td><td>integer</td></tr>
<tr><td>mem.page_tables</td><td>integer</td></tr>
<tr><td>mem.shared</td><td>integer</td></tr>
<tr><td>mem.slab</td><td>integer</td></tr>
<tr><td>mem.swap_cached</td><td>integer</td></tr>
<tr><td>mem.swap_free</td><td>integer</td></tr>
<tr><td>mem.swap_total</td><td>integer</td></tr>
<tr><td>mem.total</td><td>integer</td></tr>
<tr><td>mem.used</td><td>integer</td></tr>
<tr><td>mem.used_percent</td><td>float</td></tr>
<tr><td>mem.vmalloc_chunk</td><td>integer</td></tr>
<tr><td>mem.vmalloc_total</td><td>integer</td></tr>
<tr><td>mem.vmalloc_used</td><td>integer</td></tr>
<tr><td>mem.wired</td><td>integer</td></tr>
<tr><td>mem.write_back</td><td>integer</td></tr>
<tr><td>mem.write_back_tmp</td><td>integer</td></tr>
<tr><td>net.bytes_recv</td><td>integer</td></tr>
<tr><td>net.bytes_sent</td><td>integer</td></tr>
<tr><td>net.drop_in</td><td>integer</td></tr>
<tr><td>net.drop_out</td><td>integer</td></tr>
<tr><td>net.err_in</td><td>integer</td></tr>
<tr><td>net.err_out</td><td>integer</td></tr>
<tr><td>net.icmp_inaddrmaskreps</td><td>integer</td></tr>
<tr><td>net.icmp_inaddrmasks</td><td>integer</td></tr>
<tr><td>net.icmp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.icmp_indestunreachs</td><td>integer</td></tr>
<tr><td>net.icmp_inechoreps</td><td>integer</td></tr>
<tr><td>net.icmp_inechos</td><td>integer</td></tr>
<tr><td>net.icmp_inerrors</td><td>integer</td></tr>
<tr><td>net.icmp_inmsgs</td><td>integer</td></tr>
<tr><td>net.icmp_inparmprobs</td><td>integer</td></tr>
<tr><td>net.icmp_inredirects</td><td>integer</td></tr>
<tr><td>net.icmp_insrcquenchs</td><td>integer</td></tr>
<tr><td>net.icmp_intimeexcds</td><td>integer</td></tr>
<tr><td>net.icmp_intimestampreps</td><td>integer</td></tr>
<tr><td>net.icmp_intimestamps</td><td>integer</td></tr>
<tr><td>net.icmp_outaddrmaskreps</td><td>integer</td></tr>
<tr><td>net.icmp_outaddrmasks</td><td>integer</td></tr>
<tr><td>net.icmp_outdestunreachs</td><td>integer</td></tr>
<tr><td>net.icmp_outechoreps</td><td>integer</td></tr>
<tr><td>net.icmp_outechos</td><td>integer</td></tr>
<tr><td>net.icmp_outerrors</td><td>integer</td></tr>
<tr><td>net.icmp_outmsgs</td><td>integer</td></tr>
<tr><td>net.icmp_outparmprobs</td><td>integer</td></tr>
<tr><td>net.icmp_outredirects</td><td>integer</td></tr>
<tr><td>net.icmp_outsrcquenchs</td><td>integer</td></tr>
<tr><td>net.icmp_outtimeexcds</td><td>integer</td></tr>
<tr><td>net.icmp_outtimestampreps</td><td>integer</td></tr>
<tr><td>net.icmp_outtimestamps</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype0</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype11</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype3</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype4</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype5</td><td>integer</td></tr>
<tr><td>net.icmpmsg_intype8</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype0</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype3</td><td>integer</td></tr>
<tr><td>net.icmpmsg_outtype8</td><td>integer</td></tr>
<tr><td>net.ip_defaultttl</td><td>integer</td></tr>
<tr><td>net.ip_forwarding</td><td>integer</td></tr>
<tr><td>net.ip_forwdatagrams</td><td>integer</td></tr>
<tr><td>net.ip_fragcreates</td><td>integer</td></tr>
<tr><td>net.ip_fragfails</td><td>integer</td></tr>
<tr><td>net.ip_fragoks</td><td>integer</td></tr>
<tr><td>net.ip_inaddrerrors</td><td>integer</td></tr>
<tr><td>net.ip_indelivers</td><td>integer</td></tr>
<tr><td>net.ip_indiscards</td><td>integer</td></tr>
<tr><td>net.ip_inhdrerrors</td><td>integer</td></tr>
<tr><td>net.ip_inreceives</td><td>integer</td></tr>
<tr><td>net.ip_inunknownprotos</td><td>integer</td></tr>
<tr><td>net.ip_outdiscards</td><td>integer</td></tr>
<tr><td>net.ip_outnoroutes</td><td>integer</td></tr>
<tr><td>net.ip_outrequests</td><td>integer</td></tr>
<tr><td>net.ip_reasmfails</td><td>integer</td></tr>
<tr><td>net.ip_reasmoks</td><td>integer</td></tr>
<tr><td>net.ip_reasmreqds</td><td>integer</td></tr>
<tr><td>net.ip_reasmtimeout</td><td>integer</td></tr>
<tr><td>net.packets_recv</td><td>integer</td></tr>
<tr><td>net.packets_sent</td><td>integer</td></tr>
<tr><td>net.tcp_activeopens</td><td>integer</td></tr>
<tr><td>net.tcp_attemptfails</td><td>integer</td></tr>
<tr><td>net.tcp_currestab</td><td>integer</td></tr>
<tr><td>net.tcp_estabresets</td><td>integer</td></tr>
<tr><td>net.tcp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.tcp_inerrs</td><td>integer</td></tr>
<tr><td>net.tcp_insegs</td><td>integer</td></tr>
<tr><td>net.tcp_maxconn</td><td>integer</td></tr>
<tr><td>net.tcp_outrsts</td><td>integer</td></tr>
<tr><td>net.tcp_outsegs</td><td>integer</td></tr>
<tr><td>net.tcp_passiveopens</td><td>integer</td></tr>
<tr><td>net.tcp_retranssegs</td><td>integer</td></tr>
<tr><td>net.tcp_rtoalgorithm</td><td>integer</td></tr>
<tr><td>net.tcp_rtomax</td><td>integer</td></tr>
<tr><td>net.tcp_rtomin</td><td>integer</td></tr>
<tr><td>net.udp_ignoredmulti</td><td>integer</td></tr>
<tr><td>net.udp_incsumerrors</td><td>integer</td></tr>
<tr><td>net.udp_indatagrams</td><td>integer</td></tr>
<tr><td>net.udp_inerrors</td><td>integer</td></tr>
<tr><td>net.udp_noports</td><td>integer</td></tr>
<tr><td>net.udp_outdatagrams</td><td>integer</td></tr>
<tr><td>net.udp_rcvbuferrors</td><td>integer</td></tr>
<tr><td>net.udp_sndbuferrors</td><td>integer</td></tr>
<tr><td>net.udplite_ignoredmulti</td><td>integer</td></tr>
<tr><td>net.udplite_incsumerrors</td><td>integer</td></tr>
<tr><td>net.udplite_indatagrams</td><td>integer</td></tr>
<tr><td>net.udplite_inerrors</td><td>integer</td></tr>
<tr><td>net.udplite_noports</td><td>integer</td></tr>
<tr><td>net.udplite_outdatagrams</td><td>integer</td></tr>
<tr><td>net.udplite_rcvbuferrors</td><td>integer</td></tr>
<tr><td>net.udplite_sndbuferrors</td><td>integer</td></tr>
<tr><td>system.load1</td><td>float</td></tr>
<tr><td>system.load15</td><td>float</td></tr>
<tr><td>system.load5</td><td>float</td></tr>
<tr><td>system.n_cpus</td><td>integer</td></tr>
<tr><td>system.n_users</td><td>integer</td></tr>
<tr><td>system.uptime</td><td>integer</td></tr>
<tr><td>system.uptime_format</td><td>string</td></tr>
</table>

Using Datadog with Bonsai

The term “metering” in Bonsai refers to the limits imposed on the resources allocated to a cluster. A cluster’s limits are determined by its subscription level; higher-priced plans yield higher limits. There are several metered resources that can result in an overage. These documents explains what those resources are and how to resolve related overages:

  • How Bonsai Handles Overages
  • Concurrent Connections
  • Shards
  • Disk
  • Documents

How Bonsai Handles Overages

Overages are indicated on the Cluster dashboard. If your cluster is over its subscription limits, the overage will be indicated in red like so:

Bonsai uses “soft limits” for metering. This approach does not immediately penalize or disable clusters exceeding the limits of the subscription. This is a much more gentle way of treating users who are probably not even aware that they’re over their limits, or why that may be an issue.

When an overages is detected, it triggers a state machine that follows a specially-designed process that takes increasingly firm actions. This process has several steps:

1. Initial notification (immediately). The owner and members of their team is notified that there is an overage, and provided information about how to address it.

2. Second notification (5 days). A reminder is sent and warns that the cluster is about to be put into read-only mode. Clusters in read-only mode will receive an HTTP 403: Cluster Read Only error message.

3. Read-only mode (10 days). The cluster is put into read-only mode. Updates will fail with a 403 error.

4. Disabled (15 days). All access to the cluster is disabled. Both searches and updates will fail with a HTTP 403: Cluster Disabled error.

Extreme Overages Skip the Process

Overages that are particularly aggressive are subject to being disabled immediately. Bonsai uses a fairly generous algorithm to determine whether an overage is severe and subject to immediate cut-off. This step is uncommon, but is a definite possibility.

Stale Data May Be Lost

Free clusters that have been disabled for a period of time may be purged to free up resources for other users.

The fastest way to deal with an overage is to simply upgrade your cluster’s plan. Upgrades take effect instantly and will unlock any cluster set to read-only or disabled.

If upgrading is not possible for some reason, then the next best option is to address the issue directly. This document contains information about how to address an overage in each metered resource.

Give it a minute!

The resource usage indicators are not real time displays. Cluster stats are calculated every 10 minutes or so. If you address an overage and the change isn’t immediately reflected on the display, don’t worry. The changes will be detected, the dashboard will be updated, and any sanction in place on the cluster will be lifted automatically. Please wait 5-10 minutes before emailing support.

Concurrent Connections

Concurrent connections are effective number of connections a cluster can handle before it returns an HTTP 429 response. Bonsai meters on concurrency as one level of defense against Denial of Service (DoS) and noisy neighbor situations, which helps ensure users are not taking more than their fair share of resources.

More on concurrency and how to solve concurrency overages in our Reducing Concurrent Connections documentation.

Shards

A shard is the basic unit of work in Elasticsearch. If you haven’t read the Core Concept on Shards, that would be a good place to start. Bonsai meters on the total number of shards in a cluster. That means both primary and replica shards count towards the limit.

The relationship between shard scheme and your cluster’s usage can sometimes not be readily apparent. For example, if you have an index with a 3x2 sharding scheme (3 primaries, 2 replicas), that’s not 5 or 6 shards, it’s 9. If this is confusing, read our Shards and Replicas documentation for some nice illustrations.

If you have a shard overage you can look at out Reducing Shard Usage documentation to resolve it.

Disk

Bonsai meters on the total amount of disk space a cluster can consume. This is for capacity planning purposes, and to ensure multitenant customers have their fair share of resources. Bonsai calculates a cluster’s disk usage by looking at the total data store size in bytes. This information can be found in the Index Stats API.

Disk overages can be resolved in a couple different ways that are explained in our Reducing Data Usage documentation.

Documents

Bonsai meters on the number of documents in an index. There are several types of “documents” in Elasticsearch, but Bonsai only counts the live Lucene documents in primary shards towards the document limit. Documents which have been marked as deleted, but have not yet been merged out of the segment files do not count towards the limit.

Document overages can be resolved in a couple different ways that are explained in our Reducing Document Usage documentation.

Billing: How Bonsai handles overages

Bonsai meters on the number of documents in an index. There are several types of“documents” in Elasticsearch, but Bonsai only counts the live Lucene documents in primary shards towards the document limit. Documents which have been marked as deleted, but have not yet been merged out of the segment files do not count towards the limit.

In this guide we cover a few different ways of reducing document usage:

  1. How Document Overages Occur
  2. Remove Old Data
  3. Remove an Index
  4. Compact Your Mappings
  5. Upgrade the Subscription

How Document Overages Occur

Users frequently report having "only" a few hundred documents, but the dashboard registers several thousand. This is due to how Elasticsearch counts nested documents. The Index Stats API is used to determine how many documents are in a cluster. The Index Stats API counts nested documents by including all associated documents. In other words, if you have a document with 2 nested documents, this is reported as 3 total documents.

Elasticsearch has several different articles on how nested documents work, but the simplest answer is that it is creating the illusion of complex object by quietly creating multiple hidden documents.

A common point of confusion is that the /_cat/indices endpoint will show one set of documents, while the /_stats endpoint shows a much larger count. This is because the Cat API is counting the“visible” documents, while the Index Stats API is counting all documents. The _stats endpoint is a more true representation of a cluster’s document usage, and is the most fair to all users for metering purposes.

Remove Old Data

If your index has a lot of old data, or“stale” data (documents which rarely show up in searches), then you could simply delete those documents. Deleted documents do not count against your limits.

Remove an Index

Occasionally users are indexing time series data, or database tables that are not actually being searched by the application. Audit usage by using the Interactive Console to check the /_cat/indices endpoint. If you find that there are old or unnecessary indices with data, then delete those.

You may also want to check out the Index Trimmer.

Compact Your Mappings

Changing your mappings to nest less information can greatly reduce your document usage. Consider this sample document:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{  
"title": "Spiderman saves child from well",  
"body":  "Move over, Lassie! New York has a new hero. But is he also a menace?",  
"authors": [
  {
     "name":  "Jonah Jameson",      
     "title": "Sr. Editor",    
  },    
  {      
     "name":  "Peter Parker",      
     "title": "Photos",    
  }  
 ],  
"comments": [    
  {      
     "username": "captain_usa",      
     "comment":  "I understood that reference!",    
  },    
  {      
     "username": "man_of_iron",      
     "comment":  "Congrats on being slightly more useful than a ladder.",    
  }  
 ],  
"photos": [
  {      
     "url":      "https://assets.dailybugle.com/12345",      
     "caption":  "Spiderman delivering Timmy back to his mother",    
  }  
 ]
}</code></pre>
</div>

Note that it’s nesting data for authors, comments and photos. The mapping above would actually result in creating 6 documents. Removing the comments and photos(which usually don’t need to be indexed anyway) would reduce the footprint by 50%.

If you’re using nested objects, review whether any of the nested information could stand to be left out, and then reindex with a smaller mapping.

Upgrade the Subscription

If you find that you’re unable to reduce documents through the options discussed here, then you will need to upgrade to the next plan.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

Reducing Document Usage

Bonsai meters on the total amount of disk space a cluster can consume. This is for capacity planning purposes, and to ensure multitenant customers have their fair share of resources. Bonsai calculates a cluster’s disk usage by looking at the total data store size in bytes. This information can be found in the Index Stats API.

Resolving disk overages can be resolved in a couple different ways that we will cover in this documentation:

  1. Remove Stale Data / Indices
  2. Purge Deleted Documents
  3. Reindex with Smaller Mappings
  4. Upgrade the Subscription

Remove Stale Data / Indices

There are some cases where one or more indices are created on a cluster for testing purposes, and are not actually being used for anything. These will count towards the data limits; if you’re getting overage notifications, then you should delete these indices.

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_cat/indices
green open prod20180101    1 1 1015123 0  32M  64M
green open prod20180201    1 1 1016456 0  35M  70M
green open prod20180301    1 1 1017123 0  39M  78M
green open prod20180401    1 1 1018456 0  45M  90M
green open prod20180501    1 1 1019123 0  47M  94M
green open prod20180601    1 1 1020456 0  51M  102M</code></pre>
</div>

Removing the old and unneeded indices in the example above would free up 356MB. A single command could do it:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript"># Delete a group of indices:
DELETE /prod20180101,prod20180201,prod20180301,prod20180401,prod20180501</code></pre>
</div>
</div>

Purge Deleted Documents

Data in Elasticsearch is spread across lots of files called segments. Segments each contain some number of documents. An index could have dozens, hundreds or even thousands of segment files, and Elasticsearch will periodically merge some segment files into others.

When a document is deleted in Elasticsearch, its segment file is simply updated to mark the document as deleted. The data is not actually removed until that segment file is merged with another. Elasticsearch normally handles segment merging automatically, but forcing a segment merging will reduce the overall disk footprint of the cluster by eliminating deleted documents.

This can be done through the Optimize / Forcemerge API, but the same effect can be accomplished more efficiently by simply reindexing. Reindexing will cause the data to be refreshed, and no deleted documents will be tracked by Elasticsearch. This will reduce disk usage.

To check whether this will work for you, look at the  <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> data. There is a column called <span class="inline-code"><pre><code>docs.deleted</code></pre></span>, which shows how many documents are sitting on the disk and are marked as deleted. This should give a sense of how much data could be freed up by reindexing. For example:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">health status index     pri rep docs.count docs.deleted store.size pri.store.size
green  open   my_index  3   2   15678948   6895795      47.1G      15.7G</code></pre>
</div>

In this case, the  <span class="inline-code"><pre><code>docs.deleted</code></pre></span> is around 30% of the primary store, or around 4.8G of primary data. With replication, this works out to something like 14.4GB of total disk marked for deletion. Reindexing would reduce the cluster’s disk footprint by this much. The result would look like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">health status index     pri rep docs.count docs.deleted store.size pri.store.size
green  open   my_index  3   2   15678948   0            32.7G      10.9G</code></pre>
</div>

Protip: Queue Writes and Reindex in the Background for Minimal Impact

Your app’s search could be down or degraded during a reindex. If reindexing will take a long time, that may make this option unfeasible. However, you could minimize the impact by using something like Kafka to queue writes while reindexing to a new index.

Search traffic can continue to be served from the old index until its replacement is ready. Flush the queued updates from Kafka into the new index, then destroy the old index and use an alias to promote the new index.

The tradeoff of this solution is that you’ll minimize the impact to your traffic/users, but you’ll need to set up and manage the queue software. You’ll also have a lot of duplicated data for a short period of time, so your footprint could be way above the subscription limit for a short time.

To prevent the state machine from disabling your cluster, you might want to consider temporarily upgrading to perform the operation, then downgrading when you’re done. Billing is prorated, so this would not add much to your invoice. You can always email us to discuss options before settling on a decision.

Reindex with Smaller Mappings

Mappings define how data is stored and indexed in an Elasticsearch cluster. There are some settings which can cause the disk footprint to grow exponentially.

For example, synonym expansion can lead to lots of extra tokens to be generated per input token (if you’re using WordNet, see our documentation article on it, specifically Why Wouldn’t Everyone Want WordNet?). If you’re using lots of index-time synonym expansion, then you’re essentially inflating the document sizes with lots of data, with the tradeoff (hopefully) being improved relevancy.

Another example would be Ngrams. Ngrams are tokens generated from the parts of other tokens. A token like “hello” could be broken into 2-grams like “he”, “el”, “ll”, and “lo”. In 3-grams, it would be “hel”, “ell” and “llo”. And so on. The Elasticsearch Guide has more examples.

It’s possible to generate multiple gram sizes for a single, starting with values as low as 1. Some developers use this to maximize substring matching. But there is an exponential growth in the number of grams generated for a single token:

This relationship is expressed mathematically as:

In other words, a token with a length of 5 and a minimum gram size of 1 would result in <span class="inline-code"><pre><code>(1/2)*5*(5+1)=15</code></pre></span> grams. A token with a length of 10 would result in 55 grams. The grams are generated per token, which leads to an explosion in terms for a document.

As a sample calculation: if a typical document in your corpus has a field with ~1,000 tokens and a Rayleigh distribution of length with an average of ~5, you could plausibly see something like a 1,100-1,200% inflation in disk footprint using Ngrams of minimum size 1. In other words, if the non-grammed document would need 100KB on disk, the Ngrammed version would need over 1MB. Virtually none of this overhead would improve relevancy, and would probably even hurt it.

Nested documents are another example of a feature can also increase your data footprint without necessarily improving relevancy.

The point is that there are plenty of features available that lead to higher disk usage than one might think at first glance. Check on your mappings carefully: look for large synonym expansions, make sure you’re using Ngrams with a minimum gram size of 3 or more (also look into EdgeNGrams if you’re attempting autocomplete), and see if you can get away with fewer nested objects. Reindex your data with the updated mappings, and you should see a definite improvement.

Upgrade the Subscription

If you find that you’re unable to remove data, reindex, or update your mappings – or that these changes don’t yield a stable resolution – then you will simply need to upgrade to the next subscription level.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

Reducing Data Usage

Concurrent connections are effective number of connections a cluster can handle before it returns an HTTP 429 response. Bonsai meters on concurrency as one level of defense against Denial of Service (DoS) and noisy neighbor situations, which helps ensure users are not taking more than their fair share of resources.

This guide will cover concurrent connection limits and some strategies for avoiding them:

  1. What are concurrent connection limits?
  2. Upgrade to a plan with a higher concurrency allowance
  3. Reduce the request overhead
  4. Reduce the traffic rate

If reading over these suggestions doesn’t provide enough ideas for resolving the issue, you can always email our support team at support@bonsai.io to discuss options.

What Are Concurrent Connection Limits?

Bonsai distinguishes between types of connections for concurrency metering purposes:

  • Searches. Search requests. The concurrency allowance for search traffic is generally much higher than for updates.
  • Updates. Adding, modifying or deleting a given document. Generally has the lowest concurrency allowance due to the high overhead of single document updates.This includes _bulk requests.

Bonsai also uses connection pools to account for short-lived bursts of traffic.

The effective number of connections the cluster can handle is the pool size, plus the number of "active" connections. Active connections are being served by the cluster and not sitting in the queue.

If all active connections are in use, subsequent connections are put into a FIFO queue and served when a connection becomes available. If all connections are in use  and the request pool is exhausted, then Bonsai will return an immediate HTTP 429.

Concurrency is NOT Throughput!

Concurrency allowances are a limit on total active connections, not throughput (requests per second). A cluster with a search concurrency allowance of 10 connections can easily serve hundreds of requests per second. A typical p50 query time on Bonsai is ~10ms. At this latency, a cluster could serve a sustained 100 requests per second,  per connection. That would give the hypothetical cluster a maximum throughput of around 1,000 requests per second before the queue would even be needed. In practice, throughput may be closer to 500 rps (to account for bursts, network effects and longer-running requests).

To put these numbers in perspective, a sustained rate of 500 rps is over 43M requests per day. StackExchange – the parent of sites like StackOverflow, ServerFault and SuperUser (and many more) is performing 34M daily Elasticsearch queries. By the time your application demands exceed StackExchange by 25%, you will likely already be on a single tenant configuration with no concurrency limits.

Queued Connections Have a 60s TTL

If the connection queue begins to fill up with connection requests, those requests will only be held for up to 60 seconds. After that time, Bonsai will return a HTTP 504 response. If you are seeing HTTP 429 and 504 errors in your logs, that is an indicator that your cluster has a high volume of long-running queries.

Upgrade

Concurrency allowances on Bonsai scale with the plan level. Often the fastest way to resolve concurrency issues is to simply upgrade the plan. Upgrades take effect instantly.

Upgrading Direct Bonsai cluster

Upgrading a Heroku Bonsai cluster

Reduce Overhead

The more time a request takes to process, the longer the connection remains open and active. Too many of these requests in a given amount of time will exhaust the connection pool and request queue, resulting in HTTP 429 responses. Reducing the amount of time a request takes to process adds overhead for more traffic.

Some examples of requests which tend to have a lot of overhead and take longer to process:

  • Aggregations on large numbers of results
  • Geospatial sorting on large numbers of results
  • Highlighting
  • Custom scripting
  • Wildcard matching

If you have requests that perform a lot of processing, then finding ways to optimize them with filter caches and better scoping can improve throughput quite a bit.

Reduce Traffic Volume

Sometimes applications are written without considering scalability or impact on Elasticsearch. These often result in far more queries being made to the cluster than is really necessary. Some minor changes are all that is needed to reduce the volume and avoid HTTP 429 responses in a way that still makes the application usable.

A common example is autocomplete / typeahead scripts. Often the frontend is sending a query to the Elasticsearch cluster each time a user presses a key. The average computer user types around 3-4 characters per second, and (depending on the query and typist speed) a search launched by one keypress may not even be returned before the next keypress is made. This results in a piling up of requests. More users searching the app exacerbate the problem. Initiating a search every ~500 milliseconds instead of every keypress will be much more scalable without impacting the user experience.

Another example might be a site that boosts relevancy scores by page view. Whenever a page is requested, the application updates the corresponding document in Elasticsearch by incrementing a counter for the number of times the page has been viewed.

This strategy will boost a document’s position in the results in real time, but it also means the cluster is updated every time a user visits any page, potentially resulting in a high volume of expensive single document updates. It would be better to write the page counts to the local database and use a queue and worker system (like Resque) to push out updated page counts using the Bulk API every minute or so. This would be a much cheaper and more scalable approach, and would be just as effective.

Reducing Concurrency

The Snapshot and Restore feature is a really nice set of tools for backing up and restoring your data. Because Bonsai is a managed service that offers a multitenant architecture, there are some limits on how it can be used.

How Does Bonsai Manage Snapshots?

Bonsai takes regular, automatic backups of all paid clusters, and stores them in an encrypted S3 bucket in the same region as the cluster. These snapshots are taken at the start of every hour and are retained for two weeks.

HTTP 400: Cannot delete indices that are being snapshotted

In some rare cases, you may see an error like this when attempting to alter an index:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Cannot delete indices that are being snapshotted: [[my_index/dPzLciT9RlmS8OtGGm61IQ]]. Try again after snapshot finishes or cancel the currently running snapshot.
</code></pre>
</div>

This is happening because the action is taking place during the regular snapshot. Snapshots happen at the start of every hour (00:00, 01:00, 02:00, etc), and can take anywhere from a few seconds to a few minutes. If you attempt to delete your index during this time, the action will be blocked.

The solution is to wait a minute or two and try again.

In the unlikely event that your cluster suffers unrecoverable data loss (for example: a node hosting primary data is lost, and there’s no replica), then we will use the most recent successful snapshot to restore the data. Any data updates from the time of the snapshot to the time of its restoration will need to be reindexed.

Can I Take My Own Snapshots?

Not at this time. The technical explanation is that snapshot and restore operations can be extremely demanding on IO, and Elasticsearch will only allow one snapshot/restore operation to occur at a time, with subsequent calls sitting in a queue.

We’re Always Improving

We’re working on ways of making snapshot and restore features safer and easier to use on the Bonsai platform. If you have an idea or use case you’d like to see supported, shoot us an email and we’ll evaluate adding it to our development pipeline.

If you’ve read the article on Architecture Classes, specifically the section on Multi Tenant Class clusters, you’ll understand why this is problematic. Users with unrestrained access to that API could inadvertently take down a group of nodes with some ill-timed calls. Or, an impatient user wondering why his/her snapshot isn’t processing right away may repeat the call multiple times, populating the queue. It’s also plausible that a less experienced user may attempt taking a snapshot a minute.

Something to consider is the purpose of the snapshot. If your desire is simply to have an up to date backup, then Bonsai is already handling that with hourly snapshots. If the desire is to restore the snapshot locally, or to another cluster for testing/dev purposes, we may be able to accommodate through our support channels.

I Need More Frequent Snapshots / Longer Retention!

We can provide custom SLA’s – including more frequent snapshots and longer retention times for users on Enterprise plans. Send an email to support with your requirements and we’ll put together a quote.

I Have My Own S3 Bucket – Can You Just Use That?

The encrypted buckets we use are set up and managed automatically. It’s possible to have snapshots added to a different bucket, but only for Enterprise subscriptions. If you’re on an Enterprise tier cluster, please send us a request.

Snapshots on Bonsai

Heroku Private Spaces are network-isolated application containers available to Heroku Enterprise customers. Private Spaces allow organizations to host applications within a secure, HIPAA-compliant environment. They are ideal for apps that handle PII and other legally regulated types of data.

When third party add-ons are included in your build, additional steps need to be taken in order for your data to maximize the benefit of a private space. Most add-ons are operated outside of Heroku’s VPC, which means your Private Space application will be communicating across the public internet. For some use cases this is unacceptable, which is why Bonsai proudly supports joining our networks together allowing your traffic to travel on the private backbone of AWS. Joining these networks together securely requires some careful networking, called peering.

Fortunately, both Heroku and Bonsai run on AWS infrastructure, which offers a service called VPC Peering. VPC Peering is a network connection between two VPCs that allows appliances within each VPC to communicate as though they were in a single network.

Be aware of your security model

Bonsai clusters come in one of two architectures: multitenant or single tenant. Clusters in the single tenant architectures (Business and Enterprise tiers) are running on private, sandboxed nodes. Clusters in the multitenant architecture (the Standard tier) are in an environment resources are shared among multiple users. As a result, shared tier clusters are not available for VPC Peering.

This may or may not be acceptable for the data you plan to index. The rest of this guide assumes you are running on a single tenant architecture.

VPC Peering can be set up between a Heroku Private Space and a Bonsai cluster on an Enterprise plan. This configuration will ensure maximum isolation and protection of your data.

Common Issues

Running in a Private Space has some additional implications that may not be immediately obvious. The main points are:

  • The web Console will not be available, due to the fact that the Javascript is running outside of the private space. If you want to interface with your cluster directly, you'll need to connect to the VPC first.
  • The Kibana dashboard will also be unavailable because the proxy is outside of the private space. Kibana can be set up within a private space, but it requires additional client-side configuration. Contact support@bonsai.io for more details.
  • Connecting to the cluster directly will need to happen within the VPC itself, so users with a Private Space-based Bonsai cluster will need to first access their VPC before calls to Elasticsearch will succeed. See Connecting to Bonsai for more information.

If you run into any issues with your Private Space-based Elasticsearch cluster, please reach out to us at support@bonsai.io.

Gather your Heroku Peering Network Settings

In your Heroku Private Space you’ll need to navigate to the Network tab, and make a note of some settings under the Peering sub-section of the page. We need the AWS Account ID, the AWS VPC ID, and the AWS VPC CIDR. We will use this data to initiate a peering connection with your Heroku Space.

Finding Your Private Space URL

Your Private Space URL will look something like:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://dashboard.heroku.com/teams//spaces//network</code></pre>
</div>

Enter your details into the Bonsai dashboard

Log into the Bonsai cluster dashboard by running <span class="inline-code"><pre><code>$ heroku addons:open bonsai -a</code></pre></span> and enter your details in the form provided:

Accept our Peering Request

Once Bonsai has the above data, we will initiate a peering request to your space which will show up in the Network tab and under the Peering subsection. It should look like this:

Network changes are not instantaneous

The lead time for this to show up ranges from 30 minutes to a few hours.

When you accept the invitation the UI should change to look like:

Once the request has been accepted, you will be able to use the cluster URL provided in the Bonsai dashboard.

Private Means Private!

The DNS entry for your cluster will be pointing to private internal IP addresses, which means you will not be able to access this cluster except from within the Heroku Space. Browsers and <span class="inline-code"><pre><code>curl</code></pre></span> commands will not work.

Heroku Private Spaces and VPC Peering

When you have successfully provisioned a Private Space-accessible cluster via Bonsai, you might be interested in looking at it via Kibana. However there is one problem: our Kibana server doesn't have network access to your private cluster. This is by design, as you don't want just anyone to be able to access your cluster from the public internet.

So what are your options?

We recommend deploying a free dyno in your Heroku Private Space that can then be configured to access your cluster. It can do this securely because the Kibana dyno will run in your Heroku Private Space.

Kibana-maker

Kibana maker is a simple script that we wrote that helps bootstrap a Heroku Application by configuring a small git repository and pushing that up to a heroku git repository.

You can pull down a copy of kibana-maker using the following command:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">cd /tmp && curl -s https://gist.githubusercontent.com/dansimpson/59dc310bda6490af31e6ea1717bc6b8f/raw/89e11a994e835ce785d155a7ab9d759f03089dec/kibana-maker.rb -o kibana-maker && chmod +x kibana-maker</code></pre>
</div>
</div>

Steps

  1. Identify the slug of your private space in the Heroku App
  2. Identify the URL of your cluster in the Bonsai App
  3. Note the version of Elasticsearch you are using
  4. Using kibana-maker, run the following command: <span class="inline-code"><pre><code>cd /tmp && kibana-maker --name= --url= --space= --version=</code></pre></span>
  5. You should now have a functioning Kibana instance running in your Heroku Private space
Kibana and Heroku Private Spaces

Bonsai has a variety of cost controls that allow us to provide a quality free tier of service. Among these is the periodic deletion of unused free clusters.

Free clusters that have not been used for several weeks are added to a schedule for deletion. Warnings are sent out to users several days in advance of deletion. If the cluster receives traffic before its scheduled termination date, then the cluster is automatically removed from the schedule.

If you are an infrequent user of your Bonsai cluster, but wish to avoid it from being terminated, simply upgrade to any paid plan.

Periodic Cluster Cleanup

Capacity planning is the process of estimating the resources you’ll need over short and medium term timeframes. The result is used to size a cluster and avoid the pitfalls of inadequate resources (which cause performance, stability and reliability problems), and overprovisioning, which is a waste of money. The goal is to have just enough overhead to support the cluster’s growth at least 3-6 months into the future.

This document discusses how to estimate the resources you will need to get up and running on Elasticsearch. The process is roughly as follows:

  • Estimating Shard Requirements
  • Estimating Disk Requirements
  • Planning for Memory
  • Planning for Traffic

We have included a section with some sample calculations to help tie all of the information together.

Estimating Shard Requirements

Determining the number of shards that your cluster will need is one of the first steps towards estimating how many resources you will need. Every index in Elasticsearch is comprised of one or more shards. There are two types of shards: primary and replica, and how many total shards you need will depend not only on how many indices you have, but how those indices are sharded.

If you’re unfamiliar with the concept of shards, then you should read the brief, illustrated Shard Primer section first.

Why Is This Important?

Each shard in Elasticsearch is a Lucene instance, which often creates a large number of file descriptors. The number of open file descriptors on a node grows exponentially with shard counts. Left unchecked, this can lead to a number of serious problems.

Beware of High Shard Counts

There are a fixed number of file descriptors that can be opened per process; essentially, a large number of shards can crash Elasticsearch if this limit is reached, leading to data loss and service interruptions. This limit can be increased on a server, but there are implications for memory usage. There are a few strategies for managing these types of clusters, but that discussion is out of scope for preliminary capacity planning.

Some clusters can be designed to specifically accommodate a large number of shards, but that’s something of a specialized case. If you have an application that generates lots of shards, you will want your nodes to have plenty of buffer cache. For most users, simply being conscientious of how and when shards are created will suffice.

Full Text Search

The main question for this part of the planning process is how you plan to organize data. Commonly, users are indexing a database to get access to Elasticsearch’s full text search capabilities. They may have a Postgres or MySQL database with products, user records, blog posts, etc, and this data is meant to be indexed by Elasticsearch. In this use case, there is a fixed number of indices and the cluster will look something like this:

GET /_cat/indices
green  open   products      1   1         0         0      100b 100b
green  open   users         1   2         0         0      100b 100b
green  open   posts         3   2         0         0      100b 100b

In this case, the total number of shards is calculated by adding up all the shards used for all indices. The number of shards an index uses is the number of primary shards, p, times one the number of replica shards, r. Expressed mathematically, the total number of shards for an index is calculated as:

shards = p * (1 + r)

In the sample cluster above, the products index will need 1x(1+1)=2 shards, the users index will require 1x(1+2)=3 shards, and the posts index will require 3x(1+2)=9 shards. The total shards in the cluster is then 2+3+9=14 shards. If you need help deciding how many shards you will need for an index, check out the blog article The Ideal Elasticsearch Index, specifically the section called “Benchmarking is key” for some general guidelines.

Time Series Applications

Another common use case is to index time-series data. An example might be log entries of some kind. The application will create an index per hour/day/week/etc to hold these records. These use cases require a steadily-growing number of indices, and might look something like this:

GET /_cat/indices
green  open   events-20180101         1   1         0         0      100b 100b
green  open   events-20180102         1   1         0         0      100b 100b
green  open   events-20180103         1   1         0         0      100b 100b

With this use case, each index will likely have the same number of shards, but new indices will be added at regular intervals. With most time series applications, data is not stored indefinitely. A retention policy is used to remove data after a fixed period of time.

In this case, the number of shards needed is equal to the number of shards per index times the number of indices per time unit, times the retention policy. In other words, if an application is creating 1 index, with 1 primary and 1 replica shard, per day, and has a 30 day retention policy, then the cluster would need to support 60 shards:

1x(1+1) shards/index * 1 index/day * 30 days = 60 shards

Estimating Disk Requirements

The next characteristic to estimate is disk space. This is the amount of space on disk that your cluster will need to hold the data. If you’re a customer of Bonsai, this is the only type of disk usage you will be concerned with. If you’re doing capacity planning for another host (or self-hosting), you’ll want to take into account the fact that the operating system will need additional disk space for software, logs, configuration files, etc. You’ll need to factor in a much larger margin of safety if planning to run your own nodes.

Allocation

Benchmarking for Baselines

The best way to establish a baseline estimate for the amount of disk space you’ll need is to perform some benchmarking. This does not need to be a complicated process. It simply involves indexing some sample data into an Elasticsearch cluster. This can even be done locally. The idea is to collect some information on:

Database Size is a Bad Heuristic

Sometimes users will estimate their disk needs by looking at the size of a database (Postgres, MySQL, etc). This is not an accurate way to estimate Elasticsearch’s data footprint. ACID-compliant data stores are meant to be durable, and come with far more overhead than data in Lucene (what Elasticsearch is based on). There is a ton of overhead that will never make it into your Elasticsearch cluster. A 5GB Postgres database may only require a few MB in Elasticsearch. Benchmarking some sample data is a far greater tool for estimation.

Attachments Are Also a Bad Heuristic

Some applications are indexing rich text files, like DOC/DOCX, PPT, and PDFs. Users may look at the average file size (maybe a few MB) and multiply this by the total number of files to be indexed to estimate disk needs. This is also not accurate. Rich text files are packed with metadata, images, formatting rules and so on, bits that will never be indexed into Elasticsearch. A 10MB PDF may only take up a few KB of space once indexed. Again, benchmarking a random sample of your data will be far more accurate in estimating total disk needs.

Suppose you have a development instance of the application running locally, a local instance of Elasticsearch, and a local copy of your production data (or a reasonable corpus of test data). After indexing 10% of the production data into Elasticsearch, a call to /_cat/indices shows the following:

curl localhost:9200/_cat/indices
health status index             pri rep docs.count docs.deleted store.size pri.store.size
green  open   users-production   1   1         500            0      2.4mb 1.2mb
green  open   posts-production   1   1        1500            0     62.4mb 31.2mb
green  open   notes-production   1   1         300            0     11.6mb 5.8mb

In this example, there are 3 indices. Each index has one primary shard and one replica shard, for a total of 2 shards per index.

We can also see that users-production has 1.2MB of primary data occupied by 500 documents. This means one of these documents is 2.4KB on average. Similarly, posts-production documents average 20.8KB and notes-production documents average 19.3KB.

We can also estimate the disk footprint for each index populated with 100% of its data. users-production will require ~12MB, posts-production will require around 312MB and notes-production will require ~58MB. Thus, the baseline estimate is ~382MB for 100% of the production data.

The last piece of information to determine is the impact of replica shards. A replica shard is a copy of the primary data, hosted on another node to ensure high availabilty. The total footprint of the cluster data is equal to the primary data footprint times (1 + number_of_replicas).

So if you have a replication factor of 1, as in the example above, the baseline disk footprint would be 382MB x (1 + 1) = 764MB. If you wanted an additional replica, to keep a copy of the primary data on all 3 nodes, the footprint requirement would be 382MB x (1 + 2) = 1.1GB. (Note: if this is confusing, check out the Shard Primer page).

Last, it is a good idea to add a margin of safety to these estimates to account for larger documents, tweaks to mappings, and to provide some “cushion” for the operating system. Roughly 20% is a good amount; in the example above, this would give a baseline estimate of about 920MB disk space.

Medium-term Projections

The next step is to determine how quickly you’re adding data to each index. If your database creates timestamps when records are created, this information can be used to estimate the monthly growth in the number of records in each table. Suppose this analysis was performed on the sample application, and the following monthly growth rates were found:

users-production: +5%/mo
posts-production: +12%/mo
notes-production: +7%/mo

In a 6 month period, we would expect users-production to grow ~34% from its baseline, posts-production to grow ~97% from its baseline, and notes-production to grow ~50% from its baseline. Based on this, we can guess that in 6 months, the data will look like this:

GET /_cat/indices
health status index             pri rep docs.count docs.deleted store.size pri.store.size
green  open   users-production   1   1        6700            0     32.0mb 16.0mb
green  open   posts-production   1   1       29607            0     1.23gb 615mb
green  open   notes-production   1   1        4502            0      174mb 86.9mb

Based on this, the cluster should need at least 1.44GB. Add the 20% margin of safety for an estimate of ~1.75GB.

These calculations for the sample cluster show that we should plan on having at least 1.75GB of disk space available just for the cluster data. This amount will suffice for the initial indexing of the data, and should comfortably support the cluster as it grows over the next 6 months. At the end of that interval, resource usage can be re-evalutated, and resources added (or removed) if necessary.

Time Series Data

Some use cases involve time-series data, in which new indices are created on a regular basis. For example, log data may be partitioned into daily or hourly indices. In this case, the process of estimating disk needs is roughly the same, but instead of looking at document sizes, it’s better to look at the average index footprint.

Consider this sample cluster:

GET /_cat/indices
health status index             pri rep docs.count docs.deleted store.size pri.store.size
green  open   logs_20180101   1   1       27954            0     194mb 97.0mb
green  open   logs_20180102   1   1       29607            0     207mb 103mb
green  open   logs_20180103   1   1       28764            0     201mb 100.7mb

One could estimate that the average daily index requires 200MB of disk space. In six months, that would lead to around 36.7GB disk usage. With the margin of safety, a cluster with 45GB of disk allocated to the cluster is needed.

There are two caveats to add: first, time series data usually does not have a six month retention policy. A more accurate estimate would be to multiply the average daily index size by the number of days in the retention policy. If this application had a 30 day retention policy, the disk need would be closer to 7.2GB.

The second caveat is too many shards can be a problem (see Estimating Shard Usage for some discussion of why). Creating two shards every day for 6 months would lead to around 365 shards in the cluster, each with a lot of overhead in terms of open file descriptors. This could lead to crashes, data loss and serious service interruptions if the OS limits are too low, and memory problems if those limits are too high.

In any case, if the retention policy creates a demand for large numbers of open shards, the cluster needs to be sized not just to support the data, but the file descriptors as well. On Bonsai, this is not something users need to worry about, as these details are handled for you automatically.

Planning for Memory

Memory is an important component of a high-performing cluster. Efficient use of this resource helps to reduce the CPU cycles needed for a given search in several ways. First, matches that have been computed for a query can be cached in memory so that subsequent queries do not need to be computed again. And servers that have been sized with enough RAM can avoid the CPU overhead of working in virtual and swap memory. Saving CPU cycles with memory optimizations reduces request latency and increases the throughput of a cluster.

However, memory is a complicated subject. Optimizing system memory, cache invalidation and garbage collection are frequent subjects of Ph.D. theses in computer science. Fortunately, Bonsai handles server sizing and memory management for you. Our plans are sized to accommodate a vast majority of Elasticsearch users.

“I want enough RAM to hold all of my data!”

This is a common request, and it makes sense in principal. RAM and disk (usually SSD on modern servers) are both physical storage media, but RAM is several orders of magnitude faster at reading and writing than high-end SSD units. If all of the Elasticsearch data could fit into RAM, then we would expect an order of magnitude improvement in latency, right?

This tends to be reasonable for smaller clusters, but becomes less practical as a cluster scales. System RAM availability offers diminishing returns on performance improvements. Beyond a certain point, only a very specific set of use cases will benefit and the costs will necessarily be much higher.

Furthermore, Elasticsearch creates a significant number of in-memory data structures to improve search speeds, some of which can be fairly large (see the documentation on fielddata for an example). So if your plan is to base the memory size on disk footprint, you will need to not only need to measure that footprint, but also add enough for the OS, JVM, and in-memory data structures.

For all the breadth and depth of the subject, 95% of users can get away with using a simple heuristic: estimate needing 10-30% of the total data size for memory. 50% is enough for >99% of users. Note that because Bonsai manages the deployment and configuration of servers, this heuristic does not include memory available to the OS and JVM. Bonsai customers do not need to worry about these latter details.

So where does that heuristic break down? When do you really need to worry about memory footprint? If your application makes heavy use of any of the following, then memory usage will likely be a factor:

If your application is using one or more of these features, plan on needing more memory. If you would like to see the exact types of memory that Bonsai meters against, check out the Metering on Bonsai article.

Planning for Traffic

Capacity planning for traffic requirements can be tricky. Most applications do not have consistent traffic demands over the course of a day or week. Traffic patterns are often “spiky,” which complicates the estimation process. Generally, the greater the variance in throughput (as measured in requests per second, rps), the more capacity is needed to safely handle load.

Estimating Traffic Capacity

Users frequently base their estimate on some average number of requests: “Well, my application needs to serve 1M requests per day, which averages to 11-12 requests per second, so that’s what I’ll need.” This is a reasonable basis if your traffic is consistent (with a very low variance). But it is considerably inaccurate if your variance is more than ~10% of the average.

Consider the following simplified examples of weekly traffic patterns for two applications. The plots show the instantaneous throughput over the course of 7 days:

In each of these examples, the average throughput is the same, but the variance is markedly different. If they both plan on needing capacity for 5 requests per second, Application 1 will probably be fine because of Bonsai’s connection queueing, while Application 2 will be dramatically underprovisioned. Application 2 will be consistently demanding 1.5-2x more traffic than what it was designed to handle.

You’ll need to estimate your traffic based on the upper bounds of demand rather than the average. Some analysis will be necessary to determine the “spikyness” of your application’s search demands.

Traffic Volume and Request Latencies

There is a complex economic relationship between IO (as measured by CPU load, memory usage and network bandwidth) and maximum throughput. A given cluster of nodes has only a finite supply of resources to respond to the application’s demands for search. If requests come in faster than the nodes can respond to them, the requests can pile up and overwhelm the nodes. Everything slows down and eventually the nodes crash.

Simply: complex requests that perform a lot of calculations, require a lot of memory to complete, and consume a lot of bandwidth will lead to a much lower maximum throughput than simpler requests. If the nodes are not sized to handle peak loads, they can crash, restart and perform worse overall.

With multitenant class clusters, resources are shared among users and ensuring a fair allocation of resources is paramount. Bonsai addresses this complexity with the metric of concurrent connections. There is an entire section devoted to this in Metering on Bonsai. But essentially, all clusters have some allowance for the maximum number of simultaneous active connections.

Under this scheme, applications with low-IO requests can service a much higher throughput than applications with high-IO requests, thereby ensuring fair, stable performance for all users.

Estimating Your Concurrency Needs

A reasonable way to estimate your needs is using statistics gleaned from small scale benchmarking. If you are able to determine a p95 or p99 time for production requests during peak expected load, you can calculate the maximum throughput per connection.

For example, if your benchmarking shows that under load, 99% of all requests have a round-trip time of 50ms or less, then your application could reasonably service 20 requests per second, per connection. If you have also determined that you need to be able to service a peak of 120 rps, then you could estimate the number of concurrent connections needed by dividing: 120 rps / 20rps/connection = 6 connections.

In other words, a Bonsai plan with a search concurrency allowance of at least 6 will be enough to handle traffic at peak load. A few connections over this baseline should be able to account for random fluctuations and offer some headroom for growth.

Beware the Local Benchmarking

Users will occasionally set up a local instance of Elasticseach and perform a load test on their application to determine how many requests per second it can sustain. This is a problem because it neglects network effects.

Requests to a local Elasticsearch cluster do not have the overhead of negotiating an SSL connection over the Internet and transmitting payloads over this connection. These steps introduce a non-trivial latency to all requests, increasing round trip times and reducing throughput ceilings.

It’s better to set up a remote cluster, perform a small scale load test, and use the statistics to estimate the upper latency bounds at scale.

Another possibility with Bonsai is to select a plan with far more resources than needed, launch in production, measure performance, and then scale down as necessary. Billing is prorated, so the cost of this approach is sometimes justified by the time savings of not setting up performing and validating small-scale benchmarking.

Sample Calculations

Suppose the developer of online store decides to switch the application’s search function from ActiveRecord to Elasticsearch. She spends an afternoon collecting information:

  • She wants to index three tables: users, orders and products
  • There are 12,123 user records, which are growing by 500 a month
  • There are 8,040 order records, which are growing by 1,100 a month
  • There are 101,500 product records, which are growing by 2% month over month
  • According to the application logs, users are averaging 10K searches per day, with a peak of 20 rps

Estimating Shard Needs

She reads The Ideal Elasticsearch Index and decides that she will be fine with a 1x1 sharding scheme for the users and orders indices, but will want a 3x2 scheme for the products index, based on its size, growth, and importance to the application’s revenue.

This configuration means she will need 1x(1+1)=2 shards for the users index, 1x(1+1)=2 shards for the orders index, and 3x(1+2)=9 shards for the products index. This is a total of 13 shards, although she may eventually want to increase replication on the users and orders indices. She plans for 13-15 shards for her application.

Estimating Disk Needs

She sets up a local Elasticsearch cluster and indexes 5000 random records from each table. Her cluster looks like this:

GET /_cat/indices
green  open   users         1   1         5000         0      28m  14m
green  open   orders        1   1         5000         0      24m  12m
green  open   products      3   2         5000         0      540m 60m

Based on this, she determines:

  • The users index occupies 14MB / 5000 docs = 2.8KB per document.
  • The orders index occupies 12MB / 5000 docs = 2.4KB per document.
  • The products index occupies 60MB / 5000 docs = 12KB per document.

She uses this information to calculate a baseline:

  • She will need 2.8KB/doc x 12,123 docs x 2 copies = 68MB of disk for the users data
  • She will need 2.4KB/doc x 8,040 docs x 2 copies = 39MB of disk for the orders data
  • She will need 12KB/doc x 101,500 docs x 3 copies = 3654MB of disk for the products data
  • The total disk needed for the existing data is 68MB + 39MB + 3654MB = ~3.8GB.

She then uses the growth measurements to estimate how much space will be needed within 6 months:

  • The users index will have ~15,123 records. At 2.8KB/doc and a 1x1 shard scheme, this is 85MB.
  • The orders index will have ~14,640 records. At 2.4KB/doc and a 1x1 shard scheme, this is 70MB.
  • The products index will have ~114,300 records. At 12KB/doc and a 3x2 shard scheme, this is 4115MB.
  • The total disk needed in 6 months will be around 4.27GB.

Adding some overhead to account for unexpected changes in growth and mappings, she estimates that 5GB of disk should suffice for current needs and foreseeable future.

She also uses this to estimate her memory needs, and decides to estimate a memory footprint of up to 20% of the primary data, give or take. She estimates that 1.0GB should be sufficient for memory.

Estimating Traffic Needs

She knows from the application logs that her users hit the site with a peak of 20 requests per second. She creates a free Bonsai.io cluster, indexes some sample production data to it, and performs a small scale load test to determine what kinds of request latencies she can expect her application to experience while handling user searches with a cloud service.

She finds that 99.9% of all search traffic completes the round trip in less than 80ms. This gives her a conservative estimate of 12-13 requests per second per connection (1000ms per second / 80ms per request = 12.5 rps). With a search concurrency allowance of 2, she would be able to safely service around 25 connections, which is a little more than her current need for 20 rps.

Conclusion

Based on her tests and analysis, she decides that she will need a cluster with:

  • Capacity for at least 13-15 shards
  • A search concurrency of at least 2
  • 1 GB allocated for memory
  • 5 GB of disk to support the growth in data over the next 3-6 months

She goes to https://app.bonsai.io/pricing and finds a plan. She decides that at this stage, a multitenant class cluster offers the best deal, and finds that the $50/plan meets all of these criteria (and then some), so that’s what she picks.

Capacity Planning

Bonsai makes upgrading major versions of Elasticsearch as painless as possible. There is no need to manage the operational details of deploying software upgrades to nodes or spinning up new servers. With Bonsai, version upgrades are instant and can be performed with zero downtime.

In this document, we’ll cover some general best practices and offer some Bonsai-specific guidelines for migrating your app to a new version of Elasticsearch.

Protip: Use one Elasticsearch cluster per environment

This means having one cluster for production, and another for staging, another for development, and so on. Some users like to put staging indices alongside production indices on the same cluster in order to ensure identical behaviors between staging and production applications. First, this is a terrible idea in general; you should never run staging/dev applications on the same resources as production. This is a recipe for disaster.

Second, separating out your environments allows for upgrading them one at a time. While this may sound tedious, it is the most prudent approach and allows you to discover potential problems before they impact production.

Step 1: Read the Release Notes Carefully

Make sure to perform your due diligence by reading the release notes and breaking changes that accompany the version you’re targeting for the upgrade.

Another thing to investigate is whether your application’s Elasticsearch client supports the upgrade candidate. There have been cases where a popular client or framework was several support versions behind the official Elasticsearch release. Some of these have resulted in hours of down time for users who upgraded the production Elasticsearch cluster beyond the version supported by their Elasticsearch client.

This is one of many reasons we recommend upgrading non-production environments first.

Step 2: Validate In Development

Upgrading across major versions sometimes comes with breaking changes, new dependencies, and tweaks in behavior. It is important to validate that the upgrade is safe before pushing it out to production. We advise starting by upgrading the least critical environment first. A variation of the blue-green deployment strategy is useful here.

The process looks like this:

1. The application communicates with the old Elasticsearch cluster via a supported Elasticsearch client. Upgrading the cluster to a newer version of Elasticsearch will also generally require upgrading the client:

2. A new cluster running a later version of Elasticsearch is provisioned for the application. This cluster is not initially connected to the application via the client, because the client needs to be upgraded too:

3. The Elasticsearch client is upgraded, and the new version is updated to point to the new cluster. The application then performs a reindex, which populates the new cluster with data.

4. If the reindex worked as expected, and the application is behaving normally, then the old cluster can be destroyed. If there are any issues, then the client can be downgraded and configured to point back to the old cluster.

Ensure it works as expected, and make any changes as needed. Deploy those changes to the next least critical environment and then upgrade the cluster for that environment. Continue in this fashion until reaching the production environment. By that point, you should be fairly confident that the application and search will work as expected.

Make sure to validate that searches will work as expected in terms of relevancy. Also make sure to test full deletion and reindexing in the least critical environments before upgrading production. Reindexing is something you should be familiar with anyway as a part of normal usage of Elasticsearch, such as changing analyzer settings, backfilling a new field, or - in this case - upgrading to a new major version.

Step 3: Upgrade the Production Cluster

Once you are satisfied that the candidate version will work in production as expected, the final step is to take it live. This last step is usually complicated by the constraint that search must not go down at all, and data loss is unacceptable. Because of this constraint, planning and possibly additional infrastructure (like message queues), are required to ensure a zero-downtime switch and a fallback path in case something breaks.

Of course, if you're fortunate enough to have a use case where the production app can be put into maintenance mode while the new cluster is repopulated, then you can simply use the same process as outlined in the previous step.

For everyone else, the basic process is to have the old application and cluster serve traffic while the new application is populating the new cluster from the source database. Once the new cluster is populated with the same data as the old cluster, the new application is promoted to a production role and begins serving traffic. This strategy allows developers to quickly roll back to a known working state in the event that there is a serious issue with the new system.

The exact steps will vary considerably by application and use case. A typical strategy for this is outlined below:

1. The application communicates with the old Elasticsearch cluster via a supported Elasticsearch client. Upgrading the cluster to a newer version of Elasticsearch will also generally require upgrading the client:

2. A new cluster running a later version of Elasticsearch is provisioned for the application. This cluster is not initially connected to the application via the client, because the client needs to be upgraded too:

3. A fork is made of the production application. This fork will eventually be promoted to the production role. The fork is identical to the production application, except for the Elasticsearch client and any associated dependencies, which are updated to match the new version of Elasticsearch.

The fork is configured to read from the production DB and the client is configured to point to the new cluster:

4. The new cluster is indexed from the production DB. This allows the new cluster to "catch up" with the old cluster. If the production application handles a high volume of updates, it may be necessary to push updates into a queue of some kind instead of saving to the DB.

At the end of this process, the old and new clusters will have the same data:

5. The forked application is promoted to production. If updates were queued, those changes should be flushed to each cluster. Each cluster now has the same data, and the new Elasticsearch cluster is handling searches. No data was lost in the transition:

6. The old production application is still connected to the old cluster, but is not handling traffic. It exists only as an emergency fallback option. If the new production application and Elasticsearch version introduce a regression, the old production app can be promoted back to the production role.

In that event, updates would again need to be queued, and the old production app would index any updates made since the initial cutover. When that is done, the queue is again flushed, ensuring that no data has been lost:

7. If everything worked as expected, and a rollback is not needed, the old cluster and the old production application can be destroyed. The application is now running the latest version of Elasticsearch without data loss or downtime.

You will need to adapt this to your specific use case when planning out your blue-green strategy.

Ask Support

Upgrading major versions of Elasticsearch while running a production application can be tricky. If you’re unsure of what to do, are concerned about an edge case or special circumstance, or simply want to sanity check a plan, please do not hesitate to reach out to support@bonsai.io. We’re here to help ensure the smoothest upgrade possible.

Upgrading Major Versions

For people new to Elasticsearch, shards can be a bit of a mystery. Why is it possible to add or remove replica shards on demand, but not primary shards? What’s the difference? How many indices can fit on a shard?

This article explains what shards are, how they work, and how they can best be used.

Where do shards come from?

First, a little bit of background: Elasticsearch is built on top of Lucene, which is a data storage and retrieval engine. What are called “shards” in Elasticsearch parlance are technically Lucene instances. Elasticsearch manages these different instances, spreading data across different instances, and automatically balancing those instances across different nodes.

Whenever an Elasticsearch index is created, that index will be composed of one or more shards. This means that an Elasticsearch index can spread data across several Lucene instances. This architecture is useful for both redundancy and parallelization purposes.

Shards play one of two roles: primary or replica. Primary shards are a logical partitioning of the data in the index, and are fixed at the time that the index is created. Primary shards are useful for parallelization; when a large amount of data is split across several primary shards, a node can run a query on several Lucene instances in parallel, reducing the overall time of the job.

Primary Shards Can Not Be Added/Removed Later

From one of our more popular blog entries: “Elasticsearch uses a naive hashing algorithm to route documents to a given primary shard. This design choice allows documents to be randomly distributed in a reproducible way. This avoids “hot spots” that affect performance and overallocation. However, it has one major downside, which is that the number of primary shards can not be changed after an index has been created. Replicas can be added and removed at will, but the number of primary shards is basically written in stone.

In contrast, replica shards are simply extra copies of the data. They are useful for redundancy or to handle extra search traffic, and can be added and removed on demand.

You can specify how many primary shards and replicas are used when creating a new index.

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">PUT /my_index/_settings
{
 "number_of_shards": 1,
 "number_of_replicas": 2
}</code></pre>
</div>
</div>

Replicas and High-Availability

Replica shards are not only capable of serving search traffic, but they also provide a level of protection against data loss. If a node hosting a primary shard is taken offline for some reason, Elasticsearch will promote its replica to a primary role, if a replica exists. However, if the index’s replication is set to 0, then it is not in a High-Availability configuration. In the event of a data loss incident, the data will simply be lost.

Replicas are a multiplier on the primary shards, and the total is calculated as primary * (1+replicas). In other words, if you create an index with 3 primary shards and 2 replicas, you will have 9 total shards, not 5 or 6.

Measuring your cluster’s index and shard usage

Elasticsearch offers some API endpoints to explore the state of your indices and shards. The <span class="inline-code"><pre><code>_cat</code></pre></span> APIs are helpful for human interaction. You can view your index states by visiting <span class="inline-code"><pre><code>/_cat/indices</code></pre></span>, which will show index names, primary shards and replicas. You can also inspect individual shard states and statistics by visiting <span class="inline-code"><pre><code>/_cat/shards</code></pre></span>. See example output below:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">$ curl -s https://user:password@bonsai-12345.bonsai.io/_cat/indices?v
health status index  pri rep docs.count docs.deleted store.size pri.store.size
green  open   images   1   0          0            0       130b           130b
green  open   videos   1   0          0            0       130b           130b
green  open   notes    1   0          0            0       130b           130b

$ curl -s https://user:password@bonsai-12345.bonsai.io/_cat/shards?v
index  shard pri rep state   docs store ip              node      
images 0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man
notes  0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man
videos 0     p       STARTED    0  130b XXX.XXX.XXX.XXX Sugar Man</code></pre>
</div>

We’ve also made this easy by creating a live interactive console for you. Just visit your cluster’s dashboard console, chose <span class="inline-code"><pre><code>GET</code></pre></span> from the dropdown, and run <span class="inline-code"><pre><code>/_cat/indices?v</code></pre></span> or <span class="inline-code"><pre><code>/_cat/shards?v</code></pre></span>.

Frequently Asked Questions About Shards

Q. How Many Indices Can I Put On A Shard?

Short Answer: An index can have many shards, but any given shard can only belong to one index.

Long Answer: It is possible to “fake” multiple indices per shard by using “type” fields and index aliases. There is a section of the Elasticsearch Guide dedicated to this approach. Essentially, if you have several data models, you could put two or more into a single index instead of separate indices, and use aliases to perform automated filtering of queries.

The downside to this approach is that it requires a fair amount of work. Most frameworks and content management systems with integrated Elasticsearch clients will not take this approach by default. Developers will need to create and manage the index settings manually instead of relying on automated tools.

Q. How Many Shards Do I Need?

Answer: It depends. Because replicas can be added/removed at will, the real question is how many primary shards are needed. Increasing the number of primary shards for an index is one way to improve performance because it allows the query to be processed in parallel. But there are a lot of reasons to not have a kajillion primary shards.

Generally, we recommend that if you don’t expect data to grow significantly, then:

  • One primary shard is fine if you have less than 100K documents
  • One primary shard per node is good if you have over 100K documents
  • One primary shard per CPU core is good if you have at least a couple million documents

If you anticipate lots of growth – orders of magnitude within a short time – then the problem is a little more complicated. Take a read through the Ideal Elasticsearch Index post, especially the benchmarking section. Or shoot an email to support@bonsai.io.

Q. How Do I Reduce The Number of Shards I’m Using?

Answer: There is an article dealing with this very subject: Reducing Shard Usage.

Shard Primer

Throughout the Bonsai documentation, we use a standard terminology for discussing Elasticsearch-related subjects. This terminology is described in the Elasticsearch documentation, but we’ll highlight and discuss some very common terms here.

Node

A node is simply a running instance of Elasticsearch. While it’s possible to run several instances of Elasticsearch on the same hardware, it really is a best practice to limit a server to a single running instance of Elasticsearch. On Bonsai, a node is technically a virtual server running in the cloud. Bonsai follows the best practice of one Elasticsearch instance per server.

Cluster

A cluster is a collection of nodes running Elasticsearch. From the Elasticsearch documentation: “A cluster consists of one or more nodes which share the same cluster name. Each cluster has a single master node which is chosen automatically by the cluster and which can be replaced if the current master node fails.”

Index

In Elasticsearch parlance, the word “index” can either be used as a verb or a noun. This can sometimes be confusing for users new to Elasticsearch, and especially for users for whom English is not their first language. The intended meaning is usually understood through syntax and context clues.

Index (noun)

From the Elasticsearch documentation: “An index is like a table in a relational database… [It] is a logical namespace which maps to one or more primary shards and can have zero or more replica shards.”

To add to this, an index is sort of an abstraction because the shards (discussed in the section on Shards) are the “real” search engines. Queries to an index’s contents are routed to its shards, each of which is actually a Lucene instance. Because this happens in the background, it can appear that the index is performing the actual work, and the shards are more akin to a Heroku dyno or perhaps are simply there to add compute resources.

“How Many Indices Can I Put on a Shard?”

Because an Elasticsearch index is both abstract and opaque with regards to data storage and retrieval, sometimes new users miss the connection between indices and shards. A common support question that arises is “how do I put more indices on my shard?” The answer is that the index is composed of shards, not vice-versa.

In other words, an index can have many shards, but each shard can only belong to one index. This relationship precludes collocating multiple indices on a single shard.

Index (verb)

The process of populating an Elasticsearch index (noun) with data. It is most commonly used as a transitive verb with the data as the direct object, rather than the index (noun) being populated. In other words, the process is performed on the data, so that you would say: “I need to index my data,” and not “I need to index my index.”

Indexing is a critical step in running a search engine. Without indexing your content, you will not be able to query it using Elasticsearch, or take advantage of any of the powerful search features Elasticsearch offers.

Elasticsearch does not have any mechanism for extracting the contents of your website or database on its own. Indexing is something that must be managed by the application and/or the Elasticsearch client (defined below). Refer to the documentation for your platform as necessary for details.

Elasticsearch is not a Crawler

Elasticsearch is a search engine, not a content crawler. If your use case involves scanning domains or websites, you’ll need to use a crawler like Apache Nutch. In this case, the crawler would be responsible for scraping content and indexing it into your cluster.

Shards

From the Elasticsearch documentation: “Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.”

To add to this, each shard is technically a standalone search engine. There are two types of shards: primary and replica. We have some documentation on what shards are and where they come from in What are shards and replicas?, but we’ll also elaborate on some of the differences here.

Primary Shards

According to the Elasticsearch documentation: “Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard.”

Another way to think about primary shards is “the number of ways your data is split up.” One reason to have an index composed of X primary shards is that each shard will contain 1/X of your data. Queries can then be performed in parallel. This is useful and can definitely improve performance if you have millions of documents.

So Why Not a Kagillion Shards?

Elasticsearch has an entire article dedicated to the tradeoffs and limitations of a large number of primary shards. The really short version is that primary shards offer diminishing returns while the overhead increases sharply.

The takeaway is that you never want to use a large number of primary shards without some serious capacity planning and testing. If you are wondering how many primary shards you’ll need, you can check out The Ideal Elasticsearch Index (specifically the benchmarking section), or simply shoot us an email.

Primary Shards Can Not Be Added/Removed Later

From one of our more popular blog entries: “Elasticsearch uses a naive hashing algorithm to route documents to a given primary shard. This design choice allows documents to be randomly distributed in a reproducible way. This avoids “hot spots” that affect performance and overallocation. However, it has one major downside, which is that the number of primary shards can not be changed after an index has been created. Replicas can be added and removed at will, but the number of primary shards is basically written in stone.

Replica Shards

The Elasticsearch definition for replica shards sums it up nicely:

A replica is a copy of the primary shard, and has two purposes:

  1. Increase failover: a replica shard can be promoted to a primary shard if the primary fails
  2. Increase performance: get and search requests can be handled by primary or replica shards. By default, each primary shard has one replica, but the number of replicas can be changed dynamically on an existing index. A replica shard will never be started on the same node as its primary shard.

Another way to think about replica shards is “the number of redundant copies of your data.” If your index has 1 primary shard and 2 replica shards, then you can think of the cluster as having 3 total copies of the data. If the primary shard is lost – for example, if the server running it dies, or there is a network partition – then the cluster can recover automatically by using one of the replicas.

Replication is a Best Practice

If you’re running a production application, all of your indices should have a replication factor of at least 1. Otherwise, you’re exposed to data loss if anything unexpected happens.

Replicas are Easy

In contrast to primary shards, which can not be added/removed after the index is created, replicas can be added and removed at any time.

Document(s)

From the Elasticsearch documentation: “A document is a JSON document which is stored in Elasticsearch. It is like a row in a table in a relational database.”

As referenced, an analogue for an Elasticsearch document would be a database record. It is a collection of related fields and values. For example, it might look something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
 "title" : "Hello, World!",
 "author" : "John Doe",
 "description" : "This is a JSON document!"
}</code></pre>
</div>

You may be thinking to yourself: “My data is sitting in Postgres/MySQL/whatever, which is most certainly not in a JSON format! How do I turn the contents of a DB table into something Elasticsearch can read?”

Normally this work is handled automatically by a client (see below). Most users will never need to worry about translating the database contents into the Elasticsearch documents, as it will be handled automatically and “behind the scenes,” so to speak.

What about Word/Excel/PDF…?

Sometimes new users are confused about the term “document,” because their mental model (and possibly even the data they want to index) involves file formats like Word, PDF, Excel, RTF, PPT, and others. In Elasticsearch terminology, these formats are sometimes called “rich text documents,” and are completely different from Elasticsearch documents.

One reason this can be confusing is that Elasticsearch can index and search rich text documents. There are some plugins – the mapper-attachment and ingest-attachment (both supported on Bonsai) – which use the Apache Tika toolkit for extracting the contents of rich text documents and pushing it into Elasticsearch.

That said, if the data you’re trying to index resides in a rich text format, you can still index it into Elasticsearch. But when reading the documentation, keep in mind that “document” should refer to the JSON representation of the content, and some variant of “rich text” will refer to the file you’re trying to index.

Elasticsearch Client

A client is software that sits between your application and Elasticsearch cluster. It’s used to facilitate communication between your application and Elasticsearch. At a minimum, it will take data from the application, translate it into something Elasticsearch can understand, and then push that data into the cluster.

Most clients will also handle a lot of other Elasticsearch-related tasks, such as:

  • Automatically creating indices with the correct mappings/settings
  • Handling search queries and responses
  • Setting up and managing aliases

It is unlikely that you will need to create your own client. Elasticsearch maintains a list of language-specific clients that are well-documented and in wide use. Clients also exist for popular frameworks and content management systems, like:

In short, there is probably already an open sourced, well-documented client available.

Elasticsearch Core Concepts

A shard is the basic unit of work in Elasticsearch. If you haven’t read the Shards section in Elasticsearch Core Concepts, that would be a good place to start. Bonsai meters on the total number of shards in a cluster. That means both primary and replica shards count towards the limit.

The relationship between shard scheme and your cluster’s usage can sometimes not be readily apparent. For example, if you have an index with a 3x2 sharding scheme (3 primaries, 2 replicas), that’s not 5 or 6 shards, it’s 9. If this is confusing, read our Shard Primer documentation for some nice illustrations.

Managing shards is a basic skill for any Elasticsearch user. Shards carry system overhead and potentially stale data. Keeping your cluster clean by pruning old shards can both improve performance and reduce your server costs.

In this guide we cover a few different ways of reducing shard usage:

  1. What is a Shard Overage?
  2. Delete Unneeded Indices
  3. Use a Different Sharding Scheme
  4. Reduce replication
  5. Data Collocation
  6. Upgrade the Subscription

This guide will make frequent references to the Elasticsearch API using the command line tool <span class="inline-code"><pre><code>curl</code></pre></span>. Interacting with your own cluster can also be done via <span class="inline-code"><pre><code>curl</code></pre></span>, or via a web browser. You can also use the interactive Bonsai Console in your cluster dashboard.

What is a Shard Overage?

A shard overage occurs when your cluster has more shards than the subscription allows. In practice, this usually means one of three things:

  1. Extraneous indices.
  2. Sharding scheme is not optimal.
  3. Replication too high

It is also possible that the cluster and its data are already configured according to best practices. In that case, you may need to get creative with aliases and data collocation in order to remain on your current subscription.

Delete Unneeded Indices

There are some cases where one or more indices are created on a cluster for testing purposes, and are not actually being used for anything. These will count towards the shard limits; if you’re getting overage notifications, then you should delete these indices.

There are also some clients that will use aliases to roll out changes to mappings. This is a really nice feature that allows for zero-downtime reindexing and immediate rollbacks if there’s a problem, but it can also result in lots of stale indices and data accumulating in the cluster.

To determine if you have extraneous indices, use the <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> endpoint to get a list of all the indices you have in your cluster:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">&lt;script> console.log('hello'); &lt;/script>GET /_cat/indices
green open test1           1 1 0 0  318b  159b
green open test2           1 1 0 0  318b  159b
green open test3           1 1 0 0  318b  159b
green open test4           1 1 0 0  318b  159b
green open test5           1 1 0 0  318b  159b
green open test6           1 1 0 0  318b  159b</code></pre>
</div>

If you see indices that you don’t need, you can simply delete them:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript"># Delete a single index:
DELETE /test1

# Delete a group of indices:
DELETE /test2,test3,test4,test5</code></pre>
</div>
</div>

Use a Different Sharding Scheme

It’s possible that for some reason one or more indices were created with far more shards than necessary. For example, a check of <span class="inline-code"><pre><code>/_cat/indices</code></pre></span> shows something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_cat/indices
green open test1           5 2 0 0  318b  159b
green open test2           5 2 0 0  318b  159b</code></pre>
</div>

That 5x2 shard scheme results in 15 shards per index, or 30 total for this cluster. This is probably really overprovisioned for these indices. Choosing a more conservative shard scheme of 1x1 would reduce this cluster’s usage from 30 shards down to 4.

Unfortunately, the number of primary shards can not be changed once an index has been created. To fix this, you will need to manually create a new index with the desired shard scheme and reindex the data. If you have not read The Ideal Elasticsearch Index, it has some really nice information on capacity planning and sizing. Check out the sections on Intelligent Sharding and Benchmarking for some tips on what scheme would make more sense for your particular use case.

Reduce replication

Multitenant subscription plans (Hobby and Standard tiers) are required to have at least 1 replica shard in use per index. Indices on these plans cannot have a replica count of 0.

For most use-cases, a single replica is perfectly sufficient for redundancy and load capacity. If any of your indices have been created with more than one replica, you can reduce it to free up shards. An index with more than one replica might look like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">&lt;script> console.log('hello'); &lt;/script>GET /_cat/indices?v
health status index    pri rep docs.count docs.deleted store.size pri.store.size
green  open   test1    5   2   0          0            318b       159b
green  open   test2    5   2   0          0            318b       159b</code></pre>
</div>

Note that the <span class="inline-code"><pre><code>rep</code></pre></span> column has a 2? That means there are actually 3 copies of the data: one primary shard, and its two replicas. Replicas are a multiplier against primary shards, so if an index has a 5×2 configuration (5 primary shards with 2 replicas), reducing replication to 1 will free up five shards, not just one. See the Shard Primer for more details.

Fortunately, reducing the replica count for the index is a small JSON body to the <span class="inline-code"><pre><code>_settings</code></pre></span> endpoint:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">PUT /test1,test2/_settings -d '{"index":{"number_of_replicas":1}}'
{"acknowledged":true}

GET /_cat/indices
green open test1           5 1 0 0  318b  159b
green open test2           5 1 0 0  318b  159b</code></pre>
</div>
</div>

That simple request shaved 10 shards off of this cluster’s usage.

Replication == Availability and Redundancy

It might seem like a good money-saving idea to simply set all replicas to 0 so as to fit as many indices into your cluster as possible. However, this is not advisable. This means that your primary data has no live backup. If a node in your cluster goes offline, data loss is basically guaranteed.

Data can be restored from a snapshot, but this is messy and not a great failover plan. The snapshot could be an hour or more old, and any updates to your data since then either need to be reindexed or are lost for good. Additionally, the outage will last much longer.

Having replication of at least 1 mitigates against all these problems.

Data Collocation

Another solution for reducing shard usage involves using aliases and custom routing rules to collocate different data models onto the same group of shards.

What is data collocation? Many Elasticsearch clients use an index per model paradigm as the default for data organization. This is analogous to, say, a Postgres database with a table for each type of data being indexed.

Sharding this way makes sense most of the time, but in some rare cases users may benefit from putting all of the data into a single namespace. In the Postgres analogy, this would be like putting all of the data into a single table instead of a table for each model. An attribute (i.e., table column) is then used to filter searches by class.

For example, you might have a cluster that has three indices: <span class="inline-code"><pre><code>videos</code></pre></span>, <span class="inline-code"><pre><code>images</code></pre></span> and <span class="inline-code"><pre><code>notes</code></pre></span>. If each of these has a conservative 1x1 sharding scheme, it would require 6 shards. But this data could potentially be compacted down into a single index, <span class="inline-code"><pre><code>production</code></pre></span>, where the mapping has a <span class="inline-code"><pre><code>type</code></pre></span> field of some kind to indicate whether the document belongs to the <span class="inline-code"><pre><code>video</code></pre></span>, <span class="inline-code"><pre><code>image</code></pre></span> or <span class="inline-code"><pre><code>note</code></pre></span> class.

The latter configuration with the same 1x1 scheme would only require two shards (one primary, one replica) instead of six:

There are several major downsides to this approach. One is that field name collisions become an issue. For example, if there is a field called <span class="inline-code"><pre><code>published</code></pre></span> in two models, but one is defined as a boolean and the other is a datetime, it will create a conflict in the mapping. One will need to be renamed.

Another downside is that it is a pretty large refactor for most users, and may be more trouble than simply upgrading the plan. Overriding the default behavior in the application’s Elasticsearch client may require forking the code and relying on other hacks/workarounds.

There are others. Data collocation is mentioned here as a possibility, and one that only works for certain users. It is by no means a recommendation.

Upgrade the Subscription

If you find that you’re unable to reduce shards through the options discussed here, then you will need to upgrade to the next plan.

Reducing Shard Usage

Bonsai implements connection management as a metered plan resource. Our connection management proxy is designed to help model and encourage best practices for using Elasticsearch efficiently at different cluster sizes.

This document outlines several additional strategies that users can take to maximize the performance and throughput for their cluster.

Implement HTTP Keep-alive

Normally when an application makes a request to the cluster, it needs to perform a lot of steps. Very generally, it looks like this:

  • Set up the connection locally
  • Reach the target server
  • Establish a secure communication protocol via SSL/TLS (this step is actually a dozen smaller steps, see The SSL/TLS Handshake for an infographic)
  • Send the authentication credentials and the request
  • Wait for a response
  • Receive the response
  • Tear down the connection
  • Return the response to the application

As you can see, there is quite a bit of overhead to accomplish a simple search request. There is an optimization to be made here, called HTTP keep-alive. Essentially, the application opens a persistent connection to the cluster and may send and receive multiple search requests over it without the need to perform redundant tasks like negotiating a cryptographic protocol.

Persistent connections are “free” on Bonsai; the concurrency allowances only apply to active, in-flight connections. Otherwise idle HTTP keep-alive connections are not tracked toward this limit. Users should maintain a reasonably sized pool of keep-alive connections to reduce the setup cost of each connection and reduce the average latency.

This strategy saves a significant amount of overhead and reduces overall request latencies substantially. Some benchmarks of applications in production have seen a 20-25% improvement in throughput (requests per second) by implementing HTTP keep-alive.

In the ideal implementation, the application would have access to a local pool of persistent connections, from which it would draw to perform search requests. The local pool would also queue requests when all persistent connections are in use. This implementation would allow the developer to fine-tune the local pool size to minimize latency without exceeding the concurrency allowance for the cluster’s plan.

Consult the documentation for your Elasticsearch client for more details on implementing this feature.

Utilize a Durable Message Queue

For best performance and reliability, updates to data should be pushed into a durable message queue for asynchronous grouping and processing in batches. Some examples of message queues are: Apache Kafka, AWS Kinesis, IronMQ, Disque, Redis, Rescue, and RabbitMQ.

These services are resistant to network partitions and random node outages. There may be times, particularly on multi tenant class clusters, where a node is randomly restarted. Consider a case where an application pushes updates directly in to Elasticsearch. Such an interruption, even for a few seconds, could lead to error logs and emails about HTTP 503 responses.

But with a message queue acting as an intermediary, temporary connection issues are resolved automatically, and the application does not need to bubble up errors to the user or administrators. A message queue can also avoid the overhead of single-document updates by passing updates in batches. This strategy is one way to avoid an HTTP 429 response.

Prefer Bulk Updates Over Single-Document Updates

It’s worth noting that Bonsai distinguishes Bulk API updates from other writes. This is because the Bulk API is the more efficient means of data ingress into Elasticsearch, and is the preferred means of data ingest to the cluster. Single-document updates are helpful for maintaining synchronization over time, but should be avoided as a means for mass ingest.

Batch sizes should be determined experimentally based on your document size. A good heuristic is to aim for a batch size that takes approximately 1 second to complete. If requests are slower than one second, reduce the batch size. If requests are consistently faster than one-second, add more documents to the batch size. Repeat until bulk updates average about 1 second.

Improving Throughput and Performance

Troubleshooting and resolving connection issues can be time consuming and frustrating. This article aims to reduce the friction of resolving connection problems by offering suggestions for quickly identifying a root cause.

First Steps

If you're seeing connection errors while attempting to reach your Bonsai cluster, the first step is Don't Panic. Virtually all connection issues can be fixed in under 5 minutes once the root cause is identified.

The next step is to review the documentation on Connecting to Bonsai, which contains plenty of information about creating connections and testing the availability of your Bonsai cluster.

If the basic documentation doesn't help resolve the problem, the next step is to read the error message very carefully. Often the error message contains enough information to explain the problem. For example, something like this may show up in your logs:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Faraday::ConnectionFailed (Connection refused - connect(2) for "localhost" port 9200):</code></pre>
</div>

This message tells us that the client tried to reach Elasticsearch over localhost, which is a red flag (discussed below). It means the client is not pointed at your Bonsai cluster.

Next, look for an HTTP error code. Bonsai provides an entire article dedicated to explaining what different HTTP codes mean here: HTTP Error Codes. Like the error message, the HTTP error code often provides enough diagnostic information to identify the underlying problem.

Last, don't make any changes until you understand the error message and HTTP code (if one is present), and have read the relevant documentation. Most of these errors occur during the initial set up and configuration step, so also read the relevant Quickstart guide if one exists for your language / framework.

Common Connection Issues

There are issues which do not return HTTP error codes because the request can not be made at all, or because the request is not recognized by Elasticsearch. These issues are commonly one of the following:

  • Connection Refused
  • No handler found for URI
  • Name or service not known
  • HTTP 401 / Missing Authentication Headers
  • Missing SSL Root Certificate

Connection Refused

This error indicates that a request was made to a server that is not accepting connections.

This error most commonly occurs when your Elasticsearch client has not been properly configured. By default, many clients will try to connect to something like `localhost:9200`. This is a problem because Bonsai will never be running on your localhost or network, regardless of your platform.

What is Localhost?

Simply, localhost can be interpreted as "this computer." Which computer is "this" computer is determined by whatever machine is running the code. If you're running the code on your laptop, then localhost is the machine in front of you; if you're running the code on AWS or in Heroku, then the localhost is the node running your application.

Bonsai clusters run in a private network that does not include any of your infrastructure, which is why trying to reach it via localhost will always fail.

Why Port 9200?

By default, Elasticsearch runs on port 9200. While you can access your Bonsai cluster over port 9200, this is not recommended due to lack of encryption in transit.

All users will all need to ensure their Elasticsearch client is pointed at the correct URL and ensure that the Elasticsearch client is properly configured.

Ruby / Rails

If you’re using the elastisearch-rails client, simply add the following gem to your Gemfile:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">gem 'bonsai-elasticsearch-rails'</code></pre>
</div>
</div>

The bonsai-elasticsearch-rails gem is a shim that configures your Elasticsearch client to load the cluster URL from an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>. You can read more about it on the project repository.

If you'd prefer to keep your Gemfile sparse, you can initialize the client yourself like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># config/initializers/elasticsearch.rb
Elasticsearch::Model.client = Elasticsearch::Client.new url: ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you opt for this method, make sure to add the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> to your environment. It will be automatically created for Heroku users. Users managing their own application environment will need to run something like:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">$ export BONSAI_URL="https://randomuser:randompass@something-12345.us-east-1.bonaisearch.net"</code></pre>
</div>
</div>

Other Languages / Frameworks

The Elasticsearch client is probably using the default <span class="inline-code"><pre><code>localhost:9200</code></pre></span> or <span class="inline-code"><pre><code>127.0.0.1:9200</code></pre></span> (127.0.0.1 is the IPv4 equivalent of "localhost"). You'll need to make sure that the client is configured to use the correct URL for your cluster, and that this configuration is not being overwritten somewhere.

No handler found for URI

The Elasticsearch API has a variety of endpoints defined, like <span class="inline-code"><pre><code>/_cat/indices</code></pre></span>. Each of these endpoints can be called with a specific HTTP method. This error simply indicates a request was made to an endpoint using the wrong method.

Here are some examples:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">
 # Request to /_cat/indices with GET method:
 $ curl -XGET https://user:pass@something-12345.us-east-1.bonsai.io/_cat/indices
 green open test1           1 1 0 0  318b  159b
 green open test2           1 1 0 0  318b  159b

 # Request to /_cat/indices with PUT method:
 $ curl -XPUT https://user:pass@something-12345.us-east-1.bonsai.io/_cat/indices
 No handler found for uri [/_cat/indices] and method [PUT]</code></pre>
</div>
</div>

The solution for this issue is simple: use the correct HTTP method. The Elasticsearch documentation will offer guidance on which methods pertain to the endpoint you're trying to use.

Name or service not known

The Domain Name System (DNS) is a worldwide, decentralized and distributed directory service which translates human-readable domains like www.example.com in to network addresses like 93.184.216.119. When a client makes a request to a URL like "google.com," the application's networking layer will use DNS to translate the domain to an IP address so that it can pass the request along to the right server.

The "Name or service not known" error indicates that there has been a failure in determining an IP address for the URL's domain. This typically implicates one of several root causes:

  1. There is a typo in the URL.
  2. There is a TLD outage.
  3. The Internet Service Provider (ISP) handling your application's traffic may have lost a DNS server, or is having another outage.

The first troubleshooting step is to carefully read the error and double-check that the URL (particularly the domain name) is spelled correctly. If this is your first time accessing the cluster, then a typo is almost certainly the problem.

If this error arises during regular production use, then there is probably a DNS outage. DNS outages are outside of Bonsai's control. There are a couple types of outages, but the ones that have affected our users before are:

TLD outage

A TLD is something like ".com," ".net," ".org," etc. A TLD outage affects all domains under the TLD. So an outage of the ".net" TLD would affect all domains with the ".net" suffix.

Fortunately, all Bonsai clusters can be accessed via either of two domains:

  • bonsai.io
  • bonsaisearch.net

If there is a TLD outage, you should be able to restore service by switching to the other domain. In other words, if your application is sending traffic to <span class="inline-code"><pre><code>https://user:pass@something-12345.us-east-1.bonsai.io</code></pre></span>, and the ".io" TLD goes down, you can switch over to <span class="inline-code"><pre><code>https://user:pass@something-12345.us-east-1.bonsaisearch.net</code></pre></span>, and it will fix the error.

ISP Outage

Many ISPs operate their own DNS servers. This way, requests made from a node in their network can get low-latency responses for IP addresses. Most ISPs also have a fleet of DNS servers, at minimum a primary and a secondary. However, this is not a requirement, and there have been instances where an ISP's entire DNS service is taken offline.

There are also multitenant services which have Software Defined Networks (SDN) running Internal DNS (iDNS). Regressions in this software can also lead to application-level DNS name resolution problems.

If you've already confirmed the domain is correct and swapped to another TLD for your cluster, and you're still having issues, then you are probably dealing with an ISP/DNS or SDN/iDNS outage. One way to confirm this is to try making requests to other common domains like google.com. A name resolution error on google.com or example.com points to a local DNS problem.

If this happens, there is basically nothing you can do about it as a user, aside from complaining to your application sysadmin.

Missing Authentication Headers

Users who are seeing persistent HTTP 401 Error Codes may be using a client that is not handling authentication properly. As explained in the error code documentation, as well as in Connecting to Bonsai, all Bonsai clusters have a randomly-generated username and password which must be present in the request in order for it to be accepted.

What's less clear from this documentation is that including the complete URL in your Elasticsearch client may not be enough to create a secure connection.

This is due to how HTTP Basic Access Authentication works. In short, there needs to be a request header present. This header has an "Authorization" field and a value of <span class="inline-code"><pre><code>Basic</code></pre></span> . The base64 string is a base64 representation of the username and password, concatenated with a colon (":").

Most clients handle the headers for you automatically and in the background. But not all do, especially if the client is part of a bleeding edge language or framework, or if it's something homebrewed/forked/patched.

Here is a basic example in Ruby using <span class="inline-code"><pre><code>Net::HTTP</code></pre></span>, demonstrating how a URL with auth can still receive an HTTP 401 response:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">require 'base64'
require 'net/http'

# URL with credentials:
uri = URI("https://randomuser:randompass@something-12345.us-east-1.bonsaisearch.net")

# Net::HTTP does not automatically detect the presence of
# authentication credentials and insert the proper Authorization header.req = Net::HTTP::Get.new(uri)

# This request will fail with an HTTP 401, even though the credentials
# are in the URI:
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)}

# The proper header must be added manually:
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp

# The request now succeeds
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)
}</code></pre>
</div>
</div>

From the Ruby example, it's clear that there are some cases where the credentials are simply ignored instead of being automatically put in to a header. This causes the Basic authentication to fail, and receive the HTTP 401.

Simply, if you're seeing HTTP 401 responses even while including the credentials in the URL, and you've confirmed that the credentials are entered correctly and have not been expired, then the problem is probably a missing header. You can detect this with tools like socat or wireshark if you're familiar with network traffic inspection. Or, you can try adding the headers manually.

Here are some examples of calculating the base64 string and adding the request header in several different languages:

Android

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">  public static Map getHeaders() {
 Map headers = new HashMap<>();
 String credentials = "randomuser:randompass";
 String auth = "Basic " + Base64.encodeToString(credentials.getBytes(), Base64.NO_WRAP);
 headers.put("Authorization", auth);
 headers.put("Content-type", "application/json");
 return headers;
}</code></pre>
</div>
</div>

Java

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-java">public static Map getHeaders() {
 Map headers = new HashMap<>();
 String credentials = "randomuser:randompass";
 String auth = "Basic " + Base64.getEncoder().encodeToString(credentials.getBytes());
 headers.put("Authorization", auth);
 headers.put("Content-type", "application/json");
 return headers;
}</code></pre>
</div>
</div>

Ruby

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-ruby">require 'base64'
require 'net/http'

uri = URI("https://something-12345.us-east-1.bonsaisearch.net")
req = Net::HTTP::Get.new(uri)
credentials = "randomuser:randompass"
req['Authorization'] = "Basic " + Base64::encode64(credentials).chomp
res = Net::HTTP.start(uri.hostname, uri.port, :use_ssl => true) {|http|
 http.request(req)
}</code></pre>
</div>
</div>

Missing SSL Root Certificate

As mentioned in the Security documentation, all Bonsai clusters support SSL/TLS. This enables your traffic to be encrypted over the wire. In some rare cases, users may see something like this when trying to access their cluster over HTTPS:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">SSL: CERTIFICATE_VERIFY_FAILED</code></pre>
</div>

This is almost certainly due to a server level misconfiguration.

A complete examination of how SSL works is well outside the scope of this article, but in short: it utilizes cryptographic signatures to facilitate a chain of trust from a root certificate issued by a certificate authority (CA) to a certificate deployed to a server. The latter is used to prove the server's ownership before initiating public key exchange.

Think of a certificate authority as a mediator of sorts. A company like Bonsai goes to the CA and provides proof of identity and ownership of a domain. The CA issues Bonsai a unique certificate cryptographically signed by the CA's root certificate. Then Bonsai configures its servers to respond to SSL/TLS requests using the certificates signed by the CA's root.

When your application reaches out to Bonsai over HTTPS, the Bonsai server will respond with this certificate. The application can then inspect the certificate it receives from Bonsai and determine whether the CA's cryptographic signature is valid. If it's valid, then the application knows it really is talking to Bonsai and not an impersonator. It then performs a key exchange to open an encrypted connection and then sends/receives data.

One weakness of this system is that the CA has to be recognized by both parties; if your application doesn't have a copy of the CA's root certificate, it can't validate that the server it's talking to is genuine. You may start seeing errors like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Could not verify the SSL certificate for...
SSL_connect returned=1 errno=0 state=SSLv3 read server certificate B: certificate verify failedpeer certificate won't be verified in this SSL session
SSL certificate problem, verify that the CA cert is OK
SSL routines:SSL3_GET_SERVER_CERTIFICATE:certificate verify failed</code></pre>
</div>

These are just a sample of the errors you might see, but it all points to the application's inability to verify Bonsai's certificate.

It's possible that this is caused by an active man in the middle (MITM) attack, but the greater probability is that the application does not have access to the right certificates. There are a few things to check:

1. Make sure the CA root certificate for Bonsai is available

Bonsai's nodes are using Amazon Trust Services as a certificate authority. These CA's need to be added to your trust store. They should be automatically installed in most browsers / operating systems. If you're seeing SSL issues, make sure that you have the correct root certificates in your trust store.

This is sometimes easier said than done. Typically a sysadmin will need to handle this. If you're using Heroku, you'll need to open a request to their support team to verify that the proper root certificate is deployed to their systems. There are some hackish workarounds that involve bundling the certificate in your deploy and configuring the application to read it from somewhere other than the OS' trust store.

2. Make sure the certificate(s) for Bonsai are up to date

Certificate Authorities typically do not issue permanent certificates. Most certificates expire every 1-2 years and need to be renewed. Some browsers and applications will cache SSL certificates to improve latency. It's possible to be using a cached certificate that has passed its expiration date, thus rendering it invalid. Flush your caches and make sure that the Bonsai certificate is up to date.

Connection Issues

A HTTP 402 response indicates a cluster’s account is behind on payments. All requests to the cluster will return the following message:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
"code": 402,
"message": "Cluster has been disabled due to non-payment. Please update billing info or contact support@bonsai.io for further details."
}</code></pre>
</div>

The cluster cannot be used until an overdue payment is successfully processed. If an account remains in an unpaid state for more than a week, it will be destroyed and any active clusters will be terminated and all data destroyed.

Please refer to this article on how to check an account’s billing information. If you’ve updated the account’s billing information and confirmed that the credit card on file is working, please contact us and let us know that this response still persists.

HTTP 402: Payment Required

The HTTP 504 Gateway Timeout error is returned when a request takes longer than 60 seconds to process, regardless of whether the process is waiting on Elasticsearch or sitting in a connection queue. This can sometimes be due to network issues, and sometimes it can occur when Elasticsearch is IO-bound and unable to process requests quickly. Complex requests are more likely to receive an HTTP 504 error in these cases.

For more information on timeouts, please see our recommendations on Connection Management.

HTTP 504: Gateway Timeout

This response is distinct from the "Cluster not found" message. This message indicates that you're trying to access an index that is not registered with Elasticsearch. For example:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">GET /nonexistent_index/_search?pretty
{
 "error" : {
   "root_cause" : [ {
     "type" : "index_not_found_exception",
     "reason" : "no such index",
     "resource.type" : "index_or_alias",
     "resource.id" : "nonexistent_index",
     "index" : "nonexistent_index"
   } ],
   "type" : "index_not_found_exception",
   "reason" : "no such index",
   "resource.type" : "index_or_alias",
   "resource.id" : "nonexistent_index",
   "index" : "nonexistent_index"
 },
 "status" : 404
}</code></pre>
</div>
</div>

There are a couple reasons you might see this:

  • Race condition. You or your app may be trying to access an index before it was created.
  • Typo. You may have misspelled or only partially copy/pasted the name of the index you're trying to access.
  • The index has been deleted. Trying to access an index that has been deleted will return an HTTP 404 from Elasticsearch.

Important Note on Index Auto-Creation

By default, Elasticsearch has a feature that will automatically create indices. Simply pushing data into a non-existing index will cause that index to be created with mappings inferred from the data. In accordance with Elasticsearch best practices for production applications, we've disabled this feature on Bonsai.

However, some popular tools such as Kibana and Logstash do not support explicit index creation, and rely on auto-creation being available. To accommodate these tools, we've whitelisted popular time-series index names such as <span class="inline-code"><pre><code>logstash*</code></pre></span>, <span class="inline-code"><pre><code>requests*</code></pre></span>, <span class="inline-code"><pre><code>events*</code></pre></span>, <span class="inline-code"><pre><code>kibana*</code></pre></span>. and <span class="inline-code"><pre><code>kibana-int*</code></pre></span>.

The solution to this error message is to confirm that the index name is correct. If so, make sure it is properly created (with all the mappings it needs), and try again.

HTTP 404: Index Not Found

The proximate cause of HTTP 429 errors occur is that an app has exceeded its concurrent connection limits for too long. This is often due to a spike in usage -- perhaps a new feature as been deployed, a service is growing quickly, or possibly a regression in the code.

It can also happen when reindexing (for example: when engineers want to push all the data into Elasticsearch or OpenSearch as quickly as possible, which means lots of parallelization). Unusually expensive requests, or other unusual latency and performance degradation within Elasticsearch itself can also cause unexpected queuing and result in 429 errors.

In most cases, 429 errors can be solved by upgrading your plan to a plan with higher connection limits; new connection limits are applied immediately. If that's not viable, then you may need to perform additional batching of your updates (such as queuing and bulk updating) or searches (with multi-search API as one example). We have some suggestions for optimizing your requests that can help point you in the right direction.

HTTP 429: Too Many Requests

This error indicates that the request body to Elasticsearch exceeded the limits of the Bonsai proxy layer. This can be caused by a few things:

  • A request larger than 40MB. Elasticsearch's Query DSL can be fairly verbose JSON, particularly when queries are complex. The 40MB cap is meant to be a safety mechanism to prevent runaway queries from overwhelming the routing layer, while still being an order of magnitude higher than 99.9% of request bodies.
  • Indexing too many documents at once. The Elasticsearch Bulk API allows applications to index groups of documents in a single request. Sending a single batch of millions of documents could easily trigger the HTTP 413 message.
  • Lots of request headers. Metadata about a request can be passed to Elasticsearch in the form of request headers. Bonsai allows up to 16KB for request headers; this should be enough for whatever CORS and content-type specification needs to occur. Note that the TLS and authentication headers in the request are not counted towards this limit.
  • Indexing large files. When Elasticsearch indexes a rich text file like a PDF or Word document, it converts the file into a Base64 string to compress it for transit. Still, it's possible that this string is longer than 40MB, which could trigger the HTTP 413 error.

If you're seeing this error, check that your queries are sane and not 40MB of flattened JSON. Ensure you're not explicitly sending lots of headers to your cluster.

If you're seeing this message during bulk indexing, then decrease your batch sizes by half and try again. Repeat until you can reindex without receiving an HTTP 413.

Finally, if it is indeed a large file causing the problem, then the odds are good that metadata and media in the file are resulting in its huge size. You may need to use a some file editing tool to remove the media (images, movies, sounds) and possibly the metadata from the file and then try again. If the files are user-submitted, consider capping the file size your users are able to upload.

Customers who are unable to change their application to accommodate smaller request bodies or payloads should reach out to us. We we lift this limit for customers on Business and Enterprise plans, subject to some caveats on performance.

HTTP 413: Request Too Large

The HTTP 500 Internal Server Error is both rare and often difficult to reproduce. It generally indicates a problem with a server somewhere. It may be Elasticsearch, but it could also be a node in the load balancer or proxy. A process restarting is typically the root cause, which means it will often resolve itself within a few seconds.

The easiest solution is to simply catch and retry HTTP 500's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

HTTP 500: Internal Server Error

In some rare cases, the Bonsai Ops Team will put a cluster into maintenance mode. There are a lot of reasons this may happen:

  • Load shedding
  • Data migrations
  • Rolling restarts
  • Version upgrades
  • ... and more.

Maintenance mode blocks updates to the cluster, but not searches. If you're seeing this message, it will be temporary; it rarely lasts for more than a minute or two. If your cluster has been in a maintenance state for more than a few minutes, please contact support.

HTTP 403: Maintenance

All Bonsai clusters are provisioned with a randomly generated set of credentials. These must be supplied with every request in order for the request to be processed. An HTTP 401 response indicates the authentication credentials were missing from the request.

To elaborate on this, all Bonsai cluster URLs follow this format:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">https://username:password@hostname.region.bonsai.io
</code></pre>
</div>

The username and password in this URL are not the credentials used for logging in to Bonsai, but are randomly generated alphanumeric strings. So your URL might look something like:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">&lt;script> console.log('hello'); &lt;/script>https://kjh4k3j:lv9pngn9fs@my-awesome-cluster.us-east-1.bonsai.io
</code></pre>
</div>

The credentials <span class="inline-code"><pre><code>kjh4k3j:lv9pngn9fs</code></pre></span> must be present with all requests to the cluster in order for them to be processed. This is a security precaution to protect your data (on that note, we strongly recommend keeping your full URL a secret, as anyone with the credentials can view or modify your data).

Not All APIs are Available

It's possible to get an HTTP 401 response when attempting to access one of the Unsupported API Endpoints. If you're trying to access server level tools, restart a node, etc, then the request will fail, period. Please read the documentation on unavailable APIs to determine whether the failing request is valid.

I'm including credentials and still getting a 401!

Please ensure that the credentials are correct. You can find this information on your cluster dashboard. Note that there is a tool for rotating credentials. So it's entirely possible to be using an outdated set of credentials.

Heroku users should also inspect the contents of the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> config variable. This can be found in the Heroku app dashboard, or by running <span class="inline-code"><pre><code>heroku config:get BONSAI_URL</code></pre></span>. The contents of this variable should match the URL shown in the Bonsai cluster dashboard exactly.

If you're sure that the credentials are correct and being supplied, send us an email and we will investigate.

HTTP 401: Authorization Required

This error is raised when a request is sent to a cluster that has been disabled. Clusters can be disabled for one of several reasons, but the most common reason is due to an overage.

If you're seeing this error, check on your cluster status and address any overages you see. You can find more information about this in our Metering on Bonsai documentation, specifically the section "Checking on Cluster Status". If you're not seeing any overages and the cluster is still disabled, please contact us and let us know.

HTTP 403: Cluster Disabled

This error is raised when an update request is sent to a cluster that has been placed into read-only mode. Clusters can be placed into read-only mode for one of several reasons, but the most common reason is due to an overage.

If you're seeing this error, check on your cluster status and address any overages you see. You can find more information about this in our Metering on Bonsai documentation, specifically the section "Checking on Cluster Status". If you're not seeing any overages and the cluster is still set to read-only, please contact us and let us know.

HTTP 403: Cluster Read-only

The "Cluster not found"-variant HTTP 404 is distinct from the "Index not found" message. This error message indicates that the routing layer is unable to match your URL to a cluster resource. This can be caused by a few things:

  • A typo in the URL. If you're seeing this in the command line or terminal, then it's possible the hostname is wrong due to a typo or incomplete copy/paste.
  • The cluster has been destroyed. If you deprovision a cluster, it will be destroyed instantly. Further requests to the old URL will result in an HTTP 404 Cluster Not Found response.
  • The cluster has not yet been provisioned. There are a couple cases in which clusters take a few minutes to come online. Attempting to access the cluster before it's operational can lead to an HTTP 404 response.

If you have confirmed that: A) the URL is correct (see Connecting to Bonsai for more information), B) the cluster has not been destroyed, and C) the cluster should be up and running, and D) you're still receiving HTTP 404 responses from the cluster, then send us an email and we'll investigate.

HTTP 404: Cluster Not Found

The HTTP 501 Not Implemented error means that the requested feature is not available on Bonsai. Elasticsearch offers a handful of API endpoints that are not exposed on Bonsai for security and performance reasons. You can read more about these in the Unsupported API Endpoints documentation.

HTTP 501: Not Implemented

An HTTP 400 Bad Request can be caused by a variety of problems. However, it is generally a client-side issue. An HTTP 400 implies the problem is not with Elasticsearch, but rather with the request to Elasticsearch.

For example, if you have a mapping that expects a number in a particular field, and then index a document with some other data type in that field, Elasticsearch will reject it with an HTTP 400:

<div class="code-snippet-container">
<a fs-copyclip-element="click" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript"># Create a document with a field called "views" and the number 0POST /myindex/mytype/1?pretty -d '{"views":0}'
{
 "_index" : "myindex",
 "_type" : "mytype",
 "_id" : "1",
 "_version" : 1,
 "_shards" : {
   "total" : 2,
   "successful" : 2,
   "failed" : 0
 },
 "created" : true
}

# Elasticsearch has automagically determined the "views" field to be a long (integer) data type:GET /myindex/_mapping?pretty
{
 "myindex" : {
   "mappings" : {
     "mytype" : {
       "properties" : {
         "views" : {
           "type" : "long"
         }
       }
     }
   }
 }
}

# Try to create a new document with a string value instead of a long in the "views" field:POST /myindex/mytype/2?pretty -d '{"views":"zero"}'
{
 "error" : {
   "root_cause" : [ {
     "type" : "mapper_parsing_exception",
     "reason" : "failed to parse [views]"
   } ],
   "type" : "mapper_parsing_exception",
   "reason" : "failed to parse [views]",
   "caused_by" : {
     "type" : "number_format_exception",
     "reason" : "For input string: \"zero\""
   }
 },
 "status" : 400
}</code></pre>
</div>
</div>

The way to troubleshoot an HTTP 400 error is to read the response carefully and understand which part of the request is raising the exception. That will help you to identify a root cause and remediate.

HTTP 400: Bad Request

An HTTP 502: Bad Gateway error is rare, but when it does happen, there are really only two root causes: a problem with the load balancer, or Elasticsearch is returning a high number of deprecation warnings.

If you are seeing occasional, generic messages about HTTP 502 errors, then the most likely cause is the load balancer. The short explanation is that there are a few cases where the proxy software hits an OOM error and is restarted. This causes the load balancer to send back an HTTP 502. The error message will be very generic, and it will not say anything about Bonsai.io. The easiest solution is to simply catch and retry these HTTP 502's.

If you are seeing frequent, repeated HTTP 502 messages, and those messages say something like "A problem occurred with the Elasticsearch response. Please check status.bonsai.io or contact support@bonsai.io for assistance", then it's likely due to Elasticsearch's response overwhelming the load balancer with HTTP headers. These headers might look something like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">&lt;script> console.log('hello'); &lt;/script>"The [string] field is deprecated, please use [text] or [keyword] instead on [my_field]"</code></pre>
</div>

This usually happens when the client sends a request using a large number of deprecated references. Elasticsearch responds with an HTTP header for each one. If there is a large number (many thousands) of headers, the load balancer will simply close the connection and respond with an HTTP 502 message. The solution is to review your client and application code, and either: A) use smaller bulk requests, or B) update the code so that it's no longer using deprecated features.

If you are having trouble with this resolving this issue, please let us know.

HTTP 502: Bad Gateway

An HTTP 503: Service Unavailable error indicates a problem with a server somewhere in the network. It is most likely related to a node restart affecting your primary shard(s) before a replica can be promoted.

The easiest solution is to simply catch and retry HTTP 503's. If you've seen this several times in a short period of time, please send us an email and we will investigate.

HTTP 503: Service Unavailable

Pagination Query Parameters

All APIs supporting recognizes the following request parameters for pagination:

<table>
<thead>
<tr>
<th>Parameter</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>page</td><td>The page number, starting at 1</td>
</tr>
<tr>
<td>size</td><td>The size of each page, with a max of 100</td>
</tr>
</tbody>
</table>

Pagination Response Fragment

All API responses which support pagination will include a top level pagination fragment in the JSON response body, which looks like:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">"pagination": {
 "page_number": 1,
 "page_size": 20,
 "total_records": 255
}</code></pre>
</div>
</div>

With the above information, you can infer how many pages there are and iterate until the list is exhausted.

API Result Pagination

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

An API Token is required in order to access the API. Tokens are associated to a user within a given account. To create a credential, navigate to your account and click on the API Tokens tab:

Click on the Generate Token button to create a token:

Whenever a token is submitted to the API, a log entry is generated for that token. The logs will indicate which token was used, how it was authenticated, who it belongs to (and the IP address of the requester), and a host of other information:

You can see even more details about a request by clicking on its Details button:

If there is some security concern with the request, there will be a flash message indicating that it was not made safely.

To revoke a token, click on the Revoke button. This will bring up a confirmation dialog. Confirm the request, and the token will be revoked. Once a token is revoked, it can no longer be used to access the API. Requests to the API using a revoked token will result in a HTTP 401: Authorization error.

Creating an API token

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

This introduction to the Bonsai API includes the following sections:

  • Introduction: What is the API and what is required to make it work?
  • Request Data. Sending data to the API.
  • Response Data. Receiving data from the API.
  • Error Handling. What to expect if something fails.
  • API Access Limitations. How often can you hit the API?
  • Breaking Changes. How Bonsai handles changes to the API.
  • Questions, Problems, Feature Requests. Providing feedback about the API.

Introduction

Bonsai provides a REST API at https://api.bonsai.io for managing clusters, exploring plans, and checking out versions and available regions. This API allows customers to create, view, manage and destroy clusters via HTTP calls instead of through the dashboard. The API supports four endpoints:

  • Clusters API. Create, view, manage and destroy clusters on your account.
  • Plans API. View subscription plans available to your account.
  • Spaces API. View locations where your account may provision new clusters.
  • Releases API. View versions of Elasticsearch available to your account.

To interact with the API, users must create an API Token. You can read more about creating those tokens here. An API token will have a key and a secret. The API supports multiple ways of authenticating requests with an API token. All calls to the API using an API token are logged for auditing purposes. Additional constraints on API tokens are in development.

The API generally conforms to RESTful principals. Users interact with their clusters using standard HTTP verbs such as GET, PUT, POST, PATCH and DELETE. The Bonsai API accepts and returns JSON payloads. No other formats are supported at this time.

Request Data

The Bonsai API accepts request bodies in JSON format only. Request bodies that are not in proper JSON will receive an HTTP 422: Unprocessable Entity response code, along with a JSON body containing messages about the problem.

A <span class="inline-code"><pre><code>Content-Type: application/json</code></pre></span> HTTP header is preferred, but not required. Requests may also provide an <span class="inline-code"><pre><code>Accept</code></pre></span>Accept header, as either <span class="inline-code"><pre><code>Accept: */*</code></pre></span> or <span class="inline-code"><pre><code>Accept: application/json</code></pre></span>. Any other accept type will receive an HTTP 422 error.

Response Data

The Bonsai API responds with standard HTTP response codes. All HTTP message bodies will be in JSON format. The API documentation for the call will describe the response bodies that a client should expect.

Error Handling

In the event that one or more errors are raised, the API will return a JSON response detailing the problem. The response will have a status code, and an array of error messages. For example:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "errors": [
   "This request has failed authentication. Please read the docs or email us at support@bonsai.io."
 ],
 "status": 401
}</code></pre>
</div>
</div>

Error codes for the API are documented in the API Error Codes section.

API Access Limitations

The Bonsai API limits any given token to 60 requests per minute. Provision requests are limited to 5 per minute. Making too many requests in a short amount of time will result in a temporary period of HTTP 429 responses.

Additionally, access to the API may be blocked if:

  • Your Bonsai account is cancelled (HTTP 404).
  • Your Bonsai account is suspended due to non-payment (HTTP 402).
  • Your shared secret is revoked (HTTP 401).
  • If fraudulent or bot activity is suspected (HTTP 403).
  • Any violation of our Terms of Service (HTTP 403).

Breaking Changes

The Bonsai team considers the following changes to be backwards compatible, and can be made without advance notice:

  • Adding new API endpoints
  • Non-breaking changes to existing API endpoints:
  • Adding new optional parameters
  • Adding additional attributes to response bodies
  • Adding new, optional features
  • Adding support for new formats(such as XML)
  • Adding support for additional authentication options

Questions, Problems, Feature Requests

The Bonsai team is committed to providing the best, most-reliable platform for deploying and managing Elasticsearch clusters. If you have a question, issue, or just want to submit a feature request, please reach out to our support team at support@bonsai.io.

Introduction to the API

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Clusters API provides a means of managing clusters on your account. This API supports the following actions:

  • View all clusters in your account
  • View a single cluster in your account
  • Create a new cluster
  • Update a cluster in your account
  • Destroy a cluster in your account

All calls to the Clusters API must be authenticated with an active API token.

<span id="bonsai-cluster-object"></span>

The Bonsai Cluster Object

The Bonsai API provides a standard format for Cluster objects. A Cluster object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>slug</td><td>A string representing a unique, machine-readable name for the cluster. A cluster slug is based its name at creation, to which a random integer is concatenated.</td>
</tr>
<tr>
<td>name</td><td>A string representing the human-readable name of the cluster.</td>
</tr>
<tr>
<td>uri</td><td>A URI to get more information about this cluster.</td>
</tr>
<tr>
<td>plan</td><td>An Object with some information about the cluster's current subscription plan. This hash has two keys:

  • slug. The unique, machine-readable name of the plan.
  • uri. A URI to retrieve more information about this plan.

You can see more details about the plan by passing the slug to the Plans API.
</td>
</tr>
<tr>
<td>release</td><td>An Object with some information about the cluster's current release. This hash has five keys:

  • version. The version of the release this cluster is running on.
  • slug. The unique slug of the release.
  • package_name. The package name of the release.
  • service_type. The name of the search service.
  • uri. A URI to retrieve more information about this plan.

You can see more details about the release by passing the slug to the Releases API.
</td>
</tr>
<tr>
<td>space</td><td>An Object with some information about where the cluster is running. This has three keys:

  • path. The path to the space. This string maps to a geographic region or data center.
  • region. The geographic region in which the cluster is running.
  • uri. A URI with more information about the space

You can see more details about the space by passing the path to the Spaces API.
</td>
</tr>
<tr>
<td>stats</td><td>An Object with a collection of statistics about the cluster. This hash has four keys:

  • docs. The number of documents in the index.
  • shards_used. The number of shards the cluster is using.
  • data_bytes_used. Integer representing the number of bytes the cluster is using on disk.

This attribute should not be used for real-time monitoring! Stats are updated every 10-15 minutes. To monitor real-time metrics, monitor your cluster directly, via the Index Stats API.
</td>
</tr>
<tr>
<td>access</td><td>An Object containing information about connecting to the cluster. This hash has several keys:

  • host. The host name of the cluster.
  • port. The HTTP port the cluster is running on.
  • scheme. The HTTP scheme needed to access the cluster (defaults to "https")

</td>
</tr>
<tr>
<td>state</td><td>A String representing the current state of the cluster. This indicates what the cluster is doing at any given moment. There are 8 defined states:

  • DEPROVISIONED. The cluster has been destroyed.
  • DEPROVISIONING. The cluster is in the process of being destroyed.
  • DISABLED. The cluster has been disabled.
  • MAINTENANCE. The cluster is in maintenance mode.
  • PROVISIONED. The cluster has been created and is ready for use.
  • PROVISIONING. The cluster is in the process of being created.
  • READONLY. The cluster is in read only mode.
  • UPDATING PLAN. The cluster's plan is being updated.

</td>
</tr>
</tbody>
</table>

<span id="view-all-clusters"></span>

View all clusters

The Bonsai API provides a method to get a list of all active clusters on your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will return a JSON list of Cluster objects. This API call will not return deprovisioned clusters. This call uses pagination, so you may need to make multiple requests to fetch all clusters.

Supported Parameters (Query String Parameters)

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>q</td><td>Optional. A query string for filtering matching clusters. This currently works on name</td>
</tr>
<tr>
<td>tenancy</td><td>Optional. A string which will constrain results to parent or child cluster. Valid values are: parent, child</td>
</tr>
<tr>
<td>location</td><td>Optional. A string representing the account, region, space, or cluster path where the cluster is located. You can get a list of available spaces with the Spaces API. Space path prefixes work here, so you can find all clusters in a given region for a given cloud.</td>
</tr>
</tbody>
</table>

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/clusters</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an HTTP 200: OK code, along with a JSON list representing the clusters on your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{  
  "pagination": {    
    "page_number": 1,    
    "page_size": 2,    
    "total_records": 2  
  },  
  "clusters": [    
    {      
      "slug": "first-testing-cluste-1234567890",      
      "name": "first_testing_cluster",      
      "uri": "https://api.bonsai.io/clusters/first-testing-cluste-1234567890",      
      "plan": {        
        "slug": "sandbox-aws-us-east-1",        
        "uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"      
      },      
      "release": {        
        "version": "7.2.0",        
        "slug": "elasticsearch-7.2.0",
        "package_name": "7.2.0",
        "service_type": "elasticsearch",        
        "uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"      
      },      
      "space": {        
        "path": "omc/bonsai/us-east-1/common",        
        "region": "aws-us-east-1",        
        "uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"      
      },      
      "stats": {        
        "docs": 0,        
        "shards_used": 0,        
        "data_bytes_used": 0      
      },      
      "access": {        
        "host": "first-testing-cluste-1234567890.us-east-1.bonsaisearch.net",        
        "port": 443,      
        "scheme": "https"      
      },      
      "state": "PROVISIONED"    
    },    
    {      
      "slug": "second-testing-clust-1234567890",      
      "name": "second_testing_cluster",      
      "uri": "https://api.bonsai.io/clusters/second-testing-clust-1234567890",      
      "plan": {        
        "slug": "sandbox-aws-us-east-1",        
        "uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"      
      },      
      "release": {        
        "version": "7.2.0",        
        "slug": "elasticsearch-7.2.0",
        "package_name": "7.2.0",
        "service_type": "elasticsearch",        
        "uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"      
      },      
      "space": {        
        "path": "omc/bonsai/us-east-1/common",        
        "region": "aws-us-east-1",        
        "uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"      
      },      
      "stats": {        
        "docs": 0,        
        "shards_used": 0,        
        "data_bytes_used": 0      
      },      
      "access": {        
        "host": "second-testing-clust-1234567890.us-east-1.bonsaisearch.net",        
        "port": 443,        
        "scheme": "https"      
      },      
      "state": "PROVISIONED"    
    }  
  ]
}</code></pre>
</div>
</div>

<span id="view-single-cluster"></span>

View a single cluster

The Bonsai API provides a method to retrieve information about a single cluster on your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to  <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Cluster object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
 "cluster": {
   "slug": "second-testing-clust-1234567890",
   "name": "second_testing_cluster",
   "uri": "https://api.bonsai.io/clusters/second-testing-clust-1234567890",
   "plan": {
     "slug": "sandbox-aws-us-east-1",
     "uri": "https://api.bonsai.io/plans/sandbox-aws-us-east-1"
   },
   "release": {
     "version": "7.2.0",
     "slug": "elasticsearch-7.2.0",
     "package_name": "7.2.0",
     "service_type": "elasticsearch",
     "uri": "https://api.bonsai.io/releases/elasticsearch-7.2.0"
   },
   "space": {
     "path": "omc/bonsai/us-east-1/common",
     "region": "aws-us-east-1",
     "uri": "https://api.bonsai.io/spaces/omc/bonsai/us-east-1/common"
   },
   "stats": {
     "docs": 0,
     "shards_used": 0,
     "data_bytes_used": 0
   },
   "access": {
     "host": "second-testing-clust-1234567890.us-east-1.bonsaisearch.net",
     "port": 443,
     "scheme": "https"
   },
   "state": "PROVISIONED"
 }
}</code></pre>
</div>
</div>

<span id="create-new-cluster"></span>

Create a new cluster

The Bonsai API provides a method to create new clusters on your account. An HTTP POST call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will create the cluster.

Supported Parameters

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>Required. A String representing the name for your new cluster.</td>
</tr>
<tr>
<td>plan</td><td>A String representing the slug of the new plan for your cluster. You can get a list of available plans via the Plans API.</td>
</tr>
<tr>
<td>space</td><td>A String representing the Space slug where the new cluster will be created. You can get a list of available spaces with the Spaces API.</td>
</tr>
<tr>
<td>release</td><td>A String representing the search service release to use. You can get a list of available versions with the Releases API.</td>
</tr>
</tbody>
</table>

HTTP Request

An HTTP POST call is made to /clusters along with a JSON payload of the supported parameters.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with a short message and details about the cluster that was created:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">{
 "message": "Your cluster is being provisioned.",
 "monitor": "https://api.bonsai.io/clusters/test-5-x-3968320296",
 "access": {
   "user": "utji08pwu6",
   "pass": "18v1fbey2y",
   "host": "test-5-x-3968320296",
   "port": 443,
   "scheme": "https",
   "url": "https://utji08pwu6:18v1fbey2y@test-5-x-3968320296.us-east-1.bonsaisearch.net:443"
 },
 "status": 202
}</code></pre>
</div>
</div>

Error

An <span class="inline-code"><pre><code>HTTP 422: Unprocessable Entity</code></pre></span> error may arise if you are trying to create one too many Sandbox clusters on your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">{
 "errors": [
   "The requested plan is not available for provisioning. Solution: Please use the plans endpoint for a list of available plans.",
   "Your request could not be processed. "
 ],
 "status":422
}</code></pre>
</div>
</div>

If you are not creating a Sandbox cluster, please refer to the API Error 422: Unprocessable Entity documentation.

<span id="update-cluster"></span>

Update a cluster

The Bonsai API provides a method to update the name or plan of your cluster. An HTTP PUT call is made to the <span class="inline-code"><pre><code>/clusters</code></pre></span> endpoint, and Bonsai will update the cluster.

Supported Parameters

<table>
<thead>
<tr>
<th>Param</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>A String representing the new name for your cluster. Changing the cluster name will not change its URL.</td>
</tr>
<tr>
<td>plan</td><td>A String representing the slug of the new plan for your cluster. Updating the plan may trigger a data migration. You can get a list of available plans via the Plans API.</td>
</tr>
</tbody>
</table>

HTTP Request

To make a change to an existing cluster, make an HTTP PUT call to <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> with a JSON body for one or more of the supported params.

HTTP Response

Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 202: Accepted</code></pre></span> code, along with short message:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">{
   "message": "Your cluster is being updated.",
   "monitor": "https://api.bonsai.io/clusters/[:slug]",
   "status": 202
}</code></pre>
</div>
</div>

<span id="destroy-cluster"></span>

Destroy a cluster

The Bonsai API provides a method to delete a cluster from your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP DELETE call is made to the <span class="inline-code"><pre><code>/clusters/[:slug]</code></pre></span> endpoint.

HTTP Response

Bonsai will respond with an HTTP 202: Accepted code, along with a short message:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">{
   "message": "Your cluster is being deprovisioned.",
   "monitor": "https://api.bonsai.io/clusters/[:slug]",
   "status": 202
}</code></pre>
</div>
</div>

Cluster API Introduction

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Plans API gives users the ability to explore the different cluster subscription plans available to their account. This API supports the following actions:

  • View all plans available for your account.
  • View a single plan available for your account.

All calls to the Plans API must be authenticated with an active API token.

The Bonsai Plan Object

The Bonsai API provides a standard format for Plan objects. A Plan object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>slug</td><td>A String representing a machine-readable name for the plan. </td>
</tr>
<tr>
<td>name</td><td>A String representing the human-readable name of the plan.</td>
</tr>
<tr>
<td>price_in_cents</td><td>An Integer representing the plan price in cents.</td>
</tr>
<tr>
<td>billing_interval_in_months</td><td>An Integer representing the plan billing interval in months.</td>
</tr>
<tr>
<td>single_tenant</td><td>A Boolean indicating whether the plan is single-tenant or not. A value of false indicates the Cluster will share hardware with other Clusters. Single tenant environments can be reached via the public Internet. Additional documentation here.</td>
</tr>
<tr>
<td>private_network</td><td>A Boolean indicating whether the plan is on a publicly addressable network. Private plans provide environments that cannot be reached by the public Internet. A VPC connection will be needed to communicate with a private cluster.</td>
</tr>
<tr>
<td>available_releases</td><td>An Array with a collection of search release slugs available for the plan. Additional information about a release can be retrieved from the Releases API.</td>
</tr>
<tr>
<td>available_spaces</td><td>An Array with a collection of Space paths available for the plan. Additional information about a space can be retrieved from the Spaces API.</td>
</tr>
</tbody>
</table>

View all plans

The Bonsai API provides a method to get a list of all plans available to your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/plans</code></pre></span> endpoint, and Bonsai will return a JSON list of Plan objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the Plans available to your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
  "plans": [
     {
         "slug": "sandbox-aws-us-east-1",
         "name": "Sandbox",
         "price_in_cents": 0,
         "billing_interval_in_months": 1,
         "single_tenant": false,
         "private_network": false,
         "available_releases": [
             "7.2.0"
         ],
         "available_spaces": [
             "omc/bonsai-gcp/us-east4/common",
             "omc/bonsai/ap-northeast-1/common",
             "omc/bonsai/ap-southeast-2/common",
             "omc/bonsai/eu-central-1/common",
             "omc/bonsai/eu-west-1/common",
             "omc/bonsai/us-east-1/common",
             "omc/bonsai/us-west-2/common"
         ]
     },
     {
        "slug": "standard-sm",
        "name": "Standard Small",
        "price_in_cents": 5000,
        "billing_interval_in_months": 1,
        "single_tenant": false,
        "private_network": false,
        "available_releases": [
           "elasticsearch-5.6.16",
           "elasticsearch-6.8.3",
           "elasticsearch-7.2.0"
        ],
        "available_spaces": [
           "omc/bonsai/ap-northeast-1/common",
           "omc/bonsai/ap-southeast-2/common",
           "omc/bonsai/eu-central-1/common",
           "omc/bonsai/eu-west-1/common",
           "omc/bonsai/us-east-1/common",
           "omc/bonsai/us-west-2/common"
        ]
     }
   ]
 }</code></pre>
</div>
</div>

View a single plan

The Bonsai API provides a method to retrieve information about a single Plan available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/plans/[:plan-slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai will respond with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Plan object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
   "slug": "sandbox-aws-us-east-1",
   "name": "Sandbox",
   "price_in_cents": 0,
   "billing_interval_in_months": 1,
   "single_tenant": false,
   "private_network": false,
   "available_releases": [
       "elasticsearch-7.2.0"
   ],
   "available_spaces": [
       "omc/bonsai-gcp/us-east4/common",
       "omc/bonsai/ap-northeast-1/common",
       "omc/bonsai/ap-southeast-2/common",
       "omc/bonsai/eu-central-1/common",
       "omc/bonsai/eu-west-1/common",
       "omc/bonsai/us-east-1/common",
       "omc/bonsai/us-west-2/common"
   ]
}
</code></pre>
</div>
</div>

Plans API Introduction

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Spaces API provides users a method to explore the server groups and geographic regions available to their account, where clusters may be provisioned. This API supports the following actions:

  • View all available spaces for your account
  • View a single space for your account

All calls to the Spaces API must be authenticated with an active API token.

The Bonsai Space Object

The Bonsai API provides a standard format for Space objects. A Space object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>path</td><td>A String representing a machine-readable name for the server group.</td>
</tr>
<tr>
<td>private_network</td><td>A Boolean indicating whether the space is isolated and inaccessible from the public Internet. A VPC connection will be needed to communicate with a private cluster.</td>
</tr>
<tr>
<td>cloud</td><td>An Object containing details about the cloud provider and region attributes:

  • provider. A String representing a machine-readable name for the cloud provider in which this space is deployed.
  • region. A String representing a machine-readable name for the geographic region of the server group.

</td>
</tr>
</tbody>
</table>

View all available spaces

The Bonsai API provides a method to get a list of all available spaces on your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/spaces</code></pre></span> endpoint, and Bonsai will return a JSON list of Space objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the spaces available to your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "spaces": [
   {
     "path": "omc/bonsai/us-east-1/common",
     "private_network": false,
     "cloud": {
       "provider": "aws",
       "region": "aws-us-east-1"
     }
   },
   {
     "path": "omc/bonsai/eu-west-1/common",
     "private_network": false,
     "cloud": {
       "provider": "aws",
       "region": "aws-eu-west-1"
     }
   },
   {
     "path": "omc/bonsai/ap-southeast-2/common",
     "private_network": false,
     "cloud": {
       "provider": "aws",
       "region": "aws-ap-southeast-2"
     }
   }
 ]
}</code></pre>
</div>
</div>

View a single space

The Bonsai API provides a method to get information about a single space available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/spaces/[:path]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Space object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
 "path": "omc/bonsai/us-east-1/common",
 "private_network": false,
 "cloud": {
   "provider": "aws",
   "region": "aws-us-east-1"
 }
}</code></pre>
</div>
</div>

Spaces API Introduction

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Releases API provides users a method to explore the different versions of Elasticsearch available to their account. This API supports the following actions:

  • View all available releases for your account
  • View a single release for your account

All calls to the Releases API must be authenticated with an active API token.

The Bonsai Release Object

The Bonsai API provides a standard format for Release objects. A Release object includes:

<table>
<thead>
<tr>
<th>Attribute</th><th>Description</th>
</tr>
</thead>
<tbody>
<tr>
<td>name</td><td>A String representing the name for the release.</td>
</tr>
<tr>
<td>slug</td><td>A String representing the machine-readable name for the deployment. </td>
</tr>
<tr>
<td>version</td><td>A String representing the version of the release.</td>
</tr>
<tr>
<td>multitenant</td><td>A Boolean representing whether or not the release is available on multitenant deployments.</td>
</tr>
</tbody>
</table>

View all available releases

The Bonsai API provides a method to get a list of all releases available to your account. An HTTP GET call is made to the <span class="inline-code"><pre><code>/releases</code></pre></span> endpoint, and Bonsai will return a JSON list of Release objects.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON list representing the releases available to your account:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "releases": [
   {
     "name": "Elasticsearch 5.6.16",
     "slug": "elasticsearch-5.6.16",
     "service_type": "elasticsearch",
     "version": "5.6.16",
     "multitenant": true
   },
   {
     "name": "Elasticsearch 6.5.4",
     "slug": "elasticsearch-6.5.4",
     "service_type": "elasticsearch",
     "version": "6.5.4",
     "multitenant": true
   },
   {
     "name": "Elasticsearch 7.2.0",
     "slug": "elasticsearch-7.2.0",
     "service_type": "elasticsearch",
     "version": "7.2.0",
     "multitenant": true
   }
 ]
}</code></pre>
</div>
</div>

View a single release

The Bonsai API provides a method to get information about a single release available to your account.

Supported Parameters

No parameters are supported for this action.

HTTP Request

An HTTP GET call is made to <span class="inline-code"><pre><code>/releases/[:slug]</code></pre></span>.

HTTP Response

Upon success, Bonsai responds with an  <span class="inline-code"><pre><code>HTTP 200: OK</code></pre></span> code, along with a JSON body representing the Release object:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">{
 "name": "Elasticsearch 7.2.0",
 "slug": "elasticsearch-7.2.0",
 "service_type": "elasticsearch",
 "version": "7.2.0",
 "multitenant": true
}</code></pre>
</div>
</div>

Releases API Introduction

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

HMAC

The Bonsai API supports a hash-based message authentication code protocol for authenticating requests. This scheme allows the API to simultaneously verify both the integrity and the authenticity of a user’s request.

This authentication protocol requires that all API requests include three HTTP headers:

  • X-BonsaiApi-Time. The current Unix time – seconds since epoch. This value helps to guarantee uniqueness over time, and must be within one minute of our server time to prevent replay attacks.
  • X-BonsaiApi-Key. The API token’s key, as generated within the Bonsai application. This key will also have a corresponding secret.
  • X-BonsaiApi-Hmac. The hexadecimal HMAC-SHA1 digest of your shared secret and the concatenation of the above time and public key.

For example, in Ruby, the X-BonsaiApi-Auth header can be computed as: OpenSSL::HMAC.hexdigest('sha1', token_secret, "#{time}#{token_key}"), where the token_secret is the API key’s secret.

Using HMAC Authentication

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Bonsai API supports two methods of authenticating requests: HTTP Basic Auth, and HMAC. The former is a widely-adopted standard supported by most HTTP clients, but requires an encrypted connection for safe transmission. The latter is an older and slight more complicated method, but offers some security over unencrypted connections.

Which Scheme Should I Use?

If you’re connecting to the API via https (as most people are), then Basic Auth is fine. The header containing the credentials is encrypted using industry-standard protocols before being sent over the Internet. TLS allows you to authenticate the receiving party(the API) using a trusted certificate authority, rendering MITM attacks highly unlikely. Basic Auth is not secure over unencrupted connections, however. Your credentials could be leaked and read by a third party.

If you can’t use https for some reason, then consider using HMAC. This protocol involves passing along some special headers with your API requests, with the expectation that a 3rd party can access the transmission. It’s slightly more complicated to configure, but it involves a private key-signed time-based nonce, mitigating against MITM and replay attacks. A third party could see the data you send/receive with the API, but would not be able to steal your API credentials and interact with the API on your behalf.

Failed Authentication

Requests that do not have the proper authentication will receive an HTTP 401: Not Authorized response. This can happen for a variety of reasons, including(but not limited too):

  1. The token itself has been revoked
  2. The token key can not be found(perhaps due to a typo)
  3. The token secret does not match the secret provided
  4. One or more HTTP headers have been miscalculated
  5. (When using HMAC) the X-BonsaiApi-Time timestamp deviates more than 60 seconds from the server time
  6. Some other filtering rule has been violated

If you are having trouble authenticating your requests to the API, please reach out to support@bonsai.io.

Authentication

Alpha Stage

The Bonsai API is currently in its Alpha release phase. It may not be feature-complete, and is subject to change without notice. If you have any questions about the roadmap of the API, please reach out to support.

The Bonsai API supports HTTP Basic Authentication over TLS >= 1.2 as one means for authenticating requests. The authentication protocol utilizes an Authorization header, with the contents: Basic .

Many tools, such as curl will construct this header automatically from credentials in a URL. For example, curl will translate https://user:pass@api.bonsai.io/  in to https://api.bonsai.io/ with the header Authorization: Basic dXNlcjpwYXNz.

For Basic Auth, the token key corresponds to the“user” parameter and the token secret corresponds to the“password” parameter.

Using Basic Authentication

An HTTP 401: Unauthorized error occurs when a request to the API could not be authenticated. All requests to API resources must use some authentication scheme to prove access rights to the resource.

If you are receiving an HTTP 401: Unauthorized error, there are several possibilities for why it might be occurring:

  • The authentication credentials are missing
  • The authentication credentials are incorrect
  • The authentication credentials belong to a token that has been revoked

Check that the authentication credentials you are passing along in the request are correct and belong to an active token.

Example

A call to the API that results in an HTTP 401: Unauthorized error may look something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "errors": [
   "Authentication failed.",
   "Could not authenticate your request.",
   "This request has failed authentication. Please read the docs or email us at support@bonsai.io."
 ],
 "status": 401
}</code></pre>
</div>
</div>

The <span class="inline-code"><pre><code>"status": 401</code></pre></span> key indicates the HTTP 401: Unauthorized error.

Troubleshooting

The first thing to do is to carefully read the list of errors returned by the API. This will often include some hints about what is happening:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
 "errors": [
   "The 'Authorization' header has no value for the password field.",
   "The API token is missing, inactive or does not exist.",
   "Authentication failed.",
   "Could not authenticate your request.",
   "This request has failed authentication. Please read the docs or email us at support@bonsai.io."
 ],
 "status": 401
}</code></pre>
</div>
</div>

If that doesn't help, then check is that the credentials you're sending are correct. You can view the credentials in your account dashboard and cross-reference this with the credentials you're passing to the API.

If you're sure that the credentials are correct, then you may want to try isolating the problem. Try making a curl call to the API and see what happens. For example, using Basic Auth:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">curl -s -vvv -XGET https://user1234:somereallylongpassword@api.bonsai.io/clusters/
{
 "clusters": [],
 "status": 200
}</code></pre>
</div>
</div>

If the request succeeds, then you have eliminated the API Token as the source of the problem, and it's likely an issue with how the application is making the call to the API.

If the request still fails, then you should consult the documentation for the authentication scheme you're using to determine which HTTP headers are needed in the request. You can also use the -vvv flag in curl to see which headers and values are being passed with the request.

If all else fails, you can always contact support and we will be glad to assist.

API Error 401: Unauthorized

An HTTP 429: Too Many Requests error occurs when an API token is used to make too many requests to the API in a given period. Bonsai throttles the number of API calls that can be made by any given token in order to maintain a high level of service and prevent DoS scenarios.

If you are receiving an HTTP 429: Too Many Requests error, then you are hitting the API too frequently. The API documentation introduction describes which limits are in place.

Example

A call to the API that results in an HTTP 429: Too Many Requests error may look something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{  
  "errors": [    
    "You are making too many requests to the API.",    
    "Please read the documentation at https://docs.bonsai.io/article/308-api-error-429-too-many-requests"  
  ],  
  "status": 429
}</code></pre>
</div>
</div>

Troubleshooting

The first step is to look at how often you are polling the API. If you're checking the API while waiting for a cluster to provision or update, then every 3-5 seconds should be adequate.

If you have an adequate amount of sleep time in between API calls, then the next thing to check would be whether you have multiple jobs checking the API using the same token. If you're using some kind of CI system to spin up and tear down clusters during testing and continuous integration, then and they all share the same token, then they're likely interfering with each other.

Note that when the API returns an HTTP 429: Too Many Requests error, it will include a header called <span class="inline-code"><pre><code>Retry-After</code></pre></span>, which indicates how many seconds you will need to wait to make your next request. So you could add in a check for this into your scripts. That way an HTTP 429 does not cause things to fail, but rather informs the necessary sleep duration in order to proceed.

If all else fails, you can always contact support and we will be glad to assist.

API Error 429: Too Many Requests

An HTTP 422: Unprocessable Entity error occurs when a request to the API can not be processed. This is a client-side error, meaning the problem is with the request itself, and not the API.

If you are receiving an HTTP 422: Unprocessable Entity error, there are several possibilities for why it might be occurring:

  • You are sending in a payload that is not valid JSON
  • You are sending HTTP headers, such as <span class="inline-code"><pre><code>Content-Type</code></pre></span> or <span class="inline-code"><pre><code>Accept</code></pre></span>, which specify a value other than <span class="inline-code"><pre><code>application/json</code></pre></span>
  • The request may be valid JSON, but there is something wrong with the content. For example, instructing the API to upgrade a cluster to a plan which does not exist.

Example

A call to the API that results in an HTTP 422: Unprocessable Entity error may look something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a><div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "errors": [
   "The Content-Type header specifies application/xml.",
   "The Accept header specifies a response format other than application/json.",
   "Your request could not be processed. "
 ],
 "status": 422
}</code></pre>
</div>
</div>

The  <span class="inline-code"><pre><code>"status": 422</code></pre></span> key indicates the HTTP 422: Unprocessable Entity error.

Troubleshooting

The first step in troubleshooting this error is to carefully inspect the response from the API. It will often provide valuable information about what went wrong with the request. For example, if there was a problem creating a cluster because a plan slug was not recognized, you might see something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a><div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">{
 "errors": [
   "Plan 'sandboxing' not found.",
   "Please use the /plans endpoint for a list of available plans.",
   "Your request could not be processed. "
 ],
 "status": 422
}</code></pre>
</div>
</div>

If all else fails, you can always contact support and we will be glad to assist.

API Error 422: Unprocessable Entity

An HTTP 402: Payment Required error occurs when your account is past due and you try to make a request to the API. Bonsai only provides API access to accounts which are up to date on payments.

If you are receiving an HTTP 402: Payment Required error, then there is a balance due on your account. You can update your billing information in your account profile. If you run into any issues, there is documentation available here.

Example

A call to the API which results in an HTTP 402: Payment Required error may look something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{
 "errors": [
   "Your account has been suspended due to non-payment. Please update your billing information or contact support@bonsai.io."
 ],
 "status": 402
}</code></pre></div></div>


The <span class="inline-code"><pre><code>"status": 402</code></pre></span> key indicates the HTTP 402: Payment Required error.

Troubleshooting

The first thing to do is navigate to your billing profile and make sure your account is up to date. You can update your credit card if needed, and review any recent invoices. Additionally, you should check your inbox and spam folders for any billing-related notices from Bonsai.

If everything seems correct with your account, you can always contact support and we will be glad to assist.

API Error 402: Payment Required

An HTTP 403: Forbidden error can occur for one of several reasons. Generally, it communicates that the server understood the request, but is refusing to authorize it. This is distinct from an authentication error ( HTTP 401), in that the authentication credentials are correct, but there is some other reason the request is not authorized.

Some examples of situations where a user might see an HTTP 403: Forbidden response from the API:

  • We have detected activity on your account that appears fraudulent or highly suspicious, and have blocked access to the API until further notice.
  • We have identified a Terms of Service violation and are blocking access until it is resolved.
  • The API may be down for maintenance.

Example

A call to the API which results in an HTTP 403: Forbidden response may look something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">{  "errors": [    "Your API access has been suspended due to a Terms of Service violation. Please contact support@bonsai.io."  ],  "status": 403}</code></pre>
</div>
</div>

The <span class="inline-code"><pre><code>"status": 403</code></pre></span> key indicates that the error is indeed an HTTP 403: Forbidden error.

Troubleshooting

The first step for troubleshooting the error is to examine the error messages in detail. If there is a problem that merits contacting support, then you will want to reach out to us for further discussion. Also check your email inbox and spam folders for anything that we may have already sent.

If the error indicates a temporary interruption, such as maintenance mode, then check out our Twitter account for updates, or shoot us an email.

API Error 403: Forbidden
No results were found for ' '.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.

Upgrading Your Bonsai Cluster

Overview

Upgrading your Bonsai cluster to a more powerful one is a straightforward process, designed to minimize downtime and maintain the integrity of your data. This document outlines the steps involved, the expected downtime, and the effort required for a successful upgrade.

Upgrade Process

Bonsai utilizes a two-phase snapshot and restore process to migrate your data from one hardware cluster to another. A snapshot of your data is taken from the current hardware and restored to the new new hardware. There is no downtime or production impact to your cluster during this phase. This phase of the upgrade can take time, depending on how much data you have on your existing hardware.

Once this is completed, a second snapshot-restore operation is performed. This operation is simply a delta — covering only the data that has been changed since the phase-1 shapshot was taken. Usually this second phase lasts a few seconds, or maybe a minute. During this phase, your cluster is placed into a read-only mode. Search traffic is not impacted, but writes are blocked until the restore completes.

Once this final restore has completed, the cluster will be running entirely on the new hardware, with no data loss.

Downtime and Effort

  1. Zero Downtime for Searches:
    • Your search operations will not experience any downtime during the upgrade process. This ensures continuous availability for read operations.
  2. Short Window for Writes:
    • There will be a brief period where write operations are paused. This typically lasts less than a minute, depending on the amount of data being transferred.
    • An error message will be returned during this pause: 403 Forbidden along with a JSON response indicating that Cluster is currently read-only for maintenance. Please try your request again in a few minutes. See [status.bonsai.io](<http://status.bonsai.io/>) or contact [support@bonsai.io](<mailto:support@bonsai.io>) for updates.. It's essential to handle this gracefully in your application.
  3. Data Volume Considerations:
    • The time required for the upgrade depends on the size of your data and the rate of updates. Larger datasets may take slightly longer, but the write pause is generally kept under a minute.
  4. Using a Queue and Buffering:
    • As a general rule, we highly recommend using a queue to buffer write operations wherever possible. This could be something as simple as a Redis queue, kafka or similar that implements retries and exponential backoff for failed writes to the cluster. This ensures that transient network issues or other connection problems don’t cause lost writes.
    • During the upgrade process, this kind of queue can be paused to ensure that any writes attempted during the brief readonly phase are retried and successfully processed once the cluster is upgraded.

Let’s discuss how to change a cluster’s plan the Bonsai’s Cluster Dashboard.


To begin, navigate to the cluster’s Plan under Settings on the cluster dashboard.

If you haven’t added billing information yet, please do so at Account Billing. Detailed steps can be found in Add a Credit Card.

Once there is billing information listed on the account, you will see the different options for changing the cluster’s plan. Select the plan you would like to change to and click the green Change to … Plan button. In this example, this `documentation_cluster` is on a Sandbox plan and it will be upgraded to a Standard Micro plan.

A successfully changed plan will notify you at the top with Plan scheduled for update. The Plan has been upgraded to the Standard Micro plan in this example.

Can't Downgrade?

Please note that downgrading a plan may fail to update if it puts the cluster in an Overage State for the new plan. For example, downgrading a Standard Micro plan to a Sandbox plan will fail if there is 11 shards on the cluster (Sandbox plan has a shard limit of 10).

Upgrading and Changing a Cluster’s Plan

Hold Up!

Bonsai clusters can be provisioned in a few different ways. These instructions are for users who want to be billed directly by Bonsai. If you are a Heroku customer and are adding Bonsai to your app, then you will not need these instructions. Instead, check out:

Creating an account on Bonsai is the first step towards provisioning and scaling an Elasticsearch cluster on demand. The following guide has everything you need to know about signing up.

  1. Fill out account details
  2. Describe your project
  3. Set up your cluster details
  4. Confirm your email
  5. FAQs

1. Fill out account details

To start the sign up process, head to our Sign up page, where you’ll begin by providing account information. Click Create Account to continue.

2. Describe your project

In this section, you can provide some extra project details to allow our project team to best support your cluster. We have a few questions for you to answer to tailor your experience when using Bonsai.

Click Next to move on.

3. Set up your cluster details

Fill out your cluster name, then select a release and a region. All hobby and testing clusters are provisioned with the latest version. More information about our version and region support can be found here.

You don’t have to sign up with payment details - your first testing cluster is on us! If you wish to provision or upgrade to a larger cluster with more resources, you’ll need to add a credit card after confirming your email.

Click Provision Cluster to complete your signup.

4. Confirm your email

If everything checks out, you should receive an email from us shortly with a confirmation button.

Navigate to your email client and click on Confirm email.

You’re all set! Thank you for signing up.

If you don't receive the sign up email, please check your Spam folder. You can also reach out to us for help.

5. FAQs

Q: How does Bonsai validate emails?

A: We validate emails using RFC 2822 and RFC 3696. If you have a non-conforming email address, let us know at support@bonsai.io.

Q: Do you have tips for creating a strong password?

A: Yes! We’re a security-conscious bunch, but we don’t have any arcane rules about what kinds of characters you must use for your password. Why? We’ll let xkcd explain it, but the tl;dr is our password policy simply enforces a minimum length of 10 characters. We also reject common passwords that have been pwned. Sadly, correct horse battery staple appears in our blacklist.

We also highly recommend using a password manager like 1Password to generate unique, secure passwords.

Q: What happens if I try to sign up with an email address that I used before or that is taken?

A: If you have signed up for Bonsai in the past using this email address, you will receive an email directing you to log in using it.

Q: Do you have tips on picking a cluster region?

A: You will want to select a region that’s as close to where your application is hosted as possible to minimize latency. Doing so will ensure the fastest search and best user experience for your application.

Bonsai is built on Amazon Web Services (AWS) and Google Cloud Platform (GCP), and we run clusters in several of their regions. Users are able to provision clusters in the following regions:

Amazon Web Services
  • us-east-1 (Virginia)
  • us-west-2 (Oregon)
  • eu-west-1 (Ireland)
  • ap-southeast-2 (Sydney)
  • eu-central-1 (Frankfurt)
Google Cloud Platform
  • gcp-us-east4 (Virginia)
  • gcp-us-east1 (South Carolina)
  • gcp-us-central1 (Iowa)
  • gcp-us-west1 (Oregon)

These regions are supported due to broad demand. We can support other regions as well, but pricing will vary. Shoot us an email if you’d like to learn more about getting Bonsai running in a region not listed above.

Q: How should I approach naming clusters?

A: Users are able to manage multiple clusters through their Bonsai dashboard once they have confirmed their email. Not only is the cluster name a label for you to distinguish between different applications and environments, but it’s also used as part of the cluster’s unique hostname.

For example, if you name your cluster “Erin’s Exciting Elasticsearch Experiment,” that will be the name of your cluster. The host name of your cluster URL will be automatically generated into something like erins-exciting-elasticsea-123456.

Cluster names can be changed later, but the host name that is generated when the cluster is first created is immutable.

Anything not listed here?

As always, feel free to email us at support@bonsai.io if you have further questions.

Signing Up

The Bonsai cluster dashboard’s Overview provides a series of useful metrics about your cluster.

This article will cover the following:

  • Cluster Information
  • Performance metric
  • Traffic Summary
  • Usage
  • Data Allocation
  • Tenants
  • How to find help

Cluster Information

Overview provides general information about your cluster at the top:

The account name, cluster’s name, and health status dot (which will be green, yellow or red) is found here. Below that, the region the cluster is provisioned in, the version of Elasticsearch it’s running, and the subscription plan tier is display.

Performance

The first component is the Performance heatmap:

This heatmap reveals how fast requests are. Each column represents a “slice” of time. Each row, or “bucket”, in the slice represents duration speed. The"hotter" a bucket is colored, the more requests there are in that bucket. To further help visualize the difference in the quantity of requests for each bucket, every slice of time can be viewed as a histogram on hover.

You can check out our Metrics documentation for a more detailed dive into cluster metrics.

Traffic Summary

Traffic Summary highlights several statistics in the last 24 hours:

  • Request Count: This is the total number of requests your cluster has served in the past 24 hours. If there is a ‘-’ (hyphen character), then there is no data available to report. Also indicated are request counts for the previous 24 hours under Yesterday, and the percentage change between the two days.
  • Duration Median: This indicates your median request latency. The first number is the median response time for all requests over the past 24 hours. Also indicated are the median request latency for the previous 24 hours under Yesterday, and the percentage change between the two days. The median is an important metric because it’s more resistant to long tail effects and gives a better picture of overall performance than averages.
  • Duration 95th: This shows the 95th percentile in response times. The previous 24 hour period is found under Yesterday, and the percentage change between the two days. A percentile indicates how much of the data set falls below a particular value. For example, if the p95 response time for a cluster is 100ms, that means 95% of all requests are occurring in 100ms or less. This is an important metric for benchmarking, especially with high traffic volumes.

Usage

This component shows the cluster’s current usage versus the limits of your plan for 3 items:

  • Docs: This is the total number of documents you have in your cluster. We’re counting all documents, which can sometimes lead to confusion when nested documents are involved. If you have a parent document with three child documents, that counts as four documents - not one. This can be a source of confusion, as Elasticsearch may report different counts based on the endpoint queried.
  • Data: This is the disk footprint of your cluster, or the amount of data your cluster is occupying on the server.
  • Shards: This is the number of shards in your cluster across all indices. We’re counting both primary and replica shards here.

Data Allocation

This component indicates how your data is allocated across your cluster. If the allocation seems radically unbalanced, that can be an indication that you should reindex your data with an updated shard scheme. Documentation on this can be found in our Capacity Planning documentation.

Business / Enterprise Plans

If you upgrade to a Business / Enterprise cluster, you may see some extra nodes appear here, and may further observe that these nodes have few or 0 shards allocated. This is expected. These extra nodes will be cleaned up and removed later.

Tenants

Enterprise subscriptions support private multitenancy. For clusters running on these subscriptions, there will also be a Tenants table that lists tenants on the cluster:

Cluster Overview

Bonsai cluster dashboard’s Metrics is the place to troubleshoot cluster traffic issues and view performance metrics. This article will cover:

  • Navigating to Metrics
  • Metrics Utilities
  • Metrics Overview

Navigating to Metrics

Metrics is located in each cluster’s dashboard. Log into Bonsai, click on your cluster, and click on Metrics within the left sidebar:

Metrics Utilities

Time Window Selector

Use this selector to choose between four window sizes for metrics in the:

  • (1h) last 1 hour
  • (1d) last 24 hours
  • (7d) last 7 days
  • (28d) last 28 days

Time Scrubber

Click the left and right arrows to go back or forth in time within the selected window size.

UTC and Local Timezone Toggle

Click on the timezone to toggle between displaying the graph timestamps in UTC time or your local browser timezone.

Highlighting

You can drill down to smaller windows of time on any graph by clicking and dragging to select a time range.

Metrics Overview

More information doesn’t necessarily mean more clarity. When something happens to your traffic and cluster responses, it’s important to know how to see your metrics and draw conclusions.

We’ll cover what each graph displays and some examples of what they will look like given certain use cases (such as periods of high-traffic, or clusters in a normal state compared to ones that are experiencing downtime). We’ll start with the most information-dense graph: the Requests heat map.

Requests (Counts & Duration) heat map

This graph reveals how fast requests are. Each column in the graph represents a “slice” of time. Each row, or “bucket”, in the slice represents duration speed. The "hotter" a bucket is colored, the more requests there are in that bucket. To further help visualize the difference in the quantity of requests for each bucket, every slice of time can be viewed as a histogram on hover.

Example 1

This heat map displays a cluster with consistent traffic with most requests in the 200-300ms range to complete.

Example 2

This cluster has light, sporadic traffic. It’s important to note that the "heat" color of every bucket is determined relative to the other data in the graph - so a side-by-side comparison of two request heat maps using color won’t be accurate.

Request Counts

This graph shows the number of requests handled by the cluster at a given time.

Request Duration Percentiles

This graph, similar to the Requests heat map, shows a distribution of request speed based on 3 percentiles of the requests in that time slice: p50 (50%), p95 (95%), and p99 (99%). This is helpful in determining where the bulk of your requests sit in terms of speed and how slow the outliers are.

Proxy Queue time

Proxy Queue time is the total amount of time requests were queued (or paused) at our load balancing layer. Queue time is ideally 0; however, in the event that you send many requests in parallel, our load balancer will queue up requests while waiting for executing requests to finish. This is part of our Quality of Service layer.

Concurrency

Concurrency shows the number of requests that are happening at the same time. Since clusters are limited on concurrency, this can be an important one to keep an eye on. When you reach your plan’s max concurrency, you will notice queue time start to consistently increase.

Bandwidth

This graph shows the amount of data crossing the network - going into the cluster (shown in green) and coming from the cluster (in blue).

We expect most bandwidth graphs to look something like the graph below — a relatively higher count of "From Client" data (read requests) compared to "To Client" data (write or indexing requests).

The relationship between green to blue bars in this graph really depends on your use-case. A staging cluster, for example, might see a larger ratio of Write:Read data. It’s important to note that this graph deals exclusively in data - a high-traffic cluster will probably see a lot of data coming “From” the cluster, but a low-traffic cluster with very complicated queries and large request bodies will also have a larger “From Client” data than would otherwise be expected. Therefore, it’s helpful to look at request counts to get a feeling for the average "size" of a request.

Response Codes

This graph can do two things:

  • It will confirm that responses are successful: 2xx responses. This means that everything is moving along well and requests are formed correctly.
  • In the less positive case, it can be a debugging tool that can help figure out where any buggy behavior is coming from. In general, 4xx requests are the result of a malformed query from your app or some client, whilst a 5xx request is from our end.

It’s important to note while reading this graph, that 5xx requests don’t necessarily mean that your cluster is down. A common situation on multitenant plans is a cluster that’s getting throttled by a noisy neighbor who’s taking up a lot of resources on the server. This can interrupt some(but not all) normal behavior on your cluster, resulting in a mix of 2xx and 5xx requests.

Tolerance for a few 5xx requests every now and then should be expected with any cloud service. We’re committed to getting all production clusters a 99.99% uptime (i.e. expected 0.001% 5xx responses), and we often have a track record of four 9’s and higher.

We have a lot of people that are very sensitive to 5xx requests. In these cases, it’s usually best to be on a higher plan or a single tenant plan. Reach out to us at support@bonsai.io if this is something your team needs.

Business and Enterprise Plans - Additional Metrics

Clusters running on Business and Enterprise grade clusters will have access to some additional metrics. If you would like to get access to these metrics for your cluster, please reach out and our team will walk you through the process.

System Load

System load is the average number of processes waiting on a resource within a given period of time. This is often reported in 1, 5 and 15 minute windows. The Bonsai dashboard shows the system load average over the past minute.

It is helpful to think of system load as how saturated a node is with tasks; as long as the node's load average is lower than the number of its available CPUs, the node is able to handle all of its work without getting backed up. If the load average is larger than the number of its available CPUs, that means that some tasks are being scheduled. When tasks are delayed like this, performance suffers, and the performance impact is correlated to how high the load average gets.

Elasticsearch Queue

Elasticsearch utilizes a number of thread pools for handling various tasks. These pools are also backed by a queue, so that if an tasks is created and a thread is not available to execute it, the task is queued until a thread becomes available. This metric shows the total number of tasks sitting in an Elasticsearch queue.

It's important to note that this is not the same as a request queue. A single request can result in multiple tasks being created within Elasticsearch. It is also important to note that these queues have a finite length. If the queue is full and an additional task is created, Elasticsearch will reject it with a message like "rejected execution (queue capacity 50)" and an HTTP 429 response.

Elasticsearch Bulk

This metric shows the number of _bulk requests that have been processed over time. Bulk requests are an efficient way to insert or update data in your cluster. Naturally, your payload sizes need to be more than 1-2 documents in order to get the benefits of bulk updates. Usually batches of 50-500 are ideal.

Elasticsearch Search

This metric shows the number of search requests that have been processed by Elasticsearch. It's important to distinguish between user searches and shard searches. A user search is performed by your application (often in response to some user action or search in the app), and may translate into multiple shard searches. This is true if your indices have multiple primary shards, or if you're searching across multiple indices. A query for the top X results will be passed to all relevant shards; each shard will perform the search and return the top X results. The coordinating node is then responsible for collating those results, sorting, and returning the top X.

Elasticsearch does utilize a thread pool specifically for searches, so depending on the types of searches you're running, and the volume of search traffic, it's possible to get a message like "rejected execution (queue capacity 50)" and an HTTP 429 response.

JVM Heap

Elasticsearch is a Java-based search engine that runs in the Java Virtual Machine (JVM). The JVM has a special area of memory where objects are stored, called "heap space." This space is periodically garbage-collected, meaning objects that are no longer in use are destroyed to free up space in memory.

This metric shows the percentage of heap space that is currently occupied by Elasticsearch's objects and arrays. Lower is ideal, because high heap usage generally means more frequent, and longer-lasting garbage collection pauses, which manifests as slower performance and higher latency.

JVM Young GC Time

This metric shows the amount of time in milliseconds that the JVM spent reclaiming memory from new or short-lived objects. This type of garbage collection is expected and shouldn't lead to HTTP 503 or 504 errors. However, if it is chronic and frequent, it may lead to slow performance.

JVM Old GC Time

This metrics shows the amount of time in milliseconds the JVM spent reclaiming memory from long-surviving objects. The JVM periodically pauses the application so it can free up heap space, which means some operations are stopped for a period of time. This can result in perceived slow response times, and in some extreme cases can lead to system restarts and HTTP 503 and 504 responses.

CPU IO Wait

This metric shows the percentage of the time that the CPU(s) waited on IO operations. This means that an operation requested IO (like reading or writing to disk) and then had to wait for the system to complete the request. A certain amount of IOWait is expected for any IO operation, and usually it's on the order of nanoseconds. This isn't indicative of a problem on its own.

Excessive wait times are problematic though. It is usually correlated to high system load and a high volume of updates. Sometimes that can be addressed through hardware scaling or better throttling of updates. In rare cases it can indicate a problem with the hardware itself, like an SSD drive failing.

CPU User

This metric shows the percentage of the time that the CPU(s) were executing code in user space. The user space is where all code runs, outside of the operating system's kernel. On Bonsai, this space is primarily dedicated to Elasticsearch, so the metric is roughly the amount of time the CPU spent processing instructions by the Elasticsearch code.

The metric can vary widely between clusters, based on application, hardware and use case. There is not necessarily an ideal value or range for this metric. However, large spikes or long periods of high processing times can manifest as poor performance. It often indicates that the hardware is not able to keep up with the demands of the application, although it can sometimes indicate a problem with the hardware itself.

Dashboard Metrics

Bonsai offers support for automatically removing old indices from your cluster to save space. If you’re using Bonsai to host logs or some application that creates regular indices following a time-series naming convention, then you can specify one or more prefix patterns and Bonsai will automatically purge the oldest indices that match each pattern. The pattern will only match from the start of the index name.

This feature is found in the Cluster dashboard, under the Trimmer section:

For example, assume you’re indexing time-series data, say, number of support tickets in a day. You’re putting these in time-series indices like "support_tickets-201801010000", "support_tickets-201801020000", and so on.

With this feature, you could specify a pattern like "support_tickets-", and we’ll auto-prune the oldest indices first when you reach the size limit specified for the pattern. Indices scheduled for removal will be highlighted in red.

Please note we will not purge the last remaining index that matches the pattern, even if the size is above the limit.

Note on Trimmer and deleting documents

The trimmer feature only allows you to delete whole indexes and only if there are more than one with the same trimmer pattern. The trimmer does not delete just certain documents in an index. To remove a number of documents in an index your best option is to use delete_by_query.

Here is an example for deleting the 50 oldest documents, according to a hypothetical "date" field:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">POST //_delete_by_query?conflicts=proceed

{
 "query": {
   "match_all": {}
 },
 "sort": {
   "_script": {
     "type": "date",
     "order": "asc"    }
 },
 "size": 50
}</code></pre></div>

The Elasticsearch Search API can be used to identify documents to delete. Please use your Console (or an equivalent tool) and build your query using the Search API first, and verify that you will only be deleting the documents you want. Do this before running the delete_by_query call against your query.

Trimmer

To navigate to your personal profile, click on your initials in the upper right corner and select Profile Settings from the dropdown menu. Then navigate to the Security tab.

  1. Single Sign-On
  2. Password Management
  3. Browser Session Management

1. Single Sign-On

Single Sign-On (SSO) is the ability to have a third party service validate your identity. You can enable Google SSO which offers additional security like multi-factor authentication (MFA). Bonsai also supports Okta.

To use this feature, your identity provider must match your Bonsai.io account email address. For example, if your Google email address is "bob.smith@gmail.com," then your Bonsai.io account must use this same email address in order to verify your identity.

Once you have SSO set up, you will no longer be able to log in with your username/password. Logging in will need to be done through the identity provider.

To revert back to username/password authentication, you will need to disable SSO. To do so, simply click on Disable SSO.

If you see this section greyed out then your account admin has required that you use SSO.

2. Password Management

To update your password, enter your old password and a new password. Bonsai strongly recommends using a password manager like 1Password or LastPass to keep your passwords secure, and to help randomly generate new passwords.

Protip: Use a strong password

We’re a security-conscious bunch, and we don’t have any arcane rules about what kinds of characters you must use for your password. Why? We’ll let xkcd explain it. Tl;dr: our password policy simply enforces a minimum length of 10 characters. We also reject common passwords that have been pwned. Sadly, correct horse battery staple appears in our blacklist.

Note: updating your password will revoke all of your active sessions and force you to log in again.

3. Browser Session Management

View and revoke your active sessions by scrolling down to Active Sessions. If you have a session on another device, you can see its IP address and information about the device.

You can revoke sessions individually, or revoke all. Revoking all sessions will also revoke your current session that you are using to view your profile, and doing so will require you to log in again.

Account Security (SSO, Passwords, Sessions)

You can add or delete a credit card, update your payment profile, or add coupon codes in the Billing tab of Account Settings.

  • Add a credit card
  • Check Card Details
  • Set default credit card
  • Delete credit cards on file
  • Add an Address
  • Coupon Codes
  • Statements

Add a credit card

To begin adding a credit card to the account, click on Add credit card to bring up a new form.

Fill in the credit card information and click Add Card to proceed. Click Cancel to go back.

Your Financial Data, Secured

When you add a credit card in Bonsai, the information is encrypted and passed to our payment processor. We have verified that this service is Level 1 PCI Compliant. This level of compliance is the highest level of security a business can offer. We do not host or store any financial data, and your credit card details can not be accessed by anyone on our team or within the payment processor.

A successfully added credit card will be listed.

Check Card Details

Once there is at least one card listed in Credit Cards, click on the overflow menu icon to reveal a drop-down menu to View Details on that card.

Card Details lists the credit card information.

Set default credit card

You can associate multiple credit cards to your account. If you only have one credit card on file, it will automatically be the default. If you have multiple credit cards, you have the option to update the default credit card for payment.

To set another credit card as default, click on the overflow menu icon to reveal a drop-down menu to Set Default on a that card.

Delete credit cards on file

To remove a credit card from an account, click on the overflow menu icon to reveal a drop-down menu to Delete a credit card.

Please note, you can delete a default credit card if all 3 of the following statements are true:

  • it’s the only credit card on file
  • there are no active paid clusters on the account
  • the account balance is $0

If you are having trouble deleting a default card, let us know at support@bonsai.io.

Add an Address

Once a credit card has been successfully added, the Address section will appear under Credit Cards. Adding an address here will aid in applying taxes based on your location. If tax is charged, then at what rate depends on this address. See Bonsai's Policy on Collecting Sales Tax for further information.

Clicking on the Add Address button will take you to a form where you can use an existing address populated from your credit card or fill out a new address.

A successfully added address will take you back to the Billing tab.

Coupon Codes

If you have a coupon code, you can add it in the Coupon Code section. Simply enter the code and click on Apply Coupon.

If the code is accepted, it will be listed under Coupon Codes.

If the code is not valid, there will be an error message.

If you have a code that isn’t working but should, shoot us an email at support@bonsai.io and we’ll check it out.

Statements

Download and view the account statements.

Billing: Settings

You and the people you refer can earn credit that will be applied to your account to on paid Bonsai plans. We'll cover how this process works in the following guide.

  1. Overview
  2. Refer new users with an email invite
  3. Refer new users with a shareable referral URL
  4. The new user referral sign up process
  5. FAQs

1. Overview

How it works:

<ol>
<li>Send your unique referral URL to a new user.
<ul>
<li>Enter the email of the person you are referring under your Referrals tab, and we’ll automatically send them an invite email with everything they need to know.</li>
and/or
<li>Copy your shareable link, and share your URL via email, social media, or whatever method works for you.</li>
</ul>
</li>
<li>The new user signs up using your referral URL.</li>
<li>The new user pays for their first month on a paid cluster plan.</li>
<li>You earn $50 AND they receive $50 toward a paid Bonsai plan.</li>
<li>You can send your referral URL to as many people as you want, but you will stop earning credit once you have reached the limit of $50.</li>
</ol>

To begin, navigate to Account Settings then to the Referrals tab.

Under the Referrals tab you will be able to access your unique referral URL and see how many referrals have been sent and accepted:

2. Refer new users with an email invite

Entering a new user's email address and clicking Invite will automatically send them an email with a detailed explanation of how to sign up for Bonsai using your referral URL. A success message on the dashboard will notify you that your invite has been successfully sent.

The email we send to new users looks like this:

3. Refer new users with a shareable referral URL

Clicking  Copy Link will automatically copy your referral URL to your clipboard.

Paste your unique referral URL to share it with new users through your personal email, social media, or wherever!

4. The new user referral sign up process

<ol>
<li>When a new user is directed to our Sign Up page using your referral URL, they will see the following page:
<figure class="w-richtext-figure-type-image w-richtext-align-center"><div><img src="https://global-uploads.webflow.com/63c81e4decde60e815417fc3/6464200ec961bced99877436_6410e020d4469.png" alt=""></div></figure>
</li>
<li>By completing the sign up process through your referral URL, their account will be marked as having been created with your referral code</li>
<li>Once the new user has upgraded their cluster to a Staging plan or higher, you and the new user will both receive a credit of $50 towards the next monthly bill. Hooray!</li>
</ol>

5. FAQs

Q: I don't have a Referrals tab in my Account Settings.

A: Only accounts with Standard an up clusters have access to Referrals.

The Referrals tab will show up in your Account Settings once you have at least one paid cluster in your account.

Q: I invited a friend with my referral URL, but I didn't receive any credit.

A: In order to receive the credit your friend has to both sign up for a new account with your referral URL and upgrade to their cluster to Standard, Business, or Enterprise. If you don't see your credit after that, please let us know at support@bonsai.io. We're happy to get this sorted out for you.

Q: Can I accumulate credit and use it in the future?

A: No, your credit will automatically be applied to the following month’s payment.

Q: My referral credit is more than my plan’s cost. Does that carry over to the billing cycle?

A: No. Since this is a one-time credit, it can only be applied to the following month.

Referral Credit Program

Note on Logs

Logs are real-time they only appear on this page while requests are happening. Logs do not persist.

Bonsai provides a real-time stream of all the requests hitting your cluster. There is also a subtab for the Top 20 Slow Requests. This will begin to populate if requests slow down measurably.

The streaming logs show a timestamp, HTTP verb and endpoint, along with how long it took Elasticsearch to respond and what HTTP response code was returned. Request/response payloads are not captured.

Bonsai does not expose server logs at this time.

Logs

Credential management is imperative to tracking who and what has access to your cluster. At Bonsai, regardless of plan level, every request made to your cluster requires a username and password. Security is a default, not an upgrade.

With the Credential Management section of the cluster dashboard, you can add and remove access credentials. In this guide, we will cover:

  1. Introduction to Credentials Management
  2. Regenerating your master credentials
  3. Creating new credentials
  4. Deciding which credential type to use
  5. Which credential types does your subscription support?

1. Introduction to Credentials Management

You can see your current credentials and generate new ones by logging into your cluster dashboard and navigating to the Credentials section.

Every cluster has a Master credential that is created when the cluster is provisioned. The Master credential can never be revoked, only rotated.

There are three types of credentials you can generate:

  1. Full-access
  2. Read-only
  3. Custom

With custom auth controls you can specify things like:

  • Which indices (or index patterns) are accessible.
  • What Elasticsearch actions may be performed.
  • Where requests are allowed to originate.

2. Regenerating your master credentials

You would want to regenerate your default credentials if your fully-qualified URL has been linked (say, if somehow was copy-pasted into an email, GitHub issue, or, perish the thought, StackOverflow). To do that, simply click the yellow  Regenerate button. This will instantly regenerate a new, randomized authentication pair.

The old credentials will remain active for two minutes or so. After that time the old keys are revoked. The purpose of the two minute warning is to give administrators the opportunity to update their application with the new credentials before the old ones expire.

3. Creating new credentials

To create a new credential, click on the Create Credential button.

Choose one of the three types (full-access, read-only, or custom), give it a name, and then click Generate.

Note

We advise giving your credential a human-friendly name, like ACME_frontend, python_indexer, or docs_search_component. It’s an easy way to help you and your teammates remember how each credential is used. When you generate a new credential, Bonsai shows your credential details.

Credential Details

You can view who created the credential (in this example, Leo), the access key and secret (username and password), the allowed settings, and some quick links.

Settings

This displays what indices are accessible, which Elasticsearch actions are allowed, and if there are whitelisted IP’s or CIDR blocks. If your cluster is on the Business tier and above, these fields are customizable.

URL Quick Links

The Elasticsearch access url is excellent for pasting into a terminal and executing curl commands. Use Kibana access to launch your Bonsai-hosted instance Kibana, included with every Bonsai cluster. Read more about Bonsai hosted Kibana here.

4. Deciding which credential type to use

Full Access

Full-access tokens are best used for back-end applications that handle indexing or act as a proxy for user’s input for querying.

Read Only

When choosing this type, the form will pre-populate with the allowed action (or ‘privilege’) `indices:data/read/*`. This allowed read-only specific actions: count, explain, get, exists, mget, get indexed scripts, more like this, multi percolate/search/termvector), percolate, scroll, clear_scroll, search, suggest, tv.

Custom

If you have specific security needs, generate a custom credential. Increase your team’s security tolerance by using custom credentials for things like limiting index actions to only certain IP addresses, or making certain indices search-only. There are three fields for custom credentials: Indices, Actions, and IP/CIDRs.

Indices

This sections allowed you to list a set of indices that are permitted, or create a pattern such as "logs_2019-12-*":

Leaving this blank will allow all indices present on the cluster accessible.

Allowed Actions

Specify access privileges from the searchable dropdown:

If you ever need help figuring out exactly which actions map to your needs, please email support and we’ll point you in the right direction. Leaving this blank will allow all access privileges.

IP Address or CIDR block

Use this section to control where you allow requests to be made from. Whitelist individual IP addresses for monitoring privileges, or write a CIDR block that only allows your company to access an internal-only index or cluster. Leaving this blank will allow any IP address by default.

Further reading:

5. Which credential types does your subscription support?

If you receive a notice to upgrade for access to read-only or custom credential management, you’ll need to navigate to Manage tab and upgrade your cluster to Standard, Business, or Enterprise:

Clicking on the Upgrade this cluster link will take you to your management dashboard, where you can upgrade to a Business or Enterprise subscription.

Credential Management

Changing a cluster's name is quick and easy! From the cluster dashboard, simply navigate to the Settings link under the Manage header:

Under the "Edit Cluster Name" form, enter the new name for the cluster, and click on "Save."

Changing your cluster's name will not result in downtime or loss of data. It will also not change the cluster's URL. If the name is already taken by an active cluster in your account, you will receive an error.

Change a Cluster's Name

Destroying your cluster is simple. In this doc we cover:

  1. How to deprovision a cluster
  1. FAQ

How to deprovision a cluster

If you need to deprovision(destroy) a cluster, navigate to your cluster, and head to the settings section of the dashboard.

Once you verify your account password in the form, you’ll be able to click the Deprovsion button.

Once a cluster has been deprovisioned, it is destroyed immediately. The data is deleted, it will instantly stop responding, and all requests will return an HTTP 404.

When you deprovision a paid cluster on Bonsai, you will automatically be credited with a prorated amount. This credit will apply to future invoices. For example, if you had a $50/mo cluster, which you destroyed after 3 weeks, you would get(roughly) $12.50 credit applied to your account.

Note

This section does not apply to Heroku users. If you're a Heroku user, check out:

FAQ

I don’t see the Settings section on my dashboard.

If you can’t find the Settings section on a cluster dashboard, then that means your Role on your team is either Member or Billing , neither of which can deprovision a cluster. Contact your team Admin to change your role, or handle the deprovision themselves. Read more about how teams on Bonsai work.

What if I accidentally deprovisioned my cluster?

The data in your cluster will be automatically deleted. Bonsai retains hourly snapshots for the past 24 hours, and daily snapshots for the past 7 days. So if you accidentally deleted data you needed, our team might be able to provide a partial restore, depending on when we are alerted. Contact us for more help.

Deprovision A Cluster

Bonsai offers support for integrating with other services. This menu can be found by clicking on the Integrations tab of your cluster dashboard.

Bonsai currently offers the support for the following integrations:

We are always looking for new services to integrate with, so if you would like to see Bonsai add support for another service, please reach out and let us know.

Integrations

The Console is a web-based UI for interacting with your cluster. Think of it like a user-friendly version of curl. If you want to try out some queries or experiment with the Elasticsearch API, this is a great place to start:

The UI allows the user to select an HTTP verb, enter an endpoint and run the command. The results are shown in the navy console on the right. The box below the verb and endpoint boxes can be used to create a request body.

Beware

Some places in the Elasticsearch documentation suggest using a GET request even when passing a request body. An example would be something like:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">GET /_search
{
 "query" : {
   "term" : {
     "user" : "george"
   }
 }
}</code></pre></div>

Without getting bogged down in pedantry, GET is a questionable verb to use in this case. A POST is more appropriate. It’s what the UI is assuming anyway. If you paste in a request body and use a GET verb, the body will be ignored.

Console

Need to update a credit card or provide access to your clusters to a team member? Use the sections below to manage all things account-related.

To access your Account Settings, click on the upper right-hand dropdown and select Account Settings.

From your account settings page, you can…

Account Settings

Control security within your whole team in your Account Security tab by enabling Single-sign on for your Account.

Single-sign on (SSO) is the ability to have a third party service validate your identity, like Google and Github. Bonsai currently support Google SSO. You can enable Google SSO which offers additional security like multi-factor authentication (MFA).

Individual team members can finalize this setup in their Profile Setting's Security tab.

Once Google SSO is enabled on your account, you will be able to require your team members to log in with Google.

Require SSO For All Team Members

Search Clips allow you to query any of your clusters and view readable, real-time responses. The Query Builder helps you navigate the Elasticsearch DSL with auto-completion and syntax error highlighting, and the results are exportable as JSON or CSV.

With Search Clips, you can build and share queries with your team members. You can invite some collaboration on the query structure while looking at the same results. Utilizing Search Clips in this way could create loops of building queries, learning from the output, refining, and ultimately lead to a well designed query that then gets integrated into an application.

To manage team members on an account, navigate to Managing Team Members.

In this document we will cover the following actions:

  1. View all Clips
  2. Create a Clip
  3. View and edit a Clip
  4. Viewing results
  5. Export a Clip
  6. Delete a Clip

1. View all Clips

Click on the icon to the left of the account drop down on the upper right hand corner of the screen, and click on Search Clips from that menu. You will be able to view and edit Clips related to clusters in any of your accounts created by you or your team members.

Accessing Search Clips for the first time? Click Get started with Search Clips

2. Create a Clip

Click the New Clip button in the upper right hand corner of the Search Clips page.

This will open a modal to enter information for your new Clip. Enter a name and select the cluster and indices you wish to query. All indices can be queried at once by selecting '_all', or select individual indices. Clicking on Create Clip will create and redirect you to the Search Clip.

3. View and edit a Clip

The main sections of the Search Clip view are the Query Editor on the left, and the Response View on the right. Updating the query in the editor will automatically refresh results, or you can click the refresh button on the upper right hand corner of the Response View. The name of the Clip and the name of the cluster that is currently being queried are also shown on this page.

Update the cluster, indices, and name of the Clip by clicking on Settings in the upper right hand corner.

4. Viewing Results

The Response View has tabs for Hits, Aggregations (Aggs), Metadata, and the Raw response. The Hits table shows the returned hits in a readable format, and allows you to toggle which columns you would like to see. If your query has any aggregations, they will be summarized in the Aggs tab.

5. Export a Clip

Click on the Export button to export your Clip results to JSON or CSV.

6. Delete a Clip

Clips can be deleted from the Search Clips page by clicking on the toggle to the right of the Clip you would like to delete, and selecting the Delete option.

Search Clips

Bonsai occasionally sends out updates about new features, and can also send out weekly reports about your cluster's usage and performance. You can manage whether or not you receive these emails in the Notification Preferences tab.

Change Notification Preferences

You can edit your name and company size in the account profile.

Update Your Organization's Profile

You can create a new account from the upper right-hand dropdown by selecting Add Account within Switch Account's drop-down menu

Give your new account a name and click Create Account.

You will then be prompted to create your free sandbox cluster in the new account. You can switch between your accounts from the drop-down menu:

Creating a New Account

You can invite multiple team members to your Account, each with specialized roles. Account admins can invite new users to join the Account. At least one Billing and one Admin role are needed for each account. If you only have one team member, that person assumes the Admin and Billing role by default.

Click here to go directly to Team Members in the dashboard

Add or update team members

You can edit your name and contact information in the Profile tab.

Edit your Profile details

Need to change your email, password, or change the notifications you receive about clusters? Use the sections below to manage your personal profile.

To navigate to your personal profile, click on your initials in the upper right corner and select Profile Settings from the dropdown menu.

Profile Settings

Navigate to Cancel Account to begin the cancellation process.

If you want to cancel your account, you’ll need to first make sure that any active clusters you have are deprovisioned first. If you have any active clusters on your account, you’ll see a notice like this:

For instructions on how to deprovision your clusters, please visit Cluster Dashboard.

Once you have deprovisioned all of the clusters on your account, you will be able to cancel your account completely. You will see a screen requesting your account password to confirm the cancellation.

So long, farewell, auf Wiedersehen, good night!

We’re sorry that you no longer want to have an account with us, but wish you the very best with your application and search. If there’s anything you feel we could do better, please don’t hesitate to send us an email with comments.

Cancel the account

Migrating from a Heroku Account to a Direct Account requires minor configuration changes in the Heroku application. This will require redeployment of the application.

How It Works:

  1. Sign up for an account at Bonsai.io. Please make sure to add your billing information.
  2. Change your application to use ELASTICSEARCH_URL environment variable rather than BONSAI_URL.
  3. Then configure your application in Heroku with the new ELASTICSEARCH_URL shown below:
    <span class="inline-code"><pre><code>heroku config:set ELASTICSEARCH_URL=$(heroku config:get BONSAI_URL)</code></pre></span>
    When you uninstall the Bonsai add-on, Heroku will remove the BONSAI_URL configuration setting. By redeploying your application to use this different environment variable now, you can remove any downtime for your application in later steps.
  4. Email support@bonsai.io and include: A) the email address associated with your new Bonsai account and B) the cluster(s) that you want migrated over.
  5. We’ll perform the migration. Your cluster URLs and all your data will remain intact. You will be billed at the monthly rate once that migration is complete. We’ll let you know once this step is done.
  6. Once we have confirmed that the migration is complete, remove the Bonsai addon(s) from your Heroku app so you’re not being billed twice! Uninstalling the Bonsai add-on at this step in Elasticsearch will remove the BONSAI_URL.
  7. You can migrate the rest of your application at your convenience. Any cluster we have migrated will now belong to your Bonsai.io account and can be managed there. Please let us know if your application is not functioning as expected.

That’s it! Migrations are zero-downtime and take only a few minutes once our support team takes the ticket.

Migrating from Heroku to a Direct Account

Hugo is a static site generator written in Go. It is conceptually similar to Jekyll, albeit with far more speed and flexibility. Hugo also supports generating output formats other than HTML, which allows users to pipe content directly into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Hugo to generate the exact format needed to submit the file to the _bulk endpoint of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Hugo installed and configured on your system

  1. Make Sure You Have Hugo Installed. This guide assumes you already have Hugo installed and and configured on your system. Visit the Hugo Documentation to get started.
  2. Spin Up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend.
  3. Create an Index on the Cluster. In this example, we’re going to push data into an index called <span class="inline-code"><pre><code>hugo</code></pre></span>. This index needs to be created before any data can be stored on it. The index can be created either through the Interactive Console, or with a tool like <span class="inline-code"><pre><code>curl</code></pre></span>:

Use the URL for your cluster. A Bonsai URL looks something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">curl -XPUT https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/hugo</code></pre></div>

Configure Hugo to Output to Bonsai Elasticsearch

Hugo’s configuration settings live in a file called <span class="inline-code"><pre><code>config.toml</code></pre></span> by default. This file may also have a <span class="inline-code"><pre><code>.json</code></pre></span>. or <span class="inline-code"><pre><code>.yaml</code></pre></span>/<span class="inline-code"><pre><code>yml</code></pre></span>yml extension. Add the following snippet based on your config file format:

TOML:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">[outputs]
home = ["HTML", "RSS", "Bonsai"]

[outputFormats.Bonsai]
baseName = "bonsai"
isPlainText = true
mediaType = "application/json"
notAlternative = true

[params.bonsai]
vars = ["title", "summary", "date", "publishdate", "expirydate", "permalink"]
params = ["categories", "tags"]</code></pre></div></div>

JSON:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">

{
 "outputs": {
   "home": [
     "HTML",
     "RSS",
     "Bonsai"
   ]  
},
 "outputFormats": {
   "Bonsai": {
     "baseName": "bonsai",
     "isPlainText": true,
     "mediaType": "application/json",
     "notAlternative": true
   }
 },
 "params": {
   "bonsai": {
     "vars": [
       "title",
       "summary",
       "date",
       "publishdate",
       "expirydate",
       "permalink"
     ],
     "params": [
       "categories",
       "tags"
     ]
   }
 }
}</code></pre></div></div>


YAML:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"> outputs:
 home:
   - HTML
   - RSS
   - Bonsai
outputFormats:
 Bonsai:
   baseName: bonsai
   isPlainText: true
   mediaType: application/json
   notAlternative: true
params:
 bonsai:
   vars:
     - title
     - summary
     - date
     - publishdate
     - expirydate
     - permalink
   params:
     - categories
     - tags</code></pre></div></div>

This snippet defines a new output called “Bonsai”, and specifies some associated variables.

Creating the JSON template

Hugo needs to have a template for rendering data in a way that Elasticsearch will understand. To do this, we will define a JSON template that conforms to the Elasticsearch Bulk API.

Create a template called <span class="inline-code"><pre><code>layouts/_default/list.bonsai.json</code></pre></span> and give it the following content:

<div class="code-snippet-container"><a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">{{/* Generates a valid Elasticsearch _bulk index payload */}}
{{- $section := $.Site.GetPage "section" .Section }}
{{- range .Site.AllPages -}}
 {{- if or (and (.IsDescendant $section) (and (not .Draft) (not .Params.private))) $section.IsHome -}}
   {{/* action / metadata */}}
   {{ (dict "index" (dict "_index" "hugo" "_type" "doc"  "_id" .UniqueID)) | jsonify }}
   {{ (dict "objectID" .UniqueID "date" .Date.UTC.Unix "description" .Description "dir" .Dir "expirydate" .ExpiryDate.UTC.Unix "fuzzywordcount" .FuzzyWordCount "keywords" .Keywords "kind" .Kind "lang" .Lang "lastmod" .Lastmod.UTC.Unix "permalink" .Permalink "publishdate" .PublishDate "readingtime" .ReadingTime "relpermalink" .RelPermalink "summary" .Summary "title" .Title "type" .Type "url" .URL "weight" .Weight "wordcount" .WordCount "section" .Section "tags" .Params.Tags "categories" .Params.Categories "authors" .Params.Authors) | jsonify }}
 {{- end -}}
{{- end }}</code></pre></div></div>

When the site is generated, this will result in creating a file called public/bonsai.json, which will have the content stored in a way that can be pushed directly into Elasticsearch using the Bulk API.

Push the Data Into Elasticsearch

To get the site’s data into Elasticsearch, render it by running <span class="inline-code"><pre><code>hugo</code></pre></span> on the command line. Then send it to your Bonsai cluster with <span class="inline-code"><pre><code>curl</code></pre></span>:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">curl -H "Content-Type: application/x-ndjson" -XPOST "https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/_bulk" --data-binary @public/bonsai.json</code></pre></div></div>

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://user123:pass456@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre></div></div>

Hugo

Bonsai supports the ElasticHQ monitoring and management interface. This open source software gives you insight into the state of your cluster. If you’re looking to see details about performance, check out Cluster Metrics.

The GitHub repo has tons of documentation and how-to guides. This article lays out some common methods of running ElasticHQ locally.

Using OS X with Pow

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">mkdir -p ~/.pow/elastichq
git clone git@github.com:royrusso/elasticsearch-HQ.git ~/.pow/elastichq/public</code></pre></div></div>

Navigate your browser to http://elastichq.dev/. You should see the ElasticHQ dashboard. Enter your Bonsai cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using Grunt

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">git clone git@github.com:royrusso/elasticsearch-HQ.git
cd elasticsearch-HQ
git checkout master
npm install
grunt server</code></pre></div></div>

You should see the ElasticHQ dashboard at <span class="inline-code"><pre><code>http://localhost:9000/</code></pre></span>. Enter your bonsai.io cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using Apache

First, make sure Apache is running. Typically this means you can access a directory via your browser at <span class="inline-code"><pre><code>http://localhost:80/</code></pre></span>. Your install may be different, so use whatever URL/folder is appropriate for you.

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">cd   # The root directory of your HTTP server
git@github.com:royrusso/elasticsearch-HQ.git
cd elasticsearch-HQ/
git checkout master</code></pre></div></div>

Open up your browser to <span class="inline-code"><pre><code>http://localhost:80/elasticsearch-HQ/</code></pre></span> and you should see the ElasticHQ dashboard. Enter your bonsai.io cluster URL in the field for the cluster location and click on “Connect.” The dashboard will bring up information about your cluster.

Using ElasticHQ with Bonsai

Logstash is a data processing tool for ingesting logs into Elasticsearch. It plays a prominent role in the Elastic suite, and a common question is whether Bonsai offers support for it.

The answer is a qualified “yes.” Logstash is a server-side tool, meaning it runs outside of Bonsai’s infrastructure and Bonsai is not involved in its configuration or management. But as a host, Bonsai is not opinionated about where your cluster’s data comes from. So if you have Logstash running on your servers, you can configure an output to your Bonsai cluster, and it will work.

Connecting your Logstash instance to a Bonsai cluster is as easy as adding an output to the Logstash configuration file like so:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">output {
   elasticsearch {
       # https://randomuser:randompass@something-12345.us-east-1.bonsai.io
       # would be entered as:
       hosts     => ["something-12345.us-east-1.bonsai.io:443"]
       user      => "randomuser"
       password  => "randompass"
       ssl       => true
       index     => ""
   }
}</code></pre></div>

Autocreation and Bonsai

If an application sends data to an index which does not exist, Elasticsearch will create that index and assume its mappings from the data in the payload. This feature is called autocreation, and it is supported in a limited capacity on Bonsai. Certain base names can be used for autocreation. Those base names are:

  • .kibana
  • events
  • filebeat
  • kibana-int
  • logstash
  • requests

This means your Logstash index must start with one of these index names, or it will not be automatically created.

We have made a number of tests and verified that we are fully compatible with Logstash, as of version 1.5+. Older versions of Logstash don’t have support for SSL/TLS or HTTP Basic Auth; these older versions can work with Bonsai, but only without the benefits of encryption or authentication.

If you have any issues getting Logstash to pass data along to Bonsai, check the documentation to make sure it’s set up correctly. If that doesn’t help, feel free to reach out to us at support@bonsai.io and we’ll do our best to get you pointed in the right direction.

Using Logstash with Bonsai

Kibana is an open-source data visualization and dashboard tool built for rich analytics. It takes advantage of Elasticsearch’s full-text querying and aggregation capabilities to built highly flexible and interactive dashboards.

All Bonsai clusters support Kibana out of the box. It is possible to use Kibana through one of several ways: via your Bonsai cluster dashboard, as a free Heroku app, or locally/as a private server.

Cluster Dashboard

Bonsai provides Kibana instances to clusters running on Elasticsearch versions 5.x and up. You can launch your Kibana instance right from your dashboard:

Clicking on the Kibana link will open up your Kibana instance:

Please be patient, as it may take a few seconds for Kibana to load.

As a Free Heroku App

If you have a Heroku account, there is a GitHub project that offers a click to deploy button. Clicking on the button will walk you through the process of deploying a free Heroku app running Kibana, which can be configured with a URL to an Elasticsearch cluster.

If you don’t have a cluster yet, a free Bonsai cluster will be created and attached to the Kibana app. If you already have a Bonsai cluster, you can link to it during the build process.

Locally / Private Server

You may also download Kibana and run it locally or on a private server. Not all versions of Kibana are compatible with all versions of Elasticsearch, so make sure to check the compatibility matrix and download a version that will work with your Bonsai cluster.

( Note: You can also install Kibana using a repository and package manager, but this will likely involve downloading the latest version and may not be compatible with your cluster)

Once you have Kibana downloaded, you’ll need to configure it to point at your Bonsai cluster. Open up the <span class="inline-code"><pre><code>config/kibana.yml</code></pre></span> file and to set the value for <span class="inline-code"><pre><code>elasticsearch_url</code></pre></span>. For example:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">elasticsearch_url: "https://:@something-12345.us-east-1.bonsai.io"</code></pre></div>

In some later versions of Kibana, you may need to separately specify your Bonsai cluster’s username/password as configuration options:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">elasticsearch_url: "https://:@something-12345.us-east-1.bonsai.io:443"
elasticsearch.username: ""
elasticsearch.password: ""</code></pre></div>

Once Kibana has been configured, you can run it with bin/kibana (or bin\kibana.bat on Windows). This will start up the Kibana server with the settings pointing to your Bonsai cluster.

Last, open up a browser to http://localhost:5601 to finish setting up Kibana and get started. Note that if you’re running Kibana on a remote server, you’ll need to replace localhost with the IP address or domain of the remote server.

Using Kibana with Bonsai

You must have an Enterprise account to enable Okta Single Sign-On. Once enabled for your account, all users must login via Okta. This can be done from the login form with any password, or from the Okta End-User Dashboard.

To enable Okta you must:

  1. Reach out to support@bonsai.io and request Okta for your Bonsai account.
  2. Navigate to your account security page to find a form to enter your metadata from Okta. You will see a screen similar to this:
  1. In Okta, select the Sign On tab for the Bonsai app:
  1. Click on View Setup Instructions which will open a new page. Scroll to the bottom of the page to find and copy your IDP metadata.
  1. Paste the metadata into the form on the Bonsai account security page and submit by clicking Add Metadata.
  2. Now Log into Bonsai via the Okta End-User Dashboard and Bonsai will activate and require Okta for all users associated with your account.

Okta Single Sign-On

Filebeat is a lightweight shipper for forwarding and centralizing log data. It monitors the log files or locations that you specify, collects log events, and forwards them to Elasticsearch for indexing. A common question is whether Bonsai offers support for it.

The answer is a qualified “yes.” Filebeat is a server-side tool, meaning it runs outside of Bonsai’s infrastructure and Bonsai is not involved in its configuration or management. But as a host, Bonsai is not opinionated about where your cluster’s data comes from. So if you have Filebeat running on your servers, you can configure an output to your Bonsai cluster, and it will work.

To connect Filebeat  to a Bonsai cluster you just need to add your Bonsai URL to the filebeat.yml file like this:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">output.elasticsearch:
 hosts: ["wp-play-8646224217.us-east-1.bonsaisearch.net:443"]
 protocol: "https"
 username: "aaa" # The randomly-generated username for your cluster
 password: "xxx" # The randomly-generated password for your cluster</code></pre></div></div>

Autocreation and Bonsai

If an application sends data to an index which does not exist, Elasticsearch will create that index and assume its mappings from the data in the payload. This feature is called autocreation, and it is supported in a limited capacity on Bonsai. Certain base names can be used for autocreation. Those base names are:

  • .kibana
  • events
  • filebeat
  • kibana-int
  • logstash
  • requests

This means your Filebeat index must start with one of these index names, or it will not be automatically created.

It is important to note that Filebeat requires the OSS distribution of Elasticsearch, so for this process to work the OSS version of Filebeat needs to be used.

Using Filebeat with Bonsai

WordNet is a huge lexical database that collects and orders English words into groups of synonyms. It can offer major improvements in relevancy, but it is not at all necessary for many use cases. Make sure you understand the tradeoffs (discussed below) well before setting it up.

There are two ways to use WordNet with Bonsai. Users can add a subset of the list using the Elasticsearch API, or use the WordNet file that comes standard with all Bonsai clusters.

First, a brief background on synonyms and WordNet. If you want to jump around, the main sections of this document are:

  • What Are Synonyms in Elasticsearch
  • How Does WordNet Improve Synonyms?
  • Why Wouldn’t Everyone Want WordNet?
  • Using WordNet via the Elasticsearch API
  • Using the WordNet List File, wn_s.pl
  • Resources

What Are Synonyms in Elasticsearch?

Let’s say that you have an online store with a lot of products. You want users to be able to search for those products, but you want that search to be smart. For example, say that your user searches for “bowtie pasta.” You may have a product called “Funky Farfalle” which is related to their search term but which would not be returned in the results because the title has “farfalle” instead of “bowtie pasta”. How do you address this issue?

Elasticsearch has a mechanism for defining custom synonyms, through the Synonym Token Filter. This lets search administrators define groups of related terms and even corrections to commonly misspelled terms. A solution to this use case might look like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">{
   "settings": {
       "index" : {
           "analysis" : {
               "filter" : {
                   "synonym" : {
                       "type" : "synonym",
                       "synonyms" : [
                           "bowtie pasta, farfalle"
                       ]
                   }
               }
           }
       }
   }
}</code></pre></div>

This is great for solving the proximate issue, but what it can get extremely tedious to define all groups of related words in your index.

How Does WordNet Improve Synonyms?

WordNet is essentially a text database which places English words into synsets - groups of synonyms - and can be considered as something of a cross between a dictionary and a thesaurus. An entry in WordNet looks something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript"></code></pre></div>

Let’s break it down:

This line expresses that the word ‘kitty’ is a noun, and the first word in synset 102122298 (which includes other terms like “kitty-cat,” “pussycat,” and so on). The line also indicates ‘kitty’ is the fourth most commonly used term according to semantic concordance texts. You can read more about the structure and precise definitions of WordNet entries in the documentation.

The WordNet has become extremely useful in text processing applications, including data storage and retrieval. Some use cases require features like synonym processing, for which a lexical grouping of tokens is invaluable.

Why Wouldn’t Everyone Want WordNet?

Relevancy tuning can be a deeply complex subject, and WordNet – especially when the complete file is used – has tradeoffs, just like any other strategy. Synonym expansion can be really tricky and can result in unexpected sorting, lower performance and more disk use. WordNet can introduce all of these issues with varying severity.

When synonyms are expanded at index time, Elasticsearch uses WordNet to generate all tokens related to a given token, and writes everything out to disk. This has several consequences: slower indexing speed, higher load during indexing, and significantly more disk use. Larger index sizes often correspond to memory issues as well.

There is also the problem of updating. If you ever want to change your synonym list, you’ll need to reindex everything from scratch. And WordNet includes multi-term synonyms in its database, which can break phrase queries.

Expanding synonyms at query time resolves some of those issues, but introduces others. Namely, performing expansion and matching at query time adds overhead to your queries in terms of server load and latency. And it still doesn’t really address the problem of multi word synonyms.

The Elasticsearch documentation some really great examples of what this means. The takeaway is that WordNet is not a panacea for relevancy tuning, and it may introduce unexpected results unless you’re doing a lot of preprocessing or additional configuration.

tl;dr: Do not simply assume that chucking a massive synset collection at your cluster will make it faster with more relevant results.

Using WordNet via the Elasticsearch API

Elasticsearch supports several different list formats, including the WordNet format. WordNet synonyms are maintained in a Prolog file called <span class="inline-code"><pre><code>wn_s.pl</code></pre></span>. To use these in your cluster, you’ll need to download the WordNet archive and extract the <span class="inline-code"><pre><code>wn_s.pl</code></pre></span> file. You’ll then need to create your synonyms list by reading this file into a request to your cluster.

The target index could be created with settings like so:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT https://randomuser:randompass@something-12345.us-east-1.bonsai.io/some_index

   {
     "settings": {
       "analysis": {
         "filter": {
           "wn_synonym_filter": {
             "type": "synonym",
             "format" : "wordnet",
             "synonyms" : [
                 "s(100000001,1,"abstain",v,1,0).",
                 "s(100000001,2,"refrain",v,1,0).",
                 "s(100000001,3,"desist",v,1,0).",
                 #... more synonyms, read from wn_s.pl file
             ]
           }
         },
         "analyzer": {
           "my_synonyms": {
             "tokenizer": "standard",
             "filter": [
               "lowercase",
               "wn_synonym_filter"
             ]
           }
         }
       }
     }
   }</code></pre></div>

There are a number of ways to generate this request. You could do it programmatically with a language like Python, or Bash scripts with <span class="inline-code"><pre><code>curl</code></pre></span>, or any language with which you feel comfortable.

A benefit of using a subset of the list would be more control over your mappings and data footprint. Depending on when your analyzer is running, you could save IO by not computing unnecessary expansions for terms not in your corpus or search parameters. Reducing the overhead will improve performance overall.

Using the WordNet List File, wn_s.pl

If you would rather use the official WordNet list, it is part of our Elasticsearch deployment. You can follow the official Elasticsearch documentation for WordNet synonyms, and link to the file with <span class="inline-code"><pre><code>analysis/wn_s.pl</code></pre></span>. For example:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">PUT https://username:password@my-awesome-cluster.us-east-1.bonsai.io/some_index
{
   "settings": {
       "index" : {
           "analysis" : {
               "analyzer" : {
                   "synonym" : {
                       "tokenizer" : "whitespace",
                       "format" : "wordnet",
                       "filter" : ["synonym"]
                   }
               },
               "filter" : {
                   "synonym" : {
                       "type" : "synonym",
                       "format" : "wordnet",
                       "synonyms_path" : "analysis/wn_s.pl"
                  }
               }
           }
       }
   }
}</code></pre></div>

Resources

WordNet is a large subject and a great topic to delve deeper into. Here are some links for further reading:

Using Wordnet with Bonsai

Setting up your Java app to use Bonsai Elasticsearch is quick and easy. Just follow these simple steps:

Adding the Elasticsearch library

You’ll need to add the <span class="inline-code"><pre><code>elasticsearch-rest-high-level-client</code></pre></span> library to your pom.xml file like so:


   org.elasticsearch.client
   elasticsearch-rest-client
   6.2.3

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-java">package io.omc;

import org.apache.http.HttpHost;
import org.apache.http.auth.AuthScope;
import org.apache.http.auth.UsernamePasswordCredentials;
import org.apache.http.client.CredentialsProvider;
import org.apache.http.impl.client.BasicCredentialsProvider;
import org.apache.http.impl.client.DefaultConnectionKeepAliveStrategy;
import org.elasticsearch.action.search.SearchRequest;
import org.elasticsearch.action.search.SearchResponse;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.net.URI;

public class Main {

public static void main(String[] args) {
 String connString = System.getenv("BONSAI_URL");
 URI connUri = URI.create(connString);
 String[] auth = connUri.getUserInfo().split(":");

 CredentialsProvider cp = new BasicCredentialsProvider();
 cp.setCredentials(AuthScope.ANY, new UsernamePasswordCredentials(auth[0], auth[1]));

 RestHighLevelClient rhlc = new RestHighLevelClient(
   RestClient.builder(new HttpHost(connUri.getHost(), connUri.getPort(), connUri.getScheme()))
     .setHttpClientConfigCallback(
       httpAsyncClientBuilder -> httpAsyncClientBuilder.setDefaultCredentialsProvider(cp)
         .setKeepAliveStrategy(new DefaultConnectionKeepAliveStrategy())));

 // https://www.elastic.co/guide/en/elasticsearch/client/java-rest/master/java-rest-high-search.html
 SearchRequest searchRequest = new SearchRequest();
 SearchSourceBuilder searchSourceBuilder = new SearchSourceBuilder();
 searchSourceBuilder.query(QueryBuilders.matchAllQuery());
 searchRequest.source(searchSourceBuilder);

 try {
  SearchResponse resp  = rhlc.search(searchRequest);
  // Show that the query worked
  System.out.println(resp.toString());
 } catch (Exception ex) {
  // Log the exception
  System.out.println(ex.toString());
 }

 // Need to close the client so the thread will exit
 try {
  rhlc.close();
 } catch (Exception ex) {

 }
}
}</code></pre></div></div>

Feel free to adapt that code to your particular app. Keep in mind that the core elements here can be moved around, but really shouldn’t be changed or further simplified.

For example, the snippet above parses out the authentication credentials from the  <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable. This is so the username and password don’t need to be hard coded into your app, where they could pose a security risk.

Java

Bonsai Elasticsearch integrates with your .Net app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.Net and NEST client to your dependencies list:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">...


....</code></pre></div></div>

You’ll also want to make sure the client is installed locally for testing purposes:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">$ dotnet restore</code></pre></div></div>

Self-hosted users: when you sign up with Bonsai and create a cluster, it will be provisioned with a custom URL that looks like https://user:pass@my-awesome-cluster.us-east-1.bonsai.io. Make note of this URL because it will be needed momentarily.

Heroku users: you’ll want to make sure that you’ve added Bonsai Elasticsearch to your app. Visit our addons page to see available plans. You can add Bonsai to your app through the dashboard, or by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">$ heroku addons:create bonsai: -a</code></pre></div></div>

Update your Main.cs file with the following:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">using System;
using Nest;

namespace search.con
{
   class Program
   {
       static void Main(string[] args)
       {
           var server = new Uri(Environment.GetEnvironmentVariable("BONSAI_URL"));
           var conn = new ConnectionSettings(server);
           // use gzip to reduce network bandwidth
           conn.EnableHttpCompression();            
           // lets the HttpConnection control this for optimal access (default: 80)
           conn.ConnectionLimit(-1);
           var client = new ElasticClient(conn);

           var resp = client.Ping();
           if (resp.IsValid)
           {
               Console.WriteLine("All is well");
           }
           else
           {
               Console.WriteLine("Elasticsearch cluster is down");
           }
       }
   }
}
</code></pre></div></div>

The code above does several things:

  1. Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
  2. References the Elasticsearch.Net and NEST libraries
  3. Instantiates the client using your private Bonsai URL
  4. Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment variables. Something like this should work:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ export BONSAI_URL="https://username:password@my-awesome-cluster-123.us-east-1.bonsai.io"</code></pre></div></div>

Heroku users: You don’t need to worry about this step. The environment variable is automatically populated for you when you add Bonsai Elasticsearch to your app. You can verify that it exists by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">$ heroku config:get BONSAI_URL</code></pre></div></div>

Go ahead and spin up your .Net app with <span class="inline-code"><pre><code>dotnet run</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

Elasticsearch INFO: 2016-02-03T23:44:41Z
 Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
 Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
 -> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

 <- 200

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

.Net

Bonsai Elasticsearch integrates with your node.js app quickly and easily, whether you’re running on Heroku or self hosting.

First, make sure to add the Elasticsearch.js client to your dependencies list:

package.json:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">"dependencies": {
 "elasticsearch": "10.1.2"
 ....</code></pre></div></div>

You’ll also want to make sure the client is installed locally for testing purposes:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">$ npm install --save elasticsearch
$ npm install</code></pre></div></div>

Self-hosted users: when you sign up with Bonsai and create a cluster, it will be provisioned with a custom URL that looks like https://user:pass@my-awesome-cluster.us-east-1.bonsai.io. Make note of this URL because it will be needed momentarily.

Heroku users: you’ll want to make sure that you’ve added Bonsai Elasticsearch to your app. Visit our addons page to see available plans. You can add Bonsai to your app through the dashboard, or by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">$ heroku addons:create bonsai: -a</code></pre></div></div>

Update your index.js file with the following:

index.js:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">var bonsai_url    = process.env.BONSAI_URL;
var elasticsearch = require('elasticsearch');
var client        = new elasticsearch.Client({
                           host: bonsai_url,
                           log: 'trace'
                       });

// Test the connection:
// Send a HEAD request to "/" and allow
// up to 30 seconds for it to complete.
client.ping({
 requestTimeout: 30000,
}, function (error) {
 if (error) {
   console.error('elasticsearch cluster is down!');
 } else {
   console.log('All is well');
 }
});</code></pre></div></div>

The code above does several things:

  1. Pulls your Bonsai URL from the environment (you never want to hard-code this value – it’s a bad practice)
  2. Adds the Elasticsearch.js library
  3. Instantiates the client using your private Bonsai URL
  4. Pings the cluster to test the connection

Self-hosted users: You will need to take an additional step and add your Bonsai URL to your server/dev environment. Something like this should work:

<div class="code-snippet-container"><a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ export BONSAI_URL="https://username:password@my-awesome-cluster-123.us-east-1.bonsai.io"</code></pre></div></div>

Heroku users: You don’t need to worry about this step. The environment variable is automatically populated for you when you add Bonsai Elasticsearch to your app. You can verify that it exists by running:

<div class="code-snippet-container"><a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">$ heroku config:get BONSAI_URL</code></pre></div></div>

Go ahead and spin up your node app with <span class="inline-code"><pre><code>npm start</code></pre></span>. Heroku users can also push the changes to Heroku, and monitor the logs with <span class="inline-code"><pre><code>heroku logs --tail</code></pre></span>. Either way, you should see something like this:

<div class="code-snippet w-richtext"><pre><code fs-codehighlight-element="code" class="hljs language-javascript">Elasticsearch INFO: 2016-02-03T23:44:41Z
 Adding connection to https://nodejs-test-012345.us-east-1.bonsai.io/

Elasticsearch DEBUG: 2016-02-03T23:44:41Z
 Request complete

Elasticsearch TRACE: 2016-02-03T23:44:41Z
 -> HEAD https://nodejs-test-012345.us-east-1.bonsai.io:443/?hello=elasticsearch

 <- 200</code></pre></div>

The above output indicates that the client was successfully initiated and was able to contact the cluster.

Questions? Problems? Feel free to ping our support if something isn’t working right and we’ll do what we can to help out.

Node.js

Getting Elasticsearch up and running with a PHP app is fairly straightforward. We recommend using the official PHP client, as it is being actively developed alongside Elasticsearch.

Adding the Elasticsearch library

Like Heroku, we recommend Composer for dependency management in PHP projects, so you’ll need to add the Elasticsearch library to your <span class="inline-code"><pre><code>composer.json file</code></pre></span>:

<div class="code-snippet-container"><a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">{
   "require": {
       "elasticsearch/elasticsearch": "1.0"
   }
}</code></pre></div></div>

Heroku will automatically add the library when you deploy your code.

Using the library

The elasticsearch-php library is configured almost entirely by associative arrays. To initialize your client, use the following code block:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-php">// Initialize the parameters for the cluster
$params = array();
$params['hosts'] = array (
   getenv("BONSAI_URL"),
);

$client = new Elasticsearch\Client($params);</code></pre></div></div>

Note that the host is pulled from the environment. When you add Bonsai to your app, the cluster will be automatically created and a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable will be added to your environment. The initialization process above allows you to access your cluster without needing to hardcode your authentication credentials.

Indexing your documents

Bonsai does not support lazy index creation, so you will need to create your index before you can send over your documents. You can specify a number of parameters to configure your new index; the code below is the minimum needed to get started:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-php">$indexParams = array();
$indexParams['index']  = 'my_index';    //index

$client->indices()->create($indexParams);</code></pre></div></div>

Once your index has been created, you can add a document by specifying a body (an associative array of fields and data), target index, type and (optionally) a document ID. If you don’t specify an ID, Elasticsearch will create one for you:

<div class="code-snippet-container"><a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-php">$params['body']  = array('testField' => 'abc');

$params['index'] = 'my_index';
$params['type']  = 'my_type';
$params['id']    = 'my_id';

// Document will be indexed to my_index/my_type/my_id
$ret = $client->index($params);</code></pre></div></div>

PHP

Warning

The official Elasticsearch Python client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the client. This change prevents the Python client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch.

If you are receiving an error stating "The client noticed that the server is not a supported distribution of Elasticsearch," then you'll need to downgrade the client to 7.13 or lower.

Setting up your Python app to use Bonsai Elasticsearch is quick and easy. Just follow the steps below:

Adding the Elasticsearch library

You’ll need to add the elasticsearch library to your <span class="inline-code"><pre><code>Pipfile</code></pre></span> file like so:

<div class="code-snippet-container"><a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-python">pipenv install elasticsearch elasticsearch-dsl</code></pre></div></div>

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container"><a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy"><img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt=""><img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt=""></a><div class="code-snippet"><pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">import os, base64, re, logging
from elasticsearch import Elasticsearch

# Log transport details (optional):

logging.basicConfig(level=logging.INFO)

# Parse the auth and host from env:

bonsai = os.environ['BONSAI_URL']
auth = re.search('https\:\/\/(.*)\@', bonsai).group(1).split(':')
host = bonsai.replace('https://%s:%s@' % (auth[0], auth[1]), '')

# Optional port

match = re.search('(:\d+)', host)
if match:
 p = match.group(0)
 host = host.replace(p, '')
 port = int(p.split(':')[1])
else:
 port=443

# Connect to cluster over SSL using auth for best security:

es_header = [{
'host': host,
'port': port,
'use_ssl': True,
'http_auth': (auth[0],auth[1])
}]

# Instantiate the new Elasticsearch connection:

es = Elasticsearch(es_header)

# Verify that Python can talk to Bonsai (optional):

es.ping()

</code></pre></div></div>

Feel free to adapt that code to your particular app. Keep in mind that the core elements here can be moved around, but really shouldn’t be changed or further simplified.

For example, the snippet above parses out the authentication credentials from the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable. This is so the username and password don’t need to be hard coded into your app, where they could pose a security risk. Also, the <span class="inline-code"><pre><code>host</code></pre></span>, <span class="inline-code"><pre><code>port</code></pre></span> and <span class="inline-code"><pre><code>use_ssl</code></pre></span> parameters are important for SSL encryption to work properly. Simply using your <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> as the host will not work because of limitations in urllib. It needs to be a plain URL, with the credentials passed to the client as a tuple.

Python

Users of Django/Haystack can easily integrate with Bonsai Elasticsearch! We recommend using the official Python client, as it is being actively developed alongside Elasticsearch.

Note

Haystack does not yet support Elasticsearch 6.x or 7.x, which is the current default on Bonsai. Heroku users can provision a 5.x cluster by adding Bonsai via the command line and passing in a <span class="inline-code"><pre><code>--version</code></pre></span> flag like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-1" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-1" class="hljs language-javascript">heroku addons:create bonsai --version=5</code></pre>
</div>
</div>

Note that some versions require a paid plan. Free clusters are always provisioned on the latest version of Elasticsearch, regardless of which version was requested. See more details here.

Let’s get started:

Adding the Elasticsearch library

You’ll need to add the elasticsearch-py and django-haystack libraries to your requirements.txt file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">elasticsearch>=1.0.0,<2.0.0
django-haystack>=1.0.0,<2.0.0</code></pre>
</div>
</div>

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. We recommend adding the cluster URL to an environment variable, <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>, to avoid hard-coding your authentication credentials.

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">ES_URL = urlparse(os.environ.get('BONSAI_URL') or 'http://127.0.0.1:9200/')

HAYSTACK_CONNECTIONS = {
   'default': {
       'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
       'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
       'INDEX_NAME': 'haystack',
   },
}

if ES_URL.username:
   HAYSTACK_CONNECTIONS['default']['KWARGS'] = {"http_auth": ES_URL.username + ':' + ES_URL.password}</code></pre>
</div>
</div>

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

Common Issues

One of the most common issues users see relates to SSL certs. Bonsai URLs are using TLS to secure communications with the cluster, and our certificates are signed by an authority (CA) that has verified our identity. Python needs access to the proper root certificate in order to verify that the app is actually communicating with Bonsai; if it can’t find the certificate it needs, you’ll start seeing exception messages like this:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">Root certificates are missing for certificate</code></pre>
</div>

The fix is straightforward enough. Simply install the certifi package in the environment where your app is hosted. You can do this locally by running <span class="inline-code"><pre><code>pip install certifi</code></pre></span>. Heroku users will also need to modify their <span class="inline-code"><pre><code>requirements.txt</code></pre></span>, per the documentation:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">certifi==0.0.8</code></pre>
</div>
</div>

I can’t install root certificates or certifi

If certifi and root cert management isn’t possible, you can simply bypass verification by modifying your <span class="inline-code"><pre><code>HAYSTACK_CONNECTIONS</code></pre></span> like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">HAYSTACK_CONNECTIONS = {
   'default': {
           'ENGINE': 'haystack.backends.elasticsearch_backend.ElasticsearchSearchEngine',
           'URL': ES_URL.scheme + '://' + ES_URL.hostname + ':443',
           'INDEX_NAME': 'documents',
           'KWARGS': {
             'use_ssl': True,
             'verify_certs': False,
           }
    },
}</code></pre>
</div>
</div>

This instructs Haystack to use TLS without verifying the host. This does allow for the possibility of MITM attacks, but the probably of that happening is pretty low. You’ll need to weigh the expediency of this approach against the unlikely event of someone eavesdropping, and decide whether there is an acceptable security risk of leaking data in that way.

Django with Haystack

Check Your Version

Haystack does not yet support Elasticsearch 6.x and up. Please ensure your Bonsai cluster version is 5.x or less.

Users of Django/django-elasticsearch-dsl can easily integrate with Bonsai Elasticsearch! This library is built on top of elasticsearch-dsl which is built on top of the low level elasticsearch-py maintained by Elastic.

Let’s get started:

Adding the libraries

You’ll need to add the django-elasticsearch-dsl library to your  <span class="inline-code"><pre><code>requirements.txt</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">django-elasticsearch-dsl>=0.5.1
elasticsearch-dsl>=6.0,<6.2</code></pre>
</div>
</div>

Full Instructions can be found here

Connecting to Bonsai

Bonsai requires basic authentication for all read/write requests. You’ll need to configure the client so that it includes the username and password when communicating with the cluster. We recommend adding the cluster URL to an environment variable,  <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>, to avoid hard-coding your authentication credentials.

The following code is a good starter for integrating Bonsai Elasticsearch into your app:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">ES_URL = urlparse(os.environ.get('BONSAI_URL') or 'http://127.0.0.1:9200/')

ELASTICSEARCH_DSL={
   'default': {
       'hosts': ES_URL
   },
}</code></pre>
</div>
</div>

Note about ports

The sample code above uses port 443, which is the default for the https:// protocol. If you’re not using SSL/TLS and want to use http:// instead, change this value to 80.

Django with django-elasticsearch-dsl

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy. In this guide, we will cover the steps and the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Throughout this guide, you will see some code examples. These code examples are drawn from a very simple Ruby on Rails application, and are designed to offer some real-world, working code that new users will find useful. The complete demo app can be found in this GitHub repo.

Warning

The official Elasticsearch Ruby client is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14 (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>

If you are receiving a <span class="inline-code"><pre><code>Elasticsearch::UnsupportedProductError</code></pre></span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.

Note

In this example, we are going to connect to Elasticsearch using the official Elasticsearch gems. There are other gems for this, namely SearchKick, which is covered in another set of documentation.

Step 1: Spin up a Bonsai Cluster

Make sure that there is a Bonsai Elasticsearch cluster ready for your app to interact with. This needs to be set up first so you know which version of the gems you need to install; Bonsai supports a large number of Elasticsearch versions, and the gems need to correspond to the version of Elasticsearch you’re running.

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

  • If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
  • Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Dashboard, in the Credentials tab:

Heroku users will also have a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable created when Bonsai is added to the application. This variable will contain the fully-qualified URL to the cluster.

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

You can also use the Interactive Console. In the Cluster Dashboard, click on the Console tab. It will load a default view, which includes the version of Elasticsearch. The version of Elasticsearch is called “number” in the JSON response:

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">curl https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443
{
 "name" : "ip-172-31-14-16",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "jVJrINr5R5GVVXHGcRhMdA",
 "version" : {
   "number" : "7.2.0",
   "build_flavor" : "oss",
   "build_type" : "tar",
   "build_hash" : "508c38a",
   "build_date" : "2019-06-20T15:54:18.811730Z",
   "build_snapshot" : false,
   "lucene_version" : "8.0.0",
   "minimum_wire_compatibility_version" : "6.8.0",
   "minimum_index_compatibility_version" : "6.0.0-beta1"
 },
 "tagline" : "You Know, for Search"
}</code></pre>
</div>
</div>

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Add the Gems

There are a few gems you will need in order to make all of this work. Add the following to your Gemfile outside of any blocks:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: 'master'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: 'master'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: 'master'</code></pre>
</div>
</div>

This will install the gems for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Branch</th><th>Elasticsearch Version</th></tr>
</thead>
<tbody>
<tr><td>0.1</td><td>-> 1.x</td></tr>
<tr><td>2.x</td><td>-> 2.x</td></tr>
<tr><td>5.x</td><td>-> 5.x</td></tr>
<tr><td>6.x</td><td>-> 6.x</td></tr>
<tr><td>master</td><td>-> master</td></tr>
</tbody>
</table>

For example, if your version of Elasticsearch is 6.x, then you would use something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'elasticsearch-model', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'elasticsearch-rails', github: 'elastic/elasticsearch-rails', branch: '6.x'
gem 'bonsai-elasticsearch-rails', github: 'omc/bonsai-elasticsearch-rails', branch: '6.x'</code></pre>
</div>
</div>

Make sure the branch you choose corresponds to the version of Elasticsearch that your Bonsai cluster is running.

What Do These Gems Do, Anyway?

  • elasticsearch-model. This gem is the only one that is actually required. It does the actual search integration for Ruby on Rails applications.
  • elasticsearch-rails. This optional gem adds some nice tools, such as rake tasks and ActiveSupport instrumentation.
  • bonsai-elasticsearch-rails. This optional gem saves you the step of setting up an initializer. By default, the Elasticsearch client will attempt to create a connection to <span class="inline-code"><pre><code>localhost:9200</code></pre></span>, which is not where your Bonsai cluster is located. This gem simply reads the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable which contains the cluster URL, and uses it to override the defaults.

Once the gems have been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install them.

Step 4: Add Elasticsearch to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured. You will need to require the `elasticsearch/model` library and include the necessary modules in the model.

For example, this demo app has a <span class="inline-code"><pre><code>User</code></pre></span> model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

Your model will need to have these lines included, at a minimum.

What Do These Lines Do?

  • <span class="inline-code"><pre><code>require 'elasticsearch/model'</code></pre></span>require 'elasticsearch/model' loads the Elasticsearch model library, if it has not already been loaded. This isn’t strictly necessary, provided the file is loaded somewhere at runtime.
  • <span class="inline-code"><pre><code>include Elasticsearch::Model</code></pre></span> tells the app that this model will be searchable by Elasticsearch. This is mandatory for the model to be searchable with Elasticsearch.
  • <span class="inline-code"><pre><code>include Elasticsearch::Model::Callbacks</code></pre></span> is an optional line, but injects Elasticsearch into the ActiveRecord lifecycle. So when an ActiveRecord object is created, updated or destroyed, it will also be created/updated/destroyed in Elasticsearch.
  • <span class="inline-code"><pre><code>settings index: { number_of_shards: 1 }</code></pre></span> This is also optional, but strongly recommended. By default, Elasticsearch will create an index for the model, with 5 primary shards and 1 replica. This will actually create 10 shards, and is ludicrously over-provisioned for most apps. This line simply overrides the default and specifies 1 primary shard(a replica will also be created by default).

You can optionally avoid (or increase) replicas by amending the settings like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: {
   number_of_shards:   1, # 1 primary shard; probably fine for most users
   number_of_replicas: 0  # 0 replicas; fine for dev, not for production.
 }
end</code></pre>
</div>
</div>

If you have more questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. There are a few different strategies for this, but we generally recommend using a dedicated controller for it, unrelated to the model(s) being indexed and searched. This is a more flexible approach, and keeps concerns separated. You can always render object-specific partials if your results involve multiple models.

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with a handful of attributes. It looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">require 'elasticsearch/model'

class User < ApplicationRecord
 include Elasticsearch::Model
 include Elasticsearch::Model::Callbacks
 settings index: { number_of_shards: 1 }
end</code></pre>
</div>
</div>

To implement search, we created a file called <span class="inline-code"><pre><code>app/controllers/search_controller.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class SearchController< ApplicationController
 def run
   # This will search all models that have `include Elasticsearch::Model`
   @results = Elasticsearch::Model.search(params[:q]).records
 end
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">post '/search', to: 'search#run'</code></pre>
</div>
</div>

Next, we need to have some Views to render the data we get back from Elasticsearch. The run controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/search/run.html.erb</code></pre></span> and adding:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript">  Search Results
 <% if @results.present? %>
   <%= render partial: 'search_result', collection: @results, as: :result %>
 <% else %>
   Nothing here, chief!
 <% end %></code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. Create a file called <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span> and add:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
   <%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

   <%= result.company %>

   <%= result.company_description %></code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

At this point, the <span class="inline-code"><pre><code>User</code></pre></span> model is configured for searching in Elasticsearch, and has routes for sending a query to Elasticsearch. The next step is to render a form so that a user can actually use this feature. This is possible with a basic <span class="inline-code"><pre><code>form_with</code></pre></span> helper in a Rails view.

In this demo app, we added this to the navigation bar:

<div class="code-snippet-container">
<a fs-copyclip-element="click-12" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-12" class="hljs language-javascript"><%= form_with(url: "/search", method: "post", class: 'form-inline my-2 my-lg-0', local: true) do %>
   <%= text_field_tag(:q, nil, class: "form-control mr-sm-2", placeholder: "Search") %>
   <%= button_tag("Search", class: "btn btn-outline-info my-2 my-sm-0", name: nil) %>
<% end %></code></pre>
</div>
</div>

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Elasticsearch Where Your Cluster is Located

By default, the gems will try to connect to a cluster running on <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Elasticsearch is pointed to the correct URL.

If you are using the bonsai-elasticsearch-rails gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

Heroku users will have this already, and can skip to the next step. Other users will need to make sure this environment variable is manually set in their application environment. If you have access to the host, you can run this command in your command line:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

  • You are not using the bonsai-elasticsearch-rails gem for some reason, OR
  • You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you need to do this, then you can create a file called <span class="inline-code"><pre><code>config/initializers/elasticsearch.rb</code></pre></span>. Inside this file, you will want to put something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
Elasticsearch::Model.client = Elasticsearch::Client.new url: ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
Elasticsearch::Model.client = Elasticsearch::Client.new url: "https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

If you’re wondering why we prefer to use an environment variable instead of the URL, it’s simply a best practice. The cluster URL is considered sensitive information, in that anyone with the fully-qualified URL is going to have full read/write access to the cluster.

So if you have it in an initializer and check it into source control, that creates an attack vector. Many people have been burned by committing sensitive URLs, keys, passwords, etc to git, and it’s best to avoid it.

Additionally, if you ever need to change your cluster URL, updating the initializer will require another pass through CI and a deployment. Whereas you could otherwise just change the environment variable and restart Rails. Environment variables are simply the better way to go.

Step 7: Push Data into Elasticsearch

Assuming you have a database populated with data, the next step is to get that data into Elasticsearch.

Something to keep in mind here: Bonsai does not support lazy index creation, and even Elastic does not recommend using this feature. You’ll need to create the indices manually for each model that you want to search before you try and populate the cluster with data.

If you have the <span class="inline-code"><pre><code>elasticsearch-rails</code></pre></span> gem, you can use one of the built-in Rake tasks. To do that, create a file called <span class="inline-code"><pre><code>lib/tasks/elasticsearch.rake</code></pre></span> and simply put this inside:

<div class="code-snippet-container">
<a fs-copyclip-element="click-16" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-16" class="hljs language-javascript">require 'elasticsearch/rails/tasks/import'</code></pre>
</div>
</div>

Now you will be able to use a Rake task that will automatically create(or recreate) the index:

<div class="code-snippet-container">
<a fs-copyclip-element="click-17" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-17" class="hljs language-javascript">bundle exec rake environment elasticsearch:import:model CLASS='User'</code></pre>
</div>
</div>

You’ll need to run that Rake task for each model you have designated as destined for Elasticsearch. Note that if you’re relying on the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable to configure the Rails client, this variable will need to be present in the environment running the Rake task. Otherwise it will populate your local Elasticsearch instance(or raise some exceptions).

If you don’t use the Rake task, then you can also create and populate the index from within a Rails console with:

<div class="code-snippet-container">
<a fs-copyclip-element="click-18" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-18" class="hljs language-javascript"># Will return nil if the index already exists:
User.__elasticsearch__.create_index!

# Will delete the index if it already exists, then recreate:
User.__elasticsearch__.create_index! force: true

# Import the data into the Elasticsearch index:
User.import</code></pre>
</div>
</div>

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Elasticsearch. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>SearchController#run</code></pre></span> action, where it will be passed to Elasticsearch. Elasticsearch will search all of the indices it has (just <span class="inline-code"><pre><code>users</code></pre></span> at this point), and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The SearchController will then render the appropriate views. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/search/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Elasticsearch in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We added the gems, configured the model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Elasticsearch on Rails. This documentation is not exhaustive. However, there are a number of other use cases not discussed here. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on these additional subjects below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

Ruby on Rails

Here’s how to get started with Bonsai Elasticsearch and Ruby on Rails using Chewy.

Warning

Chewy is built on top of the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:

<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14 (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>

EngineVersionHighest Compatible Gem VersionElasticsearch5.x7.13Elasticsearch6.x7.14+ ( sic)Elasticsearch7.x7.13OpenSearch1.x7.13

If you are receiving a <span class="inline-code"><pre><code>Elasticsearch::UnsupportedProductError</code></pre></span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.

Note

As of January 2021, Chewy supports up to Elasticsearch 5.x. Users wanting to use Chewy will need to ensure they are not running anything later than 5.x. Support for later versions is planned.

Add Chewy to the Gemfile

In order to use the Chewy gem, add the gem to the Gemfile like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">gem 'chewy'
</code></pre>
</div>
</div>

Then run <span class="inline-code"><pre><code>bundle install</code></pre></span> to install it.

Write an Initializer

A Chewy initializer will need to be written so that the Rails app can connect to your Bonsai cluster.

You can include your Bonsai cluster URL (located in Credentials of Bonsai dashboard) in the initializer, but hard-coding your credentials is not recommended. Instead, export them manually in the application environment to an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span>.

We recommend using something like dotenv for this, but you can also set it manually like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># Substitute with your own Bonsai cluster URL:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Heroku users will not need to do so, as their Bonsai cluster URL will already be in a Config Var of the same name.

Create a file called <span class="inline-code"><pre><code>config/initializers/chewy.rb</code></pre></span>. With the environment variable in place, save the initializer with the following:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">Chewy.settings = {
  host: ENV['BONSAI_URL']
}</code></pre>
</div>
</div>

The official Chewy documentation has more details on ways to modify and refine Chewy’s behavior.

Configure Your Models

Any Rails model that you will want to be searchable with Elasticsearch will need to be configured to do so by creating a Chewy index definition and adding model-observing code.

For example, here’s how we created an index definition for our demo app in <span class="inline-code"><pre><code>app/chewy/users_index.rb</code></pre></span>:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">class UsersIndex < Chewy::Index
  define_type User
end</code></pre>
</div>
</div>

Then, the User model is written like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">class User < ApplicationRecord
  update_index('users#user') { self }
end</code></pre>
</div>
</div>

You can read more about configuring how a model will be ingested by Elasticsearch in the official Chewy documentation.

Indexing Your Documents

The Chewy gem comes with some Rake tasks for getting your documents into the cluster. Once you have set up your models as desired, run the following in your application environment:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">bundle exec rake chewy:deploy
</code></pre>
</div>
</div>

That will push your data into Elasticsearch and synchronize the data.

Next Steps

Chewy is a mature ODM with plenty of great features. Toptal (the maintainers of Chewy) have some great documentation exploring some of what Chewy can do: Elasticsearch for Ruby on Rails: A Tutorial to the Chewy Gem.

Chewy

Getting started with Ruby on Rails and Bonsai Elasticsearch is fast and easy with Searchkick. In this guide, we will start with a very basic Ruby on Rails application and add the bare minimum amount of code needed to support basic search with Elasticsearch. Users looking for more details and advanced usage should consult the resources at the end of this page.

Throughout this guide, you will see some code examples. These code examples are drawn from a very simple Ruby on Rails application, and are designed to offer some real-world, working code that new users will find useful. The complete demo app can be found in this GitHub repo.

<div class="callout-warning">
<h3>Warning</h3>
<p>SearchKick uses the official Elasticsearch Ruby client, which is not supported on Bonsai after version 7.13. This is due to a change introduced in the 7.14 release of the gem. This change prevents the Ruby client from communicating with open-sourced versions of Elasticsearch 7.x, as well as any version of OpenSearch. The table below indicates compatibility:</p>
<table>
<thead>
<tr><th>Engine</th><th>Version Highest Compatible Gem Version</th></tr>
</thead>
<tbody>
<tr><td>Elasticsearch 5.x</td><td>7.13</td></tr>
<tr><td>Elasticsearch 6.x</td><td>7.14+ (sic)</td></tr>
<tr><td>Elasticsearch 7.x</td><td>7.13</td></tr>
<tr><td>OpenSearch 1.x</td><td>7.13</td></tr>
</tbody>
</table>
<p>If you are receiving a <span class="inline-code warning">Elasticsearch::UnsupportedProductError</span>, then you'll need to ensure you're using a supported version of the Elasticsearch Ruby client.</p>
</div>

<div class="callout-note">
<h3>Note</h3>
<p>In this example, we are going to connect to Elasticsearch using the Searchkick gem. There are also the official Elasticsearch gems for Rails, which are covered in another set of documentation.</p>
</div>

Step 1: Spin up a Bonsai Cluster

Make sure that there is a Bonsai Elasticsearch cluster ready for your app to interact with. This needs to be set up first so you know which version of the gems you need to install; Bonsai supports a large number of Elasticsearch versions, and the gems need to correspond to the version of Elasticsearch you’re running.

Bonsai clusters can be created in a few different ways, and the documentation for each path varies. If you need help creating your cluster, check out the link that pertains to your situation:

  • If you’ve signed up with us at bonsai.io, you will want to follow the directions here.
  • Heroku users should follow these directions.

The Cluster URL

When you have successfully created your cluster, it will be given a semi-random URL called the Elasticsearch Access URL. You can find this in the Cluster Overview, in the Credentials tab:

Heroku users will also have a <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable created when Bonsai is added to the application. This variable will contain the fully-qualified URL to the cluster.

Step 2: Confirm the Version of Elasticsearch Your Cluster is On

When you have a Bonsai Elasticsearch cluster, there are a few ways to check the version that it is running. These are outlined below:

Option 1: Via the Cluster Dashboard Details

The easiest is to simply get it from the Cluster Dashboard. When you view your cluster overview in Bonsai UI, you will see some details which include the version of Elasticsearch the cluster is running:

Option 2: Interactive Console

You can also use the Interactive Console. In the Cluster Dashboard, click on the Console tab. It will load a default view, which includes the version of Elasticsearch. The version of Elasticsearch is called “number” in the JSON response:

Option 3: Using a Browser or <span class="inline-code"><pre><code>curl</code></pre></span>

You can copy/paste your cluster URL into a browser or into a tool like <span class="inline-code"><pre><code>curl</code></pre></span>. Either way, you will get a response like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">curl https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443
{
 "name" : "ip-172-31-14-16",
 "cluster_name" : "elasticsearch",
 "cluster_uuid" : "jVJrINr5R5GVVXHGcRhMdA",
 "version" : {
   "number" : "7.2.0",
   "build_flavor" : "oss",
   "build_type" : "tar",
   "build_hash" : "508c38a",
   "build_date" : "2019-06-20T15:54:18.811730Z",
   "build_snapshot" : false,
   "lucene_version" : "8.0.0",
   "minimum_wire_compatibility_version" : "6.8.0",
   "minimum_index_compatibility_version" : "6.0.0-beta1"
 },
 "tagline" : "You Know, for Search"
}</code></pre>
</div>
</div>

The version of Elasticsearch is called “number” in the JSON response.

Step 3: Install the Gem

To install Searchkick, you will need the searchkick gem. Add the following to your Gemfile outside of any blocks:

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">gem 'searchkick'
</code></pre>
</div>
</div>

This will install the gem for the latest major version of Elasticsearch. If you have an older version of Elasticsearch, then you should follow this table:

<table>
<thead>
<tr><th>Elasticsearch Version</th><th>Searchkick Version</th></tr>
</thead>
<tbody>
<tr><td>1.x</td><td>-> 1.5.1</td></tr>
<tr><td>2.x</td><td>-> 2.5.0</td></tr>
<tr><td>5.x</td><td>-> 3.1.3 (additional notes)</td></tr>
<tr><td>6.x and up</td><td>-> 4.0 and up</td></tr>
</tbody>
</table>

If you need a specific version of Searchkick to accommodate your Elasticsearch cluster, you can specify it in your Gemfile like so:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">gem 'searchkick', '3.1.3'  # For Elasticsearch 5.x
</code></pre>
</div>
</div>

Once the gem has been added to your Gemfile, run <span class="inline-code"><pre><code>bundle install</code></pre></span>.

Step 4: Add Searchkick to Your Models

Any model that you will want to be searchable with Elasticsearch will need to be configured to do so by adding the <span class="inline-code"><pre><code>searchkick</code></pre></span> keyword to it.

For example, our demo app has a User model that looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">class User < ApplicationRecord
 searchkick
end</code></pre>
</div>
</div>

Adding the searchkick keyword makes our User model searchable with Searchkick.

Searchkick provides a number of reasonable settings out of the box, but you can also pass in a hash of settings if you want to override the defaults. The hash keys generally correspond to the official Create Indices API. For example, this will allow you to create an index with 0 replicas:

<div class="code-snippet-container">
<a fs-copyclip-element="click-6" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-6" class="hljs language-javascript">class User < ApplicationRecord
 searchkick settings: { number_of_replicas: 0 }
end</code></pre>
</div>
</div>

If you have questions about shards and how many is enough, check out our Shard Primer and our documentation on Capacity Planning.

Step 5: Create a Search Route

You will need to set up a route to handle searching. The easiest way to do this with Searchkick is to have a search route per model. This involves updating your models’ corresponding controller, and defining routes in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>. You’ll also need to have some views that handle rendering the results, and a form that posts data to the controller(s). Take a look at how we implemented this in our demo app for some examples of how this is done:

Our Example

In our example Rails app, we have one model, <span class="inline-code"><pre><code>User</code></pre></span>, with <span class="inline-code"><pre><code>searchkick</code></pre></span>. It looks something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-7" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-7" class="hljs language-javascript">class User < ApplicationRecord
 searchkick
end</code></pre>
</div>
</div>

To implement search, we updated the file <span class="inline-code"><pre><code>app/controllers/users_controllers.rb</code></pre></span> and added this code:

<div class="code-snippet-container">
<a fs-copyclip-element="click-8" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-8" class="hljs language-javascript">class UsersController< ApplicationController
 #... a bunch of controller actions, removed for brevity
 def search
   @results = User.search(params[:q])
 end
 #... more controller actions removed for brevity
end</code></pre>
</div>
</div>

We then created a route in the <span class="inline-code"><pre><code>config/routes.rb</code></pre></span> file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-9" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-9" class="hljs language-javascript">Rails.application.routes.draw do
 resources :users do
   collection do
     post :search  # creates a route called users_search
   end
 end
end</code></pre>
</div>
</div>

Next, we need to have some views to render the data we get back from Elasticsearch. The <span class="inline-code"><pre><code>search</code></pre></span> controller action will be rendered by creating a file called <span class="inline-code"><pre><code>app/views/users/search.html.erb</code></pre></span> and adding:

<div class="code-snippet-container">
<a fs-copyclip-element="click-10" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-10" class="hljs language-javascript">Search Results

 <% if @results.present? %>
   <%= render partial: 'search_result', collection: @results, as: :result %>
 <% else %>
   Nothing here, chief!
 <% end %>

</code></pre>
</div>
</div>

This way if there are no results to show, we simply put a banner indicating as such. If there are results to display, we will iterate over the collection (assigning each one to a local variable called <span class="inline-code"><pre><code>result</code></pre></span>), and passing it off to a partial. We also created a file for a partial called <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span> and added:

<div class="code-snippet-container">
<a fs-copyclip-element="click-11" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-11" class="hljs language-javascript">
   <%= link_to "#{result.first_name} #{result.last_name} <#{result.email}>", user_path(result) %>

   <%= result.company %>

   <%= result.company_description %>

   </code></pre>
</div>
</div>

This partial simply renders a search result using some of the data of the matching ActiveRecord objects.

At this point, the <span class="inline-code"><pre><code>User</code></pre></span> model is configured for searching in Elasticsearch, and has routes for sending a query to Elasticsearch. The next step is to render a form so that a user can actually use this feature. This is possible with a basic <span class="inline-code"><pre><code>form_with</code></pre></span> helper.

In this demo app, we added this to the navigation bar:

<div class="code-snippet-container">
<a fs-copyclip-element="click-12" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-12" class="hljs language-javascript"><%= form_with(url: "/search", method: "post", class: 'form-inline my-2 my-lg-0', local: true) do %>
   <%= text_field_tag(:q, nil, class: "form-control mr-sm-2", placeholder: "Search") %>
   <%= button_tag("Search", class: "btn btn-outline-info my-2 my-sm-0", name: nil) %>
<% end %></code></pre>
</div>
</div>

This code renders a form that looks like this:

Please note that these classes use Bootstrap, which may not be in use with your application. The ERB scaffold should be easily adapted to your purposes.

We’re close to finishing up. We just need to tell the app where the Bonsai cluster is located, then push our data into that cluster.

Step 6: Tell Searchkick Where Your Cluster is Located

Searchkick looks for an environment variable called <span class="inline-code"><pre><code>ELASTICSEARCH_URL</code></pre></span>, and if it doesn’t find it, it uses <span class="inline-code"><pre><code>localhost:9200</code></pre></span>. This is a problem because your Bonsai cluster is not running on a localhost. We need to make sure Searchkick is pointed to the correct URL.

Bonsai does offer a gem, bonsai-searchkick, which populates the necessary environment variable automatically. If you're using this gem, then all you need to do is ensure that there is an environment variable called <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> set in your application environment that points at your Bonsai cluster URL.

Heroku users will have this already, and can skip to the next step. Other users will need to make sure this environment variable is manually set in their application environment. If you have access to the host, you can run this command in your command line:

<div class="code-snippet-container">
<a fs-copyclip-element="click-13" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-13" class="hljs language-javascript"># Substitute with your cluster URL, obviously:
export BONSAI_URL="https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

Writing an Initializer

You will only need to write an initializer if:

  • You are not using the bonsai-searchkick gem for some reason, OR
  • You are not able to set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> environment variable in your application environment

If you need to do this, then you can create a file called <span class="inline-code"><pre><code>config/initializers/elasticsearch.rb</code></pre></span>. Inside this file, you will want to put something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-14" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-14" class="hljs language-javascript"># Assuming you can set the BONSAI_URL variable:
ENV["ELASTICSEARCH_URL"] = ENV['BONSAI_URL']</code></pre>
</div>
</div>

If you’re one of the few who can’t set the <span class="inline-code"><pre><code>BONSAI_URL</code></pre></span> variable, then you’ll need to do something like this:

<div class="code-snippet-container">
<a fs-copyclip-element="click-15" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-15" class="hljs language-javascript"># Use your personal URL, not this made-up one:
ENV["ELASTICSEARCH_URL"] = "https://abcd123:efg456@my-cluster-123456.us-west-2.bonsaisearch.net:443"</code></pre>
</div>
</div>

If you’re wondering why we prefer to use an environment variable instead of the URL, it’s simply a best practice. The cluster URL is considered sensitive information, in that anyone with the fully-qualified URL is going to have full read/write access to the cluster.

So if you have it in an initializer and check it into source control, that creates an attack vector. Many people have been burned by committing sensitive URLs, keys, passwords, etc to git, and it’s best to avoid it.

Additionally, if you ever need to change your cluster URL, updating the initializer will require another pass through CI and a deployment. Whereas you could otherwise just change the environment variable and restart Rails. Environment variables are simply the better way to go.

Step 7: Push Data into Elasticsearch

Now that the app has everything it needs to query the cluster and render the results, we need to push data into the cluster. There are a few ways to do this.

One method is to open up a Rails console and run <span class="inline-code"><pre><code>.reindex</code></pre></span>. So if you want to reindex a model called <span class="inline-code"><pre><code>User</code></pre></span>, you would run <span class="inline-code"><pre><code>User.index</code></pre></span>.

Another method is to use Rake tasks from the command line. If you wanted to reindex that same <span class="inline-code"><pre><code>User</code></pre></span> model, you could run: <span class="inline-code"><pre><code>bundle exec rake searchkick:reindex CLASS=User</code></pre></span>. Alternatively, if you have multiple Searchkick-enabled models, you could run <span class="inline-code"><pre><code>rake searchkick:reindex:all</code></pre></span>.

Regardless of how you do it, Searchkick will create an index named after the ActiveRecord table of the model, the environment, and a timestamp. So reindexing the <span class="inline-code"><pre><code>User</code></pre></span> model in a development environment might result in an index called <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span>. This allows Searchkick to provide zero-downtime updates to settings and mappings.

Step 8: Put it All Together

At this point you should have all of the pieces you need to search your data using Searchkick. In our demo app, we have this simple list of users:

This search box is rendered by a form that will pass the query to the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, via the route set up in <span class="inline-code"><pre><code>config/routes.rb</code></pre></span>:

This query will reach the <span class="inline-code"><pre><code>UsersController#search</code></pre></span> action, where it will be passed to Searchkick, which queries Elasticsearch. Elasticsearch will search the <span class="inline-code"><pre><code>users_development_20191029111649033</code></pre></span> index, and return any hits to a class variable called <span class="inline-code"><pre><code>@results</code></pre></span>.

The UsersController will then ensure the appropriate views are rendered. Each result will be rendered by the partial <span class="inline-code"><pre><code>app/views/users/_search_result.html.erb</code></pre></span>. It looks something like this:

Congratulations! You have implemented Searchkick in Rails!

Final Thoughts

This documentation demonstrated how to quickly get Elasticsearch added to a basic Rails application. We installed the Searchkick gem, added it to a model, set up the search route, and created the views and partials needed to render the results. Then we set up the connection to Elasticsearch and pushed the data into the cluster. Finally, we were able to search that data through our app.

Hopefully this was enough to get you up and running with Searchkick. This documentation is not exhaustive, and there are a lot of really cool features that Searchkick offers. There are other additional changes and customizations that can be implemented to make search more accurate and resilient.

You can find information on additional subjects in the section below. And if you have any ideas or requests for additional content, please don’t hesitate to let us know!

Additional Resources

Searchkick

Jekyll is a static site generator written in Ruby. Jekyll supports a plugin model that Searchyll uses to read your site’s content and then index it into an Elasticsearch cluster.

In this guide, we are going to use this feature to tell Jekyll to index all of the content into a configured instance of Elasticsearch.

First Steps

In order to make use of this documentation, you will need Jekyll installed and configured on your system.

  1. Make sure you have Jekyll installed. This guide assumes you already have Jekyll installed and configured on your system. Visit the Jekyll Documentation to get started.
  2. Spin up a Bonsai Elasticsearch Cluster. This guide will use a Bonsai cluster as the Elasticsearch backend. Guides are available for setting up a cluster through bonsai.io or Heroku.

Configure Jekyll to output to Bonsai Elasticsearch

Jekyll’s configuration systems live in a file called `_config` by default.  Add the following snippet to the file:

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">elasticsearch:
 url: "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io"
 index_name: jekyll-posts

plugins:
 - searchyll</code></pre>
</div>
</div>

We also need to add the gem `searchyll`

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript"># Manually add the Searchyll gem into your Gemfile
bundle install</code></pre>
</div>
</div>

Push the Data Into Elasticsearch

To get the site's data into Elasticsearch, you can load it by simply running the Jekyll build command:

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">jekyll build
</code></pre>
</div>
</div>

You should now be able to see your data in the Elasticsearch cluster:

<div class="code-snippet-container">
<a fs-copyclip-element="click-5" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-5" class="hljs language-javascript">$ curl -XGET "https://:@my-awesome-cluster-1234.us-east-1.bonsai.io/_search"  

{"took":1,"timed_out":false,"_shards":{"total":2,"successful":2,"failed":0},"hits":{"total":1,"max_score":1.0,"hits":[{"_index":"hugo","_type":"doc","_id":...</code></pre>
</div>
</div>

Jekyll

At Bonsai, security and privacy are always a top concern. We are constantly evaluating the service and our vendors for vulnerabilities and flaws, and we will immediately address anything that could put our customers at risk. This is how we keep your data secure:

  • Access Controls All Bonsai clusters are provisioned with a unique, randomized URL and have HTTP Basic Authentication enabled by default, using a randomly generated set of credentials. Under this scheme, it would take the world’s fastest super computer around 23.5 quadrillion years to guess.
  • Encrypted communications. All Bonsai clusters support SSL/TLS for encryption in transit. We use industry standard strength encryption to ensure your data is safe over the wire.
  • Encrypted at rest. Bonsai clusters are provisioned on hardware that is encrypted at rest by default. In addition to Amazon’s physical security controls, this means your data is safe from physical theft.
  • Regular Snapshots. All paid Bonsai clusters receive regular snapshots, which are stored in an offsite, encrypted S3 bucket in the same region as the cluster.
  • Firewalled. All Bonsai clusters are accessed via a custom-built, high-performance layer 7 routing proxy, and sit behind a tightly controlled firewall. This helps to ensure that the cluster and data are protected from port scans and unauthorized persons.
  • Advanced Networking. Bonsai can support IP whitelisting, and VPC Peering to users on single tenant clusters.

How Does Bonsai Secure Data?
Note: The following article applies to Business and Enterprise tiers with dedicated single tenant clusters. These watermark alerts do not apply to Sandbox, Staging, or Standard tiers on shared multitenant clusters; instead, these tiers will be notified once an overage is detected on at least one metered metric.

Elasticsearch protects itself from data loss with the disk-based shard allocator. This mechanism attempts to strike a balance between minimizing disk usage across all nodes while also minimizing the number of shard reallocation processes. This ensures that all nodes have as much disk headroom as possible, with minimal impact to cluster performance.

Elasticsearch is constantly checking on each node’s disk availability to make decisions about where to allocate and move shards. There are 4 important“stages” that help dictate this decision making process:

  1. Alert watermark. When a node in a single tenant cluster reaches 70% disk or higher, Bonsai sends out an alert to the user.
  2. Low watermark. When a node reaches its low watermark stage(Bonsai defaults to 70% disk used), the cluster will no longer allocate new shards to this node.
  3. High watermark. When a node reaches its high watermark stage(Bonsai defaults to 75% disk used), the cluster will actively try to move shards off the node.
  4. Flood stage. When a node reaches its flood stage watermark(Bonsai uses 95% disk used), the cluster will put all open indices into a state that only allows for reads and deletes.

The low and high watermark stages are when Business and Enterprise tiers receive an emailed notification from the Bonsai Support team with suggestions for scaling disk capacity. Clusters that are over the 75% threshold can start to experience performance issues that may include but are not limited to: increased bulk queue times, higher CPU and/or load usage, or slower than normal searches.

How Bonsai Manages Watermarks

Bonsai offers two main architecture classes: multitenant and single tenant. The multitenant class – sometimes called “shared” – is designed to allow clusters to share hardware resources while still being securely sandboxed from one another. This allows us to provide unparalleled performance per dollar at smaller scales. All Hobby and Standard plans use multitenant architecture.

The single tenant class – sometimes called “dedicated” – maps one cluster to a private set of hardware resources. Because these resources are not shared with any other cluster, single tenant configurations provide maximum performance, security and customization. All Business and Enterprise plans use dedicated architecture.

Multi Tenant Class

Bonsai’s multitenant class utilizes some sandboxing features built into Elasticsearch. This allows the service to run multiple clusters on a single instance of Elasticsearch per node. This approach saves substantial hardware and network resources, and allows for radical cost savings especially for students, hobbyists, startups, projects in development, small businesses, and so on.

A great benefit of this approach is that Bonsai is able to provide some really nice features out of the box, for no additional cost: all multitenant clusters – even the free ones – are running on 3 nodes. They also get industry standard SSL/TLS and HTTP Basic authentication (see Security for more information), which keeps your data safe. Plus, the Bonsai dashboard offers plenty of tools for monitoring, managing, and engaging with your cluster.

Because these clusters are running on a shared Elasticsearch instance, there are also a few limitations. For one, certain API endpoints and actions are unavailable for security and performance reasons. Snapshots and plugins are not manageable by users to avoid collisions and regressions. And usage is metered about how you’d expect for a free/low-cost SaaS.

Clusters on the multitenant class can often be identified by their plan name. “Hobby”, “Staging”, “Production”, and “Shared” are all terms used by plans running on a multitenant architecture.

Single Tenant Class

Bonsai’s single tenant class has a fairly standard configuration. The cluster is simply one or more nodes (three by default), each running Elasticsearch. These nodes are physically isolated and on a different network than those running multitenant clusters. This means that all available IO on the nodes is always 100% allocated to your cluster.

This approach offers all of the same benefits as the multitenant class: you get the same industry standard SSL/TLS and HTTP Basic authentication (see Security for details), and the Bonsai dashboard. In addition, the isolated environment is suitable for encryption at rest and VPC Peering (for applications with stringent security requirements).

Furthermore, this class offer extremely flexible deployments and scaling. Need it in a region we don’t support on the multitenant class? No problem! Have a plugin or script that is vital to your operation? We can package it into our deployment! Let us know what you need, and we can provide a quote.

Also, because this class is highly secure and customizable, we are also able to support amendments to our terms of service and privacy policy, custom SLAs and more. Contact us for questions and quotes.

Trade Offs

This article details the differences between the two architectures we offer. There are a number of trade offs between these classes, summarized below:

<table>
<thead>
<tr><th>Class</th><th>Pros</th><th>Cons</th></tr>
</thead>
<tbody>
<tr><td>Multi-tenant</td><td>- Extremely cost effective
- Great performance for the money
- Can scale up or down on demand</td><td>- Limits on usage (disk, memory, connections, etc)
- Can not install and run arbitrary plugins
- Noisy neighbors*
- VPC pairing and at-rest encryption not available
- Subject to general terms of service and privacy policy</td></tr>
<tr><td>Single Tenant</td><td>- Extremely powerful
- No metering on usage
- Can deploy arbitrary plugins
- No noisy neighbors*
- Can have at-rest encryption
- VPC Pairing
- Custom terms, SLAs, etc</td><td>- More expensive to operate
- Scaling can be more difficult</td></tr>
</tbody>
</table>

* The Noisy Neighbor Problem is a well-known issue that occurs in multitenant architectures. In this context, one or more clusters may inadvertently monopolize IO resources (CPU, network, memory, etc), which can adversely affect other users on the same nodes. Bonsai actively monitors and addresses these situations when they come up, although the issue is frequently transient and resolves itself within a few minutes. Single tenant architectures do not suffer from this issue.

Why Not Containers / VMs?

A frequent question that comes up when talking about our various service architectures is why we don’t use container or virtualization technologies. A service that incorporates these technologies would offer some nice benefits, like allowing users to install their own plugins and manage their own snapshots.

The simple answer is “overhead.” Running containers – and especially VMs – requires system resources via some orchestration daemon or hypervisor. And simply running multiple instances of Elasticsearch on a node would require multiple JREs, which wastes resources through duplication.

Any resources that are allocated towards management of environments are therefore unavailable for Elasticsearch to use. In comparison to an architecture that doesn’t have this overhead, the provider must either offer less performance for the money, or charge more money for the same performance.

There is also a practical aspect to Bonsai eschewing containers and virtual machines. It is impossible to provide both great support and absolute customization. For example, users who install their own plugins can introduce a variety of regressions into Elasticsearch’s performance and behavior. When they open a support ticket, the agent must either spend time bug squashing or decline assistance altogether. Being opinionated allows our team to focus on depth of knowledge rather than breadth, which leads to faster, higher quality resolutions.

Finally, there is a philosophical motivation for how we built the service. We want to make Elasticsearch accessible to people at all stages of development; from the hobbyists and students, all the way up to the billion dollar unicorns. And we want to make sure that it’s the best possible experience. This means being opinionated about certain features, and taking a more active role in managing the infrastructure.

Bonsai Architecture

Bonsai supports a large number of Elasticsearch versions in regions all over the world. Multitenant class clusters usually have more limited options in terms of available regions and versions, while single tenant class clusters have far more options.

If you need a version or geographic region that is not listed here, reach out to our support and let us know what you need. We can get you a quote and timeline for getting up and running.

Multitenant Class

Bonsai operates a fleet of shared resource nodes in Oregon, Virginia, Ireland, Frankfurt, and Sydney. The Elasticsearch versions available on these nodes do not change often.

Sandbox clusters are limited to the most recent version of Elasticsearch and may be subject to automatic upgrades when a new version is released.

For all other multitenant plans, when Bonsai adds support for a new version, we will create a new server group rather than upgrading an existing group. The exception to this is a potential patch upgrade in response to some critical vulnerability. In other words, users will not be upgraded in place unless there is a vulnerability to address or if they are on a Sandbox plan. Read How Bonsai Handles Elasticsearch Releases for more information.

This table shows which versions of Elasticsearch are available for multitenant plans. Last updated: 2023-08-10.

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>Sandbox</td><td>7.10.2</td><td>2.6.0</td></tr>
<tr><td>All other multitenant </td><td>5.6.16 / 6.8.21 / 7.10.2 </td><td>2.6.0</td></tr>
</tbody></table>

Multitenant Version Support by Region

Multitenant plans are supported in a handful of popular regions, although the versions available to free plans is limited. Additional regions are available to Business and Enterprise subscriptions.

<table>
<thead>
<tr><th>Cloud</th><th>Region</th><th>Location</th><th>Elasticsearch 5.6.16</th><th>Elasticsearch 6.8.21</th><th>Elasticsearch 7.10.2</th><th>OpenSearch 2.6.0</th></tr>
</thead>
<tbody>
<tr><td>AWS</td><td>us-east-1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard Free Plans</td></tr>
<tr><td>AWS</td><td>us-west-2</td><td>Oregon, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, FreePlans</td></tr>
<tr><td>AWS</td><td>eu-west-1</td><td>Ireland, EU</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
<tr><td>GCP</td><td>us-east1</td><td>Virginia, USA</td><td>Standard Plans</td><td>Standard Plans</td><td>Standard, Free Plans</td><td>Standard, Free Plans</td></tr>
</tbody>
</table>

Single Tenant Class

Single tenant clusters can be deployed in Oregon, Virginia, Ireland, Frankfurt, Sydney, and Tokyo. We support a variety of Elasticsearch versions for these kinds of clusters and will default to the most recent minor version unless something else is specified.

This table shows which versions of Elasticsearch are available for single tenant plans. Last updated: 2023-08-10.

<table>
<thead>
<tr><th>Plan</th><th>Elasticsearch Versions </th><th>OpenSearch Versions</th></tr>
</thead>
<tbody>
<tr><td>All single tenant </td><td>2.4.0 to 7.10.2</td><td>1.2.4 to 2.6.0</td></tr>
</tbody></table>

Enterprise Deployments

Bonsai can deploy and manage whichever version of Elasticsearch or OpenSearch your use case needs. Please reach out to us at support@bonsai.io.

Older Search Engine Version Pricing

For major search engine versions behind Bonsai’s current primary supported version (OpenSearch 2.x and Elasticsearch 7.x), clusters will be charged an operational and maintenance fee to ensure that we can continue to support these older versions, and that we can give your organization the time it needs to decide when to upgrade to a new major version.

The fee will be assessed based on the following table:

Search Engine VersionOperational Fee
OpenSearch 1.x or Elasticsearch 6.x20%
OpenSearch 1.x or Elasticsearch 6.x20%
Elasticsearch 5.x15% additional
Elasticsearch 2.x10% additional
Elasticsearch 1.x5% additional
Which Versions Bonsai Supports

Bonsai clusters support most Elasticsearch APIs out of the box, with a few exceptions. This article details those exceptions, along with a brief explanation of why they’re in place. Here is what we will cover :

  • _all and wildcard destructive actions
  • Tasks API
  • Node Hot Threads
  • Node Shutdown & Restart
  • Snapshots
  • Reindex
  • Cluster Shard Reroute
  • Cluster Settings
  • Index Search Warmers
  • Static Scripts
  • Update By Query API

While having many of the following endpoints can be helpful for power users, the majority of applications don’t directly need them. If, however, you find yourself stuck without one of these available, please email us and we’ll be happy to help.

_all and wildcard destructive actions

Wildcard delete actions are usually for clusters with a large number of indices, and can be useful for completely wiping out a cluster and starting over. Wildcard and _all destructive actions were initially available on shared tier clusters. However, we received an increasingly large number of threads from distressed developers that accidentally deleted their entire production clusters.

After some internal discussion, we decided to disable wildcard actions. Removing the ability to sweepingly delete everything forces users to slow down and identify exactly what they’re deleting, reducing the risk of accidental and permanent data loss.

Examples of _all and wildcard destruction:

<div class="code-snippet w-richtext">
<pre><code fs-codehighlight-element="code" class="hljs language-javascript">DELETE /*
DELETE /_all</code>
</pre></div>

Tasks API

Elasticsearch provides an API for viewing information about current running tasks and the ability to cancel them on versions 5.x and up. This API is disabled for multi-tenant plans. It can be enabled by request on Business and Enterprise plans through an email to support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-2" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-2" class="hljs language-javascript">GET /_tasks
GET /_tasks/{task_id}
POST /_tasks/{task_id}/_cancel</code></pre>
</div>
</div>

Node hot threads

A Java thread that uses lots of CPU and runs for an unusually long period of time is known as a hot thread. Elasticsearch provides an API to get the current hot threads on each node in the cluster. This information can be useful in forming a holistic picture of potential problems within the cluster.

Bonsai doesn’t support these endpoints on our shared tier to ensure user activity isn’t exposed to others. For a detailed explanation of why this is a concern, please see the article on Architecture Classes. Additionally, Bonsai is a managed service, so it’s really the responsibility of our Ops Team to investigate node-level issues.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-3" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-3" class="hljs language-javascript">GET /_cluster/nodes/hotthreads
GET /_cluster/nodes/hot_threads
GET /_cluster/nodes/{node_id}/hotthreads
GET /_cluster/nodes/{node_id}/hot_threads
GET /_nodes/hotthreads
GET /_nodes/hot_threads
GET /_nodes/{node_id}/hotthreads
GET /_nodes/{node_id}/hot_threads</code></pre>
</div>
</div>

Node Shutdown & Restart

Elasticsearch provides an API for shutting down and restarting nodes. This functionality is unsupported across the platform. On our multitenant architecture, this prevents a user from shutting down a node or set of nodes that may be in use by another user. That action would have an adverse affect on other users which is why it is unsupported. This also prevents users from exacerbating whatever problem they’re trying to resolve.

If you think there is a problem with your cluster that you need help troubleshooting, please email support.

<div class="code-snippet-container">
<a fs-copyclip-element="click-4" href="#" class="btn w-button code-copy-button" title="Copy">
<img class="copy-image" src="https://global-uploads.webflow.com/63c81e4decde60c281417feb/6483934eeefb356710a1d2e9_icon-copy.svg" loading="lazy" alt="">
<img class="copied-image" src="https://cdn.prod.website-files.com/63c81e4decde60c281417feb/64839e207c2860eb9e6aa572_icon-copied.svg" loading="lazy" alt="">
</a>
<div class="code-snippet">
<pre><code fs-codehighlight-element="code" fs-copyclip-element="copy-this-4" class="hljs language-javascript">POST /_cluster/nodes/_restart
POST /_cluster/nodes/_shutdown
POST /_cluster/nodes/{node_id}/_restart
POST /_cluster/nodes/{node_id}/_shutdown
POST /_shutdown</code></pre>
</div>
</div>

Snapshots

The Snapshot API allows users to create and restore snapshots of indices and cluster data. It’s useful as a backup tool and for recovering from problems or data loss. On Bonsai, we’re