FAQ

Do you offer high availability indexes?

Yes! Our production clusters are composed of a minimum of three servers, running in redundant data centers (AWS Availability Zones). These clusters gracefully tolerate partitioning of any single node or datacenter, caused either by outages or by normal maintenance. Production indices may have their data replicated onto multiple servers in the cluster, for multi-datacenter high availability.

How often do you back up cluster data?

Production clusters are backed up hourly to Amazon S3. We retain hourly snapshots for the last 24 hours, and daily snapshots for the last 14 days.

Can I create multiple indexes?

Yes. All of our plans allow you to create multiple indexes within your cluster. Multiple indexes are common and useful for supporting a number of use cases, including multiple application environments, zero-downtime hot reindexing, or for partitioning your index data by customer or by model.

Index and shard limits are a soft limit in our API. If you need to create more indexes than your current plan allows, contact us and let us know some more information about your use case and we can work with you to create an appropriate plan.

What are shards, why do I need them, and how many do I need?

Shards are the fundamental unit of an Elasticsearch index. When you need to search more documents or index a larger volume than can efficiently be handled by a single server, an index can be split into multiple primary shards, allowing you to partition your data and distribute your workload across multiple servers.

Shards can be replicated according to your replica level. For example, an index created with two primary shards and one replica will have four total shards.

If your search or indexing performance starts to suffer as the size of your index grows, you should consider partitioning your index across more primary shards. You can approximate an optimal number of primary shards by indexing documents into an index with a single primary shard, measuring the size or traffic rate at which your test queries degrade in performance, and extrapolating from there.

Our rough recommendation is to consider creating one shard per 500,000 to 1,000,000 documents or 500 MB to 1 GB of indexed data. Another useful heuristic, on dedicated server clusters, is to use one shard per node, or one shard per CPU core.

How many shards am I allowed?

Because each shard carries some fixed overhead in a cluster, the number of total shards allowed varies per plan. As with indexes, we are still measuring real-world usage of shards to determine useful plan defaults. If you need more shards than your current plan allows, contact us so we can help design a good plan for your needs.

How many requests per second can I execute?

We limit the number of concurrent requests. In practice, the actual requests per second this allows is based on the speed of the requests you are executing.

Request limits vary at different plan levels. We are still making changes and measuring real-world limits to determine sensible plan defaults. Rate-limited requests will fail with a HTTP 429 error indicating that you contact us so that we can work with you to accommodate your usage.

Updates in particular should be batched to run no more than one update request per second. Elasticsearch has excellent Bulk Update API, which we meter separately from other types of requests to encourage its use.

What version of Elasticsearch do you support?

Generally our clusters run on the latest, or a very recent, version of Elasticsearch. We have designed our systems to support only one or two recent versions of Elasticsearch, rather than provide support for deploying many arbitrary versions.

You can check our current Elasticsearch version at a regional cluster endpoint, such as http://us-east-1.bonsai.io/, or at the root of one of your provisioned clusters.

As a managed service provider, we believe in keeping your clusters continually updated, so you can benefit from the constant improvements being integrated by the larger Elasticsearch and Lucene user communities. We also work to balance that with the responsibility to smoothly and reliably manage change over time, in order to prevent or mitigate the risk of service interruptions.

That said, one feature of our single-tenant dedicated server plans is coordinated version upgrades and long-term version support. If you need a specific version of Elasticsearch, we can likely accommodate that for you on one of our single-tenant cluster plans.

If you ever have questions about our schedule for upgrading to a new version, you can ask us at support@bonsai.io — be sure to let us know what features or changes your app needs supported in a particular new version.

Do you support all of Elasticsearch's features and APIs?

While we support most of Elasticsearch's features and APIs, we do not necessarily expose all of them. You can find an annotated list of unsupported endpoints here.

Our philosophy is to expose the Elasticsearch API on a whitelist basis, explicitly allowing functionality that we have reviewed and deemed appropriate for a managed hosting environment.

This approach allows us deeper understanding and finer-grained control over how the system is being used. Ultimately, that means better performance and availability for all of our customers. It also helps protect your data against accidental abusive behavior, as well as unexpected security issues.

Conversely, we may not always perfectly match the behavior you see from your local development server. And while make a priority of keeping Elasticsearch up to date with the latest release, our support for some new features may be delayed as we take time to evaluate and integrate them.

Our API will generally return graceful error messages when a certain feature is not supported. If you would like to know our plans for supporting such features, please contact support and we can clarify that for you. In some cases we may be able to suggest alternate approaches for short-term help.

Do you support indexing rich content (e.g., PDF, Word) attachments?

Yes! We deploy a recent version of the elasticsearch-mapper-attachments plugin.

Do you support Elasticsearch River plugins?

We do not support Elasticsearch Rivers. Rivers are an example of a feature that we have chosen not to ever support due to their underlying risk of introducing instability in a distributed environment.

As of this writing, Rivers have been officially deprecated and will be removed from future versions of Elasticsearch. It is recommended that applications use one of the official client libraries to coordinate data ingest and synchronization into your Elasticsearch cluster.

Do I have access to the Elasticsearch Snapshot and Restore API?

We use the Snapshot and Restore API internally for our hourly backups, which prevents us from exposing those APIs to customers at this time. In some cases we can design a cluster that performs its snapshots into an S3 bucket that you supply. Contact us for details.

I need a feature that you will not support in your shared clusters. What are my options?

Contact us to ask about our dedicated cluster hosting plans, which allow unfettered access to the entire Elasticsearch API.

Do you offer non-profit discounts? Can you create custom plans?

Yes!

We have chosen certain price points for the sake of simplifying the decision-making process. However, we can certainly create customized plans tailored to various business needs. This can be helpful for, e.g., non-profit organizations; large-data, low-traffic low-availability internal clusters; or high-end big data dedicated clusters.

Contact info@bonsai.io if you would like to discuss a custom plan for your organization.

Do you support synonyms?

Yes. There are two ways to use synonyms in Elasticsearch: you can define a path in synonyms_path, or you can define synonyms for the filter directly. On our multitenant shared clusters, users need to use the REST API and define the synonyms directly. If your word list is prohibitively large (our max req size is 20 MB), then we do have the option to manually install it for you on a dedicated cluster.

I am in Europe and I can't have you exporting my data to the US.

Awesome, because we don't do that. When you provision a cluster in any region, we don't transfer it out of that region. Our tooling and automations don't even have that ability. The only exception would be if you explicitly asked us to migrate your cluster elsewhere; we can accommodate these requests, but it requires scheduling and manual intervention on our part. It would never, ever happen otherwise.

Why is my question not answered here?

Good question! There are real people behind Bonsai, and we welcome—nay, covet—your questions and feedback. Please get in touch and let us know what that question is. :-)

The Bonsai Blog