Yes! Our production clusters are composed of a minimum of three servers, running in redundant data centers (AWS Availability Zones). These clusters gracefully tolerate partitioning of any single node or datacenter, caused either by outages or by normal maintenance. Production indices may have their data replicated onto multiple servers in the cluster, for multi-datacenter high availability.
Production clusters are backed up hourly to Amazon S3. We retain hourly snapshots for the last 24 hours, and daily snapshots for the last 14 days.
Yes. All of our plans allow you to create multiple indexes within your cluster. Multiple indexes are common and useful for supporting a number of use cases, including multiple application environments, zero-downtime hot reindexing, or for partitioning your index data by customer or by model.
Index and shard limits are a soft limit in our API. If you need to create more indexes than your current plan allows, contact us and let us know some more information about your use case and we can work with you to create an appropriate plan.
Shards are the fundamental unit of an Elasticsearch index. When you need to search more documents or index a larger volume than can efficiently be handled by a single server, an index can be split into multiple primary shards, allowing you to partition your data and distribute your workload across multiple servers.
Shards can be replicated according to your replica level. For example, an index created with two primary shards and one replica will have four total shards.
If your search or indexing performance starts to suffer as the size of your index grows, you should consider partitioning your index across more primary shards. You can approximate an optimal number of primary shards by indexing documents into an index with a single primary shard, measuring the size or traffic rate at which your test queries degrade in performance, and extrapolating from there.
Our rough recommendation is to consider creating one shard per 500,000 to 1,000,000 documents or 500 MB to 1 GB of indexed data. Another useful heuristic, on dedicated server clusters, is to use one shard per node, or one shard per CPU core.
Because each shard carries some fixed overhead in a cluster, the number of total shards allowed varies per plan. As with indexes, we are still measuring real-world usage of shards to determine useful plan defaults. If you need more shards than your current plan allows, contact us so we can help design a good plan for your needs.
We limit the number of concurrent requests. In practice, the actual requests per second this allows is based on the speed of the requests you are executing.
Request limits vary at different plan levels. We are still making changes and measuring real-world limits to determine sensible plan defaults. Rate-limited requests will fail with a HTTP 429 error indicating that you contact us so that we can work with you to accommodate your usage.
Updates in particular should be batched to run no more than one update request per second. Elasticsearch has excellent Bulk Update API, which we meter separately from other types of requests to encourage its use.
The production versions we support are 2.4.0, 5.1.1, 5.3.2, 5.4.3. Non-production clusters are provisioned by default on the most recent version, 5.4.3.
Generally our clusters run on the latest, or a very recent, version of Elasticsearch. We have designed our systems to support only one or two recent versions of Elasticsearch, rather than provide support for deploying many arbitrary versions.
As a managed service provider, we believe in keeping your clusters continually updated, so you can benefit from the constant improvements being integrated by the larger Elasticsearch and Lucene user communities. We also work to balance that with the responsibility to smoothly and reliably manage change over time, in order to prevent or mitigate the risk of service interruptions.
That said, one feature of our single-tenant dedicated server plans is coordinated version upgrades and long-term version support. If you need a specific version of Elasticsearch, we can likely accommodate that for you on one of our single-tenant cluster plans.
If you ever have questions about our schedule for upgrading to a new version, you can ask us at firstname.lastname@example.org — be sure to let us know what features or changes your app needs supported in a particular new version.
While we support MOST of Elasticsearch’s features and APIs, we do not necessarily expose all of them. You can find an annotated list of unsupported endpoints here.
Our philosophy is to expose the Elasticsearch API on a whitelist basis, explicitly allowing functionality that we have reviewed and deemed appropriate for a managed hosting environment.
This approach allows us deeper understanding and finer-grained control over how the system is being used. Ultimately, that means better performance and availability for all of our customers. It also helps protect your data against accidental abusive behavior, as well as unexpected security issues.
Conversely, we may not always perfectly match the behavior you see from your local development server. And while make a priority of keeping Elasticsearch up to date with the latest release, our support for some new features may be delayed as we take time to evaluate and integrate them.
Our API will generally return graceful error messages when a certain feature is not supported. If you would like to know our plans for supporting such features, please contact support and we can clarify that for you. In some cases we may be able to suggest alternate approaches for short-term help.
Yes! We deploy a recent version of the elasticsearch-mapper-attachments plugin.
We do not support Elasticsearch Rivers. Rivers are an example of a feature that we have chosen not to ever support due to their underlying risk of introducing instability in a distributed environment.
As of this writing, Rivers have been officially deprecated and will be removed from future versions of Elasticsearch. It is recommended that applications use one of the official client libraries to coordinate data ingest and synchronization into your Elasticsearch cluster.
We use the Snapshot and Restore API internally for our hourly backups, which prevents us from exposing those APIs to customers at this time. In some cases we can design a cluster that performs its snapshots into an S3 bucket that you supply. Contact us for details.
Contact us to ask about our dedicated cluster hosting plans, which allow unfettered access to the entire Elasticsearch API.
We have chosen certain price points for the sake of simplifying the decision-making process. However, we can certainly create customized plans tailored to various business needs. This can be helpful for, e.g., non-profit organizations; large-data, low-traffic low-availability internal clusters; or high-end big data dedicated clusters.
Contact email@example.com if you would like to discuss a custom plan for your organization.
Yes. There are two ways to use synonyms in Elasticsearch: you can define a path in synonyms_path, or you can define synonyms for the filter directly. On our multitenant shared clusters, users need to use the REST API and define the synonyms directly. If your word list is prohibitively large (our max req size is 20 MB), then we do have the option to manually install it for you on a dedicated cluster.
Awesome, because we don’t do that. When you provision a cluster in ANY region, we don’t transfer it out of that region. Our tooling and automations don’t even have that ability. The only exception would be if you explicitly asked us to migrate your cluster elsewhere; we can accommodate these requests, but it requires scheduling and manual intervention on our part. It would never, ever happen otherwise.
Good question! There are real people behind Bonsai, and we welcome—nay, covet—your questions and feedback. Please get in touch and let us know what that question is.
Many questions can also be answered at our blog and documentation sites: