Categories

How to Download Elasticsearch and OpenSearch Data Without Snapshots

Last updated
June 4, 2024

The Problem

One of the biggest hurdles a search developer comes across is how to get data from one cluster into a new one. In a perfect world we would have fast and reliable reindexing scripts to quickly teardown and/or rebuild indices. A good example of this pattern is in the elasticsearch-rails gem’s import tasks. See also a more in-depth example of an Indexer class in the search for Jekyll gem, searchyll.

Sometimes the best case is not always possible, either from accumulated tech debt or contextual constraints. For those in on our Sandbox and Standard plans, this problem is compounded in that in an effort to keep these plans accessible, the Snapshot API is not available on demand. Read more in our write up on this here: https://bonsai.io/docs/snapshots-on-bonsai. Particularly for those on non-production plans such as our Sandbox plan, backups aren’t taken regularly. In this case, what options are available? Let’s explore a couple of strategies.

Possible Solutions

There are two solutions to reindexing and/or migrating your cluster in a situation where both the Snapshots API isn’t available. The first is to use the elasticsearch-dump library, and the second is to manage it with a custom solution. Regardless of which way you chose, you’ll need to follow this larger process:

  1. Download your mappings.
  2. Download a copy of your old cluster data, or design and implement indexing scripts to do it on your own.
  3. Re-create your indices and the mappings on your new cluster.
  4. Index your data on the new cluster.

elasticsearch-dump

elasticsearch-dump is a mature javascript library that has been around through nearly every release of Elasticsearch. It can download data and mappings, migrate between clusters directly, and do all sorts of imports and exports necessary for the search engineer’s workflow.

The process for getting started is simple:

  1. Download the library:
npm install elasticdump
  1. Copy over mappings and data into a new cluster, either through a download and reindex or via urls.

Here’s an example of what a migration might look like:

# Backup index data to a file:
elasticdump \
  --input=https://key:secret@fir-123.us-east-1.bonsaisearch.net:443/my_index \
  --output=/data/my_index_mapping.json \
  --type=mapping

# Index the data into your cluster with the file:
elasticdump \
  --input=/data/my_index.json \
  --output=https://key:secret@fir-123.us-east-1.bonsaisearch.net:443/my_index \
  --type=data

You’ll need to use your cluster credentials to access your index from a terminal session. See our docs on Cluster Credentials here: https://bonsai.io/docs/credential-management.

Managing your own reindex

Much of what elasticdump does can be manually written if necessary, using curl or whatever language you prefer. For example, downloading mappings can be done using curl:

curl -XGET "https://key:secret@fir-123.us-east-1.bonsaisearch.net:443/_mapping?pretty=true" > mappings.json

And later, with a new cluster, you can PUT your new mappings to its corresponding index:

curl -XPUT "https://key:secret@fir-123.us-east-1.bonsaisearch.net:443/index_name/_mapping" \
  -H 'Content-Type: application/json' \
  -d @mappings.json

It’s important to note that the downloaded mappings will have to be edited or pieced apart to PUT to the new indices. In the case of managing your reindex yourself, either dump the index data with elasticdump above, or create scripts to reindex straight from your database.

Further Resources

Depending on how many versions you are upgrading you’ll need to navigate breaking changes between versions, like the drop of _doc types in v6.x. There is extensive coverage of breaking changes in the Elasticsearch documentation. See also our guides on moving from major versions:

We’ve seen it all and are here to help. Please reach our to support@bonsai.io and we’ll point you in the right direction. Cheers!

View code snippet
Close code snippet
By clicking “Accept”, you agree to the storing of cookies on your device to enhance site navigation, analyze site usage, and assist in our marketing efforts. View our Privacy Policy for more information.