HCL Commerce Version 9.1.15.0 or later

Backup snapshot and restore for Elasticsearch

ElasticSearch includes a snapshot capability that can be used to backup and restore indexes to a variety of external snapshot repositories, such as a GCS bucket.

Refer to Elasticsearch documentation on how to Register a Snapshot Repository. For illustration purpose, this topic uses Google Cloud Storage as our custom snapshot repository

Configuring Elasticsearch to work with an existing Google Cloud Storage account

The standard Elasticsearch image needs to be customized to enable the repository-gcs plugin, and install the GCS service account key JSON file which gives access to your GCS Bucket where the index snapshots will be stored.

The following is a sample Dockerfile that can be used to custom build the Elasticsearch 7.17.10 image:

docker build --pull -t  us.gcr.io/commerce-product/performance/elastic/elasticsearch:7.17.10 .
docker push us.gcr.io/commerce-product/performance/elastic/elasticsearch:7.17.10

Dockerfile

FROM docker.elastic.co/elasticsearch/elasticsearch:7.17.10
RUN bin/elasticsearch-plugin install --batch repository-gcs
COPY sa-wc-es-snapshot.json /tmp/sa-wc-es-snapshot.json
RUN bin/elasticsearch-keystore add-file --force gcs.client.commerce.credentials_file /tmp/sa-wc-es-snapshot.json
RUN rm /tmp/sa-wc-es-snapshot.json

Note that sa-wc-es-snapshot.json is your GCS service account key JSON file. For more information on how to set up your service account credentials, refer to Using a Service Account.

Creating and registering your index snapshot repository

Once your custom Elasticsearch with GCS enabled is running, issue the following API call to create your index snapshot repository using your GCS service account
Note: Only the Snapshot originating cluster should use the default (write) access mode when creating the repository connection.

Example:

PUT /_snapshot/<your-repository-name>
{
    "type": "gcs",
    "settings": {
        "client": "<your-gcs-client-name>",
        "bucket": "<your-gcs-bucket-name>",
        "base_path": "<your-repository-name>"
    }
}

Additional Cluster access must be readonly to avoid Concurrent Modification errors. For example:

PUT /_snapshot/<your-repository-name>
{
    "type": "gcs",
    "settings": {
        "client": "<your-gcs-client-name>",
        "bucket": "<your-gcs-bucket-name>",
        "base_path": "<your-repository-name>",
		"readonly":"true"
    }
}

For more information about the parameters used in this example, refer to the Elasticsearch Repository Settings documentation.

To retrieve the definition of your snapshot repository, issue the following call:

GET /_snapshot/

Backing up your indexes as a snapshot

Your system is now ready to Create new snapshots. The following is an example for creating a nightly backup manually. Alternatively, you can use an SLM Policy to achieve the same result.

PUT /_snapshot/<your-snapshot-name>
{
    "schedule": "0 30 1 * * ?",
    "name": "<nightly-snap-{now/d}>",
    "repository": "<your-repository-name>",
    "config": {
        "indices": "live.*,.live.*",
        "include_global_state": true
    },
    "retention": {
        "expire_after": "30d",
        "min_count": 5,
        "max_count": 50
    }
}

Listing available Snapshots from your Repository

There are two ways to find out what snapshots are available in a given repository, a detailed and a tabulated format. The detailed format is fetched using the following API call:

GET /_snapshot/<your-repository-name>/*?verbose=true

For the same information in tabulated format, issue the following call:

GET /_cat/snapshots/<your-repository-name>?v=true&s=id&pretty

Restoring your indexes from a Snapshot

To restore a snapshot from your repository, use the following API to explicitly recall the backup using index name patterns.

POST /_snapshot/<your-repository-name>/<your-snapshot-name>/_restore
{
  "indices": "*,.*"
}

A complete description of the parameters used in this example can be found in the Elasticsearch Restore Snapshot API documentation.

Note that this operation only restores the necessary index files. The index aliases are not yet pointing to them..

Removing a Snapshot

Use the following API to explicitly remove a snapshot.

DELETE /_snapshot/<your-repository-name>/<your-snapshot-name>