Persisting search index data to a Gluster file system

Learn how to use Kubernetes and GlusterFS to deploy HCL Commerce with a persistent volume for the search index.

Before you begin

Before you begin:
  • Ensure that you have HCL Commerce running on Kubernetes. This sample tutorial does not cover how to deploy HCL Commerce on Kubernetes. For more information, see Deploying HCL Commerce Version 9 on Kubernetes.
  • Ensure that you are using a CentOS or RedHat Linux distribution. The procedure in this sample tutorial is based on CentOS Linux. Modify any code as needed to fit your distribution.

About this task

By default, when you build the search index in your HCL Commerce Version 9 runtime environment, the indexed files are stored in the Search server Docker container. This poses an issue. If the search container is removed, for example when you upgrade your environment, then all the existing data in the container is lost. You will need to rebuild the search index to restore your store’s full functionality. To avoid having to rebuild the index, you can persist the indexed files to a network file system. There are many different ways you can set up a network file system, but this sample tutorial will use the following tools:

A) GlusterFS: An open source scalable network filesystem. It can create various types of volumes, such as distributed, replicated, distributed replicated, dispersed, and distributed dispersed.

B) Heketi: a provisioning tool that you can use to manage the lifecycle of GlusterFS volumes through RESTful APIs.

C) Kubernetes: provides two API resources to manage storage for persisting data: PersistentVolume (PV) and PersistentVolumeClaim (PVC).

– PersistentVolume is a piece of storage in the cluster, which has a lifecycle independent of any individual pod.

– PersistentVolumeClaim is a request for storage, which consumes PV resources.

Kubernetes supports two methods for provisioning a PV: Static and dynamic. In the static method, the PersistentVolume will be created in advance. In the dynamic method, the PersistentVolume is created based on the PVC’s requirement. In this sample, you will learn how to integrate Gluster storage for Kubernetes with Heteki to deploy HCL Commerce with a persistent volume to a network file system.

Procedure

Set up a Gluster cluster and Heketi
  1. Prepare three nodes. This sample solution uses three nodes to create a Gluster cluster. If you do not want to use three, it is recommended that you have at least two nodes.
  2. Add the GlusterFS repository to your system.
    1. Create the following file.
      /etc/yum.repos.d/CentOS-Gluster-4.0.repo
    2. Add the following content to the CentOS-Gluster-4.0.repo file.
      #CentOS-Gluster-4.0.repo
      # Please see http://wiki.centos.org/SpecialInterestGroup/Storage for more information
      [centos-gluster40]name=CentOS-$releasever - Gluster 4.0
      baseurl=http://buildlogs.centos.org/centos/7/storage/x86_64/gluster-4.0
      gpgcheck=1
      enabled=1
      gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage
      [centos-gluster40-test]name=CentOS-$releasever - Gluster 4.0 Testing
      baseurl=http://debuginfo.centos.org/centos/7/storage/x86_64/
      gpgcheck=0
      enabled=0
      gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage
  3. Install the Gluster server and client.
    yum install --nogpgcheck glusterfs glusterfs-server glusterfs-fuse  glusterfs-rdma nfs-ganesha-gluster
  4. Edit your Gluster server node basic configuration by adding a new disk to the Gluster server node. After the new disk is added, use the following command to check the new disk name. The new disk does not have the “Device Boot” record. In this tutorial, we take the new disk /dev/sdb as an example.
    fdisk -l.
    For example:
    Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk label type: dos
    Disk identifier: 0x000adb00
     Device Boot Start End Blocks Id System
    /dev/sda1 * 2048 1026047 512000 83 Linux
    /dev/sda2 1026048 201326591 100150272 8e Linux LVM
    /dev/sda3 201326592 209715199 4194304 82 Linux swap / Solaris
    Disk /dev/sdb: 21.5 GB, 21474836480 bytes, 41943040 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
  5. Create and format a partition
    fdisk /dev/sdb
    mkfs.xfs /dev/sdb1  
  6. Enable the Gluster service.
    systemctl enable  --now glusterd
  7. Repeat steps 2-5 on the other nodes.
  8. Install the Heketi server and client on one Gluster server.
    yum install --nogpgcheck -y heketi heketi-client
  9. Generate a Heketi SSH key and use all default values.
    sudo –u heketi ssh-kegen
  10. Modify the /etc/heketi/heketi.json configuration file. Configure “ssh” as executor and add the keyfile and user in the ssh section. Here is a sample configuration.
    "_glusterfs_comment": "GlusterFS Configuration",
      "glusterfs": {
        "_executor_comment": [
          "Execute plugin. Possible choices: mock, ssh",
          "mock: This setting is used for testing and development.",
          "      It will not send commands to any node.",
          "ssh:  This setting will notify Heketi to ssh to the nodes.",
          "      It will need the values in sshexec to be configured.",
          "kubernetes: Communicate with GlusterFS containers over",
          "            Kubernetes exec api."
        ],
        "executor": "ssh",
        "_sshexec_comment": "SSH username and private key file information",
        "sshexec": {
          "keyfile": "/var/lib/heketi/.ssh/id_rsa",
          "user": "root",
          "port": "22",
          "fstab": "/etc/fstab"
        },
  11. Edit the /usr/lib/system/system/heketi.service file. Change the start option to include 2 dashes at the beginning of “config” instead of only 1 dash. Change from -config=/etc/heketi/heketi.json to --config=/etc/heketi/heketi.json.
  12. Export the Heketi server address variable.
    export HEKETI_CLI_SERVER=http://localhost:8080
  13. Enable and start the Heketi service.
    systemctl enable -- now heketi
  14. Create a Gluster cluster on the Heketi server and get the cluster ID from the command output. This creates a cluster named wcs.
    heketi-cli cluster create wcs
  15. Add Gluster nodes to the cluster. You can get the node Id from command output.
    
        heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node1-hostname} --storage-host-name={gluster-node1--hostname}
        heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node2-hostname} --storage-host-name={gluster-node2--hostname}
        heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node3-hostname} --storage-host-name={gluster-node3--hostname}
    
  16. Allocate devices for these nodes.
    heketi-cli device add --name=/dev/sdb1 --node={nodeID1}
    heketi-cli device add --name=/dev/sdb1 --node={nodeID2}
    heketi-cli device add --name=/dev/sdb1 --node={nodeID3}
  17. Install the Glusterfs client on all Kubenetes subordinate nodes.
    yum install --nogpgcheck glusterfs-fuse
A Glusterfs-Heketi environment is now ready. You can use the environment with Kubernetes to provision volumes dynamically.

Update Kubernetes environment

  1. Register the Gluster storage class on the Kubernetes system.
    1. Run the following command to create a YAML file.
      kubectl create –f gluster-storage.yaml
    2. Add the following content to the gluster-storage.yaml file. This content creates a Gluster storage.
      apiVersion: storage.k8s.io/v1
         kind: StorageClass
      metadata:
         name: glusterfs
      annotations:
         storageclass.kubernetes.io/is-default-class: "true"
         provisioner: kubernetes.io/glusterfs
      parameters:
         resturl: "http://{heketi-server-address}:8080"
         restuser: ""
         secretNamespace: ""
         secretName: ""
         volumetype: "none" //Choose the distributed volume type. 
                            //Note: 'replicate' type fails due to a Gluster issue. When Gluster replicates the file,
                            //      the ctime is modified, which will causes Solr validation to fail.
  2. Verify that the storage class was created successfully.
    kubectl get sc
  3. Create a PersistentVolumeClaim.
    1. Run the following command to create a YAML file.
      kubectl create –f search-master-pvc.yaml
    2. Add the following content to the search-master-pvc.yaml file.
      kind: PersistentVolumeClaim
      apiVersion: v1
        metadata:
           name: {tenantName}-{environment}-search-master-gluster-volume
      spec:
        accessModes:
           - ReadWriteMany
        resources:
           requests:
              storage: 2Gi
      storageClassName: glusterfs
  4. Verify that the PersistentVolumeClaim and PersistentVolume were created successfully.
    kubectl get pvc
    kubectl get pv
  5. Update your existing search-master helm chart to include a volume mount to the PersistentVolumeClaim. For example, volumes:
     - name: searchvol
         persistentVolumeClaim:
                claimName: {tenantName}-{environment}-search-master-gluster-volume 
    Here is an example of a search-master helm chart with the volumes definition, based on HCL Commerce Version 9.0.0.2 Docker containers.
    Note: If you are using HCL Commerce Version 9.0.0.0, you need to change – name: “CONFIGURE_MODE” to – name: “OVERRIDE_PRECONFIG”. Starting from HCL Commerce Version 9.0.0.2, the start up configuration parameters was changed from OVERRIDE_PRECONFIG to CONFIGURE_MODE.
    apiVersion: extensions/v1beta1
    kind: Deployment
    metadata:
     name: {{ .Values.Common.Tenant }}{{ .Values.Common.Environment_name}}{{.Values.Searchapp.Name}}
     namespace: {{ .Values.Common.Namespace }}
    spec:
     replicas: {{ .Values.Searchapp.Replica }}
     template:
     metadata:
     labels:
     component: {{ .Values.Common.Tenant }}{{.Values.Common.Environment_name}}{{.Values.Searchapp.Name}}
     spec:
     initContainers:
     - name: ts-app-check
     image: "{{ .Values.Common.Image_repo }}{{ .Values.InitC.Image }}:{{ .Values.InitC.Tag }}"
     imagePullPolicy: IfNotPresent
     env:
     - name: "COMPONENT"
     value: "ts-search"
     - name: "NAMESPACE"
     value: "{{ .Values.Common.Namespace }}"
     - name: "TENANT"
     value: "{{ .Values.Common.Tenant }}"
     - name: "ENVNAME"
     value: "{{ .Values.Common.Environment_name }}"
     containers:
     - name: srach-app-master
     image: "{{ .Values.Common.Image_repo }}{{.Values.Searchapp.Image}}:{{ .Values.Searchapp.Tag }}"
     resources:
     requests:
     cpu: 0.5m
     memory: 2048Mi
     limits:
     cpu: 2
     memory: 4096Mi
     env:
     - name: "VAULT_URL"
     value: "{{ .Values.Common.Vault_URL }}"
     - name: "WORKAREA"
     value: "/search"
     - name: "ENVIRONMENT"
     value: "{{ .Values.Common.Environment_name }}"
     - name: "VAULT_TOKEN"
     value: "{{ .Values.Common.Vault_token }}"
     - name: "TENANT"
     value: "{{ .Values.Common.Tenant }}"
     - name: "SOLR_MASTER"
     value: "true"
     - name: "SOLR_SLAVE"
     value: "false"
     - name: "ENVTYPE"
     value: "auth"
     - name: "LICENSE"
     value: "accept"
     - name: "VAULT_CA"
     value: "true"
     - name: "CONFIGURE_MODE"
     value: "Vault"
     ports:
     - containerPort: 3737
     name: port3737
     - containerPort: 3738
     name: port3738
     protocol: TCP
     readinessProbe:
     httpGet:
     path: /search/admin/resources/health/status?type=container
     port: 3737
     httpHeaders:
     - name: Authorization
     value: Basic {{ .Values.Common.SPIUser_PWD }}
     initialDelaySeconds: 5
     periodSeconds: 5
     livenessProbe:
     tcpSocket:
     port: 3737
     initialDelaySeconds: 600
     timeoutSeconds: 300
     volumeMounts:
     - name: myvol
     mountPath: /search
     nodeSelector:
     mounttype: glusterfs
     volumes:
     - name: searchvol
     persistentVolumeClaim:
     claimName: {tenantName}-{environment}-search-master-gluster-volume 
  6. Redeploy HCL Commerce by installing the new helm chart. For example, helm install <parameters> <helm_repo_name>/<helm_package>.

Results

You now have HCL Commerce Version 9 running on Kubernetes with a persistent volume. When you recreate containers, you will notice that the search index data is persisted to the GlusterFS volume and you do not have to rebuild the search index.
  1. Build the search index for your store. Perform these steps only when all containers are ready to use. For more information, see Deploying HCL Commerce Version 9 on Kubernetes.
  2. Visit the store to ensure that everything displays as expected.
  3. Delete the deployment.
    helm delete --purge <chartName>
  4. Redeploy the HCL Commerce environment (without creating a persistent volume claim).
    helm install <newChart>
  5. Visit the store again. Everything should display as expected without needing to rebuild the search index. If you do not see categories or products, then the search index was not persisted. Review the steps to ensure that you completed everything.