Persisting search index data to a Gluster file system

Learn how to use Kubernetes and GlusterFS to deploy HCL Commerce with a persistent volume for the search index.

Before you begin

Before you begin:

Ensure that you have HCL Commerce running on Kubernetes. This sample tutorial does not cover how to deploy HCL Commerce on Kubernetes. For more information, see Deploying HCL Commerce Version 9 on Kubernetes.
Ensure that you are using a CentOS or RedHat Linux distribution. The procedure in this sample tutorial is based on CentOS Linux. Modify any code as needed to fit your distribution.

About this task

By default, when you build the search index in your HCL Commerce Version 9 runtime environment, the indexed files are stored in the Search server Docker container. This poses an issue. If the search container is removed, for example when you upgrade your environment, then all the existing data in the container is lost. You will need to rebuild the search index to restore your store’s full functionality. To avoid having to rebuild the index, you can persist the indexed files to a network file system. There are many different ways you can set up a network file system, but this sample tutorial will use the following tools:

A) GlusterFS: An open source scalable network filesystem. It can create various types of volumes, such as distributed, replicated, distributed replicated, dispersed, and distributed dispersed.

B) Heketi: a provisioning tool that you can use to manage the lifecycle of GlusterFS volumes through RESTful APIs.

C) Kubernetes: provides two API resources to manage storage for persisting data: PersistentVolume (PV) and PersistentVolumeClaim (PVC).

– PersistentVolume is a piece of storage in the cluster, which has a lifecycle independent of any individual pod.

– PersistentVolumeClaim is a request for storage, which consumes PV resources.

Kubernetes supports two methods for provisioning a PV: Static and dynamic. In the static method, the PersistentVolume will be created in advance. In the dynamic method, the PersistentVolume is created based on the PVC’s requirement. In this sample, you will learn how to integrate Gluster storage for Kubernetes with Heteki to deploy HCL Commerce with a persistent volume to a network file system.

Procedure

Set up a Gluster cluster and Heketi

Prepare three nodes. This sample solution uses three nodes to create a Gluster cluster. If you do not want to use three, it is recommended that you have at least two nodes.

Add the GlusterFS repository to your system.

Create the following file.

/etc/yum.repos.d/CentOS-Gluster-4.0.repo

Add the following content to the CentOS-Gluster-4.0.repo file.

#CentOS-Gluster-4.0.repo
# Please see http://wiki.centos.org/SpecialInterestGroup/Storage for more information
[centos-gluster40]name=CentOS-$releasever - Gluster 4.0
baseurl=http://buildlogs.centos.org/centos/7/storage/x86_64/gluster-4.0
gpgcheck=1
enabled=1
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage
[centos-gluster40-test]name=CentOS-$releasever - Gluster 4.0 Testing
baseurl=http://debuginfo.centos.org/centos/7/storage/x86_64/
gpgcheck=0
enabled=0
gpgkey=file:///etc/pki/rpm-gpg/RPM-GPG-KEY-CentOS-SIG-Storage

Install the Gluster server and client.

yum install --nogpgcheck glusterfs glusterfs-server glusterfs-fuse  glusterfs-rdma nfs-ganesha-gluster

Edit your Gluster server node basic configuration by adding a new disk to the Gluster server node. After the new disk is added, use the following command to check the new disk name. The new disk does not have the “Device Boot” record. In this tutorial, we take the new disk /dev/sdb as an example.

fdisk -l.
For example:
Disk /dev/sda: 107.4 GB, 107374182400 bytes, 209715200 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000adb00
 Device Boot Start End Blocks Id System
/dev/sda1 * 2048 1026047 512000 83 Linux
/dev/sda2 1026048 201326591 100150272 8e Linux LVM
/dev/sda3 201326592 209715199 4194304 82 Linux swap / Solaris
Disk /dev/sdb: 21.5 GB, 21474836480 bytes, 41943040 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes

Create and format a partition
```
fdisk /dev/sdb
mkfs.xfs /dev/sdb1  
```
Enable the Gluster service.
```
systemctl enable  --now glusterd
```
Repeat steps 2-5 on the other nodes.
Install the Heketi server and client on one Gluster server.
```
yum install --nogpgcheck -y heketi heketi-client
```
Generate a Heketi SSH key and use all default values.
```
sudo –u heketi ssh-kegen
```

Modify the /etc/heketi/heketi.json configuration file. Configure “ssh” as executor and add the keyfile and user in the ssh section. Here is a sample configuration.

"_glusterfs_comment": "GlusterFS Configuration",
  "glusterfs": {
    "_executor_comment": [
      "Execute plugin. Possible choices: mock, ssh",
      "mock: This setting is used for testing and development.",
      "      It will not send commands to any node.",
      "ssh:  This setting will notify Heketi to ssh to the nodes.",
      "      It will need the values in sshexec to be configured.",
      "kubernetes: Communicate with GlusterFS containers over",
      "            Kubernetes exec api."
    ],
    "executor": "ssh",
    "_sshexec_comment": "SSH username and private key file information",
    "sshexec": {
      "keyfile": "/var/lib/heketi/.ssh/id_rsa",
      "user": "root",
      "port": "22",
      "fstab": "/etc/fstab"
    },

Edit the /usr/lib/system/system/heketi.service file. Change the start option to include 2 dashes at the beginning of “config” instead of only 1 dash. Change from -config=/etc/heketi/heketi.json to --config=/etc/heketi/heketi.json.

Export the Heketi server address variable.

export HEKETI_CLI_SERVER=http://localhost:8080

Enable and start the Heketi service.
```
systemctl enable -- now heketi
```
Create a Gluster cluster on the Heketi server and get the cluster ID from the command output. This creates a cluster named wcs.
```
heketi-cli cluster create wcs
```

Add Gluster nodes to the cluster. You can get the node Id from command output.


    heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node1-hostname} --storage-host-name={gluster-node1--hostname}
    heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node2-hostname} --storage-host-name={gluster-node2--hostname}
    heketi-cli node add --cluster={clusterId} --zone=1 --management-host-name={gluster-node3-hostname} --storage-host-name={gluster-node3--hostname}

Allocate devices for these nodes.

heketi-cli device add --name=/dev/sdb1 --node={nodeID1}
heketi-cli device add --name=/dev/sdb1 --node={nodeID2}
heketi-cli device add --name=/dev/sdb1 --node={nodeID3}

Install the Glusterfs client on all Kubenetes subordinate nodes.
```
yum install --nogpgcheck glusterfs-fuse
```

A Glusterfs-Heketi environment is now ready. You can use the environment with Kubernetes to provision volumes dynamically.

Update Kubernetes environment

Run the following command to create a YAML file.
```
kubectl create –f gluster-storage.yaml
```

Add the following content to the gluster-storage.yaml file. This content creates a Gluster storage.

apiVersion: storage.k8s.io/v1
   kind: StorageClass
metadata:
   name: glusterfs
annotations:
   storageclass.kubernetes.io/is-default-class: "true"
   provisioner: kubernetes.io/glusterfs
parameters:
   resturl: "http://{heketi-server-address}:8080"
   restuser: ""
   secretNamespace: ""
   secretName: ""
   volumetype: "none" //Choose the distributed volume type. 
                      //Note: 'replicate' type fails due to a Gluster issue. When Gluster replicates the file,
                      //      the ctime is modified, which will causes Solr validation to fail.

Verify that the storage class was created successfully.
```
kubectl get sc
```

Create a PersistentVolumeClaim.

Run the following command to create a YAML file.
```
kubectl create –f search-master-pvc.yaml
```

Add the following content to the search-master-pvc.yaml file.

kind: PersistentVolumeClaim
apiVersion: v1
  metadata:
     name: {tenantName}-{environment}-search-master-gluster-volume
spec:
  accessModes:
     - ReadWriteMany
  resources:
     requests:
        storage: 2Gi
storageClassName: glusterfs

Verify that the PersistentVolumeClaim and PersistentVolume were created successfully.
```
kubectl get pvc
kubectl get pv
```

Update your existing search-master helm chart to include a volume mount to the PersistentVolumeClaim. For example, volumes:

 - name: searchvol
     persistentVolumeClaim:
            claimName: {tenantName}-{environment}-search-master-gluster-volume

Here is an example of a search-master helm chart with the volumes definition, based on HCL Commerce Version 9.0.0.2 Docker containers.

Note: If you are using HCL Commerce Version 9.0.0.0, you need to change – name: “CONFIGURE_MODE” to – name: “OVERRIDE_PRECONFIG”. Starting from HCL Commerce Version 9.0.0.2, the start up configuration parameters was changed from OVERRIDE_PRECONFIG to CONFIGURE_MODE.

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
 name: {{ .Values.Common.Tenant }}{{ .Values.Common.Environment_name}}{{.Values.Searchapp.Name}}
 namespace: {{ .Values.Common.Namespace }}
spec:
 replicas: {{ .Values.Searchapp.Replica }}
 template:
 metadata:
 labels:
 component: {{ .Values.Common.Tenant }}{{.Values.Common.Environment_name}}{{.Values.Searchapp.Name}}
 spec:
 initContainers:
 - name: ts-app-check
 image: "{{ .Values.Common.Image_repo }}{{ .Values.InitC.Image }}:{{ .Values.InitC.Tag }}"
 imagePullPolicy: IfNotPresent
 env:
 - name: "COMPONENT"
 value: "ts-search"
 - name: "NAMESPACE"
 value: "{{ .Values.Common.Namespace }}"
 - name: "TENANT"
 value: "{{ .Values.Common.Tenant }}"
 - name: "ENVNAME"
 value: "{{ .Values.Common.Environment_name }}"
 containers:
 - name: srach-app-master
 image: "{{ .Values.Common.Image_repo }}{{.Values.Searchapp.Image}}:{{ .Values.Searchapp.Tag }}"
 resources:
 requests:
 cpu: 0.5m
 memory: 2048Mi
 limits:
 cpu: 2
 memory: 4096Mi
 env:
 - name: "VAULT_URL"
 value: "{{ .Values.Common.Vault_URL }}"
 - name: "WORKAREA"
 value: "/search"
 - name: "ENVIRONMENT"
 value: "{{ .Values.Common.Environment_name }}"
 - name: "VAULT_TOKEN"
 value: "{{ .Values.Common.Vault_token }}"
 - name: "TENANT"
 value: "{{ .Values.Common.Tenant }}"
 - name: "SOLR_MASTER"
 value: "true"
 - name: "SOLR_SLAVE"
 value: "false"
 - name: "ENVTYPE"
 value: "auth"
 - name: "LICENSE"
 value: "accept"
 - name: "VAULT_CA"
 value: "true"
 - name: "CONFIGURE_MODE"
 value: "Vault"
 ports:
 - containerPort: 3737
 name: port3737
 - containerPort: 3738
 name: port3738
 protocol: TCP
 readinessProbe:
 httpGet:
 path: /search/admin/resources/health/status?type=container
 port: 3737
 httpHeaders:
 - name: Authorization
 value: Basic {{ .Values.Common.SPIUser_PWD }}
 initialDelaySeconds: 5
 periodSeconds: 5
 livenessProbe:
 tcpSocket:
 port: 3737
 initialDelaySeconds: 600
 timeoutSeconds: 300
 volumeMounts:
 - name: myvol
 mountPath: /search
 nodeSelector:
 mounttype: glusterfs
 volumes:
 - name: searchvol
 persistentVolumeClaim:
 claimName: {tenantName}-{environment}-search-master-gluster-volume

Redeploy HCL Commerce by installing the new helm chart. For example,helm install <parameters> <helm_repo_name>/<helm_package>.

Results

You now have HCL Commerce Version 9 running on Kubernetes with a persistent volume. When you recreate containers, you will notice that the search index data is persisted to the GlusterFS volume and you do not have to rebuild the search index.

Build the search index for your store. Perform these steps only when all containers are ready to use. For more information, see Deploying HCL Commerce Version 9 on Kubernetes.
Visit the store to ensure that everything displays as expected.
Delete the deployment.
```
helm delete --purge <chartName>
```
Redeploy the HCL Commerce environment (without creating a persistent volume claim).
```
helm install <newChart>
```
Visit the store again. Everything should display as expected without needing to rebuild the search index. If you do not see categories or products, then the search index was not persisted. Review the steps to ensure that you completed everything.