Disaster recovery runbook for Kubernetes deployment

This article explains how to recover your cluster and its persistent data in case of the HiveMQ cluster is Down or you observe the following warning in your hivemq.log

Not all replicas are currently reachable. More nodes than the replication factor of X have left the cluster in a too short time frame.

Instructions

1. Create a separate Pod to Access PVCs

Create a manifest file (pvc-access-pod.yaml) for a temporary pod to mount the PVCs and access their data. Here's the configuration:
Note: Make sure to have a minimum 4CPU, 4GB RAM and sufficient disk space (at least 2.5 times the size of all data folders combined).
Following is the manifest example of 2 node HiveMQ cluster:

apiVersion: v1
kind: Pod
metadata:
  name: pvc-access-pod
spec:
  containers:
  - name: pvc-access-container
    image: busybox
    command: ["/bin/sh", "-c", "while true; do sleep 3600; done"]
    resources:
      requests:
        memory: "4Gi"          # Request minimum 4GB of memory
        cpu: "4"              # Request minimum 4 CPUs
        ephemeral-storage: "100Gi"  # Request 100Gi of ephemeral storage
      limits:
        memory: "4Gi"          # Limit to 4GB of memory
        cpu: "4"              # Limit to 4 CPUs
        ephemeral-storage: "150Gi"  # Limit to 150Gi of ephemeral storage
    volumeMounts:
    - name: data0
      mountPath: /mnt/data0
    - name: data1
      mountPath: /mnt/data1
  volumes:
  - name: data0
    persistentVolumeClaim:
      claimName: data-hivemq-0
  - name: data1
    persistentVolumeClaim:
      claimName: data-hivemq-1

Apply the manifest:

kubectl apply -f pvc-access-pod.yaml -n <namespace>

2. Access the Pod

Once the pod is running, open a shell session to it:

kubectl exec -it pvc-access-pod -- sh

3. Zip the PVC Data

Inside the pod, run the following commands to archive the data from the mounted PVCs:

tar -cf /tmp/data0.tar -C /mnt/data0 .
tar -cf /tmp/data1.tar -C /mnt/data1 .

These commands create two tar files, data0.tar and data1.tar, in the /tmp directory of the pod.

4. Copy the Tar Files Locally

Exit the pod shell and use the kubectl cp command to copy the tar files from the pod to your local machine:

kubectl cp hivemq/pvc-access-pod:/tmp/data0.tar ./data0.tar
kubectl cp hivemq/pvc-access-pod:/tmp/data1.tar ./data1.tar

Verify the files are present in your current directory:

ls -ls data0.tar data1.tar

5. Extract the Tar Files Locally

Create a folder to store the extracted data:

mkdir -p extracted_data

Extract the contents of the tar files:

tar -xf data0.tar -C extracted_data
tar -xf data1.tar -C extracted_data

This consolidates all the data into the extracted_data directory.

6. Run the HiveMQ Disaster Recovery Tool

Download and set up the HiveMQ Disaster Recovery tool if you haven't already.

Run the tool with the extracted data directories:

./hivemq-recovery-tool-4.35.0/bin/hivemq-recovery-tool -i extracted_data/data0 -i extracted_data/data1 -e export/

This command processes the PVC data and exports the backup file to the export/ directory.

Note: This step can take minutes up to hours depending on the amount of data that has to be restored.

7. Once the process is complete, clean up the resources to avoid unnecessary costs:

Delete the temporary pod:

kubectl delete pod pvc-access-pod

8. Upload Backup

Once the recovery command is executed, you will find the new folder with a backup file in your export folder location.
For example:
Now you have a backup file ready to be restored on the running HiveMQ cluster. There are two ways to restore backup i.e via Control Center WebUI or via REST API.
1. If you choose to restore via the control center then follow the below steps
  1. Use the backup file generated by the recovery tool and upload it via the browser under the Admin > Backup page in HiveMQ’s Control Center
  2. Import progress is shown in the Control Center and once it's completed you will get a message about it. You can always verify import progress in your monitoring dashboard.

HiveMQ Knowledge Base - Self Managed Offering