Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

All members of a HiveMQ cluster will output log messages to indicate replication is in progress while the process is ongoing

Code Block
INFO  - Starting cluster replication process. This may take a while. Please do not shut down HiveMQ.
[...]
INFO  - Replication is still in progress. Please do not shut down HiveMQ.

While these messages are being logged, the cluster is at risk of data loss, should more than replica count -1 brokers be removed from the cluster.

Once all necessary data has been exchanged the brokers will log the following upon complete data exchange:

Code Block
INFO  - Finished cluster replication successfully in 30000 ms.

...

Info

In order to ensure the cluster has reached its base line load with the traffic it is experiencing, the metrics give further insight. Observing these metrics in addition to the mentioned log line can be helpful when performing rolling upgrades in clusters that are operating close to the limits of their hardware’s capabilities.

Ensuring replication-related tasks are resolved prior to rotating the next node is the least impactful way to perform such topology changes.

  1. A join process (where a fresh broker with no state joins the cluster) has been completed once the following tasks return to 0:

Code Block
com.hivemq.internal.singlewriter.*
      topic-tree.remove-locally.queued
      com.hivemq.internal.singlewriter.client-session-subscription-persistence.remove-locally.queued
      com.hivemq.internal.singlewriter.client-session-persistence.remove-locally.queued
      com.hivemq.internal.singlewriter.client-queue-persistence.remove-local.queued
      com.hivemq.internal.singlewriter.client-event-persistence.remove-bucket.queued
  1. Further, the replication batches should also have reached 0 again:

Code Block
com.hivemq.replication.batches-queued
com.hivemq.replication.batches-sent

Observing these metrics in addition to the mentioned log line can be helpful when performing rolling upgrades in clusters that are operating close to the limits of their hardware’s capabilities.

...