/
Restart cluster statefully when join replication process is taking too long

Restart cluster statefully when join replication process is taking too long

 Problem

This problem can be encountered with HiveMQ broker versions before 4.18 when Client Event History is enabled, the cluster has been running for a long time while client connections are low.

The problem might reveal itself during the cluster replication process showing longer than expected up to “never-ending” joining time.

 Solution

  1. Check the metrics and make sure that the Client Event Count is growing while the node is in the joining state
    com_hivemq_persistence_executor_client_events_tasks