Slow Rolling Restart After Removing Declared Shared Subscriptions in HiveMQ

Slow Rolling Restart After Removing Declared Shared Subscriptions in HiveMQ

Issue Summary

When performing a rolling restart of a HiveMQ cluster after removing declared shared subscriptions, users may experience unusually slow restart times. This article outlines the scenario, provides steps to identify and resolve the issue, and offers suggestions to prevent recurrence.

Background

HiveMQ supports declared shared subscriptions, which are preconfigured shared subscriptions defined in a declared-shared-subscriptions.xml file placed in the conf directory of each node. These allow messages to be enqueued to specific shared subscriptions as soon as HiveMQ starts, even if no clients are connected to those subscriptions. Changes to declared shared subscriptions require a rolling restart of all cluster nodes for the configuration to take full effect and to ensure message loss protection and cluster consistency.
Reference: Declared Shared Subscriptions

Symptoms

  • Rolling restart of HiveMQ cluster is significantly slower than expected after removing entries from declared-shared-subscriptions.xml.

  • Cluster nodes may take longer to reach healthy status or complete startup.

Root Cause

Removing declared shared subscriptions can result in additional processing during restart, especially if the configuration files are not consistent across all cluster nodes or if the removal leads to message queue reconciliation or cleanup operations. HiveMQ logs a warning if configurations are inconsistent, and this can affect cluster startup and message loss protection.
Reference: Declared Shared Subscriptions Concepts

Identification Steps

  1. Review the hivemq.log files on each node for warnings or errors related to declared shared subscriptions or configuration inconsistencies.

  2. Confirm that the declared-shared-subscriptions.xml file is identical on all nodes before performing the rolling restart.

  3. Check the cluster health and status after each node restart to ensure synchronization and that the cluster is fully operational.

Resolution Steps

  1. Ensure all nodes have the same updated declared-shared-subscriptions.xml file in their conf directory.

  2. Perform a rolling restart of each node, one at a time, allowing each node to become fully operational before proceeding to the next.

  3. Monitor the logs for any errors or warnings. Address any inconsistencies reported by HiveMQ.

  4. After the restart, verify that the cluster is healthy and that all expected shared subscriptions are correctly configured or removed.

Future Prevention Suggestions

  • Always synchronize the declared-shared-subscriptions.xml configuration across all nodes before restarting.

  • Use version control or configuration management tools to manage configuration files.

  • After modifying declared shared subscriptions, perform a rolling restart as recommended to ensure all nodes are updated and to prevent message loss.

  • Regularly monitor logs for warnings about configuration inconsistencies.

References

Additional Assistance

If you encounter persistent issues or need further support, please create a support request through the HiveMQ customer portal.