Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

In situations where a node behaves unexpectedly or experiences memory consumption issues, it's advisable to isolate the node from the cluster. This allows you to analyze the problem further by creating heap dumps or thread dumps. This guide will walk you through the steps to isolate a node.

Note

Isolating a node may result in increased load (memory, CPU, and network) on other cluster members. If the heap consumption is excessively high, this could potentially lead to a total cluster outage. It's important to note that executing this script on your production system should only be DONE after receiving confirmation from a member of the HiveMQ Support team.

\uD83D\uDCD8 Instructions

...

  1. Create a Bash script named "isolate_node.sh" with the following content:

    Code Block
    #!/bin/bash
    
    # Block incoming traffic on port 7800
    iptables -A INPUT -p tcp --dport 7800 -j DROP
    
    # Block outgoing traffic on port 7800
    iptables -A OUTPUT -p tcp --dport 7800 -j DROP
    
    # Save the changes
    iptables-save > /dev/null
    
  2. Run the script with sudo permissions on the affected node:

    Code Block
    languagebash
    sudo chmod +x isolate_node.sh
    sudo ./isolate_node.sh
  3. After executing the script, check the "hivemq.log" file of the node where the command was run. You should observe that the cluster size has decreased to 1. Additionally, the "hivemq.log" of other nodes should indicate that the cluster size has been reduced by one.

  4. You can also verify the changes in the Control Center or in your monitoring dashboard.

  5. With the node isolated, you can proceed to create heap dump or thread dumps for analysis.

  6. Once the required data is collected from the node, remove all rules from the iptables firewall to allow the node to rejoin the cluster. Use the following command:

    Code Block
    sudo iptables -F
  7. After executing the above command, the isolated node will rejoin the cluster. If you prefer to shut down this node and create a new one instead, you can skip step 6.

...

Filter by label (Content by label)
page
showLabelsfalse
max5
spacescom.atlassian.confluence.content.render.xhtml.model.resource.identifiers.SpaceResourceIdentifier@957
maxCheckboxfalse
showSpacefalse
sortmodified
showSpacetypefalsepage
reversetruetype
labelskb-how-to-article
cqllabel in ( "isolate" , "cluster" , "memory" ) and type = "page" and space = "KB"labelskb-how-to-article