How to Configure Pod Affinity in Kubernetes for High Availability

How to Configure Pod Affinity in Kubernetes for High Availability

This article explains how to set up Kubernetes affinity rules to ensure your HiveMQ platform runs with optimal high availability. By configuring proper node affinity and pod anti-affinity rules, you can ensure your HiveMQ instances are distributed across nodes and availability zones for maximum resilience.

What This Article Covers

  • Understanding Kubernetes affinity concepts

  • Setting up node affinity to select appropriate nodes

  • Configuring pod anti-affinity for resilience

Instructions

1. Understanding Affinity Rules

Kubernetes offers three types of affinity rules:

  • Node Affinity: Controls which nodes your pods can run on

  • Pod Affinity: Attracts pods to nodes running certain other pods

  • Pod Anti-Affinity: Prevents pods from running on nodes with certain other pods

For HiveMQ high availability, we use node affinity to select appropriate nodes and pod anti-affinity to distribute pods across nodes and availability zones.

Node Affinity In-Depth

Node affinity rules determine which nodes your pods can be scheduled on. In our example:

nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: designation operator: In values: - true

This means:

  • Pods must be scheduled on nodes labeled with designation=true

  • requiredDuringSchedulingIgnoredDuringExecution - This is a hard requirement for pod scheduling, but if node labels change after a pod is running, the pod won't be evicted

  • If no suitable nodes exist, pods will remain in a pending state

Pod Anti-Affinity In-Depth

Pod anti-affinity prevents pods from being co-located based on labels. Our configuration includes two types:

  1. Required Anti-Affinity:

podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app.kubernetes.io/instance operator: In values: - hivemq-platform topologyKey: kubernetes.io/hostname

This ensures:

  • No two pods with label app.kubernetes.io/instance=hivemq-platform can run on the same node

  • kubernetes.io/hostname defines the scope (different physical/virtual machines)

  • This is a hard requirement that cannot be violated

  1. Preferred Anti-Affinity:

preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app.kubernetes.io/name operator: In values: - hivemq-platform topologyKey: topology.kubernetes.io/zone

This means:

  • Kubernetes will try to schedule pods with app.kubernetes.io/name=hivemq-platform across different availability zones

  • weight: 100 gives this preference high priority (on a scale of 1-100)

  • topology.kubernetes.io/zone defines the domain as availability zones

  • This is a soft requirement that can be violated if necessary

2. Configure Affinity Rules in HiveMQ Deployment

Add the following affinity configuration to your hivemq-platform’s values.yaml:

podScheduling: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: designation operator: In values: - true podAntiAffinity: # all pods of given hive must be on different nodes requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app.kubernetes.io/instance operator: In values: - hivemq-platform topologyKey: kubernetes.io/hostname # hive pods should be distributed between AZs preferredDuringSchedulingIgnoredDuringExecution: - weight: 100 podAffinityTerm: labelSelector: matchExpressions: - key: app.kubernetes.io/name operator: In values: - hivemq-platform topologyKey: topology.kubernetes.io/zone

Important Note: The requiredDuringSchedulingIgnoredDuringExecution setting is a hard requirement. If no nodes meet the criteria, your pods will remain in a pending state until suitable nodes become available.

3. Label Your Designated Nodes

Label your nodes that should run HiveMQ pods:

# Get a list of your cluster nodes kubectl get nodes # Label your selected nodes kubectl label node <node-name> designation=true

Repeat this for each node you want to include in your HiveMQ node pool.

4. Verify Your Configuration

Check that your nodes are properly labeled:

# Verify node labels kubectl get nodes --show-labels | grep designation # Check if pods are scheduled as expected kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide

Ensure your HiveMQ pods are distributed across different nodes and, preferably, across different availability zones.

5. Monitor and Adjust

Monitor your deployment to ensure proper distribution:

# Get pod distribution by node kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide | awk '{print $7}' | sort | uniq -c # Get pod distribution by zone (if zones are labeled) kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide -L topology.kubernetes.io/zone

If you notice uneven distribution, consider adding more labeled nodes or adjusting your affinity rules.

Best Practice: For production environments, ensure you have at least one designated node per availability zone to achieve maximum resilience against zone failures.

What These Affinity Rules Accomplish

  1. Node Affinity: Ensures HiveMQ pods only run on nodes explicitly designated for this purpose

  2. Required Pod Anti-Affinity: Guarantees no two HiveMQ instances run on the same node

  3. Preferred Pod Anti-Affinity: Attempts to distribute HiveMQ instances across different availability zones

This configuration maximizes resilience by:

  • Preventing single-node failures from affecting multiple instances

  • Minimizing the impact of availability zone outages

  • Ensuring HiveMQ runs only on nodes with appropriate resources and configurations

Comparison of Affinity Types

Affinity Type

Purpose

Example Use Case

Our Configuration

Affinity Type

Purpose

Example Use Case

Our Configuration

Node Affinity

Restrict pods to specific nodes

Schedule on nodes with specialized hardware or in specific regions

designation=true required

Pod Affinity

Co-locate related pods

Place frontend and backend of same app together

Not used in our setup

Pod Anti-Affinity

Keep pods apart

Distribute database replicas for HA

Required: Different nodes<br>Preferred: Different AZs

Advanced Tip: The "required" vs "preferred" distinction is important. Use "required" rules carefully, as they can prevent pods from scheduling if conditions cannot be met. "Preferred" rules offer a balance between optimal placement and ensuring pods can still run.

Troubleshooting Affinity Issues

Common Issues and Solutions

Issue

Possible Cause

Solution

Issue

Possible Cause

Solution

Pods stuck in "Pending" state

No nodes match required affinity rules

Check node labels with kubectl get nodes --show-labels and add the designation=true label to appropriate nodes

All pods running on same AZ despite preferred anti-affinity

Insufficient nodes in other AZs

Add more designated nodes in different availability zones

Pods not spreading evenly across nodes

Uneven resource utilization

Check node resource usage with kubectl describe nodes and consider adding resource requests/limits to your pods

Debugging Commands

# Check if any pods are pending due to affinity rules kubectl get pods -l app.kubernetes.io/name=hivemq-platform | grep Pending # See detailed scheduling information for a pending pod kubectl describe pod <pod-name> | grep -A10 "Events:" # Check if nodes have the required labels kubectl get nodes -L designation

 Related articles