How to Configure Pod Affinity in Kubernetes for High Availability
This article explains how to set up Kubernetes affinity rules to ensure your HiveMQ platform runs with optimal high availability. By configuring proper node affinity and pod anti-affinity rules, you can ensure your HiveMQ instances are distributed across nodes and availability zones for maximum resilience.
What This Article Covers
Understanding Kubernetes affinity concepts
Setting up node affinity to select appropriate nodes
Configuring pod anti-affinity for resilience
Instructions
1. Understanding Affinity Rules
Kubernetes offers three types of affinity rules:
Node Affinity: Controls which nodes your pods can run on
Pod Affinity: Attracts pods to nodes running certain other pods
Pod Anti-Affinity: Prevents pods from running on nodes with certain other pods
For HiveMQ high availability, we use node affinity to select appropriate nodes and pod anti-affinity to distribute pods across nodes and availability zones.
Node Affinity In-Depth
Node affinity rules determine which nodes your pods can be scheduled on. In our example:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: designation
operator: In
values:
- true
This means:
Pods must be scheduled on nodes labeled with
designation=true
requiredDuringSchedulingIgnoredDuringExecution
- This is a hard requirement for pod scheduling, but if node labels change after a pod is running, the pod won't be evictedIf no suitable nodes exist, pods will remain in a pending state
Pod Anti-Affinity In-Depth
Pod anti-affinity prevents pods from being co-located based on labels. Our configuration includes two types:
Required Anti-Affinity:
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/instance
operator: In
values:
- hivemq-platform
topologyKey: kubernetes.io/hostname
This ensures:
No two pods with label
app.kubernetes.io/instance=hivemq-platform
can run on the same nodekubernetes.io/hostname
defines the scope (different physical/virtual machines)This is a hard requirement that cannot be violated
Preferred Anti-Affinity:
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- hivemq-platform
topologyKey: topology.kubernetes.io/zone
This means:
Kubernetes will try to schedule pods with
app.kubernetes.io/name=hivemq-platform
across different availability zonesweight: 100
gives this preference high priority (on a scale of 1-100)topology.kubernetes.io/zone
defines the domain as availability zonesThis is a soft requirement that can be violated if necessary
2. Configure Affinity Rules in HiveMQ Deployment
Add the following affinity configuration to your hivemq-platform’s values.yaml:
podScheduling:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: designation
operator: In
values:
- true
podAntiAffinity:
# all pods of given hive must be on different nodes
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: app.kubernetes.io/instance
operator: In
values:
- hivemq-platform
topologyKey: kubernetes.io/hostname
# hive pods should be distributed between AZs
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 100
podAffinityTerm:
labelSelector:
matchExpressions:
- key: app.kubernetes.io/name
operator: In
values:
- hivemq-platform
topologyKey: topology.kubernetes.io/zone
Important Note: The
requiredDuringSchedulingIgnoredDuringExecution
setting is a hard requirement. If no nodes meet the criteria, your pods will remain in a pending state until suitable nodes become available.
3. Label Your Designated Nodes
Label your nodes that should run HiveMQ pods:
# Get a list of your cluster nodes
kubectl get nodes
# Label your selected nodes
kubectl label node <node-name> designation=true
Repeat this for each node you want to include in your HiveMQ node pool.
4. Verify Your Configuration
Check that your nodes are properly labeled:
# Verify node labels
kubectl get nodes --show-labels | grep designation
# Check if pods are scheduled as expected
kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide
Ensure your HiveMQ pods are distributed across different nodes and, preferably, across different availability zones.
5. Monitor and Adjust
Monitor your deployment to ensure proper distribution:
# Get pod distribution by node
kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide | awk '{print $7}' | sort | uniq -c
# Get pod distribution by zone (if zones are labeled)
kubectl get pods -l app.kubernetes.io/name=hivemq-platform -o wide -L topology.kubernetes.io/zone
If you notice uneven distribution, consider adding more labeled nodes or adjusting your affinity rules.
Best Practice: For production environments, ensure you have at least one designated node per availability zone to achieve maximum resilience against zone failures.
What These Affinity Rules Accomplish
Node Affinity: Ensures HiveMQ pods only run on nodes explicitly designated for this purpose
Required Pod Anti-Affinity: Guarantees no two HiveMQ instances run on the same node
Preferred Pod Anti-Affinity: Attempts to distribute HiveMQ instances across different availability zones
This configuration maximizes resilience by:
Preventing single-node failures from affecting multiple instances
Minimizing the impact of availability zone outages
Ensuring HiveMQ runs only on nodes with appropriate resources and configurations
Comparison of Affinity Types
Affinity Type | Purpose | Example Use Case | Our Configuration |
---|---|---|---|
Node Affinity | Restrict pods to specific nodes | Schedule on nodes with specialized hardware or in specific regions |
|
Pod Affinity | Co-locate related pods | Place frontend and backend of same app together | Not used in our setup |
Pod Anti-Affinity | Keep pods apart | Distribute database replicas for HA | Required: Different nodes<br>Preferred: Different AZs |
Advanced Tip: The "required" vs "preferred" distinction is important. Use "required" rules carefully, as they can prevent pods from scheduling if conditions cannot be met. "Preferred" rules offer a balance between optimal placement and ensuring pods can still run.
Troubleshooting Affinity Issues
Common Issues and Solutions
Issue | Possible Cause | Solution |
---|---|---|
Pods stuck in "Pending" state | No nodes match required affinity rules | Check node labels with |
All pods running on same AZ despite preferred anti-affinity | Insufficient nodes in other AZs | Add more designated nodes in different availability zones |
Pods not spreading evenly across nodes | Uneven resource utilization | Check node resource usage with |
Debugging Commands
# Check if any pods are pending due to affinity rules
kubectl get pods -l app.kubernetes.io/name=hivemq-platform | grep Pending
# See detailed scheduling information for a pending pod
kubectl describe pod <pod-name> | grep -A10 "Events:"
# Check if nodes have the required labels
kubectl get nodes -L designation
Related articles