Alexandre Vazquez in Kubernetes

Optimizing Kubernetes Scheduling with Node Affinity Rules: Trade-offs and Best Practices

Understanding Node Affinity and its Benefits

Node affinity rules are an essential feature in Kubernetes that allow you to control the scheduling of pods based on node properties. By using node affinity rules, you can specify constraints on which nodes pods can be scheduled, enabling you to optimize resource allocation and enhance performance.

Node affinity works by allowing you to define rules for pod scheduling based on node labels. When defining node affinity rules, you have two options: required and preferred rules. Required rules ensure that pods are scheduled only on nodes that satisfy the defined criteria. If no suitable node is available, the pod remains unscheduled. On the other hand, preferred rules provide a soft constraint and attempt to schedule pods on nodes that match the specified criteria. However, if no such node is available, the pod can still be scheduled on other nodes.

Node affinity rules are an “expanded” option of the simply way by using node selectors. Node selectors are a simple form of node affinity that allows you to assign labels to nodes and match those labels with selectors defined in the pod specification. By specifying a node selector, you can ensure that pods are scheduled only on nodes with matching labels. Node selectors are useful for basic affinity requirements but lack the flexibility and fine-grained control provided by more advanced affinity options.

Challenges and Trade-offs: Worst Case Scenario with Node Affinity Rules

But this awesome capability has some trade-offs that you need to take in consideration because nothing comes with a price that you need to be aware of, so, let’s go to the important question, what is the worst case scenario of using any of those options?

Let’s imagine that you are deploying a workloads consisting on three replicas that are sharing the load and providing resiliency and fault-tolerance, there are three replicas because they use a consensus protocol that requiren an odd number of replicas. So you decide to define a set of nodes for this workload and use node affinity rules to ensure the pods are scheduled to those nodes. And, you need to think: should I use the preferred mode or the requiredMode?

Let’s say that you go with the required option and you define it like this, what happen if one of your nodes goes down? The pod will be try to be rescheduled again and unless there are another node “with same label” to that, it cannot be deployed? If you additional defined a pod anti-affinity rule to ensure each of the replicas is in a different host to ensure that in case that one node is going down you lose only a single replica, you’re losing the option to rescheudle the workload even if you have another nodes without the label available. So, you’re not in a so reliable option.

Ok, so you go with the preferred to ensure that you workload is for sure scheduled even if it is in another node, and in that case you can end up on the situation that those nodes are scheduled on other nodes keeping those nodes with the proper label without the workload that they should have, making the situation strange and more difficult to administer because you cannot ensure your workloads is on the nodes that you expected to be.

Additional to that, if the nodes has even taints to ensure other workloads cannot be placed there, you can end up in a situation that the “labeled-pods” are scheduled on non-labeled nodes, and the non-labeled pods cannot use the nodes because they’re tainted and can be not be able to use the un-labeled ones if there are not enough resources. So you’re generating an impact on the other workloasd and potentially affecting the schedulling of the other workloads.

Preparing for Unexpected Outages with Node Affinity

So, as you can see, each decision has some disadvatanges that you need to take in consdieration before defining those rules, because if you don’t, you will figure it out when this happen on an production enviornment probably as a result of some unexpected outage, because we all know that in the meantime that nothing bad happens everything works as expected, but the potential of these solutions and its reason to be used is exactly to provide the tools and the options to be prepared when bad things happens.

So, next time that you need to define a node affinity rule try to think about the disadvantages of each of the option and try to select that one that works best for you and mitigate the problems that it can bring to the table of your production environment.

Node affinity

Alexandre Vazquez:

Understanding Node Affinity and its Benefits

Challenges and Trade-offs: Worst Case Scenario with Node Affinity Rules

Preparing for Unexpected Outages with Node Affinity

Related articles: