Por qué no deberías añadir una cuota de recursos usando el valor límite?

por Alexandre Vazquez
2025-11-03

woman in blue denim jacket holding white paper

Si No Tienes Cuidado, Puedes Bloquear La Escalabilidad De Tus Cargas De Trabajo

Una de las grandes cosas sobre los desarrollos basados en contenedores es definir espacios de aislamiento donde tienes recursos garantizados como CPU y memoria. Esto también se extiende en entornos basados en Kubernetes a nivel de namespace, por lo que puedes tener diferentes entornos virtuales que no pueden exceder el uso de recursos a un nivel especificado.

Para definir eso, tienes el concepto de Cuota de Recursos que funciona a nivel de namespace. Basado en su propia definición (https://kubernetes.io/docs/concepts/policy/resource-quotas/)

Una cuota de recursos, definida por un objeto ResourceQuota, proporciona restricciones que limitan el consumo agregado de recursos por namespace. Puede limitar la cantidad de objetos que se pueden crear en un namespace por tipo, así como la cantidad total de recursos de cómputo que pueden ser consumidos por recursos en ese namespace.

Tienes varias opciones para definir el uso de estas Cuotas de Recursos, pero nos centraremos en este artículo en las principales de la siguiente manera:

limits.cpu: En todos los pods en un estado no terminal, la suma de los límites de CPU no puede exceder este valor.
limits.memory: En todos los pods en un estado no terminal, la suma de los límites de memoria no puede exceder este valor.
requests.cpu: En todos los pods en un estado no terminal, la suma de las solicitudes de CPU no puede exceder este valor.
requests.memory: En todos los pods en un estado no terminal, la suma de las solicitudes de memoria no puede exceder este valor.

Así que puedes pensar que esta es una gran opción para definir una cuota de limit.cpu y limit.memory, para asegurarte de que no excederás esa cantidad de uso por ningún medio. Pero necesitas tener cuidado con lo que esto significa, y para ilustrarlo usaré un ejemplo.

Tengo una sola carga de trabajo con un solo pod con la siguiente limitación de recursos:

requests.cpu: 500m
- limits.cpu: 1
- requests.memory: 500Mi
- limits.memory: 1 GB

Tu aplicación es una aplicación basada en Java que expone un Servicio REST y ha configurado una regla de Escalador Horizontal de Pods para escalar cuando la cantidad de CPU excede su 50%.

Entonces, comenzamos en la situación primaria: con una sola instancia que requiere ejecutar 150 m de vCPU y 200 RAM, un poco menos del 50% para evitar el escalador. Pero tenemos una Cuota de Recursos sobre los límites del pod (1 vCPU y 1 GB) por lo que hemos bloqueado eso. Tenemos más solicitudes y necesitamos escalar a dos instancias. Para simplificar los cálculos, asumiremos que usaremos la misma cantidad de recursos para cada una de las instancias, y continuaremos de esa manera hasta llegar a 8 instancias. Así que veamos cómo cambian los límites definidos (el que limitará la cantidad de objetos que puedo crear en mi namespace) y la cantidad real de recursos que estoy usando:

Entonces, para una cantidad de recursos usados de 1.6 vCPU he bloqueado 8 vCPU, y en caso de que ese fuera mi Límite de Recursos, no puedo crear más instancias, aunque tengo 6.4 vCPU no usados que he permitido desplegar debido a este tipo de limitación no puedo hacerlo.

Sí, puedo asegurar el principio de que nunca usaré más de 8 vCPU, pero me he bloqueado muy temprano en esa tendencia afectando el comportamiento y la escalabilidad de mis cargas de trabajo.

Debido a eso, necesitas ser muy cuidadoso cuando estás definiendo este tipo de límites y asegurarte de lo que estás tratando de lograr porque al resolver un problema puedes estar generando otro.

Espero que esto pueda ayudarte a prevenir que este problema ocurra en tu trabajo diario o al menos tenerlo en cuenta cuando te enfrentes a escenarios similares.

Etiquetas:Resource Quota

Why You Shouldn’t Add a Resource Quota Using The Limit Value?

por Alexandre Vazquez
2022-03-282022-05-06

If You Aren’t Careful You Can Block The Scalability Of Your Workloads

One of the great things about container-based developments idefiningne isolation spaces where you have guaranteed resources such as CPU and memory. This is also extended on Kubernetes-based environments at the namespace level, so you can have different virtual environments that cannot exceed the usage of resources at a specified level.

To define that you have the concept of Resource Quota that works at the namespace level. Based on its own definition (https://kubernetes.io/docs/concepts/policy/resource-quotas/)

A resource quota, defined by a ResourceQuota object, provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.

You have several options to define the usage of these Resource Quotas but we will focus on this article on the main ones as follows:

limits.cpu: Across all pods in a non-terminal state, the sum of CPU limits cannot exceed this value.
limits.memory: Across all pods in a non-terminal state, the sum of memory limits cannot exceed this value.
requests.cpu: Across all pods in a non-terminal state, the sum of CPU requests cannot exceed this value.
requests.memory: Across all pods in a non-terminal state, the sum of memory requests cannot exceed this value.

So you can think that this is a great option to define a limit.cpu and limit.memory quota, so you make sure that you will not extend that amount of usage by any means. But you need to be careful about what this means, and to illustrate that I will use a sample.

I have a single workload with a single pod with the following resource limitation:

requests.cpu: 500m
- limits.cpu: 1
- requests.memory: 500Mi
- limits.memory: 1 GB

Your application is a Java-based application that exposes a REST Service ant has configured a Horizontal Pod Autoscaler rule to scale when the amount of CPU exceeds its 50%.

So, we start in the primary situation: with a single instance that requires to run 150 m of vCPU and 200 RAM, so a little bit less than 50% to avoid the autoscaler. But we have a Resource Quota about the limits of the pod (1 vCPU and 1 GB) so we have blocked that. We have more requests and we need to scale to two instances. To simplify the calculations, we will assume that we will use the same amount of resources for each of the instances, and we will continue that way until we reach 8 instances. So lets’ see how it changes the limits defined (the one that will limit the number of objects I can create in my namespace) and the actual amount of resources that I am using:

So, for resources used amount of 1.6 vCPU I have blocked 8 vCPU, and in case that was my Resource Limit, I cannot create more instances, even though I have 6.4 vCPU not used that I have allowed deploying because of this kind of limitation I cannot do it.

Yes, I am able to ensure the principle that I never will use more than 8 vCPU, but I’ve been blocked very early on that trend affecting the behavior and scalability of my workloads.

Because of that, you need to be very careful when you are defining these kinds of limits and making sure what you are trying to achieve because to solve one problem you can be generating another one.

I hope this can help you to prevent this issue for happening in your daily work or at least to keep it in mind when you are facing similar scenarios.

Etiquetas:Kuberentes Resource Quota

Por qué no deberías añadir una cuota de recursos usando el valor límite?

Si No Tienes Cuidado, Puedes Bloquear La Escalabilidad De Tus Cargas De Trabajo

Related articles:

Why You Shouldn’t Add a Resource Quota Using The Limit Value?

If You Aren’t Careful You Can Block The Scalability Of Your Workloads

Related articles:

Deja una respuesta