Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

KEDA provides a rich environment to scale your application apart from the traditional HPA approach using CPU and Memory

Autoscaling is one of the great things of cloud-native environments and helps us to provide an optimized use of the operations. Kubernetes provides many options to do that being one of those the Horizontal Pod Autoscaler (HPA) approach.

HPA is the way Kubernetes has to detect if it is needed to scale any of the pods, and it is based on the metrics such as CPU usage or memory.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Sometimes those metrics are not enough to decide if the number of replicas we have available is enough. Other metrics can provide a better perspective, such as the number of requests or the number of pending events.

Kubernetes Event-Driven Autoscaling (KEDA)

Here is where KEDA comes to help. KEDA stands for Kubernetes Event-Driven Autoscaling and provides a more flexible approach to scale our pods inside a Kubernetes cluster.

It is based on scalers that can implement different sources to measure the number of requests or events that we receive from different messaging systems such as Apache Kafka, AWS Kinesis, Azure EventHub, and other systems as InfluxDB or Prometheus.

KEDA works as it is shown in the picture below:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

We have our ScaledObject that links our external event source (i.e., Apache Kafka, Prometheus ..) with the Kubernetes Deployment we would like to scale and register that in the Kubernetes cluster.

KEDA will monitor the external source, and based on the metrics gathered, will communicate the Horizontal Pod Autoscaler to scale the workload as defined.

Testing the Approach with a Use-Case

So, now that we know how that works, we will do some tests to see it live. We are going to show how we can quickly scale one of our applications using this technology. And to do that, the first thing we need to do is to define our scenario.

In our case, the scenario will be a simple cloud-native application developed using a Flogo application exposing a REST service.

The first step we need to do is to deploy KEDA in our Kubernetes cluster, and there are several options to do that: Helm charts, Operation, or YAML files. In this case, we are going to use the Helm charts approach.

So, we are going to type the following commands to add the helm repository and update the charts available, and then deploy KEDA as part of our cluster configuration:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda

After running this command, KEDA is deployed in our K8S cluster, and it types the following command kubectl get all will provide a situation similar to this one:

pod/keda-operator-66db4bc7bb-nttpz 2/2 Running 1 10m
pod/keda-operator-metrics-apiserver-5945c57f94-dhxth 2/2 Running 1 10m

Now, we are going to deploy our application. As already commented to do that we are going to use our Flogo Application, and the flow will be as simple as this one:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Flogo application listening to the requests
  • The application exposes a REST service using the /hello as the resource.
  • Received requests are printed to the standard output and returned a message to the requester

Once we have our application deployed on our Kubernetes application, we need to create a ScaledObject that is responsible for managing the scalability of that component:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
ScaleObject configuration for the application

We use Prometheus as a trigger, and because of that, we need to configure where our Prometheus server is hosted and what query we would like to do to manage the scalability of our component.

In our sample, we will use the flogo_flow_execution_count that is the metric that counts the number of requests that are received by this component, and when this has a rate higher than 100, it will launch a new replica.

After hitting the service with a Load Test, we can see that as soon as the service reaches the threshold, it launch a new replica to start handling requests as expected.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Autoscaling being done using Prometheus metrics.

All of the code and resources are hosted in the GitHub repository shown below:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/


Summary

This post has shown that we have unlimited options in deciding the scalability options for our workloads. We can use the standard metrics like CPU and memory, but if we need to go beyond that, we can use different external sources of information to trigger that autoscaling.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed

Event-Driven architecture provides more agility to meet the changes of a more demanding customer ecosystem.

The market is shifting at a speed that is needed to be ready to change very quickly, customers are becoming more and more demanding and we need to be able to deliver what they are expecting, and to do so we need an architecture that is responsive enough to be able to adapt at the pace that is required.

Event-Driven Architectures (usually just referred to as EDA) are architectures where events are the crucial part of it and we design components ready to handle those events in the most efficient way. An architecture that is ready to react to what’s happening around us instead of just setting a specific path for our customers.

This approach provides a lot of benefits to enterprises because of its characteristics but also at the same time it requires a different mindset and a different set of components in place.

What is an Event?

Let’s start with the beginning. An event is anything that can happen and it is important to you. If you think about a scenario where a user is just navigating through an e-commerce website, everything that he has is an event. If we land on the e-commerce site because he had a referral link, that is an event.

Events not only happen in virtual life but in real life too. A person just walking into the lobby of the hotel is an event, going in front of the reception desk to do the check-in is another, just walking to his room is another… everything is an event.

Events in isolation provide a small piece of information but together they can provide a lot of valuable information about the customers, their preferences, their expectations, and also their needs. And all of that will help us to provide the most customized experience to each one of our customers.

EDA vs Traditional Architectures

Traditional architectures work in pull mode, which means that a consumer sends a request to a service, that services need other components to do the logic, it goes the answer and it answers back. Everything is pre-defined.

Events work in a different way because they work on the push mode, Events are being sent and that’s it, it could trigger one action, many actions, or none. You have a series of components waiting, listening until the event or the sequence of events they need to activate appears in front of them and when it does, it just triggers its logic and as part of that execution generates one or more events to be able to be consumed again.

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed
Pull vs Push mode for Communication.

To be able to build an Event-Driven Architecture the first thing we need is to have Event-Driven Components. We need software components that are activated based on events and also generate events as part of their processing logic. At the same time, this sequence of events also becomes the way to complete complex flows in a cooperation mode without the need or a master-mind component that is aware of all the flow from end to end.

You just have components that know that when happens this, they need to do their part of the job and other components will listen to the output of that components and be activated.

This approach is called Choreography because it works the same way in a ballet company where each of the dancers can be doing different moves but each of them knows exactly what they should do and all together in sync generate the whole piece.

Layers of an Event-Driven Architecture

Now that we have software components that are being activated using events we need some structure around that in our architecture to cover all the needs in the management of the events, so we need to handle the following layers:

Event-Driven Architecture: Enhancing the Responsiveness of Your Enterprise To Succeed
Layers of the Event Driven-Architecture
  • Event Ingestion: We need a series of components that helps us to introduce and receive events in our systems. As we explained there are tons and tons of ways to send events so it is important that we offer flexibility and options in that process. Adapters and API are crucial here to make sure all the events can be gathered and be part of the system.
  • Event Distribution: We need an Event Bus that acts like our Event Ocean where all the events are flowing across to be able to activate all the components that are listening to that event.
  • Event Processing: We need a series of components to listen to all the events that are sent and make them meaningful. These components should act as security guards: They filter the events that are not important, they also enrich the events they receive with context information from other systems or data sources, and they transform the format of some events to make it easy to understand to all the components that are waiting for those events.
  • Event Action: We need a series of components listening to those events and ready to react to what is seen in the Event Bus as soon as detect that they expect to start doing their logic and send the output again to the bus to be used for somebody else.

Summary

Event-Driven Architecture can provide a much more agile and flexible ecosystem where companies can address the current challenges to dispose a compelling experience to users and customers and at the same time provide more agility to the technical teams being able to create components that work in collaboration but loosely coupled making the components and teams more autonomous.