Kubernetes Archives - Page 6 Of 8

Level-Up Your Deployment Strategy with Canarying in Kubernetes

2026-01-142021-05-04 by Alexandre Vazquez

Save time and money on your application platform deploying applications differently in Kubernetes.

We have come from a time where we deploy an application using an apparent and straight-line process. The traditional way is pretty much like this:

Wait until a weekend or some time where the load is low, and business can tolerate some service unavailability.
We schedule the change and warn all the teams involved for that time to be ready to manage the impact.
We deploy the new version and we have all the teams during the functional test they need to do to ensure that is working fine, and we wait for the real load to happen.
We monitor during the first hours to see if something wrong happens, and in case it does, we establish a rollback process.
As soon as everything goes fine, we wait until the next release in 3–4 months.

But this is not valid anymore. Business demands IT to be agile, change quickly, and not afford to do that kind of resource effort each week or, even worse, each day. Do you think that it’s possible to gather all teams each night to deploy the latest changes? It is not feasible at all. So, technology advance to help us solve that issue better, and here is where Canarying came here to help us.

Introducing Canary Deployments

Canary deployments (or just Canarying as you prefer) are not something new, and a lot of people has been talking a lot about it:

bliki: CanaryRelease

A canary release occurs when you roll out a new version of some software to a small subset of your user base to see if there are any problems before you make it available to everyone.

Google - Site Reliability Engineering

Release engineering is a term we use to describe all the processes and artifacts related to getting code from a repository into a running production system. Automating releases can help avoid many of the traditional pitfalls associated with release engineering: the toil of repetitive and manual task…

It has been here for some time, but before, it was neither easy nor practical to implement it. Basically is based on deploying the new version into production, but you still keeping the traffic pointing to the old version of the application and you just start shifting some of the traffic to the new version.

Based on that small subset of requests you monitor how the new version performs at different levels, functional level, performance level, and so on. Once you feel comfortable with the performance that is providing you just shift all the traffic to the new version, and you deprecate the old version

The benefits that come with this approach are huge:

You don’t need a big staging environment as before because you can do some of the tests with real data into the production while not affecting your business and the availability of your services.
You can reduce time to market and increase the frequency of deployments because you can do it with less effort and people involved.
Your deployment window has been extended a lot as you do not need to wait for a specific time window, and because of that, you can deploy new functionality more frequently.

Implementing Canary Deployment in Kubernetes

To implement Canary Deployment in Kubernetes, we need to provide more flexibility to how the traffic is routed among our internal components, which is one of the capabilities that get extended from using a Service Mesh. We already discussed the benefits of using a Service Mesh as part of your environment, but if you would like to retake a look, please refer to this article:

Technology wars: API Management Solution vs Service Mesh

Service Mesh vs. API Management Solution: is it the same? Are they compatible? Are they rivals? Photo by Alvaro Reyes on Unsplash When we talk about communication in a distributed cloud-native world and especially when we are talking about container-based architectures based on Kubernetes platform like AKS, EKS, Openshift, and so on, two technologies generate a lot […]

We have several technology components that can provide those capabilities, but this is how you will be able to create the traffic routes to implement this. To see how you can take a look at the following article about one of the default options that is Istio:

Integrating Istio with BWCE Applications

Introduction Services Mesh is one the “greatest new thing” in our PaaS environments. No matter if you’re working with K8S, Docker Swarm, pure-cloud with EKS or AWS, you’ve heard and probably tried to know how can be used this new thing that has so many advantages because it provides a lot of options in handling […]

But be able to route the traffic is not enough to implement a complete canary deployment approach. We also need to be able to monitor and act based on those metrics to avoid manual intervention. To do this, we need to include different tools to provide those capabilities: Prometheus is the de-facto option to monitor workloads deployed on the Kubernetes environment, and here you can get more info about how both projects play together.

Kubernetes Service Discovery for Prometheus

In previous posts, we described how to set up Prometheus to work with your TIBCO BusinessWorks Container Edition apps, and you can read more about it here. In that post, we described that there were several ways to update Prometheus about the services that ready to monitor. And we choose the most simple at that […]

And to manage the overall process, you can have a Continuous Deployment tool to put some governance around that using options like Spinnaker or using our of the extensions for the Continuous integration tools like GitLab or GitHub:

Using Spinnaker for Automated Canary Analysis

Automated canary analysis lets you partially roll out a change then evaluate it against the current deployment to assess its performance.

Summary

In this article, we covered how we can evolve a traditional deployment model to keep pace with innovation that businesses require today and how canary deployment techniques can help us on that journey, and the technology components needed to set up this strategy in your own environment.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

2026-01-162021-04-26 by Alexandre Vazquez

Maximizing the productivity of working with Kubernetes Environment with a tool for each persona

We all know that Kubernetes is the default environment for all our new applications we developed we will build. The flavor of that Kubernetes platform can be of different ways and forms, but one thing is clear, it is complex.

The reasons behind this complexity is being able to provide all the flexibility but it is also true that the k8s project never has put much effort to provide a simple way to manage your clusters and kubectl is the point of access to send commands leaving this door open to the community to provide its own solution and these are the things that we are going to discuss today.

Kubernetes Dashboard: The Default Option

Kubernetes Dashboard is the default option for most of the installations. It is a web-based interface that is part of the K8s project but not deployed by default when you install the cluster

K9S: The CLI option

K9S is one of the most common options for the ones that love a very powerful command-line interface with a lot of options at your disposal

It is a mix between all the power of a command-line interface with all the keyboard options at your disposals with a very fancy graphical view to have a quick overview of the status of your cluster at glance.

Lens — The Graphical Option

The lens is a very vitaminized GUI option that goes beyond that just showing the status of the K8S cluster or allowing modifications on the components. With integration with other projects such as Helm or support for the CRD. It provides a very pleasant experience of managing clusters with multi-cluster support as well. To know more about Lens you can take a look at this article that we cover its main features:

Lens Could Be the Tool That You Were Missing to Master Kubernetes-Based Development and Management

Find the greatest way to manage your Kubernetes development cluster Photo by Agence Olloweb on Unsplash. I need to start this article by admitting that I am an advocate of Graphical User Interfaces and everything that provides a way to speed up the way we do things and be more productive. So when we talk […]

Octant — The Web Option

Octant provides an improved experience compared with the default web option discussed in this article using the Kubernetes dashboard. Built for extension with a plug-in system that allows you to extend or customize the behavior of octant to maximize your productivity managing K8S clusters. Including CRD support and graphical visualization of dependencies provides an awesome experience.

Summary

The have provided in this article different tools that will help you during the important task to manage or inspect your Kubernetes cluster. Each of them with its own characteristics and each of them focuses on different ways to provide the information (CLI, GUI and Web) so you can always find one that works best for your situation and preferences.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

2026-01-162021-04-12 by Alexandre Vazquez

KEDA provides a rich environment to scale your application apart from the traditional HPA approach using CPU and Memory

Autoscaling is one of the great things of cloud-native environments and helps us to provide an optimized use of the operations. Kubernetes provides many options to do that being one of those the Horizontal Pod Autoscaler (HPA) approach.

HPA is the way Kubernetes has to detect if it is needed to scale any of the pods, and it is based on the metrics such as CPU usage or memory.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Sometimes those metrics are not enough to decide if the number of replicas we have available is enough. Other metrics can provide a better perspective, such as the number of requests or the number of pending events.

Kubernetes Event-Driven Autoscaling (KEDA)

Here is where KEDA comes to help. KEDA stands for Kubernetes Event-Driven Autoscaling and provides a more flexible approach to scale our pods inside a Kubernetes cluster.

It is based on scalers that can implement different sources to measure the number of requests or events that we receive from different messaging systems such as Apache Kafka, AWS Kinesis, Azure EventHub, and other systems as InfluxDB or Prometheus.

KEDA works as it is shown in the picture below:

We have our ScaledObject that links our external event source (i.e., Apache Kafka, Prometheus ..) with the Kubernetes Deployment we would like to scale and register that in the Kubernetes cluster.

KEDA will monitor the external source, and based on the metrics gathered, will communicate the Horizontal Pod Autoscaler to scale the workload as defined.

Testing the Approach with a Use-Case

So, now that we know how that works, we will do some tests to see it live. We are going to show how we can quickly scale one of our applications using this technology. And to do that, the first thing we need to do is to define our scenario.

In our case, the scenario will be a simple cloud-native application developed using a Flogo application exposing a REST service.

The first step we need to do is to deploy KEDA in our Kubernetes cluster, and there are several options to do that: Helm charts, Operation, or YAML files. In this case, we are going to use the Helm charts approach.

So, we are going to type the following commands to add the helm repository and update the charts available, and then deploy KEDA as part of our cluster configuration:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda

After running this command, KEDA is deployed in our K8S cluster, and it types the following command kubectl get all will provide a situation similar to this one:

pod/keda-operator-66db4bc7bb-nttpz 2/2 Running 1 10m
pod/keda-operator-metrics-apiserver-5945c57f94-dhxth 2/2 Running 1 10m

Now, we are going to deploy our application. As already commented to do that we are going to use our Flogo Application, and the flow will be as simple as this one:

The application exposes a REST service using the /hello as the resource.
Received requests are printed to the standard output and returned a message to the requester

Once we have our application deployed on our Kubernetes application, we need to create a ScaledObject that is responsible for managing the scalability of that component:

We use Prometheus as a trigger, and because of that, we need to configure where our Prometheus server is hosted and what query we would like to do to manage the scalability of our component.

In our sample, we will use the flogo_flow_execution_count that is the metric that counts the number of requests that are received by this component, and when this has a rate higher than 100, it will launch a new replica.

After hitting the service with a Load Test, we can see that as soon as the service reaches the threshold, it launch a new replica to start handling requests as expected.

All of the code and resources are hosted in the GitHub repository shown below:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Summary

This post has shown that we have unlimited options in deciding the scalability options for our workloads. We can use the standard metrics like CPU and memory, but if we need to go beyond that, we can use different external sources of information to trigger that autoscaling.

Kubernetes Health Checks Explained: Simplify Cluster Diagnostics with KubeEye

2026-01-162021-04-02 by Alexandre Vazquez

KubeEye supports you in the task of ensuring that your cluster is performing well and ensure all your best practices are being followed.

Kubernetes has become the new normal to deploy our applications and other serverless options, so the administration of these clusters has become critical for most enterprises, and doing a proper Kubernetes Health Check is becoming critical.

This task is clear that it is not an easy task. As always, the flexibility and power that technology provides to the users (in this case, the developers) also came with a trade-off with the operation and management’s complexity. And this is not an exception to that.

We have evolved, including managed options that simplify all the underlying setup and low-level management of the infrastructure behind it. However, many things need to be done for the cluster administration to have a happy experience in the journey of a Kubernetes Administrator.

A lot of concepts to deal with: namespaces, resource limits, quotas, ingress, services, routes, crd… Any help that we can get is welcome. And with this purpose in mind, KubeEye has been born.

GitHub – kubesphere/kubeeye: KubeEye aims to find various problems on Kubernetes, such as application misconfiguration, unhealthy cluster components and node problems.

KubeEye aims to find various problems on Kubernetes, such as application misconfiguration, unhealthy cluster components and node problems. – GitHub – kubesphere/kubeeye: KubeEye aims to find variou…

KubeEye is an open-source project that helps to identify some issues in our Kubernetes Clusters. Using their creators’ words:

KubeEye aims to find various problems on Kubernetes, such as application misconfiguration(using Polaris), cluster components unhealthy and node problems(using Node-Problem-Detector). Besides predefined rules, it also supports custom defined rules.

So we can think like a buddy that is checking the environment to make sure that everything is well configured and healthy. Also, it allows us to define custom rules to make sure that all the actions that the different dev teams are doing are according to the predefined standards and best practices.

So let’s see how we can include KubeEye to do a health check of our environment. The first thing we need to do is to install it. At this moment, KubeEye only offers a release for Linux-based system, so if you are using other systems like me, you need to follow another approach and type the following commands:

git clone https://github.com/kubesphere/kubeeye.git
cd kubeeye
make install

After doing that, we end up with a new binary in our PATH named `ke`, and this is the only component needed to work with the app. The second step we need to do to get more detail on those diagnostics is to install the node problem detector component.

This component is a component installed in each node of the cluster. It helps to make more visible to the upstream layers issues regarding the behavior of the Kubernetes cluster. This is an optional step, but it will provide more meaningful data, and install that, we need to run the following command.

ke install npd

And now we’re ready to start checking our environment, and the order is as easy as this one.

ke diag

This will provide an output similar to this that is compounded by two different tables. The first one will be focused on the Pod and the issues and events raised as part of the platform’s status, and the other will focus on the rest of the elements and kinds of objects for the Kubernetes Clusters.

The table for the issues at the pod level has the following fields:

Namespace where the pod belongs to.
Severity of the issue.
Pod Name that is responsible for the issue
EventTime of where this event has been raised
Reason for the issue
Message with the detailed description of the issue

The second table for the other objects has the following structure:

Namespace where the object that has an issue that is being detected is deployed.
Severity of the issue.
Name of the component
Kind of the component
Time of where this issue has been raised
Message with the detailed description of the issue

Command’s output can also show other tables if some issues are detected at the node level.

Today we cover a fascinating topic as it is the Kubernetes Administration and introduce a new tool that helps your daily task.

I truly expect that this tool can be added to your toolbox and ease the path for a happy and healthy Kubernetes Cluster administration!

Optimize Prometheus Disk Usage: Practical TSDB Tuning and Retention Strategies

2026-01-162021-03-12 by Alexandre Vazquez

Check out the properties that will let you an optimized use of your disk storage and savings storing your monitoring data

Prometheus has become a standard component in our cloud architectures and Prometheus storage is becoming a critical aspect. So I am going to guess that if you are reading this you already know what Prometheus is. If this is not the case, please take your time to take a look at other articles that I have created:

Prometheus Monitoring for Microservices using TIBCO

We’re living a world with constant changes and this is even more true in the Enterprise Application world. I’ll not spend much time talking about things you already know, but just say that the microservices architecture approach and the PaaS solutions have been a game-changer for all enterprise integration technologies. This time I’d like to […]

Kubernetes Service Discovery for Prometheus

We know that usually when we monitor using Prometheus we have so many exporters available at our disposal and also that each of them exposes a lot of very relevant metrics that we need to track everything we need to and that lead to very intensive usage of the storage available if we do not manage accordingly.

There are two factors that affect this. The first one is to optimize the number of metrics that we are storing and we already provide tips to do that in other articles as the ones shown below:

How it optimize the disk usage in the Prometheus database?

Learn some tricks to analyze and optimize the usage that you are doing of the TSDB and save money on your cloud deployment. Photo by Markus Spiske on Unsplash In previous posts, we discussed how the storage layer worked for Prometheus and how effective it was. But in the current times, we are of cloud computing […]

The other one is how long we store the metrics called the “retention period in Prometheus.” And this property has suffered a lot of changes during the different versions. If you would like to see all the history please take a look at this article from Robust Perception:

Configuring Prometheus storage retention – Robust Perception | Prometheus Monitoring Experts

How can you control how much history Prometheus keeps?

The main properties that you can configure are the following ones:

storage.tsdb.retention.time: Number of days to store the metrics by default to 15d. This property replaces the deprecated one storage.tsdb.retention.
storage.tsdb.retention.size: You can specify the limit of size to be used. This is not a hard limit but a minimum so please define some margin here. Units supported: B, KB, MB, GB, TB, PB, EB. Ex: “512MB”. This property is experimental so far as you can see in the official documentation:

https://prometheus.io/docs/prometheus/latest/storage

What about setting this configuration in the operator for Kubernetes? In that case, you also have similar options available in the values.yaml configuration file for the chart as you can see in the image below:

This should help you get an optimized deployment of Prometheus that ensures all the features that Prometheus has but at the same time an optimal use of the resources at your disposal.

Additional to that, you should also check the Managed Service options that some providers have regarding Prometheus, such as the Amazon Managed Services for Prometheus, as you can see in the link below:

Amazon Prometheus Service to Provide More Availability to Your Monitoring Solution

Learn what Amazon Managed Service for Prometheus provides and how you can benefit from it. Photo by Casey Horner on Unsplash Monitoring is one of the hot topics when we talk about cloud-native architectures. Prometheus is a graduated Cloud Native Computing Foundation (CNCF) open-source project and one of the industry-standard solutions when it comes to monitoring your […]

Loki vs ELK Stack: Lightweight Log Aggregation for Kubernetes and Cloud-Native

2026-01-162021-03-08 by Alexandre Vazquez

Learn about the new horizontally-scalable, highly available, multi-tenant log aggregation system inspired by Prometheus that can be the best fit for your logging architecture

Loki vs ELK is something you are reading and hearing each time more often as from some time it is a raise on the dispute of becoming the de-factor standard for log aggregation architectures.

When we talk about Cloud-Native Architecture, log aggregation is something key that you need to consider. The old practices that we followed in the on-premises virtual machine approach for logging are not valid anymore.

We already cover this topic in my previous post that I recommend you to talk a look in case you haven’t read it yet, but this is not the topic for today.

Three reasons why you need a Log Aggregation Architecture today

Log Aggregation are not more a commodity but a critical component in container-based platforms Photo by Olav Ahrens Røtne on Unsplash Log Management doesn’t seem like a very fantastic topic. It is not the topic that you see and says: “Oh! Amazing! This is what I was dreaming about my whole life”. No, I’m aware that […]

Elasticsearch as the core and the different derívate de stacks like ELK/EFK had gained popularity in the last years, being pretty much the default open-source option when we talked about log aggregation and one of the options. The main public cloud providers have also adopted this solution as part of their own offering as the Amazon Elasticsearch Service provides.

But Elasticsearch is not perfect. If you have already used it, you probably know about it. Still, because their features are so awesome, especially on the searching and indexing capabilities, it has been the kind of leader today. But other topics like the storage use, the amount of power you need to handle it, and the architecture with different kinds of nodes (master, data, ingester) increase its complexity for cases when we need something smaller.

And to fill this gap is where our main character for today’s post arrives: Loki or Grafana Loki.

Loki is a logging management system created as part of the Grafana project, and it has been created with a different approach in mind than Elasticsearch.

Loki is a horizontally-scalable, highly-available, multi-tenant log aggregation system inspired by Prometheus. It is designed to be very cost effective and easy to operate. It does not index the contents of the logs, but rather a set of labels for each log stream.

So as we can read in the definition from their own page above, it covers several interesting topics in comparison with Elasticsearch:

First of all, it addresses some of the usual pain points for ELK customers: It is very cost-effective and easy to operate.
It clearly says that the approach is not the same as ELK, you are not going to have a complete index of the payload for the events, but it is based on different labels that you can define for each log stream.
Prometheus inspires that, which is critical because it enabled the idea to use log traces as metrics to empower our monitoring solutions.

Let’s start with the initial questions when we show an interesting new technology, and we would like to start testing it.

How can I install Loki?

Loki is distributed in different flavors to be installed in your environment in the way you need it.

SaaS: provided as part of the hosting solution of Grafana Cloud.
On-Premises: Provided as a normal binary to be download to run in an on-premises mode.
Cloud: Provided a docker image or even a Helm Chart to be deployed into your Kubernetes-based environment.

GrafanaLabs teams also provide Enterprise Support for Loki if you would like to use it on production mode in your company. Still, at the same time, all the code is licensed using Apache License 2.0, so you can take a look at all the code and contribute to it.

How does Loki work?

Architecture wise is very similar to the ELK/EFK stack and follow the same approach of “collectors” and “indexers” as ELK has:

Loki itself is the central node of the architecture responsible for storing the log traces and their labels and provided an API to search among them based on their own language LogQL (a similar approach to the PromQL from Prometheus).
promtail is the agent component that runs in the edge getting all those log traces that we need that can be running on a machine on-prem or a DaemonSet fashion in our own Kubernetes cluster. It plays the same role as Logstash/Fluent-bit/Fluentd works in the ELK/EFK stack. Promtail provides the usual plugin mode to filter and transforms our log traces as the other solutions provide. At the same time, it provides an interesting feature to convert those log traces into Prometheus metrics that can be scraped directly by your Prometheus server.
Grafana is the UI for the whole stack and plays a similar role as Kibana in the ELK/EFK stack. Grafana, among other plugins, provides direct integration with Loki as a Datasource to explore those traces and include them in the Dashboards.

Summary

Grafana Loki can be a great solution for your logging architecture to cover address two points: Provide a Lightweight log aggregation solution for your environment and at the same time enable your log traces as a source for your metrics, allowing you to create detailed, more business-oriented metrics that use in your dashboards and your monitoring systems.

Amazon Managed Service for Prometheus Explained: High-Availability Monitoring on AWS

2026-01-162021-01-17 by Alexandre Vazquez

Learn what Amazon Managed Service for Prometheus provides and how you can benefit from it.

Monitoring is one of the hot topics when we talk about cloud-native architectures. Prometheus is a graduated Cloud Native Computing Foundation (CNCF) open-source project and one of the industry-standard solutions when it comes to monitoring your cloud-native deployment, especially when Kubernetes is involved.

Following its own philosophy of providing a managed service for some of the most used open-source projects but fully integrated with the AWS ecosystem, AWS releases a general preview (at the time of writing this article): Amazon Managed Service for Prometheus (AMP).

The first thing is to define what Amazon Managed Service for Prometheus is and what features provide. So, this is the Amazon definition of the service:

A fully managed Prometheus-compatible monitoring service that makes it easy to monitor containerized applications securely and at scale.

And I would like to spend some time on some parts of this sentence.

Fully managed service: So, this will be hosted and handle by Amazon, and we are just going to interact with it using API as we do with other Amazon services like EKS, RDS, MSK, SQS/SNS, and so on.
Prometheus-compatible: So, that means that even if this is not a pure-Prometheus installation, the API is going to be compatible. So the Prometheus clients who can use Grafana or others to get the information from Prometheus will work without changing their interfaces.
Service at-scale: Amazon, as part of the managed service, will take care of the solution’s scalability. You don’t need to define an instance-type or how much RAM or CPU you do need. This is going to be handled by AWS.

So, that sounds perfect. So you can think that you are going to delete your Prometheus server, and it will start using this service. Maybe you are even typing something like helm delete prom… WAIT WAIT!!

Because at this point, this is not going to replace your local Prometheus server, but it will allow the integration with it. So, that means that your Prometheus server is going to act like a scraper for the whole monitoring scalable solution that AMP is providing, something as you can see in the picture below:

So, you are still going to need a Prometheus server, that is right, but all the complexity are going to be avoided and leverage to the managed service: Storage configuration, High availability, API optimization, and so on is going to be just provided to you out of the box.

Ingesting information into Amazon Managed Service for Prometheus

At this moment, there is two way to ingest data into the Amazon Prometheus Service:

From an existing Prometheus server using the remote_write capability and configuration, so that means that each series that is scraped by the local Prometheus is going to be sent to the Amazon Prometheus Service.
Using AWS Distro for OpenTelemetry to integrate with this service using the Prometheus Receiver and the AWS Prometheus Remote Write Exporter components to get that.

Summary

So this is a way to provide an enterprise-grade installation leveraging on all the knowledge that AWS has hosting and managing this solution at scale and optimized in terms of performance. You can focus on the components you need to get the metrics ingested into the service.

I am sure this will not be the last movement from AWS in observability and metrics management topics. I am sure they will continue to provide more tools to the developer’s and architects’ hands to define optimized solutions as easily as possible.

Kubernetes Autoscaling: Learn How to Scale Your Kubernetes Deployments Dynamically

2026-01-142020-12-29 by Alexandre Vazquez

Discover the different options to scale your platform based on the traffic load you receive

When talking about Kubernetes, you’re always talking about the flexibility options that it provides. Usually, one of the topics that come into the discussion is the elasticity options that come with the platform — especially when working on a public cloud provider. But how can we really implement it?

Before we start to show how to scale our Kubernetes platform, we need to do a quick recap of the options that are available to us:

Cluster Autoscaler: When the load of the whole infrastructure reaches its peak, we can improve it by creating new worker nodes to host more service instances.
Horizontal Pod Autoscaling: When the load for a specific pod or set of pods reaches its peak, we deploy a new instance to ensure that we can have the global availability that we need.

Let’s see how we can implement these using one of the most popular Kubernetes-managed services, Amazon’s Elastic Kubernetes Services (EKS).

Setup

The first thing that we’re going to do is create a cluster with a single worker node to demonstrate the scalability behavior easily. And to do that, we’re going to use the command-line tool eksctl to manage an EKS cluster easily.

To be able to create the cluster, we’re going to do it with the following command:

eksctl create cluster --name=eks-scalability --nodes=1 --region=eu-west-2 --node-type=m5.large --version 1.17 --managed --asg-access

After a few minutes, we will have our own Kubernetes cluster with a single node to deploy applications on top of it.

Now we’re going to create a sample application to generate load. We’re going to use TIBCO BusinessWorks Application Container Edition to generate a simple application. It will be a REST API that will execute a loop of 100,000 iterations acting as a counter and return a result.

BusinessWorks sample application to show the scalability options

And we will use the resources available in this GitHub repository:

GitHub – alexandrev/testeks

Contribute to alexandrev/testeks development by creating an account on GitHub.

We will build the container image and push it to a container registry. In my case, I am going to use my Amazon ECR instance to do so, and I will use the following commands:

docker build -t testeks:1.0 .
docker tag testeks:1.0 938784100097.dkr.ecr.eu-west-2.amazonaws.com/testeks:1.0
docker push 938784100097.dkr.ecr.eu-west-2.amazonaws.com/testeks:1.0

And once that the image is pushed into the registry, we will deploy the application on top of the Kubernetes cluster using this command:

kubectl apply -f .\testeks.yaml

After that, we will have our application deployed there, as you can see in the picture below:

Image deployed on the Kubernetes cluster

So, now we can test the application. To do so, I will make port 8080 available using a port-forward command like this one:

kubectl port-forward pod/testeks-v1-869948fbb-j5jh7 8080:8080

With that, I can see and test the sample application using the browser, as shown below:

Swagger UI tester for the Kubernetes sample application

Horizontal pod autoscaling

Now, we need to start defining the autoscale rules, and we will start with the Horizontal Pod Autoscaler (HPA) rule. We will need to choose the resource that we would like to use to scale our pod. In this test, I will use the CPU utilization to do so, and I will use the following command:

kubectl autoscale deployment testeks-v1 --min=1 --max=5 --cpu-percent=80

That command will scale the replica set testeks from one (1) instance to five (5) instances when the CPU utilization percent is higher than 80%.

If now we check the status of the components, we will get something similar to the image below:

HPA rule definition for the application using CPU utilization as the key metric

If we check the TARGETS column, we will see this value: <unknown>/80%. That means that 80% is the target to trigger the new instances and the current usage is <unknown>.

We do not have anything deployed on the cluster to get the metrics for each of the pods. To solve that, we need to deploy the Metrics Server. To do so, we will follow the Amazon AWS documentation:

Installing the Kubernetes Metrics Server – Amazon EKS

The Kubernetes Metrics Server is an aggregator of resource usage data in your cluster, and it is not deployed by default in Amazon EKS clusters. For more information, see Kubernetes Metrics Server on GitHub. The Metrics Server is commonly used by other Kubernetes add ons, such as the

So, running the following command, we will have the Metrics Server installed.

kubectl apply -f https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.3.7/components.yaml

And after doing that, if we check again, we can see that the current user has replaced the <unknown>:

Current resource utilization after installing the metrics-server on the Kubernetes cluster — Current resource utilization after installing the Metrics Server on the Kubernetes cluster

If that works, I am going to start sending requests using a Load Test inside the cluster. I will use the sample app defined below:

Auto Scaling Capacity with HPA – Ultimate Kubernetes Bootcamp

With Horizontal Pod Autoscaling, Kubernetes automatically scales the number of pods in a replication controller, deployment or replica set based on observed CPU utilization (or, with alpha support, on some other, application-provided metrics).

To deploy, we will use a YAML file with the following content:

https://gist.github.com/BetterProgramming/53181f3aa7bee7b7e3adda7c4ed8ca40#file-deploy-yaml

And we will deploy it using the following command:

kubectl apply -f tester.yaml

After doing that, we will see that the current utilization is being increased. After a few seconds, it will start spinning new instances until it meets the maximum number of pods defined in the HPA rule.

Pods increasing when the load exceeds the target defined in previous steps.

Then, as soon as the load also decreases, the number of instances will be deleted.

Pods are deleted as soon as the load decreases.

Cluster autoscaling

Now, we need to see how we can implement the Cluster Autoscaler using EKS. We will use the information that Amazon provides:

https://github.com/alexandrev/testeks

The first step is to deploy the cluster autoscaling, and we will do it using the following command:

kubectl apply -f https://raw.githubusercontent.com/kubernetes/autoscaler/master/cluster-autoscaler/cloudprovider/aws/examples/cluster-autoscaler-autodiscover.yaml

Then we will run this command:

kubectl -n kube-system annotate deployment.apps/cluster-autoscaler cluster-autoscaler.kubernetes.io/safe-to-evict=”false”

And we will edit the deployment to provide the current name of the cluster that we are managing. To do that, and we will run the following command:

kubectl -n kube-system edit deployment.apps/cluster-autoscaler

When your default text editor opens with the text content, you need to make the following changes:

Set your cluster name in the placeholder available.
Add these additional properties:

- --balance-similar-node-groups
- --skip-nodes-with-system-pods=false

Deployment edits that are needed to configure the Cluster Autoscaler

Now we need to run the following command:

kubectl -n kube-system set image deployment.apps/cluster-autoscaler cluster-autoscaler=eu.gcr.io/k8s-artifacts-prod/autoscaling/cluster-autoscaler:v1.17.4

The only thing that is left is to define the AutoScaling policy. To do that, we will use the AWS Services portal:

Enter into the EC service page on the region in which we have deployed the cluster.
Select the Auto Scaling Group options.
Select the Auto Scaling Group that has been created as part of the EKS cluster-creating process.
Go to the Automatic Scaling tab and click on the Add Policy button available.

Autoscaling policy option in the EC2 Service console

Then we should define the policy. We will use the Average CPU utilization as the metric and set the target value to 50%:

To validate the behavior, we will generate load using the tester as we did in the previous test and validate the node load using the following command:

kubectl top nodes

Now we deploy the tester again. As we already have it deployed in this cluster, we need to delete it first to deploy it again:

kubectl delete -f .\tester.yaml
kubectl apply -f .\tester.yaml

As soon as the load starts, new nodes are created, as shown in the image below:

After the load finishes, we go back to the previous situation:

kubectl top nodes showing how nodes have been scaled down

Summary

In this article, we have shown how we can scale a Kubernetes cluster in a dynamic way both at the worker node level using the Cluster Autoscaler capability and at the pod level using the Horizontal Pod Autoscaler. That gives us all the options needed to create a truly elastic and flexible environment able to adapt to each moment’s needs with the most efficient approach.

Auto Scaling Capacity with HPA – Ultimate Kubernetes Bootcamp

Kubernetes Distributions Explained: What They Are and the Top Platforms to Choose

2026-01-162020-12-28 by Alexandre Vazquez

Learn what Kubernetes Distributions are, why it matters to you and who are the top best players available in the market today

Table Of Contents

Introduction
- GitHub – kubernetes/kubernetes: Production-Grade Container Scheduling and Management
What Are the Main Components Shipped in a Kubernetes Distribution?
Who Are the Top Players?
Summary

Introduction

One of the biggest announcements from the latest AWS re:Invent 2020 sessions was the release of EKS-D from Amazon. EKS-D is their open-source Kubernetes Distribution that’s now available for everyone to start using in their cloud provider or even on-premises.

It’s based on past findings and the entire process Amazon has undergone in managing their Kubernetes managed platform, Amazon EKS.

These announcements have many people asking themselves: “OK, I know Kubernetes, but what’s a Kubernetes distribution? And why should I care?”

So I’ll try to answer that with the knowledge I have, and I always try to use the same approach: a Kubernetes versus Linux model comparison.

Kubernetes is an open-source project, as you know, started by Google and is now being managed by the community and the Cloud Native Computing Foundation (CNCF), and you can find all the code available here:

GitHub – kubernetes/kubernetes: Production-Grade Container Scheduling and Management

Production-Grade Container Scheduling and Management – GitHub – kubernetes/kubernetes: Production-Grade Container Scheduling and Management

But let’s be honest: Not many of us are pulling that repo and trying to compile it to provide a cluster. That’s not how we usually work. If you follow the code path — downloading it, building it, and so on — this is usually named vanilla Kubernetes.

If we start with the Linux comparison, it’s the same situation as we have with the Linux kernel that most of the Linux distribution ships, but this is already compiled and available with a bunch of other tools all working together via the usual approach.

So that’s what a Kubernetes distribution is. They build Kubernetes. They provide other tools and components to enhance or provide more features and to focus on additional aspects like a security focus, a DevOps focus, or another focus. Another concept that usually is raised is the purity of distribution, and we try to talk about distribution that’s pure.

We call a distribution pure when it’s building Kubernetes, and that’s it. It leaves everything else to the developers or users to decide what they want to use on top of it.

What Are the Main Components Shipped in a Kubernetes Distribution?

The main components that can differ when we’re talking about a Kubernetes distribution are the following:

Container runtime and registries

We all know there’s more that one container runtime, and even if you weren’t aware of that, you’ve probably read all of the articles regarding the removal of Docker support in Kubernetes v1.20, as you can read in this awesome article from Edgar Rodriguez.

Kubernetes Just Deprecated Docker Support. What Now?

Will this kill Docker?

At this moment, it seems all runtimes should support the existing Container Runtime Interface, and runtimes like CRI-O, Containerd, or Kata seem to be the default options now.

Networking

Another topic that often differs when we’re talking about Kubernetes distributions is how they manage their network, and this is one of the most critical aspects of the whole platform.

As we have with the container runtime, a standard specification exists to cover that topic, and that’s the Container Network Interface (CNI). Several projects exist on this topic, like Flannel, Calico, Canal, and Wave. Also, some platforms provide their own component, like the Openshift SDN operator.

Storage

How to handle storage in Kubernetes is also very important, especially as we embrace this model in deployments that require stateful models. Different platforms can support different storage options, like file systems and so on.

Who Are the Top Players?

The first thing we need to be aware of is there are a huge number of Kubernetes distributions out there.

We’ll count the ones with a CNCF certification, and you can take a look at all of them here. At the moment of writing this article, we’re talking about 72 certified distributions.

Logos of the various CNCF-certified distributions. — Image via the Cloud Native Computing Foundation

These are the ones that I’d like to highlight today:

Red Hat OpenShift

The Red Hat OpenShift platform could be one of the most used platforms, especially in a private-cloud fashion. It could include most of the Red Hat services regarding storage, like GlusterFS and networking with OpenShift DNS. It has OKD as the open-source project that backs and contributes to the OpenShift platform. Check this article to see how to set up Openshift locally to test it

Mirantis

The former Docker enterprise that’s been acquired by Mirantis is another of the usual choices when we’re talking about supported platforms.

VMware Tanzu

VMware Tanzu, also coming from the acquisition of Pivotal from VMware, is a Kubernetes platform.

Canonical

Canonical (open source) is a platform from the company that develops and maintains Ubuntu. It’s another one of the important choices here and provides a variety of options, focusing not only on the common central mode but also on edge Kubernetes deployments with projects like MicroK8S and more options.

Rancher

Rancher (open source) is another one of the big players, focusing on following and extending the CNCF standards and also offering a big push for edge deployment with K3S. It also offers automated upgrades.

Summary

So, as you can see, the number of options available out there is huge. They all differ, so it’s important to take your time when you’re deciding your target platform based on your criteria for your project or your company.

And that’s without covering the managed platforms available out there that are becoming one of the more preferred options for companies so they can get all the flexibility from Kubernetes while not needing to handle the complexity of managing a Kubernetes platform themselves. But that’s a topic for another article — hopefully soon.

This article at least has provided you with more clarity about what a Kubernetes distribution is, the main differences among them, and a quick look at some of the key actors in this spectrum. Enjoy your day, and enjoy your life.

Lens Kubernetes IDE Explained: Improve Development and Cluster Management Productivity

2026-01-162020-12-09 by Alexandre Vazquez

Find the greatest way to manage your Kubernetes development cluster

I need to start this article by admitting that I am an advocate of Graphical User Interfaces and everything that provides a way to speed up the way we do things and be more productive.

So when we talk about how to manage our Kubernetes cluster mainly for development purposes, you can imagine that I am one of those people who tries any available tool to make that journey easier. The ones who’ve started using Portainer to manage their local Docker engine or are a fan of the new dashboard in Docker for Windows/Mac. But that is far from reality.

In terms of Kubernetes management, I got used to typing all the commands to check the pods, the logs, the status of the cluster to do the port-forwards, etc. Any task I did was with a terminal, and I felt that it was the right thing to do. I did not even use a Kubernetes dashboard to have a web page for my Kubernetes environment. All of that changed last week when I met with a colleague who showed me what Lens could do.

Lens is a totally different story. I am not praising it because I am being paid to do so. This is an open source project that you can find on GitHub. But the way that it does the job is just awesome!

Image of Len showing the status of a Kubernetes cluster — Screenshot by the author.

The first thing I would like to mention regarding Lens is that it has multi-context support, so you can have all the different Kubernetes contexts available to switch following a Slack approach when we switch from different workspaces. It just reads your .kube/config file and makes all those contexts available to you to connect to the one you would like.

Once we have connected to one of these clusters, we have different options to see the status of it, but the first one is to check the Workloads using the Overview option:

Then, you can drill down to any pod or different object inside Kubernetes to check its status and at the same time do the main actions you usually do when you deal with a pod, such as check the logs, execute a terminal to one of the containers that belong to that pod, or even edit the YAML for that pod.

But Lens goes beyond the usual Kubernetes tasks because it also has a Helm integration, so you can check the releases that you have there, the version of the status, and so on:

The experience of managing everything feels perfect. You are more productive as well. Even those who love the CLI and terminals need to admit that to do regular tasks, the Graphical approach and the mouse are faster than the keyboard — even for the defenders of the mechanical keyboard like myself.

So, I encourage you to download Lens and start using it right now. To do so, go to their main web page and download it:

Thanks for reading!