Kubernetes Init Containers Explained: Advanced Patterns for Dependencies and Startup

Kubernetes Init Containers Explained: Advanced Patterns for Dependencies and Startup

Discover how Init Containers can provide additional capabilities to your workloads in Kubernetes

There are a lot of new challenges that come with the new development pattern in a much more distributed and collaborative way, and how we manage the dependencies is crucial to our success.

Kubernetes and dedicated distributions have become the new standard of deployment for our cloud-native application and provide many features to manage those dependencies. But, of course, the most usual resource you will use to do that is the probes.

Kubernetes provides different kinds of probes that will let the platform know the status of your app. It will help us to tell if our application is “alive” (liveness probe), has been started (startup probe), and if it is ready to process requests (readiness probe).

Kubernetes Init Containers Explained: Advanced Patterns for Dependencies and Startup
Kubernetes Probes lifecycle by Andrew Lock (https://andrewlock.net/deploying-asp-net-core-applications-to-kubernetes-part-6-adding-health-checks-with-liveness-readiness-and-startup-probes/)

Kubernetes Probes are the standard way of doing so, and if you have deployed any workload to a Kubernetes cluster, you probably have used one. But there are some times that this is not enough.

That can be because the probe you would like to do is too complex or because you would like to create some startup order between your components. And in that cases, you rely on another tool: Init Containers.

The Init Containers are another kind of container in that they have their image that can have any tool that you could need to establish the different checks or probes that you would like to perform.

They have the following unique characteristics:

  • Init containers always run to completion.
  • Each init container must complete successfully before the next one starts.

Why would you use init containers?

#1 .- Manage Dependencies

The first use-case to use init containers is to define a relationship of dependencies between two components such as Deployments, and you need for one to the other to start. Imagine the following situation:

We have two components: a Web App and a Database; both are managed as containers in the platform. So if you deploy them in the usual way, both of them will try to start simultaneously, or the Kubernetes scheduler will define the order, so it could be possible the situation of the web app will try to start when the database is not available.

You could think that is not an issue because this is why you have a readiness or a liveness probe in your containers, and you are right: Pod will not be ready until the database is ready, but there are several things to note here:

  • Both probes have a limit of attempts; after that, you will enter into a CrashLoopBack scenario, and the pod will not try to start again until you manually restart it.
  • Web App pod will consume more resources than needed when you know the app will not start at all. So, in the end, you are wasting resources on the process.

So defining an init container as part of the web app deployment that checks if the database is available, maybe just including a database client to quickly see if the database and all the tables are appropriately populated, will be enough to solve both situations.

#2 .- Optimizing resources

One critical thing when you define your containers is to ensure that they have everything they need to perform their task and that they don’t have anything that is not required for that purpose.

So instead of adding more tools to check the component’s behavior, especially if this is something to manage at specific times, you can offload that to an init container and keep the main container more optimized in terms of size and resources used.

#3.- Preloading data

Sometimes you need to do some activities at the beginning of your application, and you can separate that for the usual work of your application, so you would like to avoid the kind of logic to check if the component has been initialized or not.

Using this pattern, you will have an init container managing all the initialization work and ensuring that all the initialization work has been performed when the main container is executed.

Kubernetes Init Containers Explained: Advanced Patterns for Dependencies and Startup
Initialization process sample (https://www.magalix.com/blog/kubernetes-patterns-the-init-container-pattern)

How to define an Init Container?

To be able to define an init container, you need to use a specific section of the specification section of your YAML file, as it is shown in the picture below:

apiVersion: v1
kind: Pod
metadata:
name: myapp-pod
labels:
app: myapp
spec:
containers:
  - name: myapp-container
image: busybox:1.28
command: ['sh', '-c', 'echo The app is running! && sleep 3600']
initContainers:
  - name: init-myservice
image: busybox:1.28
command: ['sh', '-c', "until nslookup myservice.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for myservice; sleep 2; done"]
  - name: init-mydb
image: busybox:1.28
command: ['sh', '-c', "until nslookup mydb.$(cat /var/run/secrets/kubernetes.io/serviceaccount/namespace).svc.cluster.local; do echo waiting for mydb; sleep 2; done"]

You can define as many Init Containers as you need. Still, you will note that all the init containers will run in sequence as described in the YAML file, and one init container can only be executed if the previous one has been completed successfully.

Summary

I hope this article will have provided you with a new way to define the management of your dependencies between workloads in Kubernetes and, at the same time, also other great use-cases where the capability of the init container can provide value to your workloads.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads

Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads

Kubernetes AccessModes provides a lot of flexibility on how the different pods can access the shared volumes

All Companies are moving towards a transformation, changing the current workloads on application servers running on virtual machines on a Data Center towards a cloud-native architecture where applications have been decomposed on different services that run as isolated components using containers and managing by a Kubernetes-based platform.

We started with the easiest use-cases and workloads moving our online services, mainly REST API that works on a load-balancing mode, but the issues began when we moved other workloads to follow the same transformation journey.

Kubernetes platform was not ready at the time. Most of their improvements have been made to support more use-cases. Does that mean that REST API is much more cloud-native than an application that requires a file-storage solution? Absolutely Not!

We were confusing different things. Cloud-native patterns are valid independent of those decisions. However, it is true that in the journey to the cloud and even before, there were some patterns that we tried to replace, especially File-based. But this is not because of the usage of the file itself. It is was more about the batch approach that was closely related to the use of files that we try to replace for several reasons, such as the ones below:

  • The online approach reduces time to action: Updates and notifications come faster to the target, so components are current.
  • File-based solutions reduce the solution’s scalability: You generate a dependency with a central component that has a more complex scalability solution.

But this path is being eased, and the last update on that journey was the Access Modes introduced by Kubernetes. Access Mode defines how the different pods will interact with one specific persistent volume. The access modes are the ones shown below.

  • ReadWriteOnce — the volume can be mounted as read-write by a single node
Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads
ReadWriteOnce AccessMode Graphical Representation
  • ReadOnlyMany — the volume can be mounted read-only by many nodes.
Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads
ReadOnlyMany AccessMode Graphical Representation
  • ReadWriteMany — the volume can be mounted as read-write by many nodes
Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads
ReadWriteMany AccessMode Graphical Representation
  • ReadWriteOncePod — the volume can be mounted as read-write by a single Pod. This is only supported for CSI volumes and Kubernetes version 1.22+.
Kubernetes AccessModes Explained: Choosing the Right Mode for Stateful Workloads
ReadWriteOncePod AccessMode Graphical Representation

You can define the access mode as one of the properties of your PVs and PVCs, as shown in the sample below:

kind: PersistentVolumeClaim
apiVersion: v1
metadata:
 name: single-writer-only
spec:
 accessModes:
 - ReadWriteOncePod # Allow only a single pod to access single-writer-only.
 resources:
 requests:
 storage: 1Gi

All of this will help us on our journey to have all our different kinds of workloads achieving all the benefits from the digital transformation and allowing us as architects or developers to choose the right pattern for our use-case without being restricted at all.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

GraalVM provides the capabilities to make Java a first-class language to create microservices at the same level as Go-Lang, Rust, NodeJS, and others.

Java language has been the language leader for generations. Pretty much every piece of software has been created with Java: Web Servers, Messaging System, Enterprise Application, Development Frameworks, and so on. This predominance has been shown in the most important indexes like the TIOBE index, as is shown below:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage
TIOBE index image from https://www.tiobe.com/tiobe-index/

But always, Java has some trade-offs that you need to make. The promise of the portability code is because the JVM allows us to run the same code in different OS and ecosystems. Still, at the same time, the interpreter approach will also provide a little bit of overhead compared with other compiled options like C.

That overhead was never a problem until we go inside the microservices route. When we have a server-based approach with an overhead of 100–200 MB, it is not a big problem compared with all the benefits it provides. Still, if we transform that server into, for example, hundreds of services, and each of them has a 50 MB overhead, this starts to become something to worry about.

Another trade-off was start-up time, again the abstraction layer provides a slower start-up time, but in client-service architecture, that was not an important issue if we need few more seconds to start serving requests. Still, today in the scalability era, this becomes critical if we talk about second-based startup time compared with milliseconds-based startup time because this provides better scalability and more optimized infrastructure usage.

So, how to provide all the benefits from Java and provide a solution for these trade-offs that were now starting to be an issue? And GraalVM becomes to be the answer to all of this.

GraalVM is based on its own words: “a high-performance JDK distribution designed to accelerate the execution of applications written in Java and other JVM languages,” which provides an Ahead-of-Time Compilation process to generate binary process from Java code that removes the traditional overhead from the JVM running process.

Regarding its use in microservices, this is a specific focus that they have given, and the promise of around 50x faster startup and 5x less memory footprint is just amazing. And this is why GraalVM becomes the foundation for high-level microservice development frameworks in Java-like Quarkus from RedHat, Micronaut, or even the Spring-Boot version powered by GraalVM.

So, probably you are just asking: How can I start using this? The first thing that we need to do is to go to the GitHub release page of the project and find the version for our OS and follow the instructions provided here:

When we have this installed, this is the moment to start testing it, and what better of doing so than creating a REST/JSON service and comparing it with a traditional OpenJDK 11-powered solution?

To create this REST service as simple as possible to focus on the difference between both modes, I will use the Spark Java Framework which is a minimal framework to create REST Services.

I will share all the code in this GitHub repository, so if you would like to take a look, clone it from here:

The code that we are going to use looks very simple, just a single line to create a REST service:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

Then, we will use a GraalVM maven plugin for all the compilation processes. You can check all the options here:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

The compilation process takes a while (around 1–2 min). Still, you need to understand that this compiles everything to a binary process because the only output you will get out of this is a single binary process (named in my case rest-service-test) that will have all the things you need to run your application.

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

And finally, we will have a single binary that is everything that we need to run our application:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

This binary is an exceptional one because it does not require any JVM on your local machine, and it can start in a few milliseconds. And the total size of the binary is 32M on disk and less than 5MB of RAM.

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

The output of this first tiny application is straightforward, as you saw, but I think you can get the point. But let’s see it in action I will launch a small load test with my computer with 16 threads launching requests to this endpoint:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

As you can see, this is just incredible, even with the lack of latency as this is just triggered by the same machine we are reaching with a single service a rate of TPS in 1 minute of more than 1400 requests/sec with a response time of 2ms for each of those.

And how does that compare with a normal JAR-based application with the same code exactly? For example, you can see in the table below:

GraalVM for Microservices: Improve Startup Time and Reduce Memory Usage

In a nutshell, we have seen how using tools such as GraalVM we can make our JVM-based programs ready for our microservices environment avoiding the normal issues regarding high-memory footprint or small startup time that are critical when we are adopting a full cloud-native strategy in our companies or projects.

But, the truth must be told. This is not always as simple as we showed on this sample because depending on the libraries you are using, generating the native image can be much more complex and require a lot of configuration or just be impossible. So it is not everything already done but the future looks bright and full of hope.

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache

Discover new options that you have at your disposal to do efficient disk usage in your Docker installation

The rise of the container has been a game-changer for all of us, not only at the server-side where pretty much any new workload that we deploy is deployed in a container form but also in our local environments happens same change.

We embrace containers to easily manage the different dependencies that we need to handle as developers. Even if the task at hand was not a container-related thing. Do you need a Database up & running? You use a containerized version of it. Do you need a Messaging System to test some of your applications? You quickly start a container providing that functionality.

And as soon as you don’t need them, those are killed, and your system is still as clean as it was before starting this task. But there are always things that we need to handle even when we have a wonderful solution in front of us, and in the case of a local docker environment, Disk Usage is one of the most critical ones.

This process of launching new things over and over and then we get rid of them is true in some way because all these images that we have needed and all these containers we have launched are still there in our system waiting for a new round and during that time using our disk resources as you can see in a current picture of my local Docker environment with more than 60 GB used for that purpose.

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache
Docker dashboard settings page image showing the amount of disk Docker is using.

The first thing we need to do is to check what is using this amount of space to see if we can release some of them. To do that, we can leverage on the docker system df command the docker CLI provides to us:

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache
The output of the execution of the docker system df command

As you can see, the 61 GB that is in use are 20.88 GB per the images that I have in use, 21.03 MB just for the containers that I have defined, 1.25 GB for the local volumes 21.07 for the build cache. As I only have active 18 of the 26 images defined I can reclaim up to 9.3 GB that is an important amount.

If we would like to get more details about this data, we can always use the verbose option as an append to the command, as you can see in the picture below:

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache
Detailed verbose output of the docker system df -v command

So, after getting all this information, we can go ahead and execute a prune of your system. This activity will get rid of any unused container and image that you have in your system, and to execute that, you only need to type this:

docker system prune -af

It has several options to turn a little bit the execution that you can check on the Docker Oficial web page :

In my case, that help me to recover up to 40.8 GB of my system, as you can see in the picture below.

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache

But if you would like to move one step ahead, you can also tune some properties to consider where you are executing this prune. For example, the defaultKeepStorage will help you define how much disk you want to use for this build-cache system to optimize the amount of network usage you do when building images with common layers.

To do that, you need to have the following snippet in your Docker Engine configuration section, as shown in the image below:

Reduce Docker Disk Usage Locally: Analyze and Clean Up Images, Containers, and Cache
Docker Engine configuration with the defaultKeepStorage up to 20GB

I hope that all this housekeeping process will help your location environments to shine again and get the most of it without needing to waste a lot of resources in the process

Why Apache NetBeans Is Still a Great Java IDE in 2025 (Despite IntelliJ’s Popularity)

Why Apache NetBeans Is Still a Great Java IDE in 2025 (Despite IntelliJ’s Popularity)

Discover what are the reasons why to me, Apache NetBeans is still the best Java IDE you can use

Let me start from the beginning. I always have been a Java Developer since my time at University. Even that I first learned another less-known programming (Modula-2), I quickly jump to Java to do all the different assignments and pretty much every task on my journey as a student and later as a software engineer.

I was always looking for the best IDE that I could find to speed up my programming tasks. The main choice was Eclipse at the university, but I have never been an Eclipse fan, and that has become a problem.

If you are in the Enterprise Software industry, you have noticed that pretty much every Developer-based tool is based on Eclipse because its licensing and its community behind make the best option. But I never thought that Eclipse was a great IDE, and it was too flexible but at the same time too complex.

So at that time is when I discover NetBeans. I think the first version I tried was in branch 3.x, and Sun Microsystem developed it at that time. It was quite much better than Eclipse. Indeed, the number of plugins available was not comparable with Eclipse, but the things that it did, it did it awesomely.

To me, if I need to declare why at that time Netbeans was better than Eclipse, probably the main things will be these:

  • Simplicity in the Run Configuration: Still, I think most Java IDE makes things too complex just to run the code. NetBeans simple Run without needed to create a Run Configuration and configure it (you can do it, but you are not mandated to do so)
  • Better Look & Feel: This is more based on a personal preference, but I prefer the default configuration from NetBeans compared with Eclipse.

So because of that, Netbeans become my default app to do my Java Programming, but Oracle came, and things change a little. With the acquisition of Sun Microsystems from Oracle, NetBeans was stalled like many other Open source projects. For years no many updates and progress.

It is not that they deprecated the product, but Oracle had a different IDE at the time JDeveloper, which was the main choice. This is easy to understand. I continued loyal to NetBeans even that we had another big guy in the competition: IntelliJ IDEA.

This is the fancy option, the one most developers used today for Java programming, and I can understand why. I’ve tried several times in my idea to try to feel the same feelings that others did, and I could read the different articles, and I acknowledge some of the advantages of the solution:

  • Better performance: It is clear that the response time from the IDE is better with IntelliJ IDEA than NetBeans because it doesn’t come from an almost 20-years journey, and it could start from scratch and use modern approaches for the GUI.
  • Fewer Memory Resources: Let’s be honest: All IDE consumes tons of memory. No one does great here (unless you are talking about text editors with Java compiler; that is a different story). NetBeans indeed requires more resources to run properly.

So, I did the switch and started using the solution from JetBrains, but it never stuck with me, because to me is still too complex. A lot of fancy things, but less focus on the ones that I need. Or, just because I was too used to how NetBeans do things, I could not do the mental switch that is required to adopt a new tool.

And then… when everything seems lost, something awesome happens: Netbeans was donated to the Apache Foundation and became Apache NetBeans. It seems like a new life for the tool providing simple things like Dark Mode and keeping the solution up-to-date to the progress in Java Development.

So, today, Apache NetBeans is still my preferred IDE, and I couldn’t voucher more for the usage of this awesome tool. And these are the main points I would like to raise here:

  • Better Maven Management: To me, the way and the simplicity you can manage your Maven project with NetBeans is out of this league. It is simple and focuses on performance, adding a new dependency without go to the pom.xml file, updating dependencies on the fly.
  • Run Configuration: Again, this still is a differentiator. When I’m coding something fast because of a new kind of utility, I don’t like to waste time creating run configuration or adding a maven exec plugin to my pom.xml to run the software I just coded. Instead, I need to click Run, a green button, and let the magic begins.
  • There is no need for everything else: Things evolve too fast in the Java programming world, but even today, I never feel that I was missing some capability or something in my NetBeans IDE that I could get if I move to a more modern alternative. So, no trade-offs here at this level.

So, I am aware that probably my choice is because I have a biased view of this situation. After all, this has been my main solution for more than a decade now, and I’m just used to it. But I consider myself an open person, and if I saw a clear difference, I wouldn’t have second thoughts of ditching NetBeans as I did with many other solutions in the past (Evernote, OneNote, Apple Mail, Gmail, KDE Basket, Things, Wunderstling.. )

So, if you have some curiosity about seeing how Apache NetBeans has progressed, please take a look at the latest version and give it a try. Or, if you feel that you don’t connect with the current tool, give it a try again. Maybe you have the same biased view as I have!!!

Portainer Explained: Evolution, Use Cases, and Why It Still Matters for Kubernetes

Portainer Explained: Evolution, Use Cases, and Why It Still Matters for Kubernetes

Discover the current state of the first graphical interfaces for docker containers and how it provides a solution for modern container platforms

I want to start this article with a story that I am not sure all of you, incredible readers, know. It was a time that there were no graphical interfaces to monitor your containers. It was a long time ago, understanding a long time as we can do in the container world. Maybe this was 2014-2015 when Kubernetes was in its initial stage, and also, Docker Swarm was just released and seemed the most reliable solution.

So most of us didn’t have a container platform as such. We just run our containers from our own laptops or small servers for cutting-edge companies using docker commands directly and without more help than the CLI tool. As you can see, things have changed a lot since then, and if you would like to refresh that view, you can check the article shared below:

And at that time, an open-source project provides the most incredible solution because we didn’t know that we needed that until we use it, and that option was portainer. Portainer provides a very awesome web interface where you can see all the docker containers deployed on your docker host and deploy as another platform.

Portainer: A Visionary Software and an Evolution Journey
Web page of portainer in 2017 from https://ostechnix.com/portainer-an-easiest-way-to-manage-docker/

It was the first one and generated a tremendous impact, even generated a series of other projects that were named: the portainer of… like dodo the portainer of Kubernetes infrastructure at that time.

But maybe you can ask.. and how is portainer doing? is still portainer a thing? It is still alive and kicking, as you can see on their GitHub project page: https://github.com/portainer/portainer, with the last release in the last of May 2021.

Now they have a Business version but still as Comunity Edition one that is the one that I am going to be analyzing here in more detail in another article. Still, I would like to provide some initial highlights:

  • Installing process still follows the same approach as the initial releases to be another component of your cluster. The options to be used in Docker, Docker Swarm, or Kubernetes cover all the main solutions all enterprise uses.
  • Provides now a list of application templates similar to the Openshift Catalog list, and also, you can create your own ones. This is very useful for companies that usually rely on these templates to allow developers to use a common deployment approach without needing to do all the work.
Portainer Explained: Evolution, Use Cases, and Why It Still Matters for Kubernetes
Portainer 2.5.1 Application Template view
  • Team Management capabilities can define users with access to the platform and group those users as part of the team to a more granular permission management.
  • Multi-registry support: By default, it will be integrated with Docker Hub, but you can add your own registries as well and be able to pull images directly from those directly from the GUI.

In summary, this is a great evolution of the portainer tool while keeping the same spirit that all the old users loved at that time: Simplicity and Focus on what an Administrator or Developer needs to know, but also adding more features and capabilities to keep the pace of the evolution in the container platform industry.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Promtail Explained: Turning Logs into Metrics for Prometheus and Loki

Promtail Explained: Turning Logs into Metrics for Prometheus and Loki

Promtail is the solution when you need to provide metrics that are only present on the log traces of the software you need to monitor to provide a consistent monitoring platform

It is a common understanding that three pillars in the observability world help us to get a complete view of the status of our own platforms and systems: Logs, Traces, and Metrics.

To provide a summary of the differences between each of them:

  • Metrics are the counters about the state of the different components from both a technical and a business view. So we can see here things like the CPU consumption, the number of requests, memory, or disk usage…
  • Logs are the different messages that each of the pieces of software in our platform provides to understand its current behavior and detect some non-expected situations.
  • Trace is the different data regarding the end-to-end request flow across the platform with the services and systems that have been part of that flow and data related to that concrete request.

We have solutions that claim to address all of them, mainly in the enterprise software with Dynatrace, AppDynamics, and similar. And on the other hand, we try to go with a specific solution for each of them that we can easily integrate together and we have discussed a lot about that options in previous articles.

But, some situations in that software don’t work following this path because we live in the most heterogeneous era. We all embrace, at some level, the polyglot approach on the new platforms. In some cases, we can see that software is using log traces to provide data related to metrics or other matters, and here is when we need to rely on pieces of software that help us “fix” that situation, and Promtail does specifically that.

Promtail is mainly a log forwarder similar to others like fluentd or fluent-bit from CNCF or logstash from the ELK stack. In this case, this is the solution from Grafana Labs, and as you can imagine, this is part of the Grafana stack with Loki to be the “master-mind” that we cover in this article that I recommend you to take a look at if you haven’t read it yet:

Promtail has two main ways of behaving as part of this architecture, and the first one is very similar to others in this space, as we commented before. It helps us ship our log traces from our containers to the central location that will mainly be Loki and can be a different one and provide the usual options to play and transform those traces as we can do in other solutions. You can look at all the options in the link below, but as you can imagine, this includes transformation, filtering, parsing, and so on.

But what makes promtail so different is just one of the actions that you can do, and that action is metrics. Metrics provides a specific way to, based on the data that we are reading from the logs, create Prometheus metrics that a Prometheus server can scrape. That means that you can use the log traces that you are processing that can something like this:

[2021–06–06 22:02.12] New request received for customer_id: 123
[2021–06–06 22:02.12] New request received for customer_id: 191
[2021–06–06 22:02.12] New request received for customer_id: 522

With this information apart to send those metrics to the central location to create a metric call, for example: `total_request_count` that will be generated by the promtail agent and also exposed by it and being able also to use a metrics approach even for systems or components that don’t provide a standard way to do that like a formal metrics API.

And the way to do this is very well integrated with the configuration. This is done with an additional stage (this is how we call the actions we can do in Promtail) that is namedmetrics.

The schema of that metric stage is straightforward, and if you are familiar with Prometheus, you will see how direct it is from a definition of Prometheus metrics to this snippet:

# A map where the key is the name of the metric and the value is a specific
# metric type.
metrics:
  [<string>: [ <metric_counter> | <metric_gauge> | <metric_histogram> ] ...]

So we start defining the kind of metrics that we would like to define, and we have the usual ones: counter, gauge, or histogram, and for each of them, we have a set of options to be able to declare our metrics as you can see here for a Counter Metrics

# The metric type. Must be Counter.
type: Counter

# Describes the metric.

[description: <string>]

# Defines custom prefix name for the metric. If undefined, default name “promtail_custom_” will be prefixed.

[prefix: <string>]

# Key from the extracted data map to use for the metric, # defaulting to the metric’s name if not present.

[source: <string>]

# Label values on metrics are dynamic which can cause exported metrics # to go stale (for example when a stream stops receiving logs). # To prevent unbounded growth of the /metrics endpoint any metrics which # have not been updated within this time will be removed. # Must be greater than or equal to ‘1s’, if undefined default is ‘5m’

[max_idle_duration: <string>]

config: # If present and true all log lines will be counted without # attempting to match the source to the extract map. # It is an error to specify `match_all: true` and also specify a `value`

[match_all: <bool>]

# If present and true all log line bytes will be counted. # It is an error to specify `count_entry_bytes: true` without specifying `match_all: true` # It is an error to specify `count_entry_bytes: true` without specifying `action: add`

[count_entry_bytes: <bool>]

# Filters down source data and only changes the metric # if the targeted value exactly matches the provided string. # If not present, all data will match.

[value: <string>]

# Must be either “inc” or “add” (case insensitive). If # inc is chosen, the metric value will increase by 1 for each # log line received that passed the filter. If add is chosen, # the extracted value most be convertible to a positive float # and its value will be added to the metric. action: <string>

And with that, you will have your metric created and exposed, just waiting for a Prometheus server to scrape it. If you would like to see all the options available, all this documentation is available in the Grafana Labs documentation that you can check in the link:

I hope you will find this interesting and a useful way to keep all your observability information managed correctly using the right solution and provide a solution for these pieces of software that don’t follow your paradigm.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Level-Up Your Deployment Strategy with Canarying in Kubernetes

Level-Up Your Deployment Strategy with Canarying in Kubernetes

Save time and money on your application platform deploying applications differently in Kubernetes.

We have come from a time where we deploy an application using an apparent and straight-line process. The traditional way is pretty much like this:
  • Wait until a weekend or some time where the load is low, and business can tolerate some service unavailability.
  • We schedule the change and warn all the teams involved for that time to be ready to manage the impact.
  • We deploy the new version and we have all the teams during the functional test they need to do to ensure that is working fine, and we wait for the real load to happen.
  • We monitor during the first hours to see if something wrong happens, and in case it does, we establish a rollback process.
  • As soon as everything goes fine, we wait until the next release in 3–4 months.
But this is not valid anymore. Business demands IT to be agile, change quickly, and not afford to do that kind of resource effort each week or, even worse, each day. Do you think that it’s possible to gather all teams each night to deploy the latest changes? It is not feasible at all. So, technology advance to help us solve that issue better, and here is where Canarying came here to help us.

Introducing Canary Deployments

Canary deployments (or just Canarying as you prefer) are not something new, and a lot of people has been talking a lot about it: It has been here for some time, but before, it was neither easy nor practical to implement it. Basically is based on deploying the new version into production, but you still keeping the traffic pointing to the old version of the application and you just start shifting some of the traffic to the new version.
Level-Up Your Deployment Strategy with Canarying in Kubernetes
Canary release in Kubernetes environment graphical representation
Based on that small subset of requests you monitor how the new version performs at different levels, functional level, performance level, and so on. Once you feel comfortable with the performance that is providing you just shift all the traffic to the new version, and you deprecate the old version
Level-Up Your Deployment Strategy with Canarying in Kubernetes
Removal of old version after all traffic has been shifted to the newly deployed version.
The benefits that come with this approach are huge:
  • You don’t need a big staging environment as before because you can do some of the tests with real data into the production while not affecting your business and the availability of your services.
  • You can reduce time to market and increase the frequency of deployments because you can do it with less effort and people involved.
  • Your deployment window has been extended a lot as you do not need to wait for a specific time window, and because of that, you can deploy new functionality more frequently.

Implementing Canary Deployment in Kubernetes

To implement Canary Deployment in Kubernetes, we need to provide more flexibility to how the traffic is routed among our internal components, which is one of the capabilities that get extended from using a Service Mesh. We already discussed the benefits of using a Service Mesh as part of your environment, but if you would like to retake a look, please refer to this article: We have several technology components that can provide those capabilities, but this is how you will be able to create the traffic routes to implement this. To see how you can take a look at the following article about one of the default options that is Istio: But be able to route the traffic is not enough to implement a complete canary deployment approach. We also need to be able to monitor and act based on those metrics to avoid manual intervention. To do this, we need to include different tools to provide those capabilities: Prometheus is the de-facto option to monitor workloads deployed on the Kubernetes environment, and here you can get more info about how both projects play together. And to manage the overall process, you can have a Continuous Deployment tool to put some governance around that using options like Spinnaker or using our of the extensions for the Continuous integration tools like GitLab or GitHub:

Summary

In this article, we covered how we can evolve a traditional deployment model to keep pace with innovation that businesses require today and how canary deployment techniques can help us on that journey, and the technology components needed to set up this strategy in your own environment.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Maximizing the productivity of working with Kubernetes Environment with a tool for each persona

We all know that Kubernetes is the default environment for all our new applications we developed we will build. The flavor of that Kubernetes platform can be of different ways and forms, but one thing is clear, it is complex.

The reasons behind this complexity is being able to provide all the flexibility but it is also true that the k8s project never has put much effort to provide a simple way to manage your clusters and kubectl is the point of access to send commands leaving this door open to the community to provide its own solution and these are the things that we are going to discuss today.

Kubernetes Dashboard: The Default Option

Kubernetes Dashboard is the default option for most of the installations. It is a web-based interface that is part of the K8s project but not deployed by default when you install the cluster

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

K9S: The CLI option

K9S is one of the most common options for the ones that love a very powerful command-line interface with a lot of options at your disposal

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

It is a mix between all the power of a command-line interface with all the keyboard options at your disposals with a very fancy graphical view to have a quick overview of the status of your cluster at glance.

Lens — The Graphical Option

The lens is a very vitaminized GUI option that goes beyond that just showing the status of the K8S cluster or allowing modifications on the components. With integration with other projects such as Helm or support for the CRD. It provides a very pleasant experience of managing clusters with multi-cluster support as well. To know more about Lens you can take a look at this article that we cover its main features:

Octant — The Web Option

Octant provides an improved experience compared with the default web option discussed in this article using the Kubernetes dashboard. Built for extension with a plug-in system that allows you to extend or customize the behavior of octant to maximize your productivity managing K8S clusters. Including CRD support and graphical visualization of dependencies provides an awesome experience.

Best Kubernetes Management Tools: Dashboard, CLI, and GUI Options Compared

Summary

The have provided in this article different tools that will help you during the important task to manage or inspect your Kubernetes cluster. Each of them with its own characteristics and each of them focuses on different ways to provide the information (CLI, GUI and Web) so you can always find one that works best for your situation and preferences.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

KEDA provides a rich environment to scale your application apart from the traditional HPA approach using CPU and Memory

Autoscaling is one of the great things of cloud-native environments and helps us to provide an optimized use of the operations. Kubernetes provides many options to do that being one of those the Horizontal Pod Autoscaler (HPA) approach.

HPA is the way Kubernetes has to detect if it is needed to scale any of the pods, and it is based on the metrics such as CPU usage or memory.

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/

Sometimes those metrics are not enough to decide if the number of replicas we have available is enough. Other metrics can provide a better perspective, such as the number of requests or the number of pending events.

Kubernetes Event-Driven Autoscaling (KEDA)

Here is where KEDA comes to help. KEDA stands for Kubernetes Event-Driven Autoscaling and provides a more flexible approach to scale our pods inside a Kubernetes cluster.

It is based on scalers that can implement different sources to measure the number of requests or events that we receive from different messaging systems such as Apache Kafka, AWS Kinesis, Azure EventHub, and other systems as InfluxDB or Prometheus.

KEDA works as it is shown in the picture below:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)

We have our ScaledObject that links our external event source (i.e., Apache Kafka, Prometheus ..) with the Kubernetes Deployment we would like to scale and register that in the Kubernetes cluster.

KEDA will monitor the external source, and based on the metrics gathered, will communicate the Horizontal Pod Autoscaler to scale the workload as defined.

Testing the Approach with a Use-Case

So, now that we know how that works, we will do some tests to see it live. We are going to show how we can quickly scale one of our applications using this technology. And to do that, the first thing we need to do is to define our scenario.

In our case, the scenario will be a simple cloud-native application developed using a Flogo application exposing a REST service.

The first step we need to do is to deploy KEDA in our Kubernetes cluster, and there are several options to do that: Helm charts, Operation, or YAML files. In this case, we are going to use the Helm charts approach.

So, we are going to type the following commands to add the helm repository and update the charts available, and then deploy KEDA as part of our cluster configuration:

helm repo add kedacore https://kedacore.github.io/charts
helm repo update
helm install keda kedacore/keda

After running this command, KEDA is deployed in our K8S cluster, and it types the following command kubectl get all will provide a situation similar to this one:

pod/keda-operator-66db4bc7bb-nttpz 2/2 Running 1 10m
pod/keda-operator-metrics-apiserver-5945c57f94-dhxth 2/2 Running 1 10m

Now, we are going to deploy our application. As already commented to do that we are going to use our Flogo Application, and the flow will be as simple as this one:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Flogo application listening to the requests
  • The application exposes a REST service using the /hello as the resource.
  • Received requests are printed to the standard output and returned a message to the requester

Once we have our application deployed on our Kubernetes application, we need to create a ScaledObject that is responsible for managing the scalability of that component:

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
ScaleObject configuration for the application

We use Prometheus as a trigger, and because of that, we need to configure where our Prometheus server is hosted and what query we would like to do to manage the scalability of our component.

In our sample, we will use the flogo_flow_execution_count that is the metric that counts the number of requests that are received by this component, and when this has a rate higher than 100, it will launch a new replica.

After hitting the service with a Load Test, we can see that as soon as the service reaches the threshold, it launch a new replica to start handling requests as expected.

Event-Driven Autoscaling in Kubernetes with KEDA (Beyond CPU and Memory Metrics)
Autoscaling being done using Prometheus metrics.

All of the code and resources are hosted in the GitHub repository shown below:

https://kubernetes.io/docs/tasks/run-application/horizontal-pod-autoscale/


Summary

This post has shown that we have unlimited options in deciding the scalability options for our workloads. We can use the standard metrics like CPU and memory, but if we need to go beyond that, we can use different external sources of information to trigger that autoscaling.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.