Multi-Stage Dockerfiles Explained: Reduce Docker Image Size the Right Way

Multi-Stage Dockerfiles Explained: Reduce Docker Image Size the Right Way

Multi-Stage Dockerfile is the pattern you can use to ensure that your docker image is at an optimized size. We already have covered the importance of keeping the size of your docker image at a minimum level and what tools you could use, such as dive, to understand the size of each of your layers. But today, we are going to follow a different approach and that approach is a multi-stage build for our docker containers.

What is a Multi-Stage Dockerfile Pattern?

The multi-Stage Dockerfile is based on the principle that the same Dockerfile can have different FROM sentences and each of the FROM sentences starts a new stage of the build.

Multi-Stage Dockerfile Pattern

Why Multi-Stage Build Pattern Helps Reducing The Size of Container Images?

The main reason the usage of multi-stage build patterns helps reduce the size of the containers is that you can copy any artifact or set of artifacts from one stage to the other. And that is the most important reason. Why? Because that means that everything you do not copy is discarded and you are not carrying all these not required components from layer to layer and generating a bigger unneeded size of the final Docker image.

How do you define a Multi-Stage Dockerfile

First, you need to have a Dockerfile with more than one FROM. As commented, each of the FROM will indicate the start of one stage of the multi-stage dockerfile. To differentiate them or reference them, you can name each of the stages of the Dockerfile by using the clause AS alongside the FROM command, as shown below:

 FROM eclipse-temurin:11-jre-alpine AS builder

As a best practice, you can also add a new label stage with the same name you provided before, but that is not required. So, in a nutshell, a Multi-Stage Dockerfile will be something like this:

FROM eclipse-temurin:11-jre-alpine AS builder
LABEL stage=builder
COPY . /
RUN apk add  --no-cache unzip zip && zip -qq -d /resources/bwce-runtime/bwce-runtime-2.7.2.zip "tibco.home/tibcojre64/*"
RUN unzip -qq /resources/bwce-runtime/bwce*.zip -d /tmp && rm -rf /resources/bwce-runtime/bwce*.zip 2> /dev/null


FROM  eclipse-temurin:11-jre-alpine 
RUN addgroup -S bwcegroup && adduser -S bwce -G bwcegroup

How do you copy resources from one stage to another?

This is the other important part here. Once we have defined all the stages we need, and each is doing its part of the job, we need to move data from one stage to the next. So, how can we do that?

The answer is by using the command COPY. COPY is the same command you use to move data from your local storage to the container image, so you will need a way to differentiate that this time you are not copying it from your local storage but another stage, and here is where we are going to use the argument --from. The value will be the name of the stage we learned in the previous section to declare. So a complete COPYcommand will be something like the snippet shown below:

 COPY --from=builder /resources/ /resources/

What is the Improvement you can get?

That is the essential part and will depend on how your Dockerfiles and images are created, but the primary factor you can consider is the number of layers your current image has. The bigger the number of layers, the more significant that you can probably save on the amount of the final container image in a multi-stage dockerfile.

The main reason is that each layer will duplicate part of the data, and I am sure you will not need all of the layer’s data in the next one. And using the approach comments in this article, you will get a way to optimize it.

 Where can I read more about this?

If you want to read more, you would need to know that the multi-stage dockerfile is documented as one of the best practices on the Docker official web page, and they have a great article about this by Alex Ellis that you can read here.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Inject Secrets into Kubernetes Pods Using HashiCorp Vault (Agent Injector Guide)

Inject Secrets into Kubernetes Pods Using HashiCorp Vault (Agent Injector Guide)

Introduction

This article will cover how to inject secrets in Pods using Hashicorp Vault. In previous articles, we covered how to install Hashicorp Vault in Kubernetes, configure and create secrets in Hashicorp, and how tools such as TIBCO BW can retrieve them. Still, today, we are going to go one step ahead.

The reason why Inject secrets in pods is very important is that it allows the application inside the pod to be transparent around any communication to Hashicorp. After all, for the applications, the secret will be just a regular file located in a specific path inside the container. It doesn’t need to worry if this file came from a Hashicorp Secret or a total of different resources.

This injection approach facilitates the Kubernetes ecosystem’s polyglot approach because it frees any responsibility for the underlying application. The same happens with injector approaches such as Istio or much more.

But, let’s explain how this approach works to inject secrets in Pods using Hashicorp Vault. As part of the installation alongside the vault server we have installed (or several ones if you have done a distributed installation), we have seen another pod under the name of value-agent-injector, as you can see in the picture below:

Inject Secrets In Pods: Vault Injector Pod

This agent will be responsible for listening to the new deployments you do and, based on the annotations this deployment has, will launch a sidecar alongside your application and send the configuration to be able to connect to the vault and download the secrets required and mount it as files inside your pod as shown in the picture below:

To do that, we need to do several steps as part of the configuration as we are going to include in the upcoming sections of the article

Enabling Kubernetes Authentication in Hashicorp

The first thing we need to do at this stage is to enable the Kubernetes Authentication in Hashicorp. This method allows clients to authenticate with a Kubernetes Service Account Token. We do that with the following command:

 Vault auth enable kubernetes

Vault accepts a service token from any client in the Kubernetes cluster. During authentication, Vault verifies that the service account token is valid by querying a token review Kubernetes endpoint. Now, we need to configure this authentication method providing the location our the Kubernetes API, and to do that, we need to run the following command:

 vault write auth/kubernetes/config \
    kubernetes_host="https://$KUBERNETES_PORT_443_TCP_ADDR:443"

Defining a Kubernetes Services Account and Defining a Policy

Now, we will create a Kubernetes Service Account that will run our pods, and this service account will be allowed to retrieve the secret we generated in the previous post.

To do that, we will start with the creation of the service account by running this command from outside the pod:

kubectl create sa internal-app

This will create a new service account under the name of internal-app, and now we are going to generate a policy inside the Hashicorp Vault server by using this command inside the vault server pod:

 vault policy write internal-app - <<EOF
path "internal/data/database/config" {
  capabilities = ["read"]
}
EOF

And now, we associated this policy with the service account by running this command also inside the vault server pod:

  vault write auth/kubernetes/role/internal-app \
    bound_service_account_names=internal-app \
    bound_service_account_namespaces=default \
    policies=internal-app \
    ttl=24h

And that’s pretty much all the configuration we need to do at the Vault side to be able to inject secrets in pods using Hashicorp Vault. Now, we need to configure our application accordingly by doing the following modifications:

  • Specify the ServiceAccountName to the deployment to be the one we created previously: internal-app
  • Specify the specific annotations to inject the vault secrets and the configuration of those secrets.

Let’s start with the first point. We need to add the serviceAccountName to our Kubernetes Manifest YAML file as shown below:

Inject Secrets In Pods: Service Account Name definition

And regarding the second point, we would solve it by adding several annotations to our deployment, as shown below:

Inject Secrets In Pods: Annotations

The annotations used to inject secrets in pods are the following ones:

  • vault.hashicorp.com/agent-inject: ‘true’: This tells the vault injector that we would like to inject the sidecar agent in this deployment and have the Vault configuration. This is required to do any further configuration
  • vault.hashicorp.com/role: internal-app: This is the vault role we are going to use when we’re asking for secrets and information to the vault to be sure that we only access the secrets we have allowed based on the policy we created in the previous section
  • vault.hashicorp.com/agent-inject-secret-secret-database-config.txt: internal/data/database/config: This will be one annotation per secret we plan to add, and it is composed of three parts:
    • vault.hashicorp.com/agent-inject-secret- this part is fixed
    • secret-database-config.txt this part will be the filename that is created under /vault/secrets inside our pod
    • internal/data/database/config This is the path inside the vault of our secret to being linked to that file.

And that’s it! If we deploy now our deployment we will see the following things:

  • Our deployment application has been launched with three containers instead of one because two of them are Hashicorp Vault related, as you can see in the picture below:
Inject Secrets into Kubernetes Pods Using HashiCorp Vault (Agent Injector Guide)
  • vault-agent-init will be the container that establishes the connection to the vault-server because any container starts and does the first download and injecting secrets in the pod based on the configuration provided.
  • vault-agent will be the container running as a watcher to detect any modification on the related secrets and update them.
Inject Secrets into Kubernetes Pods Using HashiCorp Vault (Agent Injector Guide)

And now, if we go to the main container, we will see in the /vault/secrets path the secret has been finally injected as expected:

Inject Secrets into Kubernetes Pods Using HashiCorp Vault (Agent Injector Guide)

And this is how easily and without any knowledge about the underlying app we can inject secrets in pods using Hashicorp Vault.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

OpenLens vs Lens: Key Differences, Licensing Changes, and Which One to Use

OpenLens vs Lens: A New Battle Starting in January 2023

Introduction

We already talked about Lens several times in different articles but today I am bringing it here OpenLens because after the release of Lens 6 in late July a lot of questions have arrises, especially regarding its change and the relationship with the OpenLens project, so I thought it could be very interesting to bring some of this data all together in the same place so any of you is quite confused. So I would try to explain and answer the main questions you can have at the moment.

What is OpenLens?

OpenLens is the open source project that is behind the code that supports the main functionality of Lens, the software to help you manage and run your Kubernetes Clusters. It is available on GitHub here (https://github.com/lensapp/lens) and it is totally open-source and distributed over an MIT License. In its own words this is the definition:

This repository ("OpenLens") is where Team Lens develops the Lens IDE product together with the community. It is backed by a number of Kubernetes and cloud-native ecosystem pioneers. This source code is available to everyone under the MIT license

OpenLens vs Lens?

So the main question you could have at the moment is what is the difference between Lens and OpenLens. The main difference is that Lens is built on top of OpenLens including some additional software and libraries with different licenses. It is developed by the Mirantis team (the same company that owns the Docker Enterprise) and it is distributed under a traditional EULA.

Is Lens going to be private?

We need to start by saying that since the beginning Lens has been released under a traditional EULA, so on that front there is not much difference, we can say that OpenLens is Open Source but Lens is Freeware or at least was freeware at that point. But on 28th July we had the release of Lens 6 where the difference between projects started to arise.

As commented on the Mirantis Blog Post a lot of changes and new capabilities have been included but on top of that also the vision has been revealed. As the Mirantis team says they don’t stop at the current level Lens has today to manage the Kubernetes cluster they want to go beyond providing also a Web version of Lens to simplify even more the access, also extend its reach beyond Kubernetes, and so on.

So, you can admit that this is a very compelling vision and very ambitious at the same time and that’s why also they are doing some changes to the license and model, which we are going to talk about below.

Is Lens still free?

We already commented that Lens was always released under a traditional EULA so it was not Open Source like other projects such as its core in OpenLens, but was free to use. With the release on July 28th, this is changing a bit to support their new vision.

They are releasing a new subscription model depending on the usage you are doing of the tool and the approach is very similar to the one they did at the time with Docker Desktop if you remember that we handle that on an article too.

  • Lens Personal subscriptions are for personal use, education, and startups (less than $10 million in annual revenue or funding). They are free of charge.
  • Lens Pro subscriptions are required for professional use in larger businesses. The pricing is $19.90 per user/month or $199 per user/year.

The new license applied with the release of Lens 6 on 28th July but they have provided a Grace Period until January 2023 so you can adapt to this new model.

Should I stop using Lens now?

This is, as always, up to you, but things are going to be the same until January 2023 and at that point, you need to formalize your situation with Lens and Mirantis. If you are under the situation of a Lens Personal license because you are working for a startup or open-source, you can continue to do so without any problem. If that’s not the case, it is up to the company if the additional features they are providing now and also the vision to the future justify the investment you need to do on the Lens Pro license.

You will always have the option to switch from Lens to OpenLens it will not be 100% the same but the core functionalities and approach at this moment will continue to be the same and the project for sure will be very very active. And also as Mirantis already confirmed in the same blog post: “There are no changes to OpenLens licensing or any other upstream open source projects used by Lens Desktop.” So you cannot expect the same situation happens if you are switching to OpenLens or already using OpenLens.

How can I install OpenLens?

Installation of OpenLens is a little bit tricky because you need to generate your build from the source, but to ease that path has been several awesome people that are doing that on their GitHub repositories such as Muhammed Kalkan that is providing a repo with the latest versions with only Open Source components for the major platforms (Windows, macOS X (Intel and Silicon) or Linux) available here:

What Features I am Losing if I switch to OpenLens?

For sure there will be some features that you will be losing if you switch from Lens to OpenLens which are the ones that are provided using the licensed pieces of software. Here we include a non-exclusive list of our experiences using both products:

  • Account Synchronization: All the capabilities of having all your Kubernetes Cluster under your Lens Account and sync will not be available on OpenLens. You will rely on the content of the kubeconfig file
  • Spaces: The option to have your configuration shared between different users that belongs to the same team is not available on OpenLens.
  • Scan Image: One of the new capabilities of the Lens 6 is the option to scan the image of the containers deployed on the cluster, but this is not available on OpenLens.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Hadolint Explained: Dockerfile Best Practices Using CLI, Docker, and VS Code

Hadolint Explained: Dockerfile Best Practices Using CLI, Docker, and VS Code

Introduction

Hadolint is an open-source tool that will help you ensure that all the Dockerfiles you create follow all the Dockerfile best practices available in an automated way. Hadolint, as the number already suggested, is a linter tool and, because of that, can also help you to teach you all these best practices when creating Dockerfiles yourself. We already talked about it the optimization of container image size, but today we are going to try to cover it more in-depth.

Hadolint is a smaller tool written in Haskell that parses the Dockerfile into an AST and performs rules on top of the AST. It stands on the shoulders of ShellCheck to lint the Bash code inside RUN instructions, as shown in the picture below:

Hadolint Explained: Dockerfile Best Practices Using CLI, Docker, and VS Code

There are several ways to run the tool, depending on what you try to achieve, and we will talk a little bit about the different options.

Running it as a standalone tool

This is the first way we can run it as a complete standalone tool that you can download from here , and it will need to do the following command.

 hadolint <Dockerfile path>

It will run against it and show any issue that is found, as you can see in the picture below:

Hadolint execution

For each of the issues found, it will show the line where the problem is detected, the code of the Dockerfile best practice check that is being performed (DL3020), the severity of the check (error, warn, info, and so on), and the description of the issue.

To see all the rules that are being executed, you can check them in the GitHub Wiki , and all of them are based on the Dockerfile best practices published directly from Docker on its official web page here.

For each of them, you will find a specific wiki page with all the information you need about the issue and why this is something that should be changed, and how it should be changed, as you can see in the picture below:

Hadolint GitHub Wiki page

Ignore Rules Capability

You can ignore some rules if you don’t want them to be applied because there are some false-positive or just because the checks are not aligned with the Dockerfile best practices used in your organization. To do that, you can include an —ignore parameter with the rule to be applied:

 hadolint --ignore DL3003 --ignore DL3006 <Dockerfile>

Running it as Docker Container

Also, the tool is available as a Docker container in the following repos:

docker pull hadolint/hadolint
# OR
docker pull ghcr.io/hadolint/hadolint

And this will help you to be introduced to your Continuous Integration and Continuous Deployment or just to be used in your local environment if you prefer not to install software locally.

 Running it inside VS Code

Like many linters, it is essential to have it close to your development environment; this time is nothing different. We would like to have the Dockerfile best practice relative to the editor while we are typing for two main reasons:

  • As soon as you get the issue, you will fix it faster so the code always will have better quality
  • As soon as you know of the issue, you will not make it again in newer developments.

You will have a Hadolint as part of the Extensions: Marketplace, and you can install it:

Hadolint VS Code Extension


Once you have that done, each time you open a Dockerfile, you will validate against all these Dockerfile best practices, and it will show the issues detected in the Problems view, as you can see in the picture below:

Hadolint: VS Code Extension Execution

And those issues will be re-evaluated as soon as you modify and save the Dockerfile again, so you will always see the live version of the problem detected against the Dockerfile best practices.

TIBCO BusinessWorks HashiCorp Vault Integration: Secure Secrets in 3 Steps

TIBCO BusinessWorks HashiCorp Vault Integration: Secure Secrets in 3 Steps

Introduction

This article aims to show the TIBCO BW Hashicorp Vault Configuration to integrate your TIBCO BW application with the secrets stored in Hashicorp Vault, mainly for the externalization and management of password and credentials resources.

This article is part of my comprehensive TIBCO Integration Platform Guide where you can find more patterns and best practices for TIBCO integration platforms.

As you probably know, in the TIBCO BW application, the configuration is stored in Properties at different levels (Module or Application properties). You can read more about them here. And the primary purpose of that properties is to provide flexibility to the application configuration.

These properties can be of different types, such as String, Integer, Long, Double, Boolean, and DateTime, among other technical resources inside TIBCO BW, as shown in the picture below:

TIBCO BW Hashicorp Vault Configuration: BW Property Types

The TIBCO BW Hashicorp Vault integration will affect only those properties of Password type (at least up to 2.7.2/6.8.1 BW version). The reason behind that is that those properties are the kind of data relevant to the information that is sensitive and needs to be secure. Other concepts can be managed through standard Kubernetes components such as ConfigMaps.

BW Application Definition

We are going to start with a straightforward application, as you can see in the picture below:

TIBCO BW Hashicorp Vault Configuration: Property sample

Just a simple timer that will be executed once and insert the current time into the PostgreSQL database. We will use Hashicorp Vault to store the password of the database user to be able to connect to it. The username and the connection string will reside on a ConfigMap.

We will skip the part of the configuration regarding the deployment of the TIBCO BW application Containers and link to a ConfigMap you have an article covering that in detail in case you need to follow it, and we will focus just on the topic regarding TIBCO BW Hashicorp Vault integration.

So we will need to tell TIBCO BW that the password of the JDBC Shared Resource will be linked to Hashicorp Vault configuration, and to do that, the first thing is to have tied the Password of the Shared Resources to a Module Property as shown in the picture below:

TIBCO BW Hashicorp Vault Configuration: Password linked to Module Property

Now, we need to tell this Module Property that is Linked to Hashicorp Vault, and we will do that on the Application Property View, selecting that this property is linked to a Credential Management Solution as shown in the picture below:

TIBCO BW Hashicorp Vault Configuration: Credential Management Configuration for Property

And it is now when we establish the TIBCO BW Hashicorp Vault relationship. We need to click directly on the green plus sign, and we will have a modal window asking for the technology of credentials management that we’re going to use and the data needed for each of them, as you can see in the following picture:

TIBCO BW Hashicorp Vault Configuration: Credential Management Configuration for Property

We will select Hashicorp Vault as the provided. Then we will need to provide three attributes that we already commented on in the previous article when we start creating secrets in Hashicorp Vault:

  • Secret Name: this is the secret name path after the root path of the element.
  • Secret Key: This is the key inside the secret itself
  • Mount Path: This is the root path of the secret

To get more details about these three concepts, please look at our article about how to create secrets in Hashicorp Vault.

So with all this, we have pretty much everything we need to connect to Hashicorp Vault and grab the secret, and from the TIBCO BW BusinessStudio side, everything is done; we can generate the EAR file and deploy it into Kubernetes because here it is the last part of our configuration.

 Kubernetes Deployment

Until this moment, we have the following information already provided:

  • BW Process that has the login to connect to the Database and insert information
  • Link between the password property used to connect and the Hashicorp Secret definition

So, pretty much everything is there, but one concept is missing. How will the Kubernetes Pod connect to Hashicorp once the pod is deployed? Until this point, we didn’t provide the Hashicorp Vault server location of the authentication method to connect to it. This is the missing part of the TIBCO BW Hashicorp Vault integration and will be part of the Kubernetes Deployment YAML file.

We will do that using the following environment properties in this sample:

TIBCO BW Hashicorp Vault Configuration: Hashicorp Environment Variables
  • HASHICORP_VAULT_ADDR: This variable will point to where the Hashicorp Vault server is located
  • HASHICORP_VAULT_AUTH: This variable will indicate which authentication options will be used. In our case, we will use the token one as we used in the previous article
  • HASHICORP_VAULT_KV_VERSION: This variable indicates which version of the KV storage solution we are using and will be two by default.
  • HASHICORP_VAULT_TOKEN: This will be just the token value to be able to authentication against the Hashicorp Vault server

If you are using other authentication methods or just want to know more about those properties please take a look at this documentation from TIBCO.

With all that added to the environment properties of our TIBCO BW application, we can run it, and we will get an output similar to this one, and that shows that the TIBCO BW Hashicorp Vault integration has been done and the application was able to start without any issue

TIBCO BW Hashicorp Vault Configuration: Running sample

Create Secrets in HashiCorp Vault: CLI and REST API Explained Step by Step

Create Secrets in HashiCorp Vault: CLI and REST API Explained Step by Step

Introduction

Create secrets in Hashicorp Vault is one of the most important and relevant things you can do once you have installed Hashicorp Vault on your environment, probably by recovering and getting these secrets from the components they need it. But in today’s article, we will focus on the first part so you can learn how easily you can create secrets in Hashicorp Vault.

In previous articles we commented on the importance of Hashicorp Vault and the installation process, as you can read here. Hence, at this point, we already have our vault ready to start working with it wholly initialized and unseal to be able to start serving requests.

Create Secrets in Hashicorp Vault using Hashicorp Vault CLI Commands

All the commands we will do will use a critical component named Hashicorp Vault CLI, and you will notice that because all of our commands will start with vault. To be honest, we already started with that in the previous article; if you remember, we already run some of these commands to initialize or unseal the vault, but now this will be our main component to interact with.

The first thing we need to do is to be able to log into the vault, and to do that; we are going to use the root token that was provided to us when we initialized the vault; we are going to store this vault in an environment variable so it will be easy to work with it. All the commands we are going to run now are going to be inside the vault agent server pod, as shown in the picture below:

Create Secrets in Hashicorp Vault: Detecting Vault Server Pod

Once we are inside of it, we are going to do the log command with the following syntax:

 vault login 

And we will get an output similar to this one:

Create Secrets in Hashicorp Vault: Login in Hashicorp Vault

If we do not provide the token in advance, the console will ask for the token to be typed afterward, and it will be automatically hidden, as you can see in the picture below:

Create Secrets in Hashicorp Vault: Login without Token provided

After this point, we are already logged into the vault, so we can start typing commands to create secrets in Hashicorp Vault. Let’s start with that process.

To start with our process for creating secrets in Hashicorp Vault, we first need to make or be more accurate to the Hashicorp Vault syntax to enable a secret path that you can think about as the root path to which all your secrets will be related to. If we are talking about having secrets for different applications, each path can be each of the applications, but the organization of secrets can be other depending on the context. We will cover that in much more detail in the following articles.

To enable the secret path to start the creation of secrets in Hashicorp Vault, we will type the following command:

 vault secrets enable -path=internal kv-v2

That will enable a secret store of the type kv-v2 (key-value store in its v2), and the path will be “internal,” so everything else that we create after that will be under this “internal” root path.

And now, we’re going to do the creation of the secret in Hashicorp Vault, and as we are using a key-value store, the syntax is also related to that because we are going to “put” a secret using the following command:

 vault kv put internal/database/config username="db-readonly-username" password="db-secret-password"

That will create inside the internal path a child path /database/config where it will store two keys:

  • username that will have the value db-readonly-username
  • password that will have the value db-secret-password

As you can see, it is quite easy to create new secrets on the Vault linked to the path, and if you want to retrieve its content, you can also do it using the Vault CLI, but this time using the get command as shown in the snippet below:

 vault kv get internal/database/config

And the output will be similar to the one shown below:

Create Secrets in Hashicorp Vault: Retrieving Secrets from the Vault

This will help you interact with your store’s content to retrieve, add or update what you already have there. Once you have everything ready there, we can move to the client side to configure it to gather all this data as part of its lifecycle workflow.

Create Secrets in Hashicorp Vault using REST API

The Hashicorp Vault CLI simplifies the interaction with the vault server, but all the interaction between the CLI and the server happens through a REST API that the server exposes and the CLI client consumes. It provides a simplified syntax to the user and translates the parameters provided into REST requests to the server, but you can use also REST requests to go to the server directly. Please look at this article in the official documentation to get more details about the REST API.

HashiCorp Vault Installation on Kubernetes Using Helm (Quick Start Guide)

HashiCorp Vault Installation on Kubernetes Using Helm (Quick Start Guide)

Introduction

In this article, we are going to cover the Hashicorp Vault Installation on Kubernetes. Hashicorp Vault has become one of the industry standards when we talk about managing secrets and sensitive data in production environments, and this covers cloud and non-cloud-native deployments. But especially in Kubernetes, this is a critical component. We have already commented that the Kubernetes Secrets are not very secured by default, so HashiCorp Vault solves that problem.

Installation Methods

Hashicorp Vault provides many different installation methods that you can read about on their official page here; most still focus on a traditional environment. But in summary, these are the ones you have available:

  • Install from Package Manager
  • Install from pre-existing binary
  • Install it from the source
  • Helm for Kubernetes

As you can imagine, the path we will follow here is the Helm way. I guess you are already familiar with help, but if not, I have some articles focus around Helm Charts that you can find all valuable information.

Helm Chart for Hashicorp Vault

For the sake of this article, we are going to what is called a standalone hashicorp vault installation, so we are not going to create in this post an architecture with High-Availability (HA) that is production-ready but something that can help you to start playing with the tool and see how this tool can be integrated with other ones that belong to the same cloud-native environment. To get more information about deploying Hashicorp Vault into a production-ready setup, please look at the following link.

We first need to install the helm chart in our local environment, but we need to be very careful about the helm version we have. When writing this article, Hashicorp Vault Installation requires a 3.7+ Helm Version, so you must first check the version you have installed.

In case you’re running on an older version, you will get the following error:

 Error: parse error at (vault/templates/_helpers.tpl:38): unclosed action

You can get more details on this GitHub issue.

At the time of writing this article, the latest version of Helm is 3.9, but this version generates an issue with AWS EKS with this error:

 Error: Kubernetes cluster unreachable: exec plugin: invalid apiVersion "client.authentication.k8s.io/v1alpha1."
Failed installing **** with helm

You can get more details on this GitHub issue.

So, in that case, the best way to ensure there will not be a problem with the Hashicorp Vault Installation is to downgrade to 3.8, and you will be able to deploy the helm chart without any issue.

Hashicorp Vault Installation Process

To proceed with the Hashicorp Vault Installation, we need to run the following commands:

helm repo add hashicorp https://helm.releases.hashicorp.com
helm install vault hashicorp/vault

This will install two different components a single vault server as part of a StatefulSet and a vault-agent-injector to manage the injection of vault configuration on the various components and deployments on the other namespaces.

To get the pods running, we need to initialize and unseal the vault before being ready to use. To do that, we need to enter inside the vault-server pod and execute the following commands:

 vault operator init

This will generate several essential things:

  • It will generate the keys to be able to unseal the vault to be able to start using it. It will prompt a different number of keys, in our sample 5, and you will need at least 3 of them to be able to unseal the vault as
  • It will also generate a root token to be able to log into the CLI and interactor with the server to be able to read and write secrets

After that, we will need to run the following command at least three times, providing each of them with a different unseal key:

 Vault operator unseal

After that point, all components are Running and Ready and we can conclude our Hashicorp Vault Installation and start interacting with the vault to create your secrets.

Hashicorp Vault Installation: All Components Ready

Helm Loops Explained: A Practical Helm Hack to Avoid Deployment Issues

Helm Loops Explained: A Practical Helm Hack to Avoid Deployment Issues

Introduction



When working with complex Helm deployments, mastering loops is just one piece of the puzzle. For a comprehensive understanding of Helm from fundamentals to advanced patterns, check out our complete Helm Charts & Kubernetes Package Management Guide.

Helm Charts are becoming the default de-factor solution when you want to package your Kubernetes deployment to be able to distribute or quickly install it in your system.

Defined several times as the apt for Kubernetes for its similarity with the ancient package manager from GNU/Linux Debian-like distributions, it seems to continue to grow in popularity each month compared with other similar solutions even more tightly integrated into Kubernetes such as Kustomize, as you can see in the Google Trends picture below:

Helm Loops: Helm Charts vs Kustomize

But creating these helm charts is not as easy as it shows. If you already have been on the work of doing so, you probably get stuck at some point, or you spend a lot of time trying to do some things. If this is the first time you are creating one or trying to do something advanced, I hope all these tricks will help you on your journey. Today we are going to cover one of the most important tricks, and those are Helm Loops.

Helm Loops Introduction

If you see any helm chart for sure, you will have a lot of conditional blocks. Pretty much everything is covered under an if/else structure based on the values.yml files you are creating. But this gets a little bit tricky when we talk about loops. But the great thing is that you will have the option to execute a helm loop inside your helm charts using the rangeprimitive.

How to create a Helm Loop?

The usage of the rangeprimitive is quite simple, as you only need to specify the element you want to iterate across, as shown in the snippet below:

{{- range .Values.pizzaToppings }}
- {{ . | title | quote }}
{{- end }}    

This is a pretty simple sample where the yaml will iterate over the values that you have assigned to the pizzaToppings structure in your values.yml

There are some concepts to keep in mind in this situation:

  • You can easily access everything inside this structure you are looping across. So, if pizza topping has additional fields, you can access them with something similar to this:
{{- range.Values.pizzaToppings }}
- {{ .ingredient.name | title | quote }}
{{- end }}    

And this will access a structure similar to this one in your values.yml:

 pizzaToppings:
	- ingredient:
		name: Pinneaple
		weight: 3

The good thing is that you can access their underlying attribute without replicating all the parent hierarchy until you reach the looping structure because inside the range section, the scope has changed. We will refer to the root of each element we are iterating across.

How to access parent elements inside a Helm Loop?

In the previous section, we covered how easily we can access the inner attribute inside the loop structure because of the change of scope, which also has an issue. In case I want to access some element in the parent of my values.yml file or somewhere outside the structure, how can I access them?

The good thing is that we also have a great answer to that, but you can get there. We need to understand a little bit about the scopes in Helm.

As commented, . refers to the root element in the current scope. If you have never defined a range section or another primitive that switches the context, .always will refer to the root of your values.yml. That is why when you see a helm chart, you see all the structures with the following way of working: .Values.x.y.z, but we already have seen that when we have a range section, this is changing, so this is not a good way.

To solve that, we have the context $ that constantly refers to the root of the values. ymlno matter which one is the current scope. So that means that if I have the following values.yml:

base:
	- type: slim 
pizzaToppings:
	- ingredient:
		name: Pinneaple
		weight: 3
	- ingredient:
		name: Apple
		weight: 3

And I want to refer to the base type inside the range section similar to before I can do it using the following snippet:

{{- range .Values.pizzaToppings }}
- {{ .ingredient.name | title | quote }} {{ $.Values.base.type }}
{{- end }}    

That will generate the following output:

 - Pinneaple slim
 - Apple slim

So I hope this helm chart trick will help you with the creation, modification, or improvement of your upgraded helm charts in the future by using helm loops without any further concern!

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes

In the previous article, we described what capability BanzaiCloud Logging Operator provides and its main features. So, today we are going to see how we can implement it.

The first thing we need to do is to install the operator itself, and to do that, we have a helm chart at our disposal, so the only thing that we will need to do are the following commands:

 helm repo add banzaicloud-stable https://kubernetes-charts.banzaicloud.com
helm upgrade --install --wait --create-namespace --namespace logging logging-operator banzaicloud-stable/logging-operator

That will create a logging namespace (in case you didn’t have it yet), and it will deploy the operator components itself, as you can see in the picture below:

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes
BanzaiCloud Logging Operator installed using HelmChart

So, now we can start creating the resources we need using the CRD that we commented on in the previous article but to do a recap. These are the ones that we have at our disposal:

  • logging – The logging resource defines the logging infrastructure for your cluster that collects and transports your log messages. It also contains configurations for Fluentd and Fluent-bit.
  • output / clusteroutput – Defines an Output for a logging flow, where the log messages are sent. output will be namespaced based, and clusteroutput will be cluster based.
  • flow / clusterflow – Defines a logging flow using filters and outputs. The flow routes the selected log messages to the specified outputs. flow will be namespaced based, and clusterflows will be cluster based.

So, first of all, we are going to define our scenario. I don’t want to make something complex; I wish that all the logs that my workloads generate, no matter what namespace they are in, are sent to a Grafana Loki instance that I have also installed on that Kubernetes Cluster on a specific endpoint using the Simple Scalable configuration for Grafana Loki.

So, let’s start with the components that we need. First, we need a Logging object to define my Logging infrastructure, and I will create it with the following command.

kubectl -n logging apply -f - <<"EOF"
apiVersion: logging.banzaicloud.io/v1beta1
kind: Logging
metadata:
  name: default-logging-simple
spec:
  fluentd: {}
  fluentbit: {}
  controlNamespace: logging
EOF

We will keep the default configuration for fluentd and fluent-bit just for the sake of the sample, and later on in upcoming articles, we can talk about a specific design, but that’s it.

Once the CRD is processed, the components will appear on your logging namespace. In my case that I’m using a 3-node cluster, I will see 3 instances for fluent-bit deployed as a DaemonSet and a single example of fluentd, as you can see in the picture below:

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes
BanzaiCloud Logging Operator configuration after applying Logging CRD

So, now we need to define the communication with Loki, and as I would like to use this for any namespace I can have on my cluster, I will use the ClusterOutput option instead of the normal Output one that is namespaced based. And to do that, we will use the following command (please ensure that the endpoint is the right one; in our case, this is loki-gateway. default as it is running inside the Kubernetes Cluster:

kubectl -n logging apply -f - <<"EOF"
apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterOutput
metadata:
 name: loki-output
spec:
 loki:
   url: http://loki-gateway.default
   configure_kubernetes_labels: true
   buffer:
     timekey: 1m
     timekey_wait: 30s
     timekey_use_utc: true
EOF

And pretty much we have everything; we just need one flow to communicate our Logging configuration to the ClusterOutput we just created. And again, we will go with the ClusterFlow because we would like to define this at the Cluster level and not in a by-namespaced fashion. So we will use the following command:

 kubectl -n logging  apply -f - <<"EOF"
apiVersion: logging.banzaicloud.io/v1beta1
kind: ClusterFlow
metadata:
  name: loki-flow
spec:
  filters:
    - tag_normaliser: {}
  match:
    - select: {}
  globalOutputRefs:
    - loki-output
EOF

And after some time to do the reload of the configuration (1-2 minutes or so), you will start to see in the Loki traces something like this:

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes
Grafana showing the logs submitted by the BanzaiCloud Logging Operator

And that indicates that we are already receiving push of logs from the different components, mainly the fluentd element we configured in this case. But I think it is better to see it graphically with Grafana:

BanzaiCloud Logging Operator on Kubernetes: Log Aggregation in 5 Minutes
Grafana showing the logs submitted by the BanzaiCloud Logging Operator

And that’s it! And to change our logging configuration is as simple as changing the CRD component we defined, applying matches and filters, or sending it to a new place. Straightforwardly we have this completely managed.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Log Aggregation in Kubernetes Explained with BanzaiCloud Logging Operator

Log Aggregation in Kubernetes Explained with BanzaiCloud Logging Operator

We already have talked about the importance of Log Aggregation in Kubernetes and why the change in the behavior of the components makes it a mandatory requirement for any new architecture we deployed today.

To solve that part, we have a lot of different stacks that you probably have heard about. For example, if we follow the traditional Elasticsearch path, we will have the pure ELK stack from Elasticsearch, Logstash, and Kibana. Now this stack has been extended with the different “Beats” (FileBeat, NetworkBeat, …) that provides a light log forwarder to be added to the task.

Also, you can change Logstash for a CNCF component such as Fluentd that probably you have heard about, and in that case, we’re talking about an EFK stack following the same principle. And also, have the Grafana Labs view using promtail, Grafana Loki, and Grafana for dashboarding following a different perspective.

Then you can switch and change any component for the one of your preference, but in the end, you will have three different kinds of components:

  • Forwarder: Component that will listen to all the log inputs, mainly the stdout/stderr output from your containers, and push it to a central component.
  • Aggregator: Component that will receive all the traces for the forwarded, and it will have some rules to filter some of the events, format, and enrich the ones received before sending it to central storage.
  • Storage: Component that will receive the final traces to be stored and retrieved for the different clients.

To simplify the management of that in Kubernetes, we have a great Kubernetes Operator named BanzaiCloud Logging Operator that tries to follow that approach in a declarative / policy manner. So let’s see how it works, and to explain it better, I will use its central diagram from its website:

Log Aggregation in Kubernetes Explained with BanzaiCloud Logging Operator
BanzaiCloud Logging Operator Architecture

This operator uses the same technologies we were talking about. It covers mainly the two first steps: Forwarding and Aggregation and the configuration to be sent to a Central Storage of your choice. To do that works with the following technologies, all of them part of the CNCF Landscape:

  • Fluent-bit will act as a forwarded deployed on a DaemonSet mode to collect all the logs you have configured.
  • Fluentd will act as an aggregator defining the flows and rules of your choice to adapt the trace flow you are receiving and sending to the output of your choice.

And as this is a Kubernetes Operator, this works in a declarative way. We will define a set of objects that will define our logging policies. We have the following components:

  • logging – The logging resource defines the logging infrastructure for your cluster that collects and transports your log messages. It also contains configurations for Fluentd and Fluent-bit.
  • output / clusteroutput – Defines an Output for a logging flow, where the log messages are sent. output will be namespaced based, and clusteroutput will be cluster based.
  • flow / clusterflow – Defines a logging flow using filters and outputs. The flow routes the selected log messages to the specified outputs. flow will be namespaced based, and clusterflows will be cluster based.

In the picture below, you will see how these objects are “interacting” to define your desired logging architecture:

Log Aggregation in Kubernetes Explained with BanzaiCloud Logging Operator
BanzaiCloud Logging Operator CRD Relationship

And apart from the policy mode, it also includes a lot of great features such as:

  • Namespace isolation
  • Native Kubernetes label selectors
  • Secure communication (TLS)
  • Configuration validation
  • Multiple flow support (multiply logs for different transformations)
  • Multiple output support (store the same logs in multiple storages: S3, GCS, ES, Loki, and more…)
  • Multiple logging system support (multiple Fluentd, Fluent Bit deployment on the same cluster)

In upcoming articles we were talking about how we can implement this so you can see all the benefits that this CRD-based, policy-based approach can provide to your architecture.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.