Ephemeral Containers in Kubernetes Explained: A Powerful Debugging Tool

Ephemeral Containers in Kubernetes Explained: A Powerful Debugging Tool

In the dynamic and ever-evolving world of container orchestration, Kubernetes continues to reign as the ultimate choice for managing, deploying, and scaling containerized applications. As Kubernetes evolves, so do its features and capabilities, and one such fascinating addition is the concept of Ephemeral Containers. In this blog post, we will delve into the world of Ephemeral Containers, understanding what they are, exploring their primary use cases, and learning how to implement them, all with guidance from the Kubernetes official documentation.

 What Are Ephemeral Containers?

Ephemeral Containers, introduced as an alpha feature in Kubernetes 1.16 and reached a stable level on the Kubernetes 1.25 version, offer a powerful toolset for debugging, diagnosing, and troubleshooting issues within your Kubernetes pods without requiring you to alter your pod’s original configuration. Unlike regular containers that are part of the central pod’s definition, ephemeral containers are dynamically added to a running pod for a short-lived duration, providing you with a temporary environment to execute diagnostic tasks.

The good thing about ephemeral containers is that they allow you to have all the required tools to do the job (debug, data recovery, or anything else that could be required) without adding more devices to the base containers and increasing the security risk based on that action.

 Main Use-Cases of Ephemeral Containers

  • Troubleshooting and Debugging: Ephemeral Containers shine brightest when troubleshooting and debugging. They allow you to inject a new container into a problematic pod to gather logs, examine files, run commands, or even install diagnostic tools on the fly. This is particularly valuable when encountering issues that are difficult to reproduce or diagnose in a static environment.
  • Log Collection and Analysis: When a pod encounters issues, inspecting its logs is often essential. Ephemeral Containers make this process seamless by enabling you to spin up a temporary container with log analysis tools, giving you instant access to log files and aiding in identifying the root cause of problems.
  • Data Recovery and Repair: Ephemeral Containers can also be used for data recovery and repair scenarios. Imagine a situation where a database pod faces corruption. With an ephemeral container, you can mount the pod’s storage volume, perform data recovery operations, and potentially repair the data without compromising the running pod.
  • Resource Monitoring and Analysis: Performance bottlenecks or resource constraints can sometimes affect a pod’s functionality. Ephemeral Containers allow you to analyze resource utilization, run diagnostics, and profile the pod’s environment, helping you optimize its performance.

Implementing Ephemeral Containers

Thanks to Kubernetes ‘ user-friendly approach, implementing ephemeral containers is straightforward. Kubernetes provides the kubectl debug command, which streamlines the process of attaching ephemeral containers to pods. This command allows you to specify the pod and namespace and even choose the debugging container image to be injected.

kubectl debug <pod-name> -n <namespace> --image=<debug-container-image>

You can go even beyond, and instead of adding the ephemeral containers to the running pod, you can do the same to a copy of the pod, as you can see in the following command:

kubectl debug myapp -it --image=ubuntu --share-processes --copy-to=myapp-debug

Finally, once you have done your duty, you can permanently remove it using a kubectl delete command, and that’s it.

It’s essential to notice that all these actions require direct access to the environment. Even that temporarily generates a mismatch on the “infrastructure-as-code” deployment, as we’re manipulating the runtime status temporarily. Hence, this approach is much more challenging to implement if you use some GitOps practices or tools such as Rancher Fleet or ArgoCD.

Conclusion

Ephemeral Containers, while currently a stable feature since the Kubernetes 1.25 release, offer impressive capabilities for debugging and diagnosing issues within your Kubernetes pods. By allowing you to inject temporary containers into running pods dynamically, they empower you to troubleshoot problems, collect logs, recover data, and optimize performance without disrupting your application’s core functionality. As Kubernetes continues to evolve, adding features like Ephemeral Containers demonstrates its commitment to providing developers with tools to simplify the management and maintenance of containerized applications. So, the next time you encounter a stubborn issue within your Kubernetes environment, remember that Ephemeral Containers might be the debugging superhero you need!

For more detailed information and usage examples, check out the Kubernetes official documentation on Ephemeral Containers. Happy debugging!

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Istio Proxy DNS Explained: How DNS Capture Improves Service Mesh Traffic Control

Istio Proxy DNS Explained: How DNS Capture Improves Service Mesh Traffic Control

Istio is a popular open-source service mesh that provides a range of powerful features for managing and securing microservices-based architectures. We have talked a lot about its capabilities and components, but today we will talk about how we can use Istio to help with the DNS resolution mechanism.

As you already know, In a typical Istio deployment, each service is accompanied by a sidecar proxy, Envoy, which intercepts and manages the traffic between services. The Proxy DNS capability of Istio leverages this proxy to handle DNS resolution requests more intelligently and efficiently.

Traditionally, when a service within a microservices architecture needs to communicate with another service, it relies on DNS resolution to discover the IP address of the target service. However, traditional DNS resolution can be challenging to manage in complex and dynamic environments, such as those found in Kubernetes clusters. This is where the Proxy DNS capability of Istio comes into play.

 Istio Proxy DNS Capabilities

With Proxy DNS, Istio intercepts and controls DNS resolution requests from services and performs the resolution on their behalf. Instead of relying on external DNS servers, the sidecar proxies handle the DNS resolution within the service mesh. This enables Istio to provide several valuable benefits:

  • Service discovery and load balancing: Istio’s Proxy DNS allows for more advanced service discovery mechanisms. It can dynamically discover services and their corresponding IP addresses within the mesh and perform load balancing across instances of a particular service. This eliminates the need for individual services to manage DNS resolution and load balancing.
  • Security and observability: Istio gains visibility into the traffic between services by handling DNS resolution within the mesh. It can apply security policies, such as access control and traffic encryption, at the DNS level. Additionally, Istio can collect DNS-related telemetry data for monitoring and observability, providing insights into service-to-service communication patterns.
  • Traffic management and control: Proxy DNS enables Istio to implement advanced traffic management features, such as routing rules and fault injection, at the DNS resolution level. This allows for sophisticated traffic control mechanisms within the service mesh, enabling A/B testing, canary deployments, circuit breaking, and other traffic management strategies.

 Istio Proxy DNS Use-Cases

There are some moments when you cannot or don’t want to rely on the normal DNS resolution. Why is that? Starting because DNS is a great protocol but lacks some capabilities, such as location discovery. If you have the same DNS assigned to three IPs, it will provide each of them in a round-robin fashion and cannot rely on the location.

Or you have several IPs, and you want to block some of them for some specific service; these are great things you can do with Istio Proxy DNS.

Istio Proxy DNS Enablement

You need to know that Istio Proxy DNS capabilities are not enabled by default, so you must help if you want to use it. The good thing is that you can allow that at different levels, from the full mesh level to just a single pod level, so you can choose what is best for you in each case.

For example, if we want to enable it at the pod level, we need to inject the following configuration in the Istio proxy:

    proxy.istio.io/config: |
		proxyMetadata:   
         # Enable basic DNS proxying
         ISTIO_META_DNS_CAPTURE: "true" 
         # Enable automatic address allocation, optional
         ISTIO_META_DNS_AUTO_ALLOCATE: "true"

The same configuration can be part of the Mesh level as part of the operator installation, as you can find the documentation here on the Istio official page.

Conclusion

In summary, the Proxy DNS capability of Istio enhances the DNS resolution mechanism within the service mesh environment, providing advanced service discovery, load balancing, security, observability, and traffic management features. Istio centralizes and controls DNS resolution by leveraging the sidecar proxies, simplifying the management and optimization of service-to-service communication in complex microservices architectures.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Helm Multiple Instances Subcharts Explained: Reuse the Same Chart with Aliases

Helm Multiple Instances Subcharts Explained: Reuse the Same Chart with Aliases

Helm Multiple Instances Subchart usages as part of your main chart could be something that, from the beginning, can sound strange. We already commented about the helm charts sub-chart and dependencies in the blog because the usual use case is like that:

Multiple subchart instances enable powerful architectural patterns in Helm. Learn about this and other advanced deployment techniques in our complete Helm charts guide.

I have a chart that needs another component, and I “import” it as a sub-chart, which gives me the possibility to deploy the same component and customize its values without needing to create another chart copy and, as you can imagine simplifying a lot of the management of the charts, a sample can be like that:

Discover how multiple subcharts can revolutionize your Helm deployments. Learn how to leverage the power of reusability and customization, allowing you to deploy identical components with unique configurations. Enhance flexibility and simplify management with this advanced Helm feature. Unlock the full potential of your microservices architecture and take control of complex application deployments. Dive into the world of multiple subcharts and elevate your Helm charts to the next level.

So, I think that’s totally clear, but what about are we talking now? The use-case is to have the same sub-chart defined twice. So, imagine this scenario, we’re talking about that instead of this:

# Chart.yaml
dependencies:
- name: nginx
  version: "1.2.3"
  repository: "https://example.com/charts"
- name: memcached
  version: "3.2.1"
  repository: "https://another.example.com/charts"

We’re having something like this

# Chart.yaml
dependencies:
- name: nginx
  version: "1.2.3"
  repository: "https://example.com/charts"
- name: memcached-copy1
  version: "3.2.1"
  repository: "https://another.example.com/charts"
- name: memcached-copy2
  version: "3.2.1"
  repository: "https://another.example.com/charts"

So we have the option to define more than one “instance” of the same sub chart. And I guess, at this moment, you can start asking to yourself: “What are the use-case where I could need this?”

Because that’s quite understandable, unless you need it you will never realize about that. It is the same that happens to me. So let’s talk a bit about possible use cases for this.

 Use-Cases For Multi Instance Helm Dependency

Imagine that you’re deploying a helm chart for a set of microservices that belongs to the scope of the same application and each of the microservices has the same technology base, that can be TIBCO BusinessWorks Container Edition or it can be Golang microservices. So all of them has the same base so it can use the same chart “bwce-microservice” or “golang-microservices” but each of them has its own configuration, for example:

  • Each of them will have its own image name that would differ from one to the other.
  • Each of them will have its own configuration that will differ from one to the other.
  • Each of them will have its own endpoints that will differ and probably even connecting to different sources such as databases or external systems.

So, this approach would help us reuse the same technology helm chart, “bwce” and instance it several times, so we can have each of them with its own configuration without the need to create something “custom” and keeping the same benefits in terms of maintainability that the helm dependency approach provides to us.

 How can we implement this?

Now that we have a clear the use-case that we’re going to support, the next step is regarding how we can do this a reality. And, to be honest, this is much simpler than you can think from the beginning, let’s start with the normal situation when we have a main chart, let’s call it a “program,” that has included a “bwce” template as a dependency as you can see here:

name: multi-bwce
description: Helm Chart to Deploy a TIBCO BusinessWorks Container Edition Application
apiVersion: v1
version: 0.2.0
icon: 
appVersion: 2.7.2

dependencies:
- name: bwce
  version: ~1.0.0
  repository: "file:///Users/avazquez/Data/Projects/DET/helm-charts/bwce"

And now, we are going to move to a multi-instance approach where we will require two different microservices, let’s call it serviceA and serviceB, and both of them we will use the same bwce helm chart.

So the first thing we will modify is the Chart.yaml as follows:

name: multi-bwce
description: Helm Chart to Deploy a TIBCO BusinessWorks Container Edition Application
apiVersion: v1
version: 0.2.0
icon: 
appVersion: 2.7.2

dependencies:
- name: bwce
  alias: serviceA
  version: ~0.2.0
  repository: "file:///Users/avazquez/Data/Projects/DET/helm-charts/bwce"
- name: bwce
  alias: serviceB
  version: ~0.2.0
  repository: "file:///Users/avazquez/Data/Projects/DET/helm-charts/bwce"

The important part here is how we declare the dependency. As you can see in the name we still keeping the same “name” but they have an additional field named “alias” and this alias is what we will help to later identify the properties for each of the instances as we required. With that, we’re already have our two serviceA and serviceB instance definition and we can start using it in the values.yml as follows:

# This is a YAML-formatted file.
# Declare variables to be passed into your templates.

serviceA:
  image: 
    imageName: 552846087011.dkr.ecr.eu-west-2.amazonaws.com/tibco/serviceA:2.5.2
    pullPolicy: Always
serviceB:  
  image: 
    imageName: 552846087011.dkr.ecr.eu-west-2.amazonaws.com/tibco/serviceB:2.5.2
    pullPolicy: Always
  

 Conclusion

The main benefit of this is that it enhances the options of using helm chart for “complex” applications that require different instances of the same kind of components and at the same time.

That doesn’t mean that you need a huge helm chart for your project because this will go against all the best practices of the whole containerization and microservices approach but at least it will give you the option to define different levels of abstraction as you want, keeping all the benefits from a management perspective.

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Kubernetes API changes quite a lot, and we know that in every new version, they are adding new capabilities at the same time that they are deprecating the old ones, so it is a constant evolution, as we already stated in previous articles, as you can see, here regarding Autoscaling v2 and Vertical Autoscaling.

Some of these changes are related to the shift in the apiVersion of some objects, and you have probably already suffered from that v1/alpha going to v1/beta or just moving to a final v1 and deprecating the previous one. So, in the end, it is crucial to ensure that your manifest is in sync with the target version you’re deploying, and some tools can help us with that, including Kubeconform.

What is Kubeconform?

Kubeconform is a powerful utility designed to assist in Kubernetes configuration management and validation. As Kubernetes continues to gain popularity as the go-to container orchestration platform, ensuring the correctness and consistency of configuration files becomes crucial. Kubeconform addresses this need by providing a comprehensive toolset to validate Kubernetes configuration files against predefined standards or custom rules.

Kubeconform supports multiple versions of Kubernetes, allowing you to validate configuration files against different API versions. This flexibility is beneficial when working with clusters running different Kubernetes versions or migrating applications across sets with varying configurations.

Another great feature of Kubeconform is its ability to enforce best practices and standards across Kubernetes configurations. It allows you to define rules, such as enforcing proper labels, resource limits, or security policies, and then validates your configuration files against these rules. This helps catch potential issues early on and ensures that your deployments comply with established guidelines.

How to install Kubeconform?

Kubeconform can be installed from different sources, the most usual ones the standard for your environment using package managers such as brew, apt or similar ones or just getting the binaries from its GitHub page: https://github.com/yannh/kubeconform/releases.

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

How to launch Kubeconform from the Command Line?

Kubeconform is shipped as a small binary targeted to be executed in the CLI interface and tries to keep its interface minimal to ensure compatibility. Hence, it receives an argument with the file or folder with the manifest files that you want to check, as you can see here:

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Then you have several options to do other things, such as the ones shown below:

-ignore-filename-pattern value

regular expression specifying paths to ignore (can be specified multiple times)

-ignore-missing-schemas

skip files with missing schemas instead of failing

-Kubernetes-version string

version of Kubernetes to validate against, e.g.: 1.18.0 (default “master”)

-output string

output format – json, junit, pretty, tap, text (default “text”)

-reject string

comma-separated list of kinds or GVKs to reject

-skip string

comma-separated list of kinds or GVKs to ignore

-strict

disallow additional properties not in schema or duplicated keys

-summary

print a summary at the end (ignored for junit output)

Use-cases of Kuberconform

There are different use cases where Kubeconfrom can play a good role. One is regarding Kubernetes upgrades, sometimes you need to ensure that your current manifest is still going to work in the new release that the cluster will be upgraded to, and with this tool, we can ensure that our YAML is still compatible with the latest version directly getting it from the environment and validate it properly.

Another notable aspect of Kubeconform is its seamless integration into existing CI/CD pipelines. You can easily incorporate kubeconform as a step in your pipeline to automatically validate Kubernetes configuration files before deploying them. By doing so, you can catch configuration errors early in the development process, reduce the risk of deployment failures, and maintain high configuration consistency.

In addition to its validation capabilities, kubeconform provides helpful feedback and suggestions for improving your Kubernetes configuration files. It highlights specific issues or deviations from the defined rules and offers guidance on addressing them. This simplifies the troubleshooting process and helps developers and administrators become more familiar with best practices and Kubernetes configuration standards.

Conclusion

Kubeconform is an invaluable utility for Kubernetes users who strive for reliable and consistent deployments. It empowers teams to maintain a high standard of configuration quality, reduces the likelihood of misconfigurations, and improves the overall stability and security of Kubernetes-based applications.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Istio Security Policies Explained: PeerAuthentication, RequestAuthentication, and AuthorizationPolicy

Istio Security Policies Explained: PeerAuthentication, RequestAuthentication, and AuthorizationPolicy

Istio Security Policies are crucial in securing microservices within a service mesh environment. We have discussed Istio and the capabilities that it can introduce to your Kubernetes workloads. Still, today we’re going to be more detailed regarding the different objects and resources that would help us make our workloads much more secure and enforce the communication between them. These objects include PeerAuthentication, RequestAuthentication, and AuthorizationPolicy objects.

PeerAuthentication: Enforcing security on pod-to-pod communication

PeerAuthentication focuses on securing communication between services by enforcing mutual TLS (Transport Layer Security) authentication and authorization. It enables administrators to define authentication policies for workloads based on the source of the requests, such as specific namespaces or service accounts. Configuring PeerAuthentication ensures that only authenticated and authorized services can communicate, preventing unauthorized access and man-in-the-middle attacks. This can be achieved depending on the value of the mode where defining this object being STRICT for only allowed mTLS communication, PERMISSIVE to allow both kinds of communication, DISABLE to forbid the mTLS connection and keep the traffic insecure, and UNSET to use the inherit option. This is a sample of the definition of the object:

apiVersion: security.istio.io/v1beta1
kind: PeerAuthentication
metadata:
  name: default
  namespace: foo
spec:
  mtls:
    mode: PERMISSIVE

RequestAuthentication: Defining authentication methods for Istio Workloads

RequestAuthentication, on the other hand, provides fine-grained control over the authentication of inbound requests. It allows administrators to specify rules and requirements for validating and authenticating incoming requests based on factors like JWT (JSON Web Tokens) validation, API keys, or custom authentication methods. With RequestAuthentication, service owners can enforce specific authentication mechanisms for different endpoints or routes, ensuring that only authenticated clients can access protected resources. Here you can see a sample of a RequestAuthentication object:

apiVersion: security.istio.io/v1beta1
kind: RequestAuthentication
metadata:
  name: jwt-auth-policy
  namespace: my-namespace
spec:
  selector:
    matchLabels:
      app: my-app
  jwtRules:
    - issuer: "issuer.example.com"
      jwksUri: "https://example.com/.well-known/jwks.json"

As commented, JWT validation is the most used approach as JWT tokens are becoming the de-facto industry standard for incoming validations and the OAuth V2 authorization protocol. Here you can define the rules the JWT needs to meet to be considered a valid request. But the RequestAuthentication only describes the “authentication methods” supported by the workloads but doesn’t enforce it or provide any details regarding the Authorization.

That means that if you define a workload to need to use JWT authentication, sending the request with the token will validate that token and ensure it is not expired. It meets all the rules you have specified in the object definition, but it will also allow bypassing requests with no token at all, as you’re just defining what the workloads support but not enforcing it. To do that, we need to introduce the last object of this set, the AuthorizationPolicy object.

AuthorizationPolicy: Fine-grained Authorization Policy Definition for Istio Policies

AuthorizationPolicy offers powerful access control capabilities to regulate traffic flow within the service mesh. It allows administrators to define rules and conditions based on attributes like source, destination, headers, and even request payload to determine whether a request should be allowed or denied. AuthorizationPolicy helps enforce fine-grained authorization rules, granting or denying access to specific resources or actions based on the defined policies. Only authorized clients with appropriate permissions can access specific endpoints or perform particular operations within the service mesh. Here you can see a sample of an Authorization Policy object:

apiVersion: security.istio.io/v1beta1
kind: AuthorizationPolicy
metadata:
  name: rbac-policy
  namespace: my-namespace
spec:
  selector:
    matchLabels:
      app: my-app
  rules:
    - from:
        - source:
            principals: ["user:user@example.com"]
      to:
        - operation:
            methods: ["GET"]

Here you can go as detailed as you need; you can apply rules on the source of the request to ensure that only some recommendations can go through (for example, requests that are from a JWT token to use in combination with the RequestAuthenitcation object), but also rules on the target, if this is going to a specific host or path or method or a combination of both. Also, you can apply to ALLOW rules or DENY rules (or even CUSTOM) and define a set of them, and all of them will be enforced as a whole. The evaluation is determined by the following rules as stated in the Istio Official Documentation:

  • If there are any CUSTOM policies that match the request, evaluate and deny the request if the evaluation result is denied.
  • If there are any DENY policies that match the request, deny the request.
  • If there are no ALLOW policies for the workload, allow the request.
  • If any of the ALLOW policies match the request, allow the request. Deny the request.
Authorization Policy validation flow from: https://istio.io/latest/docs/concepts/security/

This will provide all the requirements you could need to be able to do a full definition of all the security policies needed.

 Conclusion

In conclusion, Istio’s Security Policies provide robust mechanisms for enhancing the security of microservices within a service mesh environment. The PeerAuthentication, RequestAuthentication, and AuthorizationPolicy objects offer a comprehensive toolkit to enforce authentication and authorization controls, ensuring secure communication and access control within the service mesh. By leveraging these Istio Security Policies, organizations can strengthen the security posture of their microservices, safeguarding sensitive data and preventing unauthorized access or malicious activities within their service mesh environment.

Kubernetes Vertical Pod Autoscaling Explained: When and Why to Scale Vertically

Kubernetes Vertical Pod Autoscaling Explained: When and Why to Scale Vertically

Kubernetes has introduced as its alpha version in its Kubernetes 1.27 release the Vertical Pod Autoscaling capability to provide the option for the Kubernetes workload to be able to scale using the “vertical” approach by adding more resources to an existing pod. This increases the autoscaling capabilities of your Kubernetes workloads that you have at your disposal such as KEDA or Horizontal Pod Autoscaling.

Vertical Scaling vs Horizontal Scaling

Vertical and horizontal scaling are two approaches used in scaling up the performance and capacity of computer systems, particularly in distributed systems and cloud computing. Vertical scaling, also known as scaling up or scaling vertically, involves adding more resources, such as processing power, memory, or storage, to a single instance or server. This means upgrading the existing compute components or migrating to a more powerful infrastructure. Vertical scaling is often straightforward to implement and requires minimal changes to the software architecture. It is commonly used when the system demands can be met by a single, more powerful infrastructure.

On the other hand, horizontal scaling, also called scaling out or scaling horizontally, involves adding more instances or servers to distribute the workload. Instead of upgrading a single instance, multiple instances are employed, each handling a portion of the workload. Horizontal scaling offers the advantage of increased redundancy and fault tolerance since multiple instances can share the load. Additionally, it provides the ability to handle larger workloads by simply adding more machines to the cluster. However, horizontal scaling often requires more complex software architectures, such as load balancing and distributed file systems, to efficiently distribute and manage the workload across the machines.

In summary, vertical scaling involves enhancing the capabilities of a single object, while horizontal scaling involves distributing the workload across multiple instances. Vertical scaling is easier to implement but may have limitations in terms of the maximum resources available on a single machine. Horizontal scaling provides better scalability and fault tolerance but requires more complex software infrastructure. The choice between vertical and horizontal scaling depends on factors such as the specific requirements of the system, the expected workload, and the available resources.

Why Kubernetes Vertical AutoScaling?

This is an interesting topic because we have been living in a world where the state was that was always better to scale out (using Horizontal Scaling) rather than scaling up (using Vertical Scaling) and especially this was one of the mantras you heard in cloud-native developments. And, that hasn’t changed because horizontal scaling provides much more benefits than vertical scaling and it is well covered with the Autoscaling capabilities or side-projects such as KEDA. So, in that case, why is Kubernetes including this feature and why are we using this site to discuss it?

Because with the transformation of Kubernetes to be the de-facto alternative to any deployment you do nowadays, the characteristic and capabilities of the workloads that you need to handle have extended and that’s why you need to use different techniques to provide the best experience to each of the workloads types

How Kubernetes Vertical Autoscaling?

Here you will find all the documentation about this new feature that as commented is still in the “alpha” stage to is something to try as an experimental mode rather than using it at the production level HPA Documentation

Vertical Scaling works in the way that you will be able to change the resources assigned to the pod, CPU, and memory without needing to restart the pod and change the manifest declaration and that’s a clear benefit of this approach. As you know, until now if you want to change the resources applied to a workload you need to update the manifest document and restart the pod to apply the new changes.

To define this you need to specify the resizePolicy by adding a new section to the manifest pod as you can see here:

apiVersion: v1
kind: Pod
metadata:
  name: qos-demo-5
  namespace: qos-example
spec:
  containers:
  - name: qos-demo-ctr-5
    image: nginx
    resizePolicy:
    - resourceName: cpu
      restartPolicy: NotRequired
    - resourceName: memory
      restartPolicy: RestartContainer
    resources:
      limits:
        memory: "200Mi"
        cpu: "700m"
      requests:
        memory: "200Mi"
        cpu: "700m"

For example in this case we define for the different resource names the policy that we want to apply, if we’re going to change the cpu assigned it won’t require a restart but in case we’re changing the memory it would require a restart.

That implied that if would like to change the CPU assigned you can directly patch the manifest as you can see in the snippet below and that provides an update of the assigned resources:

 kubectl -n qos-example patch pod qos-demo-5 --patch '{"spec":{"containers":[{"name":"qos-demo-ctr-5", "resources":{"requests":{"cpu":"800m"}, "limits":{"cpu":"800m"}}}]}}'

When to use Vertical Scaling are the target scenarios?

It will depend on a lot of different scenarios from the use-case but also from the technology stack that your workload is using to know what of these capabilities can apply. As a normal thing, the CPU change will be easy to adapt to any technology but the memory one would be more difficult depending on the technology used as in most of the technologies the memory assigned is defined at the startup time.

This will help to update components that have changed their requirements as an average scenario or when you’re testing new workloads with live load and you don’t want to disrupt the current processing of the application or simply workloads that don’t support horizontal scaling because are designed on a single-replica mode

 Conclusion

In conclusion, Kubernetes has introduced Vertical Pod Autoscaling, enabling Kubernetes vertical autoscaling of workloads by adding resources to existing pods. Kubernetes Vertical autoscaling allows for resource changes without restarting pods, providing flexibility in managing CPU and memory allocations.

Kubernetes Vertical autoscaling offers a valuable option for adapting to evolving workload needs. It complements horizontal scaling by providing flexibility without the need for complex software architectures. By combining vertical and horizontal scaling approaches, Kubernetes users can optimize their deployments based on specific workload characteristics and available resources.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

Istio allows you to configure Sticky Session, among other network features, for your Kubernetes workloads. As we have commented in several posts regarding Istio, istio deploys a service mesh that provides a central control plane to have all the configuration regarding the network aspects of your Kubernetes workloads. This covers many different aspects of the communication inside the container platform, such as security covering security transport, authentication or authorization, and, at the same time, network features, such as routing and traffic distribution, which is the main topic for today’s article.

These routing capabilities are similar to what a traditional Load Balancer of Level 7 can provide. When we talk about Level 7, we’re referring to the conventional levels that compound the OSI stack, where level 7 is related to the Application Level.

A Sticky Session or Session Affinity configuration is one of the most common features you can need to implement in this scenario. The use-case is the following one:

How To Enable Sticky Session on Your Kubernetes Workloads using Istio?

You have several instances of your workloads, so different pod replicas in a Kubernetes situation. All of these pods behind the same service. By default, it will redirect the requests in a round-robin fashion among the pod replicas in a Ready state, so Kubernetes understand that they’re ready to get the request unless you define it differently.

But in some cases, mainly when you are dealing with a web application or any stateful application that handles the concept of a session, you could want the replica that processes the first request and also handles the rest of the request during the lifetime of the session.

Of course, you could do that easily just by routing all traffic to one request, but in that case, we lose other features such as traffic load balancing and HA. So, this is usually implemented using Session Affinity or Sticky Session policies that provides best of both worlds: same replica handling all the request from an user, but traffic distribution between different users.

How Sticky Session Works?

The behavior behind this is relatively easy. Let’s see how it works.

First, the important thing is that you need “something” as part of your network requests that identify all the requests that belong to the same session, so the routing component (in this case, this role is played by istio) can determine which part needs to handle these requests.

This is “something” that we use to do that, it can be different depending on your configuration, but usually, this is a Cookie or an HTTP Header that we send in each request. Hence, we know that the replica handles all requests of that specific type.

How does Istio implement Sticky Session support?

In the case of using Istio to do this role, we can implement that by using a specific Destination Rule that allows us, among other capabilities, to define the traffic policy to define how we want the traffic to be split and to implement the Sticky Session we need to use the “consistentHash” feature, that allows that all the requests that compute to the same hash will be sent to the replica.

When we define the consistentHash features, we can say how this hash will be created and, in other words, which components will be used to generate this hash, and this can be one of the following options:

  • httpHeaderName: Uses an HTTP Header to do the traffic distribution
  • httpCookie: Uses an HTTP Cookie to do the traffic distribution
  • httpQueryParameterName: Uses a Query String to do the traffic Distribution.
  • maglev: Uses Google’s Maglev Load Balancer to do the determination. You can read more about Maglev in the article from Google.
  • ringHash: Uses a ring-based hashed approach to load balancing between the available pods.

So, as you can see, you will have a lot of different options. Still, just the first three would be the most used to implement a sticky session, and usually, the HTTP Cookie (httpCookie) option will be the preferred one, as it would rely on the HTTP approach to manage the session between clients and servers.

Sticky Session Implementation Sample using TIBCO BW

We will define a very simple TIBCO BW workload to implement a REST service, serving a GET reply with a hardcoded value. To simplify the validation process, the application will log the hostname of the pod so quickly we can see who is handling each of the requests:

How To Enable Sticky Session on Your Kubernetes Workloads using Istio?

We deploy this in our Kubernetes cluster and expose it using a Kubernetes service; in our case, the name of this service will be test2-bwce-srv

On top of that, we apply the istio configuration, which will require three (3) istio objects: gateway, virtual service, and the destination rule. As our focus is on the destination rule, we will try to keep it as simple as possible in the other two objects:

 apiVersion: networking.istio.io/v1beta1
kind: Gateway
metadata:
  name: default-gw
spec:
  selector:
    istio: ingressgateway
  servers:
  - hosts:
    - '*'
    port:
      name: http
      number: 80
      protocol: HTTP

Virtual Service:

apiVersion: networking.istio.io/v1beta1
kind: VirtualService
metadata:
  name: test-vs
spec:
  gateways:
  - default-gw
  hosts:
  - test.com
  http:
  - match:
    - uri:
        prefix: /
    route:
    - destination:
        host: test2-bwce-srv
        port:
          number: 8080

And finally, the DestinationRule will use a httpCookie that we will name ISTIOD, as you can see in the snippet below:

apiVersion: networking.istio.io/v1beta1
kind: DestinationRule
metadata:
    name: default-sticky-dr
    namespace: default
spec:
    host: test2-bwce-srv.default.svc.cluster.local
    trafficPolicy:
      loadBalancer:
        consistentHash:
          httpCookie: 
            name: ISTIOID
            ttl: 60s

Now, that we have already started our test, and after launching the first request, we get a new Cookie that is generated by istio itself that is shown in the Postman response window:

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

This request has been handled for one of the replicas available of the service, as you can see here:

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

All subsequent request from Postman already includes the cookie, and all of them are handled from the same pod:

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

While the other replica’ log is empty, as all the requests have been routed to that specific pod.

Enable Sticky Sessions in Kubernetes Using Istio (Session Affinity Explained)

Summary

We covered in this article the reason behind the need for a sticky session in Kubernetes workload and how we can achieve that using the capabilities of the Istio Service Mesh. So, I hope this can help implement this configuration on your workloads that you can need today or in the future

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Kubernetes Autoscaling 1.26 Explained: HPA v2 Changes and Impact on KEDA

Kubernetes Autoscaling 1.26 Explained: HPA v2 Changes and Impact on KEDA

Introduction

Kubernetes Autoscaling has suffered a dramatic change. Since the Kubernetes 1.26 release, all components should migrate their HorizontalPodAutoscaler objects from the v1 to the new release v2that has been available since Kubernetes 1.23.

HorizontalPodAutoscaler is a crucial component in any workload deployed on a Kubernetes cluster, as the scalability of this solution is one of the great benefits and key features of this kind of environment.

A little bit of History

Kubernetes has introduced a solution for the autoscaling capability since the version Kubernetes 1.3 a long time ago, in 2016. And the solution was based on a control loop that runs at a specific interval that you can configure with the property --horizontal-pod-autoscaler-sync-period parameters that belong to the kube-controller-manager.

So, once during this period, it will get the metrics and evaluate through the condition defined on the HorizontalPodAutoscaler component. Initially, it was based on the compute resources used by the pod, main memory, and CPU.

Kubernetes Autoscaling 1.26: A Game-Changer for KEDA Users?

This provided an excellent feature, but with the past of time and adoption of the Kubernetes environment, it has been shown as a little narrow to handle all the scenarios that we should have, and here is where other awesome projects we have discussed here, such as KEDA brings into the picture to provide a much more flexible set of features.

Kubernetes AutoScaling Capabilities Introduced v2

With the release of the v2 of the Autoscaling API objects, we have included a range of capabilities to upgrade the flexibility and options available now. There most relevant ones are the following:

  • Scaling on custom metrics: With the new release, you can configure an HorizontalPodAutoscaler object to scale using custom metrics. When we talk about custom metrics, we talk about any metric generated from Kubernetes. You can see a detailed walkthrough about using Custom metrics in the official documentation
  • Scaling on multiple metrics: With the new release, you also have the option to scale based on more than one metric. So now the HorizontalPodAutoscalerwill evaluate each scaling rule condition, propose a new scale value for each of them, and take the maximum value as the final one.
  • Support for Metrics API: With the new release, the controller from the HoriztalPodAutoscaler components retrieves metrics from a series of registered APIs, such as metrics.k8s.io, custom.metrics.k8s.io ,external.metrics.k8s.io. For more information on the different metrics available, you can take a look at the design proposal
  • Configurable Scaling Behavior: With the new release, you have a new field, behavior, that allows configuring how the component will behave in terms of scaling up or scaling down activity. So, you can define different policies for the scaling up and others for the scaling down, limit the max number of replicas that can be added or removed in a specific period, to handle the issues with the spikes of some components as Java workloads, among others. Also, you can define a stabilization window to avoid stress when the metric is still fluctuating.

Kubernetes Autoscaling v2 vs KEDA

We have seen all the new benefits that Autoscaling v2 provides, so I’m sure that most of you are asking the same question: Is Kubernetes Autoscaling v2 killing KEDA?

Since the latest releases of KEDA, KEDA already includes the new objects under the autoscaling/v2 group as part of their development, as KEDA relies on the native objects from Kubernetes, and simplify part of the process you need to do when you want to use custom metric or external ones as they have scalers available for pretty much everything you could need now or even in the future.

But, even with that, there are still features that KEDA provides that are not covered here, such as the scaling “from zero” and “to zero” capabilities that are very relevant for specific kinds of workloads and to get a very optimized use of resources. Still, it’s safe to say that with the new features included in the autoscaling/v2 release, the gap is now smaller. Depending on your needs, you can go with the out-of-the-box capabilities without including a new component in your architecture.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Helm Templates in Files Explained: Customize ConfigMaps and Secrets Content

Helm Templates in Files Explained: Customize ConfigMaps and Secrets Content

Helm Templates in Files, such as ConfigMaps Content or Secrets Content, is of the most common requirements when you are in the process of creating a new helm chart. As you already know, Helm Chart is how we use Kubernetes to package our application resources and YAML in a single component that we can manage at once to ease the maintenance and operation process.

External template files are a powerful technique for managing complex configurations. Discover more advanced Helm templating strategies in our definitive Helm package management guide.

Helm Templates Overview

By default, the template process works with YAML files, allowing us to use some variables and some logic functions to customize and templatize our Kubernetes YAML resources to our needs.

So, in a nutshell, we can only have yaml files inside the templates folder of a YAML. But sometimes we would like to do the same process on ConfigMaps or Secrets or to be more concrete to the content of those ConfigMaps, for example, properties files and so on.

Helm Templates in Files: How To Customize ConfigMaps Content Simplified
Helm Templates in Files: Helm Templates Overview Overview showing the Files outside the templates that are usually required

As you can see it is quite normal to have different files such as json configuration file, properties files, shell scripts as part of your helm chart, and most of the times you would like to give some dynamic approach to its content, and that’s why using helm Templates in Files it is so important to be the main focus for this article

Helm Helper Functions to Manage Files

By default, Helm provides us with a set of functions to manage files as part of the helm chart to simplify the process of including them as part of the chart, such as the content of ConfigMap or Secret. Some of these functions are the following:

  • .Files.Glob: This function allows to find any pattern of internal files that matches the pattern, such as the following example:
    { range $path, $ := .Files.Glob ".yaml" }
  • .Files.Get: This is the simplest option to gather the content of a specific file that you know the full path inside your helm chart, such as the following sample: {{ .Files.Get "config1.toml" | b64enc }}

You can even combine both functions to use together such as in the following sample:

 {{ range $path, $_ :=  .Files.Glob  "**.yaml" }}
      {{ $.Files.Get $path }}
{{ end }}

Then you can combine that once you have the file that you want to use with some helper functions to easily introduce in a ConfigMap and a Secret as explained below:

  • .AsConfig : Use the file content to be introduced as ConfigMap handling the pattern: file-name: file-content
  • .AsSecrets: Similar to the previous one, but doing the base64 encoding for the data.

Here you can see a real example of using this approach in an actual helm chart situation:

apiVersion: v1
kind: Secret
metadata:
  name: zones-property
  namespace: {{ $.Release.Namespace }}
data: 
{{ ( $.Files.Glob "tml_zones_properties.json").AsSecrets | indent 2 }} 

You can find more information about that here. But this only allows us to grab the file as is and include it in a ConfigMap. It is not allowing us to do any logic or any substitution to the content as part of that process. So, if we want to modify this, this is not a valid sample.

How To Use Helm Templates in Files Such as ConfigMaps or Secrets?

In case we can do some modifications to the content, we need to use the following formula:

apiVersion: v1
kind: Secret
metadata:
  name: papi-property
  namespace: {{ $.Release.Namespace }}
data:
{{- range $path, $bytes := .Files.Glob "tml_papi_properties.json" }}
{{ base $path | indent 2 }}: {{ tpl ($.Files.Get $path) $ | b64enc }}
{{ end }}

So, here we are doing is first iterating for the files that match the pattern using the .Files.Glob function we explained before, iterating in case we have more than one. Then we manually create the structure following the pattern : file-name: file-content.

To do that, we use the function base to provide just the filename from a full path (and add the proper indentation) and then use the .Files.Get to grab the file’s content and do the base64 encoding using the b64encfunction because, in this case, we’re handling a secret.

The trick here is adding the tpl function that allows this file’s content to go through the template process; this is how all the modifications that we need to do and the variables referenced from the .Values object will be adequately replaced, giving you all the power and flexibility of the Helm Chart in text files such as properties, JSON files, and much more.

I hope this is as useful for you as it has been for me in creating new helm charts! And Look here for other tricks using loops or dependencies.

Nomad vs Kubernetes: Key Differences, Use Cases, and When Nomad Makes Sense

Nomad vs Kubernetes: Key Differences, Use Cases, and When Nomad Makes Sense

Nomad is the Hashicorp alternative to the typical pattern of using a Kubernetes-based platform as the only way to orchestrate your workloads efficiently. Nomad is a project started in 2019, but it is getting much more relevant nowadays after 95 releases, and the current version of this article is 1.4.1, as you can see in their GitHub profile.

Nomad approaches the traditional challenges of isolating the application lifecycle for the infrastructure operation lifecycle where that application resides. Still, instead of going full to a container-based application, it tries to provide a solution differently.

What are the main Nomad Features?

Based on its own definition, as you can read on their GitHub profile, they already highlight some of the points of difference between the de-facto industry standard:

Nomad is an easy-to-use, flexible, and performant workload orchestrator that can deploy a mix of microservice, batch, containerized, and non-containerized applications

Easy-to-use: This is the first statement they include in their definition because the Nomad approach is much simpler than alternatives such as Kubernetes because it works on a single-binary approach where it has all the capabilities that are needed running as a node agent based on its own “vocabulary” that you can read more of it in their official documentation.

Flexibility: This is the other critical thing they provide a hypervisor, an intermediate layer between the application and the underlying infrastructure. It is not just limited to container applications but also supports this kind of deployment. It also allows the deploy it as part of a traditional virtual machine approach. The primary use cases highlighted are running standard windows applications, which is tricky when talking about Kubernetes deployments; even though Windows containers have been a thing for so long, their adoption is not at the same level, as you can see in the Linux container world.

Hashicorp Integration: As part of the Hashicorp portfolio, it also includes seamless integration with other Hashicorp projects such as Hashicorp Vault, which we have covered in several articles, or Hashicorp Consul, which helps to provide additional capabilities in terms of security, configuration management, and communication between the different workloads.

Nomad vs Kubernetes: How Nomad Works?

As commented above, Nomad covers everything with a single-component approach. A nomad binary is an agent that can work in server mode or client mode, depending on the role of the machine executing it.

So Nomad is based on a Nomad cluster, a set of machines running a nomad agent in server mode. Those servers are split depending on the role of the leader or followers. The leader performs most of the cluster management, and the followers can create scheduling plans and submit them to the leader for approval and execution. This is represented in the picture below from the Hashicorp Nomad official page:

Nomad vs Kubernetes: 1 Contestant Against the Orchestration King
Nomad simple architecture from nomadproject.io

Once we have the cluster ready, we need to create our jobs, and a job is a definition of the task we would like to execute on the Nomad cluster we have previously set up. A task is the smallest unit of work in Nomad. Here is where the flexibility comes to Nomad because the task driver executes each task, allowing different drivers to execute various workloads. This is how following the same approach, we will have a docker driver to run our container deployment or an exec driver to execute it on top of a virtual infrastructure. Still, you can create your task drivers following a plugin mechanism that you can read more about here.

Jobs and Task are defined using a text-based approach but not following the usual YAML or JSON kind of files but a different format, as you can see in the picture below (click here to download the whole file from the GitHub Nomad Samples repo):

 Is Nomad a Replace for Kubernetes?

It is a complex question to answer, and even Hashicorp they have documented different strategies. You can undoubtedly use Nomad to run container-based deployments instead of running them on Kubernetes. But at the same time, they also position the solution alongside Kubernetes to run some workloads on one solution and another on the other.

In the end, both try to solve and address the same challenges in terms of scalability, infrastructure sharing and optimization, agility, flexibility, and security from traditional deployments.

Kubernetes focus on different kind of workloads, but everything follows the same deployment mode (container-based) and adopts recent paradigms (service-based, microservice patterns, API-led, and so on) with a robust architecture that allows excellent scalability, performance, flexibility, and with adoption levels in the industry that has become the current de-facto only alternative for modern workload orchestration platforms.

But, at the same time, it also requires effort in management and transforming existing applications to new paradigms.

On the other hand, Nomad tries to address it differently, minimizing the change of the existing application to take advantage of the platform’s benefits and reducing the overhead of management and complexity that a usual Kubernetes platform provides, depending on the situation.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.