As Kubernetes clusters become an integral part of infrastructure, maintaining compliance with security and configuration policies is crucial. Kyverno, a policy engine designed for Kubernetes, can be integrated into your CI/CD pipelines to enforce configuration standards and automate policy checks. In this article, we’ll walk through integrating Kyverno CLI with GitHub Actions, providing a seamless workflow for validating Kubernetes manifests before they reach your cluster.
What is Kyverno CLI?
Kyverno is a Kubernetes-native policy management tool, enabling users to enforce best practices, security protocols, and compliance across clusters. Kyverno CLI is a command-line interface that lets you apply, test, and validate policies against YAML manifests locally or in CI/CD pipelines. By integrating Kyverno CLI with GitHub Actions, you can automate these policy checks, ensuring code quality and compliance before deploying resources to Kubernetes.
Benefits of Using Kyverno CLI in CI/CD Pipelines
Integrating Kyverno into your CI/CD workflow provides several advantages:
Automated Policy Validation: Detect policy violations early in the CI/CD pipeline, preventing misconfigured resources from deployment.
Enhanced Security Compliance: Kyverno enables checks for security best practices and compliance frameworks.
Faster Development: Early feedback on policy violations streamlines the process, allowing developers to fix issues promptly.
Setting Up Kyverno CLI in GitHub Actions
Step 1: Install Kyverno CLI
To use Kyverno in your pipeline, you need to install the Kyverno CLI in your GitHub Actions workflow. You can specify the Kyverno version required for your project or use the latest version.
Here’s a sample GitHub Actions YAML configuration to install Kyverno CLI:
name: CI Pipeline with Kyverno Policy Checks
on:
push:
branches:
- main
pull_request:
branches:
- main
jobs:
kyverno-policy-check:
runs-on: ubuntu-latest
steps:
- name: Checkout Code
uses: actions/checkout@v2
- name: Install Kyverno CLI
run: |
curl -LO https://github.com/kyverno/kyverno/releases/download/v<version>/kyverno-cli-linux.tar.gz
tar -xzf kyverno-cli-linux.tar.gz
sudo mv kyverno /usr/local/bin/
Replace <version> with the version of Kyverno CLI you wish to use. Alternatively, you can replace it with latest to always fetch the latest release.
Step 2: Define Policies for Validation
Create a directory in your repository to store Kyverno policies. These policies define the standards that your Kubernetes resources should comply with. For example, create a directory structure as follows:
Each policy is defined in YAML format and can be customized to meet specific requirements. Below are examples of policies that might be used:
Disallow latest Tag in Images: Prevents the use of the latest tag to ensure version consistency.
Enforce CPU/Memory Limits: Ensures resource limits are set for containers, which can prevent resource abuse.
Step 3: Add a GitHub Actions Step to Validate Manifests
In this step, you’ll use Kyverno CLI to validate Kubernetes manifests against the policies defined in the .github/policies directory. If a manifest fails validation, the pipeline will halt, preventing non-compliant resources from being deployed.
Here’s the YAML configuration to validate manifests:
Replace manifests/ with the path to your Kubernetes manifests in the repository. This command applies all policies in .github/policies against each YAML file in the manifests directory, stopping the pipeline if any non-compliant configurations are detected.
Step 4: Handle Validation Results
To make the output of Kyverno CLI more readable, you can use additional GitHub Actions steps to format and handle the results. For instance, you might set up a conditional step to notify the team if any manifest is non-compliant:
- name: Check for Policy Violations
if: failure()
run: echo "Policy violation detected. Please review the failed validation."
Alternatively, you could configure notifications to alert your team through Slack, email, or other integrations whenever a policy violation is identified.
—
Example: Validating a Kubernetes Manifest
Suppose you have a manifest defining a Kubernetes deployment as follows:
The policy disallow-latest-tag.yaml checks if any container image uses the latest tag and rejects it. When this manifest is processed, Kyverno CLI flags the image and halts the CI/CD pipeline with an error, preventing the deployment of this manifest until corrected.
Conclusion
Integrating Kyverno CLI into a GitHub Actions CI/CD pipeline offers a robust, automated solution for enforcing Kubernetes policies. With this setup, you can ensure Kubernetes resources are compliant with best practices and security standards before they reach production, enhancing the stability and security of your deployments.
Introduction OpenShift, Red Hat’s Kubernetes platform, has its own way of exposing services to external clients. In vanilla Kubernetes, you would typically use an Ingress resource along with an ingress controller to route external traffic to services. OpenShift, however, introduced the concept of a Route and an integrated Router (built on HAProxy) early on, before Kubernetes Ingress even existed. Today, OpenShift supports both Routes and standard Ingress objects, which can sometimes lead to confusion about when to use each and how they relate.
This article explores how OpenShift handles Kubernetes Ingress resources, how they translate to Routes, the limitations of this approach, and guidance on when to use Ingress versus Routes.
OpenShift Routes and the Router: A Quick Overview
OpenShift Routes are OpenShift-specific resources designed to expose services externally. They are served by the OpenShift Router, which is an HAProxy-based proxy running inside the cluster. Routes support advanced features such as:
Because Routes are OpenShift-native, the Router understands these features natively and can be configured accordingly. This tight integration enables powerful and flexible routing capabilities tailored to OpenShift environments.
Using Kubernetes Ingress in OpenShift (Default Behavior)
Starting with OpenShift Container Platform (OCP) 3.10, Kubernetes Ingress resources are supported. When you create an Ingress, OpenShift automatically translates it into an equivalent Route behind the scenes. This means you can use standard Kubernetes Ingress manifests, and OpenShift will handle exposing your services externally by creating Routes accordingly.
This automatic translation simplifies migration and supports basic use cases without requiring Route-specific manifests.
Tuning Behavior with Annotations (Ingress ➝ Route)
When you use Ingress on OpenShift, only OpenShift-aware annotations are honored during the Ingress ➝ Route translation. Controller-specific annotations for other ingress controllers (e.g., nginx.ingress.kubernetes.io/*) are ignored by the OpenShift Router. The following annotations are commonly used and supported by the OpenShift router to tweak the generated Route:
Purpose
Annotation
Typical Values
Effect on Generated Route
TLS termination
route.openshift.io/termination
edge · reencrypt · passthrough
Sets Route spec.tls.termination to the chosen mode.
This Ingress will be realized as a Route with edge TLS and an automatic HTTP→HTTPS redirect, using least connections balancing and a 60s route timeout. The HSTS header will be added by the router on HTTPS responses.
Limitations of Using Ingress to Generate Routes While convenient, using Ingress to generate Routes has limitations:
Missing advanced features: Weighted backends and sticky sessions require Route-specific annotations and are not supported via Ingress.
TLS passthrough and re-encrypt modes: These require OpenShift-specific annotations on Routes and are not supported through standard Ingress.
Ingress without host: An Ingress without a hostname will not create a Route; Routes require a host.
Wildcard hosts: Wildcard hosts (e.g., *.example.com) are only supported via Routes, not Ingress.
Annotation compatibility: Some OpenShift Route annotations do not have equivalents in Ingress, leading to configuration gaps.
Protocol support: Ingress supports only HTTP/HTTPS protocols, while Routes can handle non-HTTP protocols with passthrough TLS.
Config drift risk: Because Routes created from Ingress are managed by OpenShift, manual edits to the generated Route may be overwritten or cause inconsistencies.
These limitations mean that for advanced routing configurations or OpenShift-specific features, using Routes directly is preferable.
When to Use Ingress vs. When to Use Routes Choosing between Ingress and Routes depends on your requirements:
Use Ingress if:
You want portability across Kubernetes platforms.
You have existing Ingress manifests and want to minimize changes.
Your application uses only basic HTTP or HTTPS routing.
You prefer platform-neutral manifests for CI/CD pipelines.
Use Routes if:
You need advanced routing features like weighted backends, sticky sessions, or multiple TLS termination modes.
Your deployment is OpenShift-specific and can leverage OpenShift-native features.
You require stability and full support for OpenShift routing capabilities.
You need to expose non-HTTP protocols or use TLS passthrough/re-encrypt modes.
You want to use wildcard hosts or custom annotations not supported by Ingress.
In many cases, teams use a combination: Ingress for portability and Routes for advanced or OpenShift-specific needs.
Conclusion
On OpenShift, Kubernetes Ingress resources are automatically converted into Routes, enabling basic external service exposure with minimal effort. This allows users to leverage existing Kubernetes manifests and maintain portability. However, for advanced routing scenarios and to fully utilize OpenShift’s powerful Router features, using Routes directly is recommended.
Both Ingress and Routes coexist seamlessly on OpenShift, allowing you to choose the right tool for your application’s requirements.
Every Kubernetes cluster runs on Linux. But the distribution you choose for your nodes determines how much time you spend patching, hardening, debugging SSH sessions, and dealing with configuration drift across your fleet. General-purpose distributions like Ubuntu and Debian were designed to run anything: web servers, desktops, databases, and yes, Kubernetes. That flexibility is also their biggest liability when your only job is running containers.
Talos Linux takes a radically different approach. It strips away everything a Kubernetes node does not need: there is no shell, no SSH daemon, no package manager, and no way to log in interactively. The entire operating system is managed through an API, and every change is declarative. If that sounds extreme, it is. But it solves real problems that traditional distributions cannot address without layers of additional tooling.
This guide is a comprehensive deep dive into Talos Linux: what it is, how its architecture works, how it compares to alternatives like Flatcar and Bottlerocket, how to install and operate it, and when you should (and should not) use it. Whether you are evaluating Talos for a production fleet or a homelab, this is everything you need to make an informed decision.
What Is Talos Linux
Talos Linux is a minimal, immutable operating system designed exclusively to run Kubernetes. It is developed by Sidero Labs and distributed as a single system image that boots into a Kubernetes-ready state. There is no general-purpose userland. No bash shell. No ability to SSH into a node and run commands. Every aspect of machine configuration — from network settings to Kubernetes component flags — is expressed in a YAML document called the machine config and applied through an authenticated gRPC API.
The core design principles are:
Immutable — The root filesystem is read-only and mounted from a SquashFS image. You cannot install packages, modify system binaries, or alter the OS at runtime.
API-driven — All management happens through talosctl, a CLI that communicates with the Talos API over mutual TLS. There is no SSH and no interactive console.
Minimal — The OS ships only what Kubernetes needs: a Linux kernel, containerd, the kubelet, etcd (on control plane nodes), and the Talos machinery. The installed image is roughly 80 MB.
Declarative — The desired machine state is defined in a YAML config. Applying a new config converges the node to the desired state, similar to how Kubernetes reconciles workloads.
Secure by default — No shell access means no attack vector through compromised credentials. All API communication requires mutual TLS authentication. The attack surface is drastically smaller than any traditional distribution.
Talos supports bare metal, VMware vSphere, AWS, Azure, GCP, Hetzner, Equinix Metal, Oracle Cloud, and several other platforms. It also runs on single-board computers like Raspberry Pi and NVIDIA Jetson, making it viable for edge deployments. For a broader perspective on how immutable infrastructure fits into the Kubernetes ecosystem, see our Kubernetes security best practices guide.
Architecture Deep Dive
Understanding Talos at an architectural level is essential before deploying it. The design choices are unconventional compared to what most Linux administrators expect, and they explain both its strengths and its constraints.
The machined Daemon and API-Driven Management
At the heart of Talos is machined, a single PID-1 process that replaces systemd, init, and every other service manager. When a Talos node boots, machined starts, reads its machine configuration, and orchestrates the entire lifecycle: networking, disk setup, containerd, the kubelet, and etcd (on control plane nodes).
machined exposes a gRPC API over port 50000 (for the trustd/machine API) and port 50001 (for the maintenance API during initial provisioning). This is the only way to interact with the node. The talosctl CLI is the primary client, authenticating with mutual TLS certificates generated during cluster bootstrapping.
Key API operations include:
talosctl apply-config — Push a new or updated machine configuration.
talosctl upgrade — Trigger an in-place OS upgrade.
talosctl dmesg — Stream kernel messages in real time.
talosctl logs — Read logs from any Talos service (etcd, kubelet, containerd).
talosctl get — Inspect resource state (network interfaces, disks, services).
talosctl reset — Wipe a node and return it to maintenance mode.
This API-first model eliminates configuration drift by design. There is no way for an operator to SSH into a node, run an ad-hoc command, and leave the system in an undocumented state. Every change flows through the same declarative path.
System Partitions Layout
Talos partitions the disk into a well-defined layout that separates immutable system data from mutable state:
Partition
Purpose
Mutable
EFI
EFI System Partition for UEFI boot
No
BIOS
BIOS boot partition (legacy boot)
No
BOOT
Contains the kernel and initramfs
No (replaced during upgrades)
META
Stores metadata like machine UUID and upgrade status
Limited
STATE
Holds the machine configuration and PKI material
Yes (managed by machined)
EPHEMERAL
Mounted at /var, stores containerd images, kubelet data, etcd data, and pod logs
Yes (wiped on reset)
The STATE partition is critical: it persists the machine config and TLS certificates across reboots and upgrades. The EPHEMERAL partition holds everything that can be reconstructed — container images, pod volumes (emptyDir), and etcd data on control plane nodes. When you run talosctl reset, the EPHEMERAL partition is wiped, but STATE can optionally be preserved.
This layout means that an OS upgrade replaces the BOOT partition contents (kernel + initramfs) while leaving your machine configuration and Kubernetes state untouched. If an upgrade fails, Talos rolls back to the previous BOOT image automatically.
Boot Process and Kubernetes Bootstrapping
The Talos boot sequence is deterministic and fast, typically completing in under 60 seconds on modern hardware:
Firmware → Bootloader — UEFI or BIOS loads GRUB, which loads the Talos kernel and initramfs.
Kernel init → machined — The kernel starts machined as PID 1. There is no init system in between.
Machine config discovery — machined checks the STATE partition for an existing config. If none is found (first boot), it enters maintenance mode and listens on the maintenance API for a config to be applied.
Network configuration — Networking is brought up based on the machine config (DHCP or static).
Disk setup — Partitions are created or validated. The EPHEMERAL partition is formatted if missing.
containerd starts — The container runtime is launched.
etcd starts (control plane only) — etcd is started and joins the existing cluster, or waits for a bootstrap command.
kubelet starts — The kubelet registers the node with the Kubernetes API server.
The first control plane node requires a one-time bootstrap command (talosctl bootstrap) to initialize the etcd cluster and generate the Kubernetes control plane static pods. Subsequent control plane nodes join automatically.
Security Model: No SSH, Mutual TLS, API-Only
Talos Linux implements a zero-trust security model at the OS level. Every API request is authenticated using mutual TLS (mTLS). When you generate a cluster configuration with talosctl gen config, it produces a Certificate Authority (CA) that signs both the client (operator) and server (node) certificates.
The security implications are significant:
No shell access — There is no /bin/sh, no /bin/bash, no login capability. Even if an attacker gains network access to the node, there is no shell to exploit.
No SSH daemon — Port 22 is not open. There is no sshd binary on the system.
No package manager — You cannot install tools, backdoors, or persistence mechanisms on the host.
Read-only rootfs — Even with theoretical root access, the filesystem cannot be modified.
Mutual TLS everywhere — The Talos API, etcd communication, and inter-node trust all use mTLS. Certificates can be rotated without downtime.
This does not make Talos invulnerable — kernel exploits and container escape vulnerabilities still apply. But it eliminates the most common attack vectors in Kubernetes node compromise: SSH credential theft, unauthorized package installation, and persistent rootkits.
Talos Linux vs Alternatives: Comparison Table
Choosing a node OS depends on your operational model, cloud provider, and team experience. Here is how Talos Linux compares to the most common alternatives for Kubernetes node operating systems.
Feature
Talos Linux
Ubuntu / Debian
Flatcar Container Linux
Bottlerocket (AWS)
RancherOS / k3OS
Mutability
Fully immutable rootfs
Fully mutable
Immutable rootfs, writable /etc
Immutable rootfs
Mostly immutable
SSH Access
None (no sshd)
Yes (default)
Yes (default)
Optional (admin container)
Yes
Shell Access
None
Full shell
Full shell
Limited (via admin container)
Full shell
Management Model
Declarative API (gRPC)
Imperative (apt, SSH)
Declarative (Ignition) + SSH
Declarative (TOML settings API)
cloud-init + SSH
Update Mechanism
A/B image swap with rollback
apt upgrade (in-place)
A/B image swap (Nebraska/FLUO)
A/B image swap
Image swap
Container Runtime
containerd
containerd or CRI-O
containerd (Docker optional)
containerd
Docker (RancherOS), containerd (k3OS)
Kubernetes Integration
Built-in (kubelet, etcd bundled)
Manual (kubeadm, etc.)
Manual (kubeadm, etc.)
EKS-optimized
Built-in (k3s bundled)
Cloud Support
AWS, Azure, GCP, Hetzner, bare metal, VMware, and more
All clouds
AWS, Azure, GCP, bare metal, VMware
AWS only
Limited
Image Size
~80 MB
~1-2 GB
~300 MB
~200 MB
~150 MB
Config Drift
Impossible (API-only)
Common without tooling
Possible (SSH access)
Low (API + limited shell)
Possible
Talos Linux vs Ubuntu / Debian
Ubuntu and Debian are the default choices for most Kubernetes deployments, especially when using kubeadm or managed installers. They work. But they carry everything a general-purpose OS includes: a package manager, a full shell, hundreds of system services, and thousands of binaries that your Kubernetes nodes never use.
The operational burden is real: you need to patch the OS independently from Kubernetes, harden SSH, configure unattended upgrades, manage user accounts, and run CIS benchmarks to verify compliance. With Talos, these concerns disappear because the attack surface simply does not exist. The trade-off is that you lose the ability to SSH in and debug problems the traditional way.
Talos Linux vs Flatcar Container Linux
Flatcar Container Linux (the successor to CoreOS Container Linux) is the closest philosophical match to Talos. Both use immutable root filesystems and image-based updates. However, Flatcar retains SSH access and a full shell, which means an operator can still log in and make ad-hoc changes. Flatcar uses Ignition for initial provisioning and systemd for service management.
The key difference is that Flatcar is a container-optimized general-purpose OS, while Talos is a Kubernetes-only OS. Flatcar can run arbitrary containers and system services. Talos runs only Kubernetes. If you need SSH as a safety net during your transition to immutable infrastructure, Flatcar is a pragmatic middle ground. If you want to enforce immutability with no escape hatches, Talos is the stronger choice.
Talos Linux vs Bottlerocket
Bottlerocket is AWS’s purpose-built container OS, designed for EKS and ECS. Like Talos, it has an immutable rootfs and an API-driven settings model. Unlike Talos, it provides an optional “admin container” that gives you a shell for debugging, and it is heavily optimized for the AWS ecosystem.
If you run exclusively on AWS with EKS, Bottlerocket is the path of least resistance. If you need a multi-cloud or bare-metal solution with integrated Kubernetes bootstrapping, Talos is significantly more flexible. Bottlerocket also does not bootstrap Kubernetes itself — it relies on EKS or an external installer.
Talos Linux vs RancherOS / k3OS
RancherOS and k3OS were early attempts at minimal container-focused Linux distributions. RancherOS ran the entire system as Docker containers. k3OS bundled k3s (lightweight Kubernetes) into the OS. Both projects have been deprecated or are in maintenance mode. Talos is the actively developed, production-grade successor to this category. If you are currently running k3OS, Talos is the natural migration path.
Installation and Cluster Bootstrap
Setting up a Talos cluster follows a consistent workflow regardless of the platform: generate configs, boot nodes, apply configs, bootstrap. Here is a step-by-step walkthrough.
Step 1: Install talosctl
Download the talosctl binary for your platform. On macOS with Homebrew:
brew install siderolabs/tap/talosctl
On Linux:
curl -sL https://talos.dev/install | sh
Step 2: Generate Machine Configurations
The talosctl gen config command generates a full set of machine configurations: one for control plane nodes, one for workers, and a talosconfig file containing the client credentials.
talosctl gen config my-cluster https://10.0.0.10:6443 \
--output-dir _out
This creates three files in the _out directory:
controlplane.yaml — Machine config for control plane nodes.
worker.yaml — Machine config for worker nodes.
talosconfig — Client configuration with the CA certificate and client key for mTLS authentication.
The endpoint URL (https://10.0.0.10:6443) should point to the Kubernetes API server address — either a load balancer VIP or the IP of your first control plane node.
Step 3: Boot Nodes with Talos
How you boot depends on the platform:
Bare metal — Write the Talos ISO or disk image to a USB drive or PXE boot. The node boots into maintenance mode, waiting for a config.
VMware — Deploy the OVA template, or use the ISO in a VM. Talos provides official OVA images.
AWS — Use the official Talos AMI. Launch EC2 instances with the AMI and pass the machine config as user-data.
Azure / GCP — Use the official images from Sidero Labs’ image factory. Pass the machine config through the platform’s metadata service.
Step 4: Apply Configuration and Bootstrap
Once nodes are booted and in maintenance mode, apply the machine configs:
# Configure talosctl to use the generated credentials
export TALOSCONFIG="_out/talosconfig"
# Apply config to the first control plane node
talosctl apply-config --insecure \
--nodes 10.0.0.10 \
--file _out/controlplane.yaml
# Apply config to worker nodes
talosctl apply-config --insecure \
--nodes 10.0.0.20 \
--file _out/worker.yaml
The --insecure flag is required for the initial config application because the node does not yet have TLS certificates. After the config is applied, all subsequent communication uses mTLS.
Now bootstrap the Kubernetes cluster from the first control plane node:
# Set the endpoint and node
talosctl config endpoint 10.0.0.10
talosctl config node 10.0.0.10
# Bootstrap etcd and the control plane
talosctl bootstrap
This command initializes etcd, generates the Kubernetes PKI, and starts the control plane static pods. Within a minute or two, the Kubernetes API server is available.
Step 5: Retrieve kubeconfig and Verify
# Get the kubeconfig
talosctl kubeconfig -n 10.0.0.10
# Verify the cluster
kubectl get nodes
kubectl get pods -A
Essential talosctl Commands
Once the cluster is running, these are the commands you will use daily:
# Check node health
talosctl health --nodes 10.0.0.10
# Stream kernel messages (equivalent to dmesg -w)
talosctl dmesg --nodes 10.0.0.10 --follow
# View service logs
talosctl logs kubelet --nodes 10.0.0.10
talosctl logs etcd --nodes 10.0.0.10
# List running services
talosctl services --nodes 10.0.0.10
# Get machine config (current running config)
talosctl get machineconfig --nodes 10.0.0.10
# Inspect resource state
talosctl get members --nodes 10.0.0.10
talosctl get addresses --nodes 10.0.0.10
Day-2 Operations
Installation is only the beginning. The real value of Talos emerges in day-2 operations: upgrades, config changes, and cluster maintenance. This is where the declarative, API-driven model pays dividends.
Upgrading Talos Linux
Talos upgrades are performed node by node through the API. The process downloads the new OS image, writes it to the inactive boot partition, and reboots the node into the new version. If the upgrade fails, the node automatically rolls back to the previous image.
# Upgrade a single node
talosctl upgrade --nodes 10.0.0.10 \
--image ghcr.io/siderolabs/installer:v1.9.0
# Upgrade with --preserve to keep the EPHEMERAL partition
talosctl upgrade --nodes 10.0.0.10 \
--image ghcr.io/siderolabs/installer:v1.9.0 \
--preserve
For production clusters, follow this sequence: upgrade control plane nodes one at a time, verify etcd health after each, then upgrade workers in a rolling fashion. The --preserve flag is important if you want to keep downloaded container images and avoid re-pulling everything after the reboot.
Upgrading Kubernetes Version
Kubernetes version upgrades are separate from Talos OS upgrades. You can run a newer version of Kubernetes on an older Talos release (within compatibility bounds). The upgrade is triggered through talosctl:
This command orchestrates the upgrade of all control plane components (kube-apiserver, kube-controller-manager, kube-scheduler, kube-proxy) and then rolls the kubelet version across all nodes. It respects PodDisruptionBudgets and cordons/drains nodes before upgrading.
Customizing Machine Config with Patches
As your cluster evolves, you will need to modify machine configurations — adding a registry mirror, changing kubelet flags, or configuring network bonding. Talos supports config patches that overlay changes onto the base config without replacing the entire file.
Patches can also be applied at generation time with talosctl gen config --config-patch, which is ideal for encoding environment-specific overrides into your GitOps pipeline.
etcd Management
Talos manages etcd as a first-class service, not as a manually deployed component. Common etcd operations are available through talosctl:
# Check etcd member list
talosctl etcd members --nodes 10.0.0.10
# Take an etcd snapshot (backup)
talosctl etcd snapshot db.snapshot --nodes 10.0.0.10
# Remove a failed etcd member
talosctl etcd remove-member --nodes 10.0.0.10
# Force a new etcd cluster from a single node (disaster recovery)
talosctl etcd forfeit-leadership --nodes 10.0.0.10
Regular etcd snapshots are non-negotiable for any production cluster. Automate this with a CronJob that calls the Talos API or runs talosctl etcd snapshot from an external host.
Limitations and When NOT to Use Talos Linux
Talos is not the right choice for every environment. Understanding its limitations is just as important as understanding its strengths.
No SSH Debugging
The most immediate pain point: when something goes wrong, you cannot SSH into the node and poke around. You are limited to what the Talos API exposes — logs, dmesg, service status, and resource state. For most Kubernetes issues, this is sufficient. But for low-level kernel or hardware debugging, you may need to boot the node from a different OS temporarily.
Talos does offer a talosctl dashboard command that provides a real-time TUI (text UI) showing CPU, memory, network, and service status. Combined with talosctl logs and talosctl dmesg, you can troubleshoot most problems. But the learning curve is real, especially for teams accustomed to reaching for htop and journalctl.
Learning Curve for Traditional Sysadmins
If your team manages infrastructure through SSH, Ansible playbooks, and shell scripts, Talos requires a fundamental shift in operational practices. There is no way to "just install" a debugging tool on a node. Everything must be done through the API or through Kubernetes workloads (DaemonSets with host-level access). This shift is valuable in the long run, but it requires investment in training and new workflows.
Custom Kernel Modules
Talos ships a specific kernel build with a curated set of modules. If your workload requires a custom kernel module (GPU drivers, specific storage drivers, or out-of-tree network drivers), you need to build a custom Talos image using the Talos image factory or the imager tool. This is supported but adds operational complexity compared to distributions where you can simply apt install a kernel module package.
Sidero Labs provides an Image Factory service that lets you build custom Talos images with additional system extensions (like NVIDIA drivers, iSCSI tools, or ZFS support) through a web interface or API.
Workloads Requiring Host-Level Access
Some workloads expect to interact with the host OS directly: log collectors that read /var/log, monitoring agents that read /proc, or security tools that install kernel modules. Most of these work in Talos (containerd's runtime allows host path mounts), but some assume a traditional Linux userland that simply does not exist. Evaluate your specific stack before committing.
Real-World Use Cases
Homelab and Learning
Talos is an excellent choice for homelab Kubernetes clusters. It runs on Raspberry Pi 4/5, Intel NUCs, and old laptops. The entire OS fits in minimal storage, and the declarative config model means you can rebuild your cluster from scratch in minutes by reapplying your machine configs. Many homelab operators use Talos with ArgoCD or Flux for a fully GitOps-managed stack.
Edge and Retail
Edge deployments benefit from Talos's small footprint, immutable design, and remote management. A retail chain with 500 store locations running local Kubernetes clusters can manage every node through the Talos API without ever needing physical or SSH access. The A/B upgrade mechanism ensures that a bad update does not brick a remote device.
Production Multi-Cloud Clusters
Talos provides a consistent node OS across AWS, Azure, GCP, and bare metal. This is valuable for organizations that run Kubernetes on multiple providers and want a single operational model for node management. Instead of maintaining separate AMIs, Azure images, and GCP images with different toolchains, you maintain one set of Talos machine configs with platform-specific patches.
Security-Sensitive Environments
For regulated industries (finance, healthcare, government), Talos's security posture simplifies compliance. The absence of SSH, shell, and package management eliminates entire categories of CIS benchmark requirements. Audit teams appreciate that there is no way for a rogue operator to install unauthorized software on the node OS. The immutable image model also simplifies forensics: if the OS hash does not match the known-good image, the node has been tampered with.
Frequently Asked Questions
Can you SSH into Talos Linux?
No. Talos Linux does not include an SSH daemon, a shell, or any interactive login mechanism. All node management is performed through the Talos API using talosctl. This is a deliberate design decision to eliminate the attack surface associated with shell access and prevent configuration drift from ad-hoc changes.
Is Talos Linux free and open source?
Yes. Talos Linux is open source under the Mozilla Public License 2.0. It is developed by Sidero Labs, which also offers Omni — a commercial SaaS platform for managing Talos clusters at scale. The OS itself is fully free to use in production without restrictions.
How do you debug a Talos Linux node without shell access?
Talos provides several debugging tools through its API: talosctl dmesg for kernel messages, talosctl logs <service> for service logs, talosctl dashboard for a real-time system overview, and talosctl get for inspecting resource state (network, disks, services). For deeper debugging, you can run a privileged DaemonSet pod with nsenter to access the host namespace from within Kubernetes.
Can Talos Linux run workloads other than Kubernetes?
No. Talos Linux is purpose-built exclusively for Kubernetes. It does not support running arbitrary containers, system services, or applications outside of the Kubernetes workload model. If you need to run non-Kubernetes workloads on the same host, consider Flatcar Container Linux or a traditional distribution.
What happens if a Talos upgrade fails?
Talos uses an A/B partition scheme for upgrades. The new image is written to the inactive boot partition, and the node reboots into it. If the new image fails to boot successfully (the health check does not pass within the configured timeout), the bootloader automatically reverts to the previous working image on the next reboot. This makes upgrades inherently safe and reversible without manual intervention.
Helm has long been the standard for managing Kubernetes applications using packaged charts, bringing a level of reproducibility and automation to the deployment process. However, some operational tasks, such as renaming a release or migrating objects between charts, have traditionally required cumbersome workarounds. With the introduction of the --take-ownership flag in Helm v3.17 (released in January 2025), a long-standing pain point is finally addressed—at least partially.
The take-ownership feature represents the continuing evolution of Helm. Learn about this and other cutting-edge capabilities in our Helm Charts Package Management Guide
In this post, we will explore:
What the --take-ownership flag does
Why it was needed
The caveats and limitations
Real-world use cases where it helps
When not to use it
Understanding Helm Release Ownership and Object Management
When Helm installs or upgrades a chart, it injects metadata—labels and annotations—into every managed Kubernetes object. These include:
This metadata serves an important role: Helm uses it to track and manage resources associated with each release. As a safeguard, Helm does not allow another release to modify objects it does not own and when you trying that you will see messages like the one below:
Error: Unable to continue with install: Service "provisioner-agent" in namespace "test-my-ns" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "dp-core-infrastructure11": current value is "dp-core-infrastructure"
While this protects users from accidental overwrites, it creates limitations for advanced use cases.
Why --take-ownership Was Needed
Let’s say you want to:
Rename an existing Helm release from api-v1 to api.
Move a ConfigMap or Service from one chart to another.
Rebuild state during GitOps reconciliation when previous Helm metadata has drifted.
Previously, your only option was to:
Uninstall the existing release.
Reinstall under the new name.
This approach introduces downtime, and in production systems, that’s often not acceptable.
You’re refactoring a large chart into smaller, modular ones and need to reassign certain Service or Secret objects.
This flag allows the new release to take control of the object without deleting or recreating it.
✅ 3. GitOps Drift Reconciliation
If objects were deployed out-of-band or their metadata changed unintentionally, GitOps tooling using Helm can recover without manual intervention using --take-ownership.
Best Practices and Recommendations
Use this flag intentionally, and document where it’s applied.
If possible, remove the previous release after migration to avoid confusion.
Monitor Helm’s behavior closely when managing shared objects.
For non-Helm-managed resources, continue to use kubectl annotate or kubectl label to manually align metadata.
Conclusion
The --take-ownership flag is a welcomed addition to Helm’s CLI arsenal. While not a universal solution, it smooths over many of the rough edges developers and SREs face during release evolution and GitOps adoption.
It brings a subtle but powerful improvement—especially in complex environments where resource ownership isn’t static.
Stay updated with Helm releases, and consider this flag your new ally in advanced release engineering.
Frequently Asked Questions
What does the Helm –take-ownership flag do?
The --take-ownership flag allows Helm to bypass ownership validation and claim control of Kubernetes resources that belong to another release. It updates the meta.helm.sh/release-name annotation to associate objects with the current release, enabling zero-downtime release renames and chart migrations.
When should I use Helm take ownership?
Use --take-ownership when renaming releases without downtime, migrating objects between charts, or fixing GitOps drift. It’s ideal for production environments where uninstall/reinstall cycles aren’t acceptable. Always document usage and clean up previous releases afterward.
What are the limitations of Helm take ownership?
The flag doesn’t clean up references from previous releases or protect against future uninstalls of the original release. It only works with Helm-managed resources, not completely unmanaged Kubernetes objects. Manual cleanup of old releases is still required.
Is Helm take ownership safe for production use?
Yes, but use it intentionally and carefully. The flag bypasses Helm’s safety checks, so ensure you understand the ownership implications. Test in staging first, document all usage, and monitor for conflicts. Remove old releases after successful migration to avoid confusion.
Which Helm version introduced the take ownership flag?
The --take-ownership flag was introduced in Helm v3.17, released in January 2025. This feature addresses long-standing pain points with release renaming and chart migrations that previously required downtime-inducing uninstall/reinstall cycles.
Kyverno offers a robust, declarative approach to enforcing security and compliance standards within Kubernetes clusters by allowing users to define and enforce custom policies. For an in-depth look at Kyverno’s functionality, including core concepts and benefits, see my detailed article here. In this guide, we’ll focus on extending Kyverno policies, providing a structured walkthrough of its data model, and illustrating use cases to make the most of Kyverno in a Kubernetes environment.
Understanding the Kyverno Policy Data Model
Kyverno policies consist of several components that define how the policy should behave, which resources it should affect, and the specific rules that apply. Let’s dive into the main parts of the Kyverno policy model:
Policy Definition: This is the root configuration where you define the policy’s metadata, including name, type, and scope. Policies can be created at the namespace level for specific areas or as cluster-wide rules to enforce uniform standards across the entire Kubernetes cluster.
Rules: Policies are made up of rules that dictate what conditions Kyverno should enforce. Each rule can include logic for validation, mutation, or generation based on your needs.
Match and Exclude Blocks: These sections allow fine-grained control over which resources the policy applies to. You can specify resources by their kinds (e.g., Pods, Deployments), namespaces, labels, and even specific names. This flexibility is crucial for creating targeted policies that impact only the resources you want to manage.
Match block: Defines the conditions under which the rule applies to specific resources.
Exclude block: Used to explicitly omit resources that match certain conditions, ensuring that unaffected resources are not inadvertently included.
Validation, Mutation, and Generation Actions: Each rule can take different types of actions:
Validation: Ensures resources meet specific criteria and blocks deployment if they don’t.
Mutation: Adjusts resource configurations to align with predefined standards, which is useful for auto-remediation.
Generation: Creates or manages additional resources based on existing resource configurations.
Example: Restricting Container Image Sources to Docker Hub
A common security requirement is to limit container images to trusted registries. The example below demonstrates a policy that only permits images from Docker Hub.
This policy targets all Pod resources in the cluster and enforces a validation rule that restricts the image source to docker.io. If a Pod uses an image outside Docker Hub, Kyverno denies its deployment, reinforcing secure sourcing practices.
Practical Use-Cases for Kyverno Policies
Kyverno policies can handle a variety of Kubernetes management tasks through validation, mutation, and generation. Let’s explore examples for each type to illustrate Kyverno’s versatility:
1. Validation Policies
Validation policies in Kyverno ensure that resources comply with specific configurations or security standards, stopping any non-compliant resources from deploying.
Use-Case: Enforcing Resource Limits for Containers
This example prevents deployments that lack resource limits, ensuring all Pods specify CPU and memory constraints.
apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: enforce-resource-limits
spec:
rules:
- name: require-resource-limits
match:
resources:
kinds:
- Pod
validate:
message: "Resource limits (CPU and memory) are required for all containers."
pattern:
spec:
containers:
- resources:
limits:
cpu: "?*"
memory: "?*"
By enforcing resource limits, this policy helps prevent resource contention in the cluster, fostering stable and predictable performance.
2. Mutation Policies
Mutation policies allow Kyverno to automatically adjust configurations in resources to meet compliance requirements. This approach is beneficial for consistent configurations without manual intervention.
Use-Case: Adding Default Labels to Pods
This policy adds a default label, environment: production, to all new Pods that lack this label, ensuring that resources align with organization-wide labeling standards.
This mutation policy is an example of how Kyverno can standardize resource configurations at scale by dynamically adding missing information, reducing human error and ensuring labeling consistency.
3. Generation Policies
Generation policies in Kyverno are used to create or update related resources, enhancing Kubernetes automation by responding to specific configurations or needs in real-time.
Use-Case: Automatically Creating a ConfigMap for Each New Namespace
This example policy generates a ConfigMap in every new namespace, setting default configuration values for all resources in that namespace.
This generation policy is triggered whenever a new namespace is created, automatically provisioning a ConfigMap with default settings. This approach is especially useful in multi-tenant environments, ensuring new namespaces have essential configurations in place.
Conclusion
Extending Kyverno policies enables Kubernetes administrators to establish and enforce tailored security and operational practices within their clusters. By leveraging Kyverno’s capabilities in validation, mutation, and generation, you can automate compliance, streamline operations, and reinforce security standards seamlessly.
In the Kubernetes ecosystem, security and governance are key aspects that need continuous attention. While Kubernetes offers some out-of-the-box (OOTB) security features such as Pod Security Admission (PSA), these might not be sufficient for complex environments with varying compliance requirements. This is where Kyverno comes into play, providing a powerful yet flexible solution for managing and enforcing policies across your cluster.
In this post, we will explore the key differences between Kyverno and PSA, explain how Kyverno can be used in different use cases, and show you how to install and deploy policies with it. Although custom policy creation will be covered in a separate post, we will reference some pre-built policies you can use right away.
What is Pod Security Admission (PSA)?
Kubernetes introduced Pod Security Admission (PSA) as a replacement for the now deprecated PodSecurityPolicy (PSP). PSA focuses on enforcing three predefined levels of security: Privileged, Baseline, and Restricted. These levels control what pods are allowed to run in a namespace based on their security context configurations.
Privileged: Minimal restrictions, allowing privileged containers and host access.
Baseline: Applies standard restrictions, disallowing privileged containers and limiting host access.
Restricted: The strictest level, ensuring secure defaults and enforcing best practices for running containers.
While PSA is effective for basic security requirements, it lacks flexibility when enforcing fine-grained or custom policies. We have a full article covering this topic that you can read here.
Kyverno vs. PSA: Key Differences
Kyverno extends beyond the capabilities of PSA by offering more granular control and flexibility. Here’s how it compares:
Policy Types: While PSA focuses solely on security, Kyverno allows the creation of policies for validation, mutation, and generation of resources. This means you can modify or generate new resources, not just enforce security rules.
Customizability: Kyverno supports custom policies that can enforce your organization’s compliance requirements. You can write policies that govern specific resource types, such as ensuring that all deployments have certain labels or that container images come from a trusted registry.
Policy as Code: Kyverno policies are written in YAML, allowing for easy integration with CI/CD pipelines and GitOps workflows. This makes policy management declarative and version-controlled, which is not the case with PSA.
Audit and Reporting: With Kyverno, you can generate detailed audit logs and reports on policy violations, giving administrators a clear view of how policies are enforced and where violations occur. PSA lacks this built-in reporting capability.
Enforcement and Mutation: While PSA primarily enforces restrictions on pods, Kyverno allows not only validation of configurations but also modification of resources (mutation) when required. This adds an additional layer of flexibility, such as automatically adding annotations or labels.
When to Use Kyverno Over PSA
While PSA might be sufficient for simpler environments, Kyverno becomes a valuable tool in scenarios requiring:
Custom Compliance Rules: For example, enforcing that all containers use a specific base image or restricting specific container capabilities across different environments.
CI/CD Integrations: Kyverno can integrate into your CI/CD pipelines, ensuring that resources comply with organizational policies before they are deployed.
Complex Governance: When managing large clusters with multiple teams, Kyverno’s policy hierarchy and scope allow for finer control over who can deploy what and how resources are configured.
If your organization needs a more robust and flexible security solution, Kyverno is a better fit compared to PSA’s more generic approach.
Installing Kyverno
To start using Kyverno, you’ll need to install it in your Kubernetes cluster. This is a straightforward process using Helm, which makes it easy to manage and update.
After installation, Kyverno will begin enforcing policies across your cluster, but you’ll need to deploy some policies to get started.
Deploying Policies with Kyverno
Kyverno policies are written in YAML, just like Kubernetes resources, which makes them easy to read and manage. You can find several ready-to-use policies from the Kyverno Policy Library, or create your own to match your requirements.
Here is an example of a simple validation policy that ensures all pods use trusted container images from a specific registry:
This policy will automatically block the deployment of any pod that uses an image from a registry other than myregistry.com.
Applying the Policy
To apply the above policy, save it to a YAML file (e.g., trusted-registry-policy.yaml) and run the following command:
kubectl apply -f trusted-registry-policy.yaml
Once applied, Kyverno will enforce this policy across your cluster.
Viewing Kyverno Policy Reports
Kyverno generates detailed reports on policy violations, which are useful for audits and tracking policy compliance. To check the reports, you can use the following commands:
List all Kyverno policy reports:
kubectl get clusterpolicyreport
Describe a specific policy report to get more details:
These reports can be integrated into your monitoring tools to trigger alerts when critical violations occur.
Conclusion
Kyverno offers a flexible and powerful way to enforce policies in Kubernetes, making it an essential tool for organizations that need more than the basic capabilities provided by PSA. Whether you need to ensure compliance with internal security standards, automate resource modifications, or integrate policies into CI/CD pipelines, Kyverno’s extensive feature set makes it a go-to choice for Kubernetes governance.
For now, start with the out-of-the-box policies available in Kyverno’s library. In future posts, we’ll dive deeper into creating custom policies tailored to your specific needs.
In Kubernetes, security is a key concern, especially as containers and microservices grow in complexity. One of the essential features of Kubernetes for policy enforcement is Pod Security Admission (PSA), which replaces the deprecated Pod Security Policies (PSP). PSA provides a more straightforward and flexible approach to enforce security policies, helping administrators safeguard clusters by ensuring that only compliant pods are allowed to run.
This article will guide you through PSA, the available Pod Security Standards, how to configure them, and how to apply security policies to specific namespaces using labels.
What is Pod Security Admission (PSA)?
PSA is a built-in admission controller introduced in Kubernetes 1.23 to replace Pod Security Policies (PSPs). PSPs had a steep learning curve and could become cumbersome when scaling security policies across various environments. PSA simplifies this process by applying Kubernetes Pod Security Standards based on predefined security levels without needing custom logic for each policy.
With PSA, cluster administrators can restrict the permissions of pods by using labels that correspond to specific Pod Security Standards. PSA operates at the namespace level, enabling better granularity in controlling security policies for different workloads.
Pod Security Standards
Kubernetes provides three key Pod Security Standards in the PSA framework:
Privileged: No restrictions; permits all features and is the least restrictive mode. This is not recommended for production workloads but can be used in controlled environments or for workloads requiring elevated permissions.
Baseline: Provides a good balance between usability and security, restricting the most dangerous aspects of pod privileges while allowing common configurations. It is suitable for most applications that don’t need special permissions.
Restricted: The most stringent level of security. This level is intended for workloads that require the highest level of isolation and control, such as multi-tenant clusters or workloads exposed to the internet.
Each standard includes specific rules to limit pod privileges, such as disallowing privileged containers, restricting access to the host network, and preventing changes to certain security contexts.
Setting Up Pod Security Admission (PSA)
To enable PSA, you need to label your namespaces based on the security level you want to enforce. The label format is as follows:
This setup enforces the baseline standard while issuing warnings and logging violations for restricted-level rules.
Example: Configuring Pod Security in a Namespace
Let’s walk through an example of configuring baseline security for the dev namespace. First, you need to apply the PSA labels:
kubectl create namespace dev
kubectl label --overwrite ns dev pod-security.kubernetes.io/enforce=baseline
Now, any pod deployed in the dev namespace will be checked against the baseline security standard. If a pod violates the baseline policy (for instance, by attempting to run a privileged container), it will be blocked from starting.
You can also combine warn and audit modes to track violations without blocking pods:
kubectl label --overwrite ns dev pod-security.kubernetes.io/enforce=baseline pod-security.kubernetes.io/warn=restricted pod-security.kubernetes.io/audit=privileged
In this case, PSA will allow pods to run if they meet the baseline policy, but it will issue warnings for restricted-level violations and log any privileged-level violations.
Applying Policies by Default
One of the strengths of PSA is its simplicity in applying policies at the namespace level, but administrators might wonder if there’s a way to apply a default policy across new namespaces automatically. As of now, Kubernetes does not natively provide an option to apply PSA policies globally by default. However, you can use admission webhooks or automation tools such as OPA Gatekeeper or Kyverno to enforce default policies for new namespaces.
Conclusion
Pod Security Admission (PSA) simplifies policy enforcement in Kubernetes clusters, making it easier to ensure compliance with security standards across different environments. By configuring Pod Security Standards at the namespace level and using labels, administrators can control the security level of workloads with ease. The flexibility of PSA allows for efficient security management without the complexity associated with the older Pod Security Policies (PSPs).
Managing Kubernetes resources effectively can sometimes feel overwhelming, but Helm, the Kubernetes package manager, offers several commands and flags that make the process smoother and more intuitive. In this article, we’ll dive into some lesser-known Helm commands and flags, explaining their uses, benefits, and practical examples.
These advanced commands are essential for mastering Helm in production. For the complete toolkit including fundamentals, testing, and deployment patterns, visit our Helm package management guide.
1. helm get values: Retrieving Deployed Chart Values
The helm get values command is essential when you need to see the configuration values of a deployed Helm chart. This is particularly useful when you have a chart deployed but lack access to its original configuration file. With this command, you can achieve an “Infrastructure as Code” approach by capturing the current state of your deployment.
Usage:
helm get values <release-name> [flags]
Example:
To get the values of a deployed chart named my-release:
helm get values my-release --namespace my-namespace
This command outputs the current values used for the deployment, which is valuable for documentation, replicating the environment, or modifying deployments.
2. Understanding helm upgrade Flags: --reset-values, --reuse-values, and --reset-then-reuse
The helm upgrade command is typically used to upgrade or modify an existing Helm release. However, the behavior of this command can be finely tuned using several flags: --reset-values, --reuse-values, and --reset-then-reuse.
--reset-values: Ignores the previous values and uses only the values provided in the current command. Use this flag when you want to override the existing configuration entirely.
Example Scenario: You are deploying a new version of your application, and you want to ensure that no old values are retained.
--reuse-values: Reuses the previous release’s values and merges them with any new values provided. This flag is useful when you want to keep most of the old configuration but apply a few tweaks.
Example Scenario: You need to add a new environment variable to an existing deployment without affecting the other settings.
--reset-then-reuse: A combination of the two. It resets to the original values and then merges the old values back, allowing you to start with a clean slate while retaining specific configurations.
Example Scenario: Useful in complex environments where you want to ensure the chart is using the original default settings but retain some custom values.
3. helm lint: Ensuring Chart Quality in CI/CD Pipelines
The helm lint command checks Helm charts for syntax errors, best practices, and other potential issues. This is especially useful when integrating Helm into a CI/CD pipeline, as it ensures your charts are reliable and adhere to best practices before deployment.
Usage:
helm lint <chart-path> [flags]
<chart-path>: Path to the Helm chart you want to validate.
Example:
helm lint ./my-chart/
This command scans the my-chart directory for issues like missing fields, incorrect YAML structure, or deprecated usage. If you’re automating deployments, integrating helm lint into your pipeline helps catch problems early. By adding this command in your CICD pipeline, you ensure that any syntax or structural issues are caught before proceeding to build or deployment stages. You can lear more about helm testing in the linked article
4. helm rollback: Reverting to a Previous Release
The helm rollback command allows you to revert a release to a previous version. This can be incredibly useful in case of a failed upgrade or deployment, as it provides a way to quickly restore a known good state.
Usage:
helm rollback <release-name> [revision] [flags]
[revision]: The revision number to which you want to roll back. If omitted, Helm will roll back to the previous release by default.
Example:
To roll back a release named my-release to its previous version:
helm rollback my-release
To roll back to a specific revision, say revision 3:
helm rollback my-release 3
This command can be a lifesaver when a recent change breaks your application, allowing you to quickly restore service continuity while investigating the issue.
5. helm verify: Validating a Chart Before Use
The helm verify command checks the integrity and validity of a chart before it is deployed. This command ensures that the chart’s package file has not been tampered with or corrupted. It’s particularly useful when you are pulling charts from external repositories or using charts shared across multiple teams.
Usage:
helm verify <chart-path>
Example:
To verify a downloaded chart named my-chart:
helm verify ./my-chart.tgz
If the chart passes the verification, Helm will output a success message. If it fails, you’ll see details of the issues, which could range from missing files to checksum mismatches.
Conclusion
Leveraging these advanced Helm commands and flags can significantly enhance your Kubernetes management capabilities. Whether you are retrieving existing deployment configurations, fine-tuning your Helm upgrades, or ensuring the quality of your charts in a CI/CD pipeline, these tricks help you maintain a robust and efficient Kubernetes environment.
Istio has become an essential tool for managing HTTP traffic within Kubernetes clusters, offering advanced features such as Canary Deployments, mTLS, and end-to-end visibility. However, some tasks, like exposing a TCP port using the Istio IngressGateway, can be challenging if you’ve never done it before. This article will guide you through the process of exposing TCP ports with Istio Ingress Gateway, complete with real-world examples and practical use cases.
Understanding the Context
Istio is often used to manage HTTP traffic in Kubernetes, providing powerful capabilities such as traffic management, security, and observability. The Istio IngressGateway serves as the entry point for external traffic into the Kubernetes cluster, typically handling HTTP and HTTPS traffic. However, Istio also supports TCP traffic, which is necessary for use cases like exposing databases or other non-HTTP services running in the cluster to external consumers.
Exposing a TCP port through Istio involves configuring the IngressGateway to handle TCP traffic and route it to the appropriate service. This setup is particularly useful in scenarios where you need to expose services like TIBCO EMS or Kubernetes-based databases to other internal or external applications.
Steps to Expose a TCP Port with Istio IngressGateway
1.- Modify the Istio IngressGateway Service:
Before configuring the Gateway, you must ensure that the Istio IngressGateway service is configured to listen on the new TCP port. This step is crucial if you’re using a NodePort service, as this port needs to be opened on the Load Balancer.
After applying these configurations, the Istio IngressGateway will expose the TCP port to external traffic.
Practical Use Cases
Exposing TIBCO EMS Server: One common scenario is exposing a TIBCO EMS (Enterprise Message Service) server running within a Kubernetes cluster to other internal applications or external consumers. By configuring the Istio IngressGateway to handle TCP traffic, you can securely expose EMS’s TCP port, allowing it to communicate with services outside the Kubernetes environment.
Exposing Databases: Another use case is exposing a database running within Kubernetes to external services or different clusters. By exposing the database’s TCP port through the Istio IngressGateway, you enable other applications to interact with it, regardless of their location.
Exposing a Custom TCP-Based Service: Suppose you have a custom application running within Kubernetes that communicates over TCP, such as a game server or a custom TCP-based API service. You can use the Istio IngressGateway to expose this service to external users, making it accessible from outside the cluster.
Conclusion
Exposing TCP ports using the Istio IngressGateway can be a powerful technique for managing non-HTTP traffic in your Kubernetes cluster. With the steps outlined in this article, you can confidently expose services like TIBCO EMS, databases, or custom TCP-based applications to external consumers, enhancing the flexibility and connectivity of your applications.
Kubernetes ConfigMaps are a powerful tool for managing configuration data separately from application code. However, they can sometimes lead to issues during deployment, particularly when a ConfigMap referenced in a Pod specification is missing, causing the application to fail to start. This is a common scenario that can lead to a CreateContainerConfigError and halt your deployment pipeline.
Understanding the Problem
When a ConfigMap is referenced in a Pod’s specification, Kubernetes expects the ConfigMap to be present. If it is not, Kubernetes will not start the Pod, leading to a failed deployment. This can be problematic in situations where certain configuration data is optional or environment-specific, such as proxy settings that are only necessary in certain environments.
Making ConfigMap Values Optional
Kubernetes provides a way to define ConfigMap items as optional, allowing your application to start even if the ConfigMap is not present. This can be particularly useful for environment variables that only need to be set under certain conditions.
Here’s a basic example of how to make a ConfigMap optional:
name: example-configmap refers to the ConfigMap that might or might not be present.
optional: true ensures that the Pod will still start even if example-configmap or the optional-key within it is missing.
Practical Use Case: Proxy Configuration
A common use case for optional ConfigMap values is setting environment variables for proxy configuration. In many enterprise environments, proxy settings are only required in certain deployment environments (e.g., staging, production) but not in others (e.g., local development).
In this setup, if the proxy-config ConfigMap is missing, the application will still start, simply without the proxy settings.
Sample Application
Let’s walk through a simple example to demonstrate this concept. We will create a deployment for an application that uses optional configuration values.
Deploy the application using kubectl apply -f <your-deployment-file>.yaml.
If the app-config ConfigMap is present, the Pod will output “Hello, World!”.
If the ConfigMap is missing, the Pod will start, but no greeting will be echoed.
Conclusion
Optional ConfigMap values are a simple yet effective way to make your Kubernetes deployments more resilient and adaptable to different environments. By marking ConfigMap keys as optional, you can prevent deployment failures and allow your applications to handle missing configuration gracefully.