Kubernetes Archives - Alexandre Vazquez

Building a Kubernetes Migration Framework: Lessons from Ingress-NGINX

2026-02-16 by Alexandre Vazquez

The recent announcement regarding the deprecation of the Ingress-NGINX controller sent a ripple through the Kubernetes community. For many organizations, it’s the first major deprecation of a foundational, widely-adopted ecosystem component. While the immediate reaction is often tactical—”What do we replace it with?”—the more valuable long-term question is strategic: “How do we systematically manage this and future migrations?”

This event isn’t an anomaly; it’s a precedent. As Kubernetes matures, core add-ons, APIs, and patterns will evolve or sunset. Platform engineering teams need a repeatable, low-risk framework for navigating these changes. Drawing from the Ingress-NGINX transition and established deployment management principles, we can abstract a robust Kubernetes Migration Framework applicable to any major component, from service meshes to CSI drivers.

Why Ad-Hoc Migrations Fail in Production

Attempting a “big bang” replacement or a series of manual, one-off changes is a recipe for extended downtime, configuration drift, and undetected regression. Production Kubernetes environments are complex systems with deep dependencies:

Interdependent Workloads: Multiple applications often share the same ingress controller, relying on specific annotations, custom snippets, or behavioral quirks.
Automation and GitOps Dependencies: Helm charts, Kustomize overlays, and ArgoCD/Flux manifests are tightly coupled to the existing component’s API and schema.
Observability and Security Integration: Monitoring dashboards, logging parsers, and security policies are tuned for the current implementation.
Knowledge Silos: Tribal knowledge about workarounds and specific configurations isn’t documented.

A structured framework mitigates these risks by enforcing discipline, creating clear validation gates, and ensuring the capability to roll back at any point.

The Four-Phase Kubernetes Migration Framework

This framework decomposes the migration into four distinct phases: Assessment, Parallel Run, Cutover, and Decommission. Each phase has defined inputs, activities, and exit criteria.

Phase 1: Deep Assessment & Dependency Mapping

Before writing a single line of new configuration, understand the full scope. The goal is to move from “we use Ingress-NGINX” to a precise inventory of how it’s used.

Inventory All Ingress Resources: Use kubectl get ingress --all-namespaces as a starting point, but go deeper.
Analyze Annotation Usage: Script an analysis to catalog every annotation in use (e.g., nginx.ingress.kubernetes.io/rewrite-target, nginx.ingress.kubernetes.io/configuration-snippet). This reveals functional dependencies.
Map to Backend Services: For each Ingress, identify the backend Services and Namespaces. This highlights critical applications and potential blast radius.
Review Customizations: Document any custom ConfigMaps for main NGINX configuration, custom template patches, or modifications to the controller deployment itself.
Evaluate Alternatives: Based on the inventory, evaluate candidate replacements (e.g., Gateway API with a compatible implementation, another Ingress controller like Emissary-ingress or Traefik). The Google Cloud migration framework provides a useful decision tree for ingress-specific migrations.

The output of this phase is a migration manifesto: a concrete list of what needs to be converted, grouped by complexity and criticality.

Phase 2: Phased Rollout & Parallel Run

This is the core of a low-risk migration. Instead of replacing, you run the new and old systems in parallel, shifting traffic gradually. For ingress, this often means installing the new controller alongside the old one.

Dual Installation: Deploy the new ingress controller in the same cluster, configured with a distinct ingress class (e.g., ingressClassName: gateway vs. nginx).

Create Canary Ingress Resources: For a low-risk application, create a parallel Ingress or Gateway resource pointing to the new controller. Use techniques like managed deployments with canary patterns to control exposure.

# Example: A new Gateway API HTTPRoute for a canary service
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: app-canary
spec:
  parentRefs:
  - name: company-gateway
  rules:
  - backendRefs:
    - name: app-service
      port: 8080
      weight: 10 # Start with 10% of traffic

Validate Equivalency: Use traffic mirroring (if supported) or direct synthetic testing against both ingress paths. Compare logs, response headers, latency, and error rates.
Iterate and Expand: Gradually increase traffic weight or add more applications to the new stack, group by group, based on the assessment from Phase 1.

This phase relies heavily on your observability stack. Dashboards comparing error rates, latency (p50, p99), and throughput between the old and new paths are essential.

Phase 3: Validation & Automated Cutover

The cutover is not a manual event. It’s the final step in a validation process.

Define Validation Tests: Create a suite of tests that must pass before full cutover. This includes:
- Smoke tests for all critical user journeys.
- Load tests to verify performance under expected traffic patterns.
- Security scan validation (e.g., no unintended ports open).
- Compliance checks (e.g., specific headers are present).
Automate the Switch: For each application, the cutover is ultimately a change in its Ingress or Gateway resource. This should be done via your GitOps pipeline. Update the source manifests (e.g., change the ingressClassName), merge, and let automation apply it. This ensures the state is declarative and recorded.
Maintain Rollback Capacity: The old system must remain operational and routable (with reduced capacity) during this phase. The GitOps rollback is simply reverting the manifest change.

Phase 4: Observability & Decommission

Once all traffic is successfully migrated and validated over a sustained period (e.g., 72 hours), you can decommission the old component.

Monitor Aggressively: Keep a close watch on all key metrics for at least one full business cycle (a week).
Remove Old Resources: Delete the old controller’s Deployment, Service, ConfigMaps, and CRDs (if no longer needed).
Clean Up Auxiliary Artifacts: Remove old RBAC bindings, service accounts, and any custom monitoring alerts or dashboards specific to the old component.
Document Lessons Learned: Update runbooks and architecture diagrams. Note any surprises, gaps in the process, or validation tests that were particularly valuable.

Key Principles for a Resilient Framework

Beyond the phases, these principles should guide your framework’s design:

Always Maintain Rollback Capability: Every step should be reversible with minimal disruption. This is a core tenet of managing Kubernetes deployments.
Leverage GitOps for State Management: All desired state changes (Ingress resources, controller deployments) must flow through version-controlled manifests. This provides an audit trail, consistency, and the simplest rollback mechanism (git revert).
Validate with Production Traffic Patterns: Synthetic tests are insufficient. Use canary weights and traffic mirroring to validate with real user traffic in a controlled manner.
Communicate Transparently: Platform teams should maintain a clear migration status page for internal stakeholders, showing which applications have been migrated, which are in progress, and the overall timeline.

Conclusion: Building a Migration-Capable Platform

The deprecation of Ingress-NGINX is a wake-up call. The next major change is a matter of “when,” not “if.” By investing in a structured migration framework now, platform teams transform a potential crisis into a manageable, repeatable operational procedure.

This framework—Assess, Run in Parallel, Validate, and Decommission—abstracts the specific lessons from the ingress migration into a generic pattern. It can be applied to migrating from PodSecurityPolicies to Pod Security Standards, from a deprecated CSI driver, or from one service mesh to another. The tools (GitOps, canary deployments, observability) are already in your stack. The value is in stitching them together into a disciplined process that ensures platform evolution doesn’t compromise platform stability.

Start by documenting this framework as a runbook template. Then, apply it to your next significant component update, even a minor one, to refine the process. When the next major deprecation announcement lands in your inbox, you’ll be ready.

Kubernetes Dashboard Alternatives in 2026: Best Web UI Options After Official Retirement

2026-01-26 by Alexandre Vazquez

The Kubernetes Dashboard, once a staple tool for cluster visualization and management, has been officially archived and is no longer maintained. For many teams who relied on its straightforward web interface to monitor pods, deployments, and services, this retirement marks the end of an era. But it also signals something important: the Kubernetes ecosystem has evolved far beyond what the original dashboard was designed to handle.

Today’s Kubernetes environments are multi-cluster by default, driven by GitOps principles, guarded by strict RBAC policies, and operated by platform teams serving dozens or hundreds of developers. The operating model has simply outgrown the traditional dashboard’s capabilities.

So what comes next? If you’ve been using Kubernetes Dashboard and need to migrate to something more capable, or if you’re simply curious about modern alternatives, this guide will walk you through the best options available in 2026.

Why Kubernetes Dashboard Was Retired

The Kubernetes Dashboard served its purpose well in the early days of Kubernetes adoption. It provided a simple, browser-based interface for viewing cluster resources without needing to master kubectl commands. But as Kubernetes matured, several limitations became apparent:

Single-cluster focus: Most organizations now manage multiple clusters across different environments, but the dashboard was designed for viewing one cluster at a time
Limited RBAC capabilities: Modern platform teams need fine-grained access controls at the cluster, namespace, and workload levels
No GitOps integration: Contemporary workflows rely on declarative configuration and continuous deployment pipelines
Minimal observability: Beyond basic resource listing, the dashboard lacked advanced monitoring, alerting, and troubleshooting features
Security concerns: The dashboard’s architecture required careful configuration to avoid exposing cluster access

The community recognized these constraints, and the official recommendation now points toward Headlamp as the successor. But Headlamp isn’t the only option worth considering.

Top Kubernetes Dashboard Alternatives for 2026

1. Headlamp: The Official Successor

Headlamp is now the official recommendation from the Kubernetes SIG UI group. It’s a CNCF Sandbox project developed by Kinvolk (now part of Microsoft) that brings a modern approach to cluster visualization.

Key Features:

Clean, intuitive interface built with modern web technologies
Extensive plugin system for customization
Works both as an in-cluster deployment and desktop application
Uses your existing kubeconfig file for authentication
OpenID Connect (OIDC) support for enterprise SSO
Read and write operations based on RBAC permissions

Installation Options:

# Using Helm
helm repo add headlamp https://kubernetes-sigs.github.io/headlamp/
helm install my-headlamp headlamp/headlamp --namespace kube-system

# As Minikube addon
minikube addons enable headlamp
minikube service headlamp -n headlamp

Headlamp excels at providing a familiar dashboard experience while being extensible enough to grow with your needs. The plugin architecture means you can customize it for your specific workflows without waiting for upstream changes.

Best for: Teams transitioning from Kubernetes Dashboard who want a similar experience with modern features and official backing.

2. Portainer: Enterprise Multi-Cluster Management

Portainer has evolved from a Docker management tool into a comprehensive Kubernetes platform. It’s particularly strong when you need to manage multiple clusters from a single interface. We already covered in detail Portainer so you can also take a look

Key Features:

Multi-cluster management dashboard
Enterprise-grade RBAC with fine-grained access controls
Visual workload deployment and scaling
GitOps integration support
Comprehensive audit logging
Support for both Kubernetes and Docker environments

Best for: Organizations managing multiple clusters across different environments who need enterprise RBAC and centralized control.

3. Skooner (formerly K8Dash): Lightweight and Fast

Skooner keeps things simple. If you appreciated the straightforward nature of the original Kubernetes Dashboard, Skooner delivers a similar philosophy with a cleaner, faster interface.

Key Features:

Fast, real-time updates
Clean and minimal interface
Easy installation with minimal configuration
Real-time metrics visualization
Built-in OIDC authentication

Best for: Teams that want a simple, no-frills dashboard without complex features or steep learning curves.

4. Devtron: Complete DevOps Platform

Devtron goes beyond simple cluster visualization to provide an entire application delivery platform built on Kubernetes.

Key Features:

Multi-cluster application deployment
Built-in CI/CD pipelines
Advanced security scanning and compliance
Application-centric view rather than resource-centric
Support for seven different SSO providers
Chart store for Helm deployments

Best for: Platform teams building internal developer platforms who need comprehensive deployment pipelines alongside cluster management.

5. KubeSphere: Full-Stack Container Platform

KubeSphere positions itself as a distributed operating system for cloud-native applications, using Kubernetes as its kernel.

Key Features:

Multi-tenant architecture
Integrated DevOps workflows
Service mesh integration (Istio)
Multi-cluster federation
Observability and monitoring built-in
Plug-and-play architecture for third-party integrations

Best for: Organizations building comprehensive container platforms who want an opinionated, batteries-included experience.

6. Rancher: Battle-Tested Enterprise Platform

Rancher from SUSE has been in the Kubernetes management space for years and offers one of the most mature platforms available.

Key Features:

Manage any Kubernetes cluster (EKS, GKE, AKS, on-premises)
Centralized authentication and RBAC
Built-in monitoring with Prometheus and Grafana
Application catalog with Helm charts
Policy management and security scanning

Best for: Enterprise organizations managing heterogeneous Kubernetes environments across multiple cloud providers.

7. Octant: Developer-Focused Cluster Exploration

Octant (originally developed by VMware) takes a developer-centric approach to cluster visualization with a focus on understanding application architecture.

Key Features:

Plugin-based extensibility
Resource relationship visualization
Port forwarding directly from the UI
Log streaming
Context-aware resource inspection

Best for: Application developers who need to understand how their applications run on Kubernetes without being cluster administrators.

Desktop and CLI Alternatives Worth Considering

While this article focuses on web-based dashboards, it’s worth noting that not everyone needs a browser interface. Some of the most powerful Kubernetes management tools work as desktop applications or terminal UIs.

If you’re considering client-side tools, you might find these articles on my blog helpful:

Choosing The Right Kubernetes IDE: FreeLens vs OpenLens vs Lens – A comprehensive comparison of the Lens ecosystem and which variant makes sense in 2026
Discover Your Perfect Tool For Managing Kubernetes – An overview of different management approaches including K9s, a powerful terminal UI

These client tools offer advantages that web dashboards can’t match: offline access, better performance, and tighter integration with your local development workflow. FreeLens, in particular, has emerged as the lowest-risk choice for most organizations looking for a desktop Kubernetes IDE.

Choosing the Right Alternative for Your Team

With so many options available, how do you choose? Here’s a decision framework:

Choose Headlamp if:

You want the officially recommended path forward
You need a lightweight dashboard similar to what you had before
Plugin extensibility is important for future customization
You prefer CNCF-backed open source projects

Choose Portainer if:

You manage multiple Kubernetes clusters
Enterprise RBAC is a critical requirement
You also work with Docker environments
Visual deployment tools would benefit your team

Choose Skooner if:

You want the simplest possible alternative
Your needs are straightforward: view and manage resources
You don’t need advanced features or multi-cluster support

Choose Devtron or KubeSphere if:

You’re building an internal developer platform
You need integrated CI/CD pipelines
Application-centric workflows matter more than resource-centric views

Choose Rancher if:

You’re managing enterprise-scale, multi-cloud Kubernetes
You need battle-tested stability and vendor support
Policy management and compliance are critical

Consider desktop tools like FreeLens if:

You work primarily from a local development environment
You need offline access to cluster information
You prefer richer desktop application experiences

Migration Considerations

If you’re actively using Kubernetes Dashboard today, here’s what to think about when migrating:

Authentication method: Most modern alternatives support OIDC/SSO, but verify your specific identity provider is supported
RBAC policies: Review your existing ClusterRole and RoleBinding configurations to ensure they translate properly
Custom workflows: If you’ve built automation around Dashboard URLs or specific features, you’ll need to adapt these
User training: Even similar-looking alternatives have different UIs and workflows; budget time for team training
Ingress configuration: If you expose your dashboard externally, you’ll need to reconfigure ingress rules

The Future of Kubernetes UI Management

The retirement of Kubernetes Dashboard isn’t a step backward—it’s recognition that the ecosystem has matured. Modern platforms need to handle multi-cluster management, GitOps workflows, comprehensive observability, and sophisticated RBAC out of the box.

The alternatives listed here represent different philosophies about what a Kubernetes interface should be:

Minimalist dashboards (Headlamp, Skooner) that stay close to the original vision
Enterprise platforms (Portainer, Rancher) that centralize multi-cluster management
Developer platforms (Devtron, KubeSphere) that integrate the entire application lifecycle
Desktop experiences (FreeLens, OpenLens) that bring IDE-like capabilities

The right choice depends on your team’s size, your infrastructure complexity, and whether you’re managing platforms or building applications. For most teams migrating from Kubernetes Dashboard, starting with Headlamp makes sense—it’s officially recommended, actively maintained, and provides a familiar experience. From there, you can evaluate whether you need to scale up to more comprehensive platforms.

Whatever you choose, the good news is that the Kubernetes ecosystem in 2026 offers more sophisticated, capable, and secure dashboard alternatives than ever before.

Frequently Asked Questions (FAQ)

Is Kubernetes Dashboard officially deprecated or just unmaintained?

The Kubernetes Dashboard has been officially archived by the Kubernetes project and is no longer actively maintained. While it may still run in existing clusters, it no longer receives security updates, bug fixes, or new features, making it unsuitable for production use in modern environments.

What is the official replacement for Kubernetes Dashboard?

Headlamp is the officially recommended successor by the Kubernetes SIG UI group. It provides a modern web interface, supports plugins, integrates with existing kubeconfig files, and aligns with current Kubernetes security and RBAC best practices.

Is Headlamp production-ready for enterprise environments?

Yes. Headlamp supports OIDC authentication, fine-grained RBAC, and can run either in-cluster or as a desktop application. While still evolving, it is actively maintained and suitable for many production use cases, especially when combined with proper access controls.

Are there lightweight alternatives similar to the old Kubernetes Dashboard?

Yes. Skooner is a lightweight, fast alternative that closely mirrors the simplicity of the original Kubernetes Dashboard while offering a cleaner UI and modern authentication options like OIDC.

Do I still need a web-based dashboard to manage Kubernetes?

Not necessarily. Many teams prefer desktop or CLI-based tools such as FreeLens, OpenLens, or K9s. These tools often provide better performance, offline access, and deeper integration with developer workflows compared to browser-based dashboards.

Is it safe to expose Kubernetes dashboards over the internet?

Exposing any Kubernetes dashboard publicly requires extreme caution. If external access is necessary, always use:
Strong authentication (OIDC / SSO)
Strict RBAC policies
Network restrictions (VPN, IP allowlists)
TLS termination and hardened ingress rules
In many cases, dashboards should only be accessible from internal networks.

Can these dashboards replace kubectl?

No. Dashboards are complementary tools, not replacements for kubectl. While they simplify visualization and some management tasks, advanced operations, automation, and troubleshooting still rely heavily on CLI tools and GitOps workflows.

What should I consider before migrating away from Kubernetes Dashboard?

Before migrating, review:
Authentication and identity provider compatibility
Existing RBAC roles and permissions
Multi-cluster requirements
GitOps and CI/CD integrations
Training needs for platform teams and developers
Starting with Headlamp is often the lowest-risk migration path

Which Kubernetes dashboard is best for developers rather than platform teams?

Tools like Octant and Devtron are more developer-focused. They emphasize application-centric views, resource relationships, and deployment workflows, making them ideal for developers who want insight without managing cluster infrastructure directly.

Which Kubernetes dashboard is best for multi-cluster management?

For multi-cluster environments, Portainer, Rancher, and KubeSphere are strong options. These platforms are designed to manage multiple clusters from a single control plane and offer enterprise-grade RBAC, auditing, and centralized authentication.

MinIO Maintenance Mode Explained: Impact on Community Users, OEMs, and S3 Alternatives

2026-02-022025-12-31 by Alexandre Vazquez

Background: MinIO and the Maintenance Mode announcement

MinIO has long been one of the most popular self-hosted S3-compatible object storage solutions, especially in Kubernetes and on‑premise environments. Its simplicity, performance, and API compatibility made it a common default choice for backups, artifacts, logs, and internal object storage.

In late 2025, MinIO marked its upstream repository as Maintenance Mode and clarified that the Community Edition would be distributed source-only, without official pre-built binaries or container images. This move triggered renewed discussion across the industry about sustainability, governance, and the risks of relying on a single-vendor-controlled “open core” storage layer.

A detailed industry analysis of this shift, including its broader ecosystem impact, can be found in this InfoQ article

—

What exactly changed?

1. Maintenance Mode

Maintenance Mode means:

No new features
No roadmap-driven improvements
Limited fixes, typically only for critical issues
No active review of community pull requests

As highlighted by InfoQ, this effectively freezes MinIO Community as a stable but stagnant codebase, pushing innovation and evolution exclusively toward the commercial offerings.

2. Source-only distribution

Official binaries and container images are no longer published for the Community Edition. Users must:

Build MinIO from source
Maintain their own container images
Handle signing, scanning, and provenance themselves

This aligns with a broader industry pattern noted by InfoQ: infrastructure projects increasingly shifting operational burden back to users unless they adopt paid tiers.

—

Direct implications for Community users

Security and patching

With no active upstream development:

Vulnerability response times may increase
Users must monitor security advisories independently
Regulated environments may find Community harder to justify

InfoQ emphasizes that this does not make MinIO insecure by default, but it changes the shared-responsibility model significantly.

Operational overhead

Teams now need to:

Pin commits or tags explicitly
Build and test their own releases
Maintain CI pipelines for a core storage dependency

This is a non-trivial cost for what was previously perceived as a “drop‑in” component.

Support and roadmap

The strategic message is clear: active development, roadmap influence, and predictable maintenance live behind the commercial subscription.

—

Impact on OEM and embedded use cases

The InfoQ analysis draws an important distinction between API consumers and technology embedders.

Using MinIO as an external S3 service

If your application simply consumes an S3 endpoint:

The impact is moderate
Migration is largely operational
Application code usually remains unchanged

Embedding or redistributing MinIO

If your product:

Ships MinIO internally
Builds gateways or features on MinIO internals
Depends on MinIO-specific operational tooling

Then the impact is high:

You inherit maintenance and security responsibility
Long-term internal forking becomes likely
Licensing (AGPL) implications must be reassessed carefully

For OEM vendors, this often forces a strategic re-evaluation rather than a tactical upgrade.

—

Forks and community reactions

At the time of writing:

Several community forks focus on preserving the MinIO Console / UI experience
No widely adopted, full replacement fork of the MinIO server exists
Community discussion, as summarized by InfoQ, reflects caution rather than rapid consolidation

The absence of a strong server-side fork suggests that most organizations are choosing migration over replacement-by-fork.

—

Fully open-source alternatives to MinIO

InfoQ highlights that the industry response is not about finding a single “new MinIO”, but about selecting storage systems whose governance and maintenance models better match long-term needs.

Ceph RGW

Best for: Enterprise-grade, highly available environments
Strengths: Mature ecosystem, large community, strong governance
Trade-offs: Operational complexity

SeaweedFS

Best for: Teams seeking simplicity and permissive licensing
Strengths: Apache-2.0 license, active development, integrated S3 API
Trade-offs: Partial S3 compatibility for advanced edge cases

Garage

Best for: Self-hosted and geo-distributed systems
Strengths: Resilience-first design, active open-source development
Trade-offs: AGPL license considerations

Zenko / CloudServer

Best for: Multi-cloud and Scality-aligned architectures
Strengths: Open-source S3 API implementation
Trade-offs: Different architectural assumptions than MinIO

—

Recommended strategies by scenario

If you need to reduce risk immediately

Freeze your current MinIO version
Build, scan, and sign your own images
Define and rehearse a migration path

If you operate Kubernetes on-prem with HA requirements

Ceph RGW is often the most future-proof option

If licensing flexibility is critical

Start evaluation with SeaweedFS

If operational UX matters

Shift toward automation-first workflows
Treat UI forks as secondary tooling, not core infrastructure

—

Conclusion

MinIO’s shift of the Community Edition into Maintenance Mode is less about short-term breakage and more about long-term sustainability and control.

As the InfoQ analysis makes clear, the real risk is not technical incompatibility but governance misalignment. Organizations that treat object storage as critical infrastructure should favor solutions with transparent roadmaps, active communities, and predictable maintenance models.

For many teams, this moment serves as a natural inflection point: either commit to self-maintaining MinIO, move to a commercially supported path, or migrate to a fully open-source alternative designed for the long run.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

Helm 4.0 Features, Breaking Changes & Migration Guide 2025

2026-01-262025-11-23 by Alexandre Vazquez

Helm is one of the main utilities within the Kubernetes ecosystem, and therefore the release of a new major version, such as Helm 4.0, is something to consider because it is undoubtedly something that will need to be analyzed, evaluated, and managed in the coming months.

Helm 4.0 represents a major milestone in Kubernetes package management. For a complete understanding of Helm from basics to advanced features, explore our .

Due to this, we will see many comments and articles around this topic, so we will try to shed some light.

Helm 4.0 Key Features and Improvements

According to the project itself in its announcement, Helm 4 introduces three major blocks of changes: new plugin system, better integration with Kubernetes ** and internal modernization of SDK and performance**.

New Plugin System (includes WebAssembly)

The plugin system has been completely redesigned, with a special focus on security through the introduction of a new WebAssembly runtime that, while optional, is recommended as it runs in a “sandbox” mode that offers limits and guarantees from a security perspective.

In any case, there is no need to worry excessively, as the “classic” plugins continue to work, but the message is clear: for security and extensibility, the direction is Wasm.

Server-Side Apply and Better Integration with Other Controllers

From this version, Helm 4 supports Server-Side Apply (SSA) through the --server-side flag, which has already become stable since Kubernetes version v1.22 and allows updates on objects to be handled server-side to avoid conflicts between different controllers managing the same resources.

It also incorporates integration with kstatus to ensure the state of a component in a more reliable way than what currently happens with the use of the --wait parameter.

Other Additional Improvements

Additionally, there is another list of improvements that, while of lesser scope, are important qualitative leaps, such as the following:

Installation by digest in OCI registries: (helm install myapp oci://...@sha256:<digest>)
Multi-document values: you can pass multiple YAML values in a single multi-doc file, facilitating complex environments/overlays.
New --set-json argument that allows for easily passing complex structures compared to the current solution using the --set parameter

Why a Major (v4) and Not Another Minor of 3.x?

As explained in the official release post, there were features that the team could not introduce in v3 without breaking public SDK APIs and internal architecture:

Strong change in the plugin system (WebAssembly, new types, deep integration with the core).
Restructuring of Go packages and establishment of a stable SDK at helm.sh/helm/v4, code-incompatible with v3.
Introduction and future evolution of Charts v3, which require the SDK to support multiple versions of chart APIs.

With all this, continuing in the 3.x branch would have violated SemVer: the major number change is basically “paying” the accumulated technical debt to be able to move forward.

Additionally, a new evolution of the charts is expected in the future, moving from v2 to a future v3 that is not yet fully defined, and currently, v2 charts run correctly in this new version.

Is Helm 4.0 Migration Required?

The short answer is: yes. And possibly the long answer is: yes, and quickly. In the official Helm 4 announcement, they specify the support schedule for Helm 3:

Helm 3 bug fixes until July 8, 2026.
Helm 3 security fixes until November 11, 2026.
No new features will be backported to Helm 3 during this period; only Kubernetes client libraries will be updated to support new K8s versions.

Practical translation:

Organizations have approximately 1 year to plan a smooth Helm 4.0 migration with continued bug support for Helm 3.
After November 2026, continuing to use Helm 3 will become increasingly risky from a security and compatibility standpoint.

Best Practices for Migration

To carry out the migration, it is important to remember that it is perfectly possible and feasible to have both versions installed on the same machine or agent, so a “gradual” migration can be done to ensure that the end of support for version v3 is reached with everything migrated correctly, and for that, the following steps are recommended:

Conduct an analysis of all Helm commands and usage from the perspective of integration pipelines, upgrade scripts, or even the import of Helm client libraries in Helm-based developments.
Especially carefully review all uses of --post-renderer, helm registry login, --atomic, --force.
After the analysis, start testing Helm 4 first in non-production environments, reusing the same charts and values, reverting to Helm 3 if a problem is detected until it is resolved.
If you have critical plugins, explicitly test them with Helm 4 before making the global change.

What are the main new features in Helm 4.0?

Helm 4.0 introduces three major improvements: a redesigned plugin system with WebAssembly support for enhanced security, Server-Side Apply (SSA) integration for better conflict resolution, and internal SDK modernization for improved performance. Additional features include OCI digest installation and multi-document values support.

When does Helm 3 support end?

Helm 3 bug fixes end July 8, 2026 and security fixes end November 11, 2026. No new features will be backported to Helm 3. Organizations should plan migration to Helm 4.0 before November 2026 to avoid security and compatibility risks.

Are Helm 3 charts compatible with Helm 4.0?

Yes, Helm Chart API v2 charts work correctly with Helm 4.0. However, the Go SDK has breaking changes, so applications using Helm libraries need code updates. The CLI commands remain largely compatible for most use cases.

Can I run Helm 3 and Helm 4 simultaneously?

Yes, both versions can be installed on the same machine, enabling gradual migration strategies. This allows teams to test Helm 4.0 in non-production environments while maintaining Helm 3 for critical workloads during the transition period.

What should I test before migrating to Helm 4.0?

Focus on testing critical plugins, post-renderers, and specific flags like --atomic, --force, and helm registry login. Test all charts and values in non-production environments first, and review any custom integrations using Helm SDK libraries.

What is Server-Side Apply in Helm 4.0?

Server-Side Apply (SSA) is enabled with the --server-side flag and handles resource updates on the Kubernetes API server side. This prevents conflicts between different controllers managing the same resources and has been stable since Kubernetes v1.22.

Resolving Kubernetes Ingress Issues: Limitations and Gateway Insights

2026-01-262025-11-23 by Alexandre Vazquez

Introduction

Ingresses have been, since the early versions of Kubernetes, the most common way to expose applications to the outside. Although their initial design was simple and elegant, the success of Kubernetes and the growing complexity of use cases have turned Ingress into a problematic piece: limited, inconsistent between vendors, and difficult to govern in enterprise environments.

In this article, we analyze why Ingresses have become a constant source of friction, how different Ingress Controllers have influenced this situation, and why more and more organizations are considering alternatives like Gateway API.

What Ingresses are and why they were designed this way

The Ingress ecosystem revolves around two main resources:

🏷️ IngressClass

Defines which controller will manage the associated Ingresses. Its scope is cluster-wide, so it is usually managed by the platform team.

🌐 Ingress

It is the resource that developers use to expose a service. It allows defining routes, domains, TLS certificates, and little more.

Its specification is minimal by design, which allowed for rapid adoption, but also laid the foundation for current problems.

The problem: a standard too simple for complex needs

As Kubernetes became an enterprise standard, users wanted to replicate advanced configurations of traditional proxies: rewrites, timeouts, custom headers, CORS, etc.
But Ingress did not provide native support for all this.

Vendors reacted… and chaos was born.

Annotations vs CRDs: two incompatible paths

Different Ingress Controllers have taken very different paths to add advanced capabilities:

📝 Annotations (NGINX, HAProxy…)

Advantages:

Flexible and easy to use
Directly in the Ingress resource

Disadvantages:

Hundreds of proprietary annotations
Fragmented documentation
Non-portable configurations between vendors

📦 Custom CRDs (Traefik, Kong…)

Advantages:

More structured and powerful
Better validation and control

Disadvantages:

Adds new non-standard objects
Requires installation and management
Less interoperability

Result?
Infrastructures deeply coupled to a vendor, complicating migrations, audits, and automation.

The complexity for development teams

The design of Ingress implies two very different responsibilities:

Platform: defines IngressClass
Application: defines Ingress

But the reality is that the developer ends up making decisions that should be the responsibility of the platform area:

Certificates
Security policies
Rewrite rules
CORS
Timeouts
Corporate naming practices

This causes:

Inconsistent configurations
Bottlenecks in reviews
Constant dependency between teams
Lack of effective standardization

In large companies, where security and governance are critical, this is especially problematic.

NGINX Ingress: the decommissioning that reignited the debate

The recent decommissioning of the NGINX Ingress Controller has highlighted the fragility of the ecosystem:

Thousands of clusters depend on it
Multiple projects use its annotations
Migrating involves rewriting entire configurations

This has reignited the conversation about the need for a real standard… and there appears Gateway API.

Gateway API: a promising alternative (but not perfect)

Gateway API was born to solve many of the limitations of Ingress:

Clear separation of responsibilities (infrastructure vs application)
Standardized extensibility
More types of routes (HTTPRoute, TCPRoute…)
Greater expressiveness without relying on proprietary annotations

But it also brings challenges:

Requires gradual adoption
Not all vendors implement the same
Migration is not trivial

Even so, it is shaping up to be the future of traffic management in Kubernetes.

Conclusion

Ingresses have been fundamental to the success of Kubernetes, but their own simplicity has led them to become a bottleneck. The lack of interoperability, differences between vendors, and complex governance in enterprise environments make it clear that it is time to adopt more mature models.

Gateway API is not perfect, but it moves in the right direction.
Organizations that want future stability should start planning their transition.

Kubernetes Node Affinity Explained: Scheduling Rules, Trade-offs & Best Practices

2026-01-262025-11-03 by Alexandre Vazquez

What is Kubernetes Node Affinity? Benefits and Core Concepts

Kubernetes node affinity is an essential scheduling feature that allows you to control pod placement based on node labels and properties. By using node affinity rules, you can specify constraints on which nodes pods can be scheduled, enabling you to optimize resource allocation and enhance performance.

Node affinity works by allowing you to define rules for pod scheduling based on node labels. When defining node affinity rules, you have two options: required and preferred rules. Required rules ensure that pods are scheduled only on nodes that satisfy the defined criteria. If no suitable node is available, the pod remains unscheduled. On the other hand, preferred rules provide a soft constraint and attempt to schedule pods on nodes that match the specified criteria. However, if no such node is available, the pod can still be scheduled on other nodes.

Node affinity rules are an “expanded” option of the simply way by using node selectors. Node selectors are a simple form of node affinity that allows you to assign labels to nodes and match those labels with selectors defined in the pod specification. By specifying a node selector, you can ensure that pods are scheduled only on nodes with matching labels. Node selectors are useful for basic affinity requirements but lack the flexibility and fine-grained control provided by more advanced affinity options.

Node Affinity Trade-offs: Required vs Preferred Rules and Failure Scenarios

But this awesome capability has some trade-offs that you need to take in consideration because nothing comes with a price that you need to be aware of, so, let’s go to the important question, what is the worst case scenario of using any of those options?

Consider a stateful workload, like a distributed database (e.g., etcd or ZooKeeper), deployed with three replicas for consensus and fault tolerance. So you decide to define a set of nodes for this workload and use node affinity rules to ensure the pods are scheduled to those nodes. And, you need to think: should I use the preferred mode or the requiredMode?

Let’s say that you go with the required option and you define it like this, what happen if one of your nodes goes down? The pod will be try to be rescheduled again and unless there are another node “with same label” to that, it cannot be deployed? If you additional defined a pod anti-affinity rule to ensure each of the replicas is in a different host to ensure that in case that one node is going down you lose only a single replica, you’re losing the option to rescheudle the workload even if you have another nodes without the label available. So, you’re not in a so reliable option.

Ok, so you go with the preferred to ensure that you workload is for sure scheduled even if it is in another node, and in that case you can end up on the situation that those nodes are scheduled on other nodes keeping those nodes with the proper label without the workload that they should have, making the situation strange and more difficult to administer because you cannot ensure your workloads is on the nodes that you expected to be.

Additional to that, if the nodes has even taints to ensure other workloads cannot be placed there, you can end up in a situation that the “labeled-pods” are scheduled on non-labeled nodes, and the non-labeled pods cannot use the nodes because they’re tainted and can be not be able to use the un-labeled ones if there are not enough resources. So you’re generating an impact on the other workloasd and potentially affecting the schedulling of the other workloads.

Preparing for Unexpected Outages with Node Affinity

So, as you can see, each decision has some disadvatanges that you need to take in consdieration before defining those rules, because if you don’t, you will figure it out when this happen on an production enviornment probably as a result of some unexpected outage, because we all know that in the meantime that nothing bad happens everything works as expected, but the potential of these solutions and its reason to be used is exactly to provide the tools and the options to be prepared when bad things happens.

So, next time that you need to define a node affinity rule try to think about the disadvantages of each of the option and try to select that one that works best for you and mitigate the problems that it can bring to the table of your production environment.

Frequently Asked Questions

What is the difference between nodeSelector and node affinity in Kubernetes?

nodeSelector is a simple field that requires a node to have all specified labels. Node affinity is a more expressive API that supports complex operators like In, NotIn, and Exists, and distinguishes between hard (requiredDuringScheduling...) and soft (preferredDuringScheduling...) constraints. Use nodeSelector for basic needs; use node affinity for advanced scheduling logic.

When should I use required vs preferred node affinity rules?

Use required rules for strict placement needs, like licensing constraints or specific hardware (e.g., GPU nodes). Use preferred rules for optimization, like trying to place pods on nodes in the same availability zone for lower latency. Be aware that required rules can prevent scheduling during node failures, while preferred rules may not guarantee optimal placement.

What are the risks of using required node affinity?

The primary risk is scheduling failure. If no node matches the required rules (e.g., due to a failure or label mismatch), the pod will remain Pending. This can lead to application downtime, especially if combined with Pod Anti-Affinity, which further restricts eligible nodes. Always ensure you have enough labeled nodes to handle failures.

How does node affinity interact with taints and tolerations?

They work sequentially. First, the scheduler filters nodes based on node affinity/selector rules. Then, from the filtered nodes, it checks taints and tolerations. A pod will only be scheduled on a node that satisfies both its affinity/selector requirements and for which the pod has a matching toleration for all the node’s taints.

What are best practices for defining node affinity labels?

Use clear, descriptive label keys (e.g., node.kubernetes.io/instance-type, topology.kubernetes.io/zone). Prefer built-in labels where possible. Document the purpose of custom labels. Combine node affinity with pod anti-affinity carefully to avoid over-constraining the scheduler. Test scenarios with node failures.

Integrate Kyverno CLI into CI/CD Pipelines with GitHub Actions for Kubernetes Policy Checks

2026-01-262025-11-03 by Alexandre Vazquez

Introduction

As Kubernetes clusters become an integral part of infrastructure, maintaining compliance with security and configuration policies is crucial. Kyverno, a policy engine designed for Kubernetes, can be integrated into your CI/CD pipelines to enforce configuration standards and automate policy checks. In this article, we’ll walk through integrating Kyverno CLI with GitHub Actions, providing a seamless workflow for validating Kubernetes manifests before they reach your cluster.

What is Kyverno CLI?

Kyverno is a Kubernetes-native policy management tool, enabling users to enforce best practices, security protocols, and compliance across clusters. Kyverno CLI is a command-line interface that lets you apply, test, and validate policies against YAML manifests locally or in CI/CD pipelines. By integrating Kyverno CLI with GitHub Actions, you can automate these policy checks, ensuring code quality and compliance before deploying resources to Kubernetes.

Benefits of Using Kyverno CLI in CI/CD Pipelines

Integrating Kyverno into your CI/CD workflow provides several advantages:

Automated Policy Validation: Detect policy violations early in the CI/CD pipeline, preventing misconfigured resources from deployment.
Enhanced Security Compliance: Kyverno enables checks for security best practices and compliance frameworks.
Faster Development: Early feedback on policy violations streamlines the process, allowing developers to fix issues promptly.

Setting Up Kyverno CLI in GitHub Actions

Step 1: Install Kyverno CLI

To use Kyverno in your pipeline, you need to install the Kyverno CLI in your GitHub Actions workflow. You can specify the Kyverno version required for your project or use the latest version.

Here’s a sample GitHub Actions YAML configuration to install Kyverno CLI:

name: CI Pipeline with Kyverno Policy Checks

on:
  push:
    branches:
      - main
  pull_request:
    branches:
      - main

jobs:
  kyverno-policy-check:
    runs-on: ubuntu-latest

    steps:
      - name: Checkout Code
        uses: actions/checkout@v2

      - name: Install Kyverno CLI
        run: |
          curl -LO https://github.com/kyverno/kyverno/releases/download/v<version>/kyverno-cli-linux.tar.gz
          tar -xzf kyverno-cli-linux.tar.gz
          sudo mv kyverno /usr/local/bin/

Replace <version> with the version of Kyverno CLI you wish to use. Alternatively, you can replace it with latest to always fetch the latest release.

Step 2: Define Policies for Validation

Create a directory in your repository to store Kyverno policies. These policies define the standards that your Kubernetes resources should comply with. For example, create a directory structure as follows:

.
└── .github
    └── policies
        ├── disallow-latest-tag.yaml
        └── require-requests-limits.yaml

Each policy is defined in YAML format and can be customized to meet specific requirements. Below are examples of policies that might be used:

Disallow latest Tag in Images: Prevents the use of the latest tag to ensure version consistency.
Enforce CPU/Memory Limits: Ensures resource limits are set for containers, which can prevent resource abuse.

Step 3: Add a GitHub Actions Step to Validate Manifests

In this step, you’ll use Kyverno CLI to validate Kubernetes manifests against the policies defined in the .github/policies directory. If a manifest fails validation, the pipeline will halt, preventing non-compliant resources from being deployed.

Here’s the YAML configuration to validate manifests:

- name: Validate Kubernetes Manifests
  run: |
    kyverno apply .github/policies -r manifests/

Replace manifests/ with the path to your Kubernetes manifests in the repository. This command applies all policies in .github/policies against each YAML file in the manifests directory, stopping the pipeline if any non-compliant configurations are detected.

Step 4: Handle Validation Results

To make the output of Kyverno CLI more readable, you can use additional GitHub Actions steps to format and handle the results. For instance, you might set up a conditional step to notify the team if any manifest is non-compliant:

- name: Check for Policy Violations
  if: failure()
  run: echo "Policy violation detected. Please review the failed validation."

Alternatively, you could configure notifications to alert your team through Slack, email, or other integrations whenever a policy violation is identified.

—

Example: Validating a Kubernetes Manifest

Suppose you have a manifest defining a Kubernetes deployment as follows:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 1
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
        - name: nginx
          image: nginx:latest  # Should trigger a violation

The policy disallow-latest-tag.yaml checks if any container image uses the latest tag and rejects it. When this manifest is processed, Kyverno CLI flags the image and halts the CI/CD pipeline with an error, preventing the deployment of this manifest until corrected.

Conclusion

Integrating Kyverno CLI into a GitHub Actions CI/CD pipeline offers a robust, automated solution for enforcing Kubernetes policies. With this setup, you can ensure Kubernetes resources are compliant with best practices and security standards before they reach production, enhancing the stability and security of your deployments.

Kubernetes Ingress on OpenShift: Routes Explained and When to Use Them

2026-01-262025-09-22 by Alexandre Vazquez

Introduction
OpenShift, Red Hat’s Kubernetes platform, has its own way of exposing services to external clients. In vanilla Kubernetes, you would typically use an Ingress resource along with an ingress controller to route external traffic to services. OpenShift, however, introduced the concept of a Route and an integrated Router (built on HAProxy) early on, before Kubernetes Ingress even existed. Today, OpenShift supports both Routes and standard Ingress objects, which can sometimes lead to confusion about when to use each and how they relate.

This article explores how OpenShift handles Kubernetes Ingress resources, how they translate to Routes, the limitations of this approach, and guidance on when to use Ingress versus Routes.

OpenShift Routes and the Router: A Quick Overview

OpenShift Routes are OpenShift-specific resources designed to expose services externally. They are served by the OpenShift Router, which is an HAProxy-based proxy running inside the cluster. Routes support advanced features such as:

Weighted backends for traffic splitting
Sticky sessions (session affinity)
Multiple TLS termination modes (edge, passthrough, re-encrypt)
Wildcard subdomains
Custom certificates and SNI
Path-based routing

Because Routes are OpenShift-native, the Router understands these features natively and can be configured accordingly. This tight integration enables powerful and flexible routing capabilities tailored to OpenShift environments.

Using Kubernetes Ingress in OpenShift (Default Behavior)

Starting with OpenShift Container Platform (OCP) 3.10, Kubernetes Ingress resources are supported. When you create an Ingress, OpenShift automatically translates it into an equivalent Route behind the scenes. This means you can use standard Kubernetes Ingress manifests, and OpenShift will handle exposing your services externally by creating Routes accordingly.

Example: Kubernetes Ingress and Resulting Route

Here is a simple Ingress manifest:

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
spec:
  rules:
  - host: www.example.com
    http:
      paths:
      - path: /testpath
        pathType: Prefix
        backend:
          service:
            name: test-service
            port:
              number: 80

OpenShift will create a Route similar to:

apiVersion: route.openshift.io/v1
kind: Route
metadata:
  name: example-route
spec:
  host: www.example.com
  path: /testpath
  to:
    kind: Service
    name: test-service
    weight: 100
  port:
    targetPort: 80
  tls:
    termination: edge

This automatic translation simplifies migration and supports basic use cases without requiring Route-specific manifests.

Tuning Behavior with Annotations (Ingress ➝ Route)

When you use Ingress on OpenShift, only OpenShift-aware annotations are honored during the Ingress ➝ Route translation. Controller-specific annotations for other ingress controllers (e.g., nginx.ingress.kubernetes.io/*) are ignored by the OpenShift Router. The following annotations are commonly used and supported by the OpenShift router to tweak the generated Route:

Purpose	Annotation	Typical Values	Effect on Generated Route
TLS termination	`route.openshift.io/termination`	`edge` · `reencrypt` · `passthrough`	Sets Route `spec.tls.termination` to the chosen mode.
HTTP→HTTPS redirect (edge)	`route.openshift.io/insecureEdgeTerminationPolicy`	`Redirect` · `Allow` · `None`	Controls `spec.tls.insecureEdgeTerminationPolicy` (commonly `Redirect`).
Backend load-balancing	`haproxy.router.openshift.io/balance`	`roundrobin` · `leastconn` · `source`	Sets HAProxy balancing algorithm for the Route.
Per-route timeout	`haproxy.router.openshift.io/timeout`	duration like `60s`, `5m`	Configures HAProxy timeout for requests on that Route.
HSTS header	`haproxy.router.openshift.io/hsts_header`	e.g. `max-age=31536000;includeSubDomains;preload`	Injects HSTS header on responses (edge/re-encrypt).

Note: Advanced features like weighted backends/canary or wildcard hosts are not expressible via standard Ingress. Use a Route directly for those.

Example: Ingress with OpenShift router annotations

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: example-ingress-https
  annotations:
    route.openshift.io/termination: edge
    route.openshift.io/insecureEdgeTerminationPolicy: Redirect
    haproxy.router.openshift.io/balance: leastconn
    haproxy.router.openshift.io/timeout: 60s
    haproxy.router.openshift.io/hsts_header: max-age=31536000;includeSubDomains;preload
spec:
  rules:
  - host: www.example.com
    http:
      paths:
      - path: /
        pathType: Prefix
        backend:
          service:
            name: test-service
            port:
              number: 80

This Ingress will be realized as a Route with edge TLS and an automatic HTTP→HTTPS redirect, using least connections balancing and a 60s route timeout. The HSTS header will be added by the router on HTTPS responses.

Limitations of Using Ingress to Generate Routes
While convenient, using Ingress to generate Routes has limitations:

Missing advanced features: Weighted backends and sticky sessions require Route-specific annotations and are not supported via Ingress.
TLS passthrough and re-encrypt modes: These require OpenShift-specific annotations on Routes and are not supported through standard Ingress.
Ingress without host: An Ingress without a hostname will not create a Route; Routes require a host.
Wildcard hosts: Wildcard hosts (e.g., *.example.com) are only supported via Routes, not Ingress.
Annotation compatibility: Some OpenShift Route annotations do not have equivalents in Ingress, leading to configuration gaps.
Protocol support: Ingress supports only HTTP/HTTPS protocols, while Routes can handle non-HTTP protocols with passthrough TLS.
Config drift risk: Because Routes created from Ingress are managed by OpenShift, manual edits to the generated Route may be overwritten or cause inconsistencies.

These limitations mean that for advanced routing configurations or OpenShift-specific features, using Routes directly is preferable.

When to Use Ingress vs. When to Use Routes
Choosing between Ingress and Routes depends on your requirements:

Use Ingress if:
You want portability across Kubernetes platforms.
You have existing Ingress manifests and want to minimize changes.
Your application uses only basic HTTP or HTTPS routing.
You prefer platform-neutral manifests for CI/CD pipelines.
Use Routes if:
You need advanced routing features like weighted backends, sticky sessions, or multiple TLS termination modes.
Your deployment is OpenShift-specific and can leverage OpenShift-native features.
You require stability and full support for OpenShift routing capabilities.
You need to expose non-HTTP protocols or use TLS passthrough/re-encrypt modes.
You want to use wildcard hosts or custom annotations not supported by Ingress.

In many cases, teams use a combination: Ingress for portability and Routes for advanced or OpenShift-specific needs.

Conclusion

On OpenShift, Kubernetes Ingress resources are automatically converted into Routes, enabling basic external service exposure with minimal effort. This allows users to leverage existing Kubernetes manifests and maintain portability. However, for advanced routing scenarios and to fully utilize OpenShift’s powerful Router features, using Routes directly is recommended.

Both Ingress and Routes coexist seamlessly on OpenShift, allowing you to choose the right tool for your application’s requirements.

Talos: A Modern Kubernetes-Optimized Linux Distribution

2026-01-262025-07-13 by Alexandre Vazquez

Introduction

Managing a Kubernetes cluster can quickly become overwhelming, particularly when your operating system adds unnecessary complexity. Enter Talos Linux—a groundbreaking, container-optimized, immutable OS explicitly designed for Kubernetes environments. It’s API-driven, completely secure, and strips away traditional management methods, including SSH and package managers.

Talos Linux revolutionizes node management by drastically simplifying operations and enhancing security. In this deep dive, we’ll explore why Talos is capturing attention, its core architecture, and the practical implications for Kubernetes teams.

What is Talos Linux?

Talos Linux is a specialized open-source Linux distribution meticulously crafted to run Kubernetes securely and efficiently. Unlike general-purpose operating systems, Talos discards all irrelevant features and focuses exclusively on Kubernetes, ensuring:

Immutable Design: Changes are handled through atomic upgrades rather than manual interventions.
API-Driven Management: Administrators use talosctl, a CLI that interacts securely with nodes through a gRPC API.
Security by Default: No SSH access, comprehensive kernel hardening, TPM integration, disk encryption, and secure boot features.
Minimal and Predictable: Talos minimizes resource usage and reduces operational overhead by eliminating unnecessary services and processes.

Maintainers and Backing

Talos is maintained by Sidero Labs, renowned for their expertise in Kubernetes tooling and bare-metal provisioning. The active, open-source community of cloud-native engineers and SREs continuously contribute to its growth and evolution.

Talos Architecture Deep Dive

Talos Linux employs a radical design that prioritizes security, simplicity, and performance:

API-Only Interaction: There is no traditional shell access, eliminating many common vulnerabilities associated with SSH.
Atomic Upgrades: System updates are atomic—new versions boot directly into a stable, validated state.
Resource Efficiency: Talos’s stripped-down design reduces its footprint significantly, ensuring optimum resource utilization and faster startup times.
Enhanced Security Measures: It incorporates kernel-level protections, secure boot, disk encryption, and TPM-based security, aligning with stringent compliance requirements.

Kubernetes Distribution based on Talos Linux

Sidero Labs also offers a complete Kubernetes distribution built directly upon Talos Linux, known as “Talos Kubernetes.” This streamlined distribution combines the benefits of Talos Linux with pre-configured Kubernetes components, making it easier and faster to deploy highly secure, production-ready Kubernetes clusters. This simplifies cluster management further by reducing the overhead and complexity typically associated with installing and maintaining Kubernetes separately.

Real-World Use-Cases

Talos shines particularly well in scenarios demanding heightened security, predictability, and streamlined operations:

Security-Conscious Clusters: Zero-trust architectures greatly benefit from Talos’s immutable and restricted-access model.
Edge Computing and IoT: Its minimal resource consumption and robust management via API make it ideal for edge deployments, where remote management is essential.
CI/CD and GitOps Pipelines: The declarative configuration, compatible with YAML and GitOps methodologies, enables automated and reproducible Kubernetes environments.

How to Download and Try Talos Linux

Talos Linux is easy to test and evaluate. You can download it directly from the official Talos GitHub releases. Sidero Labs provides comprehensive documentation and straightforward quick-start guides for deploying Talos Linux on various platforms, including bare-metal servers, virtual machines, and cloud environments such as AWS, Azure, and GCP. For a quick test-drive, running it within a local virtual machine or container is a convenient option.

Talos Compared to Traditional OS Choices

Talos presents distinct advantages compared to more familiar options like Ubuntu, CoreOS, or Flatcar:

Feature	Talos	Ubuntu	Flatcar
SSH Access	❌	✅	✅
Package Manager	❌	✅ (apt)	✅ (rpm)
Kubernetes Native	✅ Built-in	❌	✅ (via tools)
Security Defaults	🔒 High	Moderate	High
Immutable OS	✅	❌	✅
Resource Efficiency	✅ High	Moderate	High
API-driven Management	✅	❌	Limited

What You Cannot Do with Talos Linux

Talos Linux’s specialized design intentionally restricts certain traditional operating system functionalities. Notably:

No SSH Access: Direct shell access to nodes is disabled. All interactions must occur through talosctl.
No Package Managers: Traditional tools like apt, yum, or similar are absent; changes are done through immutable updates.
No Additional Applications: It doesn’t support running additional, non-Kubernetes services or workloads directly on Talos nodes.

These restrictions enforce best practices, significantly enhance security, and ensure a predictable, consistent operational environment.

Conclusion

Talos Linux represents a substantial shift in Kubernetes node management—secure, lean, and entirely Kubernetes-focused. For organizations prioritizing security, compliance, operational simplicity, and efficiency, Talos provides a robust and future-ready foundation.

If your Kubernetes strategy values minimalism, security, and simplicity, Talos Linux offers compelling reasons to consider adoption.

What is Talos Linux?

Talos Linux is a minimal, immutable Linux distribution designed specifically to run Kubernetes. It offers a declarative API for management and focuses on security and consistency.

What are the main advantages of Talos Linux?

Talos Linux provides an immutable system with atomic updates, removes traditional SSH access, and maintains a minimal attack surface, making it ideal for production Kubernetes environments.

How do I install Talos Linux?

To install Talos Linux, download the appropriate image, prepare a machine configuration file, and boot your node from the image. Then use the Talos CLI to bootstrap your Kubernetes cluster.

How does Talos Linux differ from other distributions like CoreOS?

Unlike CoreOS, Talos Linux focuses exclusively on Kubernetes, offering an immutable OS managed entirely via API. CoreOS has been discontinued, whereas Talos Linux is actively maintained and supported.

Is Talos Linux suitable for production use?

Yes. Talos Linux is optimized for running Kubernetes in production, provides advanced security features, and has both active community and commercial support options.

—

References
– Talos Documentation
– Sidero Labs
– Talos GitHub Repository

Helm v3.17 Take Ownership Flag: Fix Release Conflicts

2026-01-272025-06-12 by Alexandre Vazquez

Helm has long been the standard for managing Kubernetes applications using packaged charts, bringing a level of reproducibility and automation to the deployment process. However, some operational tasks, such as renaming a release or migrating objects between charts, have traditionally required cumbersome workarounds. With the introduction of the --take-ownership flag in Helm v3.17 (released in January 2025), a long-standing pain point is finally addressed—at least partially.

The take-ownership feature represents the continuing evolution of Helm. Learn about this and other cutting-edge capabilities in our Helm Charts Package Management Guide

In this post, we will explore:

What the --take-ownership flag does
Why it was needed
The caveats and limitations
Real-world use cases where it helps
When not to use it

Understanding Helm Release Ownership and Object Management

When Helm installs or upgrades a chart, it injects metadata—labels and annotations—into every managed Kubernetes object. These include:

app.kubernetes.io/managed-by: Helm
meta.helm.sh/release-name: my-release
meta.helm.sh/release-namespace: default

This metadata serves an important role: Helm uses it to track and manage resources associated with each release. As a safeguard, Helm does not allow another release to modify objects it does not own and when you trying that you will see messages like the one below:

Error: Unable to continue with install: Service "provisioner-agent" in namespace "test-my-ns" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "meta.helm.sh/release-name" must equal "dp-core-infrastructure11": current value is "dp-core-infrastructure"

While this protects users from accidental overwrites, it creates limitations for advanced use cases.

Why `--take-ownership` Was Needed

Let’s say you want to:

Rename an existing Helm release from api-v1 to api.
Move a ConfigMap or Service from one chart to another.
Rebuild state during GitOps reconciliation when previous Helm metadata has drifted.

Previously, your only option was to:

Uninstall the existing release.
Reinstall under the new name.

This approach introduces downtime, and in production systems, that’s often not acceptable.

What the Flag Does

helm upgrade my-release ./my-chart --take-ownership

When this flag is passed, Helm will:

Skip the ownership validation for existing objects.
Override the labels and annotations to associate the object with the current release.

In practice, this allows you to claim ownership of resources that previously belonged to another release, enabling seamless handovers.

⚠️ What It Doesn’t Do

This flag does not:

Clean up references from the previous release.
Protect you from future uninstalls of the original release (which might still remove shared resources).
Allow you to adopt completely unmanaged Kubernetes resources (those not initially created by Helm).

In short, it’s a mechanism for bypassing Helm’s ownership checks, not a full lifecycle manager.

Real-World Helm Take Ownership Use Cases

Let’s go through common scenarios where this feature is useful.

✅ 1. Renaming a Release Without Downtime

Before:

helm uninstall old-name
helm install new-name ./chart

Now:

helm upgrade new-name ./chart --take-ownership

✅ 2. Migrating Objects Between Charts

You’re refactoring a large chart into smaller, modular ones and need to reassign certain Service or Secret objects.

This flag allows the new release to take control of the object without deleting or recreating it.

✅ 3. GitOps Drift Reconciliation

If objects were deployed out-of-band or their metadata changed unintentionally, GitOps tooling using Helm can recover without manual intervention using --take-ownership.

Best Practices and Recommendations

Use this flag intentionally, and document where it’s applied.
If possible, remove the previous release after migration to avoid confusion.
Monitor Helm’s behavior closely when managing shared objects.
For non-Helm-managed resources, continue to use kubectl annotate or kubectl label to manually align metadata.

Conclusion

The --take-ownership flag is a welcomed addition to Helm’s CLI arsenal. While not a universal solution, it smooths over many of the rough edges developers and SREs face during release evolution and GitOps adoption.

It brings a subtle but powerful improvement—especially in complex environments where resource ownership isn’t static.

Stay updated with Helm releases, and consider this flag your new ally in advanced release engineering.

Frequently Asked Questions

What does the Helm –take-ownership flag do?

The --take-ownership flag allows Helm to bypass ownership validation and claim control of Kubernetes resources that belong to another release. It updates the meta.helm.sh/release-name annotation to associate objects with the current release, enabling zero-downtime release renames and chart migrations.

When should I use Helm take ownership?

Use --take-ownership when renaming releases without downtime, migrating objects between charts, or fixing GitOps drift. It’s ideal for production environments where uninstall/reinstall cycles aren’t acceptable. Always document usage and clean up previous releases afterward.

What are the limitations of Helm take ownership?

The flag doesn’t clean up references from previous releases or protect against future uninstalls of the original release. It only works with Helm-managed resources, not completely unmanaged Kubernetes objects. Manual cleanup of old releases is still required.

Is Helm take ownership safe for production use?

Yes, but use it intentionally and carefully. The flag bypasses Helm’s safety checks, so ensure you understand the ownership implications. Test in staging first, document all usage, and monitor for conflicts. Remove old releases after successful migration to avoid confusion.

Which Helm version introduced the take ownership flag?

The --take-ownership flag was introduced in Helm v3.17, released in January 2025. This feature addresses long-standing pain points with release renaming and chart migrations that previously required downtime-inducing uninstall/reinstall cycles.