Helm Chart Testing in Production: Layers, Tools, and a Minimum CI Pipeline

Helm Chart Testing in Production: Layers, Tools, and a Minimum CI Pipeline

When a Helm chart fails in production, the impact is immediate and visible. A misconfigured ServiceAccount, a typo in a ConfigMap key, or an untested conditional in templates can trigger incidents that cascade through your entire deployment pipeline. The irony is that most teams invest heavily in testing application code while treating Helm charts as “just configuration.”

Chart testing is fundamental for production-quality Helm deployments. For comprehensive coverage of testing along with all other Helm best practices, visit our complete Helm guide.

Helm charts are infrastructure code. They define how your applications run, scale, and integrate with the cluster. Treating them with less rigor than your application logic is a risk most production environments cannot afford.

The Real Cost of Untested Charts

In late 2024, a medium-sized SaaS company experienced a 4-hour outage because a chart update introduced a breaking change in RBAC permissions. The chart had been tested locally with helm install --dry-run, but the dry-run validation doesn’t interact with the API server’s RBAC layer. The deployment succeeded syntactically but failed operationally.

The incident revealed three gaps in their workflow:

  1. No schema validation against the target Kubernetes version
  2. No integration tests in a live cluster
  3. No policy enforcement for security baselines

These gaps are common. According to a 2024 CNCF survey on GitOps practices, fewer than 40% of organizations systematically test Helm charts before production deployment.

The problem is not a lack of tools—it’s understanding which layer each tool addresses.

Testing Layers: What Each Level Validates

Helm chart testing is not a single operation. It requires validation at multiple layers, each catching different classes of errors.

Layer 1: Syntax and Structure Validation

What it catches: Malformed YAML, invalid chart structure, missing required fields

Tools:

  • helm lint: Built-in, minimal validation following Helm best practices
  • yamllint: Strict YAML formatting rules

Example failure caught:

# Invalid indentation breaks the chart
resources:
  limits:
      cpu: "500m"
    memory: "512Mi"  # Incorrect indentation

Limitation: Does not validate whether the rendered manifests are valid Kubernetes objects.

Layer 2: Schema Validation

What it catches: Manifests that would be rejected by the Kubernetes API

Primary tool: kubeconform

Kubeconform is the actively maintained successor to the deprecated kubeval. It validates against OpenAPI schemas for specific Kubernetes versions and can include custom CRDs.

Project Profile:

  • Maintenance: Active, community-driven
  • Strengths: CRD support, multi-version validation, fast execution
  • Why it matters: helm lint validates chart structure, but not if rendered manifests match Kubernetes schemas

Example failure caught:

apiVersion: apps/v1
kind: Deployment
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: myapp
    spec:
      containers:
      - name: app
        image: nginx:latest
# Missing required field: spec.selector

Configuration example:

helm template my-chart . | kubeconform \
  -kubernetes-version 1.30.0 \
  -schema-location default \
  -schema-location 'https://raw.githubusercontent.com/datreeio/CRDs-catalog/main/{{.Group}}/{{.ResourceKind}}_{{.ResourceAPIVersion}}.json' \
  -summary

Example CI integration:

#!/bin/bash
set -e

KUBE_VERSION="1.30.0"

echo "Rendering chart..."
helm template my-release ./charts/my-chart > manifests.yaml

echo "Validating against Kubernetes $KUBE_VERSION..."
kubeconform \
  -kubernetes-version "$KUBE_VERSION" \
  -schema-location default \
  -summary \
  -output json \
  manifests.yaml | jq -e '.summary.invalid == 0'

Alternative: kubectl --dry-run=server (requires cluster access, validates against actual API server)

Layer 3: Unit Testing

What it catches: Logic errors in templates, incorrect conditionals, wrong value interpolation

Unit tests validate that given a set of input values, the chart produces the expected manifests. This is where template logic is verified before reaching a cluster.

Primary tool: helm-unittest

helm-unittest is the most widely adopted unit testing framework for Helm charts.

Project Profile:

  • GitHub: 3.3k+ stars, ~100 contributors
  • Maintenance: Active (releases every 2-3 months)
  • Primary maintainer: Quentin Machu (originally @QubitProducts, now independent)
  • Commercial backing: None
  • Bus Factor: Medium-High (no institutional backing, but consistent community engagement)

Strengths:

  • Fast execution (no cluster required)
  • Familiar test syntax (similar to Jest/Mocha)
  • Snapshot testing support
  • Good documentation

Limitations:

  • Doesn’t validate runtime behavior
  • Cannot test interactions with admission controllers
  • No validation against actual Kubernetes API

Example test scenario:

# tests/deployment_test.yaml
suite: test deployment
templates:
  - deployment.yaml
tests:
  - it: should set resource limits when provided
    set:
      resources.limits.cpu: "1000m"
      resources.limits.memory: "1Gi"
    asserts:
      - equal:
          path: spec.template.spec.containers[0].resources.limits.cpu
          value: "1000m"
      - equal:
          path: spec.template.spec.containers[0].resources.limits.memory
          value: "1Gi"

  - it: should not create HPA when autoscaling disabled
    set:
      autoscaling.enabled: false
    template: hpa.yaml
    asserts:
      - hasDocuments:
          count: 0

Alternative: Terratest (Helm module)

Terratest is a Go-based testing framework from Gruntwork that includes first-class Helm support. Unlike helm-unittest, Terratest deploys charts to real clusters and allows programmatic assertions in Go.

Example Terratest test:

func TestHelmChartDeployment(t *testing.T) {
    kubectlOptions := k8s.NewKubectlOptions("", "", "default")
    options := &helm.Options{
        KubectlOptions: kubectlOptions,
        SetValues: map[string]string{
            "replicaCount": "3",
        },
    }
    
    defer helm.Delete(t, options, "my-release", true)
    helm.Install(t, options, "../charts/my-chart", "my-release")
    
    k8s.WaitUntilNumPodsCreated(t, kubectlOptions, metav1.ListOptions{
        LabelSelector: "app=my-app",
    }, 3, 30, 10*time.Second)
}

When to use Terratest vs helm-unittest:

  • Use helm-unittest for fast, template-focused validation in CI
  • Use Terratest when you need full integration testing with Go flexibility

Layer 4: Integration Testing

What it catches: Runtime failures, resource conflicts, actual Kubernetes behavior

Integration tests deploy the chart to a real (or ephemeral) cluster and verify it works end-to-end.

Primary tool: chart-testing (ct)

chart-testing is the official Helm project for testing charts in live clusters.

Project Profile:

  • Ownership: Official Helm project (CNCF)
  • Maintainers: Helm team (contributors from Microsoft, IBM, Google)
  • Governance: CNCF-backed with public roadmap
  • LTS: Aligned with Helm release cycle
  • Bus Factor: Low (institutional backing from CNCF provides strong long-term guarantees)

Strengths:

  • De facto standard for public Helm charts
  • Built-in upgrade testing (validates migrations)
  • Detects which charts changed in a PR (efficient for monorepos)
  • Integration with GitHub Actions via official action

Limitations:

  • Requires a live Kubernetes cluster
  • Initial setup more complex than unit testing
  • Does not include security scanning

What ct validates:

  • Chart installs successfully
  • Upgrades work without breaking state
  • Linting passes
  • Version constraints are respected

Example ct configuration:

# ct.yaml
target-branch: main
chart-dirs:
  - charts
chart-repos:
  - bitnami=https://charts.bitnami.com/bitnami
helm-extra-args: --timeout 600s
check-version-increment: true

Typical GitHub Actions workflow:

name: Lint and Test Charts

on: pull_request

jobs:
  lint-test:
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
        uses: actions/checkout@v3
        with:
          fetch-depth: 0

      - name: Set up Helm
        uses: azure/setup-helm@v3

      - name: Set up Python
        uses: actions/setup-python@v4
        with:
          python-version: '3.11'

      - name: Set up chart-testing
        uses: helm/chart-testing-action@v2

      - name: Run chart-testing (lint)
        run: ct lint --config ct.yaml

      - name: Create kind cluster
        uses: helm/kind-action@v1

      - name: Run chart-testing (install)
        run: ct install --config ct.yaml

When ct is essential:

  • Public chart repositories (expected by community)
  • Charts with complex upgrade paths
  • Multi-chart repositories with CI optimization needs

Layer 5: Security and Policy Validation

What it catches: Security misconfigurations, policy violations, compliance issues

This layer prevents deploying charts that pass functional tests but violate organizational security baselines or contain vulnerabilities.

Policy Enforcement: Conftest (Open Policy Agent)

Conftest is the CLI interface to Open Policy Agent for policy-as-code validation.

Project Profile:

  • Parent: Open Policy Agent (CNCF Graduated Project)
  • Governance: Strong CNCF backing, multi-vendor support
  • Production adoption: Netflix, Pinterest, Goldman Sachs
  • Bus Factor: Low (graduated CNCF project with multi-vendor backing)

Strengths:

  • Policies written in Rego (reusable, composable)
  • Works with any YAML/JSON input (not Helm-specific)
  • Can enforce organizational standards programmatically
  • Integration with admission controllers (Gatekeeper)

Limitations:

  • Rego has a learning curve
  • Does not replace functional testing

Example Conftest policy:

# policy/security.rego
package main

import future.keywords.contains
import future.keywords.if
import future.keywords.in

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  not container.resources.limits.memory
  msg := sprintf("Container '%s' must define memory limits", [container.name])
}

deny[msg] {
  input.kind == "Deployment"
  container := input.spec.template.spec.containers[_]
  not container.resources.limits.cpu
  msg := sprintf("Container '%s' must define CPU limits", [container.name])
}

Running the validation:

helm template my-chart . | conftest test -p policy/ -

Alternative: Kyverno

Kyverno offers policy enforcement using native Kubernetes manifests instead of Rego. Policies are written in YAML and can validate, mutate, or generate resources.

Example Kyverno policy:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
  name: require-resource-limits
spec:
  validationFailureAction: Enforce
  rules:
  - name: check-container-limits
    match:
      resources:
        kinds:
        - Pod
    validate:
      message: "All containers must have CPU and memory limits"
      pattern:
        spec:
          containers:
          - resources:
              limits:
                memory: "?*"
                cpu: "?*"

Conftest vs Kyverno:

  • Conftest: Policies run in CI, flexible for any YAML
  • Kyverno: Runtime enforcement in-cluster, Kubernetes-native

Both can coexist: Conftest in CI for early feedback, Kyverno in cluster for runtime enforcement.

Vulnerability Scanning: Trivy

Trivy by Aqua Security provides comprehensive security scanning for Helm charts.

Project Profile:

  • Maintainer: Aqua Security (commercial backing with open-source core)
  • Scope: Vulnerability scanning + misconfiguration detection
  • Helm integration: Official trivy helm command
  • Bus Factor: Low (commercial backing + strong open-source adoption)

What Trivy scans in Helm charts:

  1. Vulnerabilities in referenced container images
  2. Misconfigurations (similar to Conftest but pre-built rules)
  3. Secrets accidentally committed in templates

Example scan:

trivy helm ./charts/my-chart --severity HIGH,CRITICAL --exit-code 1

Sample output:

myapp/templates/deployment.yaml (helm)
====================================

Tests: 12 (SUCCESSES: 10, FAILURES: 2)
Failures: 2 (HIGH: 1, CRITICAL: 1)

HIGH: Container 'app' of Deployment 'myapp' should set 'securityContext.runAsNonRoot' to true
════════════════════════════════════════════════════════════════════════════════════════════════
Ensure containers run as non-root users

See https://kubernetes.io/docs/concepts/security/pod-security-standards/
────────────────────────────────────────────────────────────────────────────────────────────────
 myapp/templates/deployment.yaml:42

Commercial support:
Aqua Security offers Trivy Enterprise with advanced features (centralized scanning, compliance reporting). For most teams, the open-source version is sufficient.

Other Security Tools

Polaris (Fairwinds)

Polaris scores charts based on security and reliability best practices. Unlike enforcement tools, it provides a health score and actionable recommendations.

Use case: Dashboard for chart quality across a platform

Checkov (Bridgecrew/Palo Alto)

Similar to Trivy but with a broader IaC focus (Terraform, CloudFormation, Kubernetes, Helm). Pre-built policies for compliance frameworks (CIS, PCI-DSS).

When to use Checkov:

  • Multi-IaC environment (not just Helm)
  • Compliance-driven validation requirements

Enterprise Selection Criteria

Bus Factor and Long-Term Viability

For production infrastructure, tool sustainability matters as much as features. Community support channels like Helm CNCF Slack (#helm-users, #helm-dev) and CNCF TAG Security provide valuable insights into which projects have active maintainer communities.

Questions to ask:

  • Is the project backed by a foundation (CNCF, Linux Foundation)?
  • Are multiple companies contributing?
  • Is the project used in production by recognizable organizations?
  • Is there a public roadmap?

Risk Classification:

Tool Governance Bus Factor Notes
chart-testing CNCF Low Helm official project
Conftest/OPA CNCF Graduated Low Multi-vendor backing
Trivy Aqua Security Low Commercial backing + OSS
kubeconform Community Medium Active, but single maintainer
helm-unittest Community Medium-High No institutional backing
Polaris Fairwinds Medium Company-sponsored OSS

Kubernetes Version Compatibility

Tools must explicitly support the Kubernetes versions you run in production.

Red flags:

  • No documented compatibility matrix
  • Hard-coded dependencies on old K8s versions
  • No testing against multiple K8s versions in CI

Example compatibility check:

# Does the tool support your K8s version?
kubeconform --help | grep -A5 "kubernetes-version"

For tools like ct, always verify they test against a matrix of Kubernetes versions in their own CI.

Commercial Support Options

When commercial support matters:

  • Regulatory compliance requirements (SOC2, HIPAA, etc.)
  • Limited internal expertise
  • SLA-driven operations

Available options:

  • Trivy: Aqua Security offers Trivy Enterprise
  • OPA/Conftest: Styra provides OPA Enterprise
  • Terratest: Gruntwork offers consulting and premium modules

Most teams don’t need commercial support for chart testing specifically, but it’s valuable in regulated industries where audits require vendor SLAs.

Security Scanner Integration

For enterprise pipelines, chart testing tools should integrate cleanly with:

  • SIEM/SOAR platforms
  • CI/CD notification systems
  • Security dashboards (e.g., Grafana, Datadog)

Required features:

  • Structured output formats (JSON, SARIF)
  • Exit codes for CI failure
  • Support for custom policies
  • Webhook or API for event streaming

Example: Integrating Trivy with SIEM

# .github/workflows/security.yaml
- name: Run Trivy scan
  run: trivy helm ./charts --format json --output trivy-results.json

- name: Send to SIEM
  run: |
    curl -X POST https://siem.company.com/api/events \
      -H "Content-Type: application/json" \
      -d @trivy-results.json

Testing Pipeline Architecture

A production-grade Helm chart pipeline combines multiple layers:

Pipeline efficiency principles:

  1. Fail fast: syntax and schema errors should never reach integration tests
  2. Parallel execution where possible (unit tests + security scans)
  3. Cache ephemeral cluster images to reduce setup time
  4. Skip unchanged charts (ct built-in change detection)

Decision Matrix: When to Use What

Scenario 1: Small Team / Early-Stage Startup

Requirements: Minimal overhead, fast iteration, reasonable safety

Recommended Stack:

Linting:      helm lint + yamllint
Validation:   kubeconform
Security:     trivy helm

Optional: helm-unittest (if template logic becomes complex)

Rationale: Zero-dependency baseline that catches 80% of issues without operational complexity.

Scenario 2: Enterprise with Compliance Requirements

Requirements: Auditable, comprehensive validation, commercial support available

Recommended Stack:

Linting:      helm lint + yamllint
Validation:   kubeconform
Unit Tests:   helm-unittest
Security:     Trivy Enterprise + Conftest (custom policies)
Integration:  chart-testing (ct)
Runtime:      Kyverno (admission control)

Optional: Terratest for complex upgrade scenarios

Rationale: Multi-layer defense with both pre-deployment and runtime enforcement. Commercial support available for security components.

Scenario 3: Multi-Tenant Internal Platform

Requirements: Prevent bad charts from affecting other tenants, enforce standards at scale

Recommended Stack:

CI Pipeline:
  • helm lint → kubeconform → helm-unittest → ct
  • Conftest (enforce resource quotas, namespaces, network policies)
  • Trivy (block critical vulnerabilities)

Runtime:
  • Kyverno or Gatekeeper (enforce policies at admission)
  • ResourceQuotas per namespace
  • NetworkPolicies by default

Additional tooling:

  • Polaris dashboard for chart quality scoring
  • Custom admission webhooks for platform-specific rules

Rationale: Multi-tenant environments cannot tolerate “soft” validation. Runtime enforcement is mandatory.

Scenario 4: Open Source Public Charts

Requirements: Community trust, transparent testing, broad compatibility

Recommended Stack:

Must-have:
  • chart-testing (expected standard)
  • Public CI (GitHub Actions with full logs)
  • Test against multiple K8s versions

Nice-to-have:
  • helm-unittest with high coverage
  • Automated changelog generation
  • Example values for common scenarios

Rationale: Public charts are judged by testing transparency. Missing ct is a red flag for potential users.

The Minimum Viable Testing Stack

For any environment deploying Helm charts to production, this is the baseline:

Layer 1: Pre-Commit (Developer Laptop)

helm lint charts/my-chart
yamllint charts/my-chart

Layer 2: CI Pipeline (Automated on PR)

# Fast validation
helm template my-chart ./charts/my-chart | kubeconform \
  -kubernetes-version 1.30.0 \
  -summary

# Security baseline
trivy helm ./charts/my-chart --exit-code 1 --severity CRITICAL,HIGH

Layer 3: Pre-Production (Staging Environment)

# Integration test with real cluster
ct install --config ct.yaml --charts charts/my-chart

Time investment:

  • Initial setup: 4-8 hours
  • Per-PR overhead: 3-5 minutes
  • Maintenance: ~1 hour/month

ROI calculation:

Average production incident caused by untested chart:

  • Detection: 15 minutes
  • Triage: 30 minutes
  • Rollback: 20 minutes
  • Post-mortem: 1 hour
  • Total: ~2.5 hours of engineering time

If chart testing prevents even one incident per quarter, it pays for itself in the first month.

Common Anti-Patterns to Avoid

Anti-Pattern 1: Only using --dry-run

helm install --dry-run validates syntax but skips:

  • Admission controller logic
  • RBAC validation
  • Actual resource creation

Better: Combine dry-run with kubeconform and at least one integration test.

Anti-Pattern 2: Testing only in production-like clusters

“We test in staging, which is identical to production.”

Problem: Staging clusters rarely match production exactly (node counts, storage classes, network policies). Integration tests should run in isolated, ephemeral environments.

Anti-Pattern 3: Security scanning without enforcement

Running trivy helm without failing the build on critical findings is theater.

Better: Set --exit-code 1 and enforce in CI.

Anti-Pattern 4: Ignoring upgrade paths

Most chart failures happen during upgrades, not initial installs. Chart-testing addresses this with ct install --upgrade.

Conclusion: Testing is Infrastructure Maturity

The gap between teams that test Helm charts and those that don’t is not about tooling availability—it’s about treating infrastructure code with the same discipline as application code.

The cost of testing is measured in minutes per PR. The cost of not testing is measured in hours of production incidents, eroded trust in automation, and teams reverting to manual deployments because “Helm is too risky.”

The testing stack you choose matters less than the fact that you have one. Start with the minimal viable stack (lint + schema + security), run it consistently, and expand as your charts become more complex.

By implementing a structured testing pipeline, you catch 95% of chart issues before they reach production. The remaining 5% are edge cases that require production observability, not more testing layers.

Helm chart testing is not about achieving perfection—it’s about eliminating the preventable failures that undermine confidence in your deployment pipeline.

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Kubernetes API changes quite a lot, and we know that in every new version, they are adding new capabilities at the same time that they are deprecating the old ones, so it is a constant evolution, as we already stated in previous articles, as you can see, here regarding Autoscaling v2 and Vertical Autoscaling.

Some of these changes are related to the shift in the apiVersion of some objects, and you have probably already suffered from that v1/alpha going to v1/beta or just moving to a final v1 and deprecating the previous one. So, in the end, it is crucial to ensure that your manifest is in sync with the target version you’re deploying, and some tools can help us with that, including Kubeconform.

What is Kubeconform?

Kubeconform is a powerful utility designed to assist in Kubernetes configuration management and validation. As Kubernetes continues to gain popularity as the go-to container orchestration platform, ensuring the correctness and consistency of configuration files becomes crucial. Kubeconform addresses this need by providing a comprehensive toolset to validate Kubernetes configuration files against predefined standards or custom rules.

Kubeconform supports multiple versions of Kubernetes, allowing you to validate configuration files against different API versions. This flexibility is beneficial when working with clusters running different Kubernetes versions or migrating applications across sets with varying configurations.

Another great feature of Kubeconform is its ability to enforce best practices and standards across Kubernetes configurations. It allows you to define rules, such as enforcing proper labels, resource limits, or security policies, and then validates your configuration files against these rules. This helps catch potential issues early on and ensures that your deployments comply with established guidelines.

How to install Kubeconform?

Kubeconform can be installed from different sources, the most usual ones the standard for your environment using package managers such as brew, apt or similar ones or just getting the binaries from its GitHub page: https://github.com/yannh/kubeconform/releases.

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

How to launch Kubeconform from the Command Line?

Kubeconform is shipped as a small binary targeted to be executed in the CLI interface and tries to keep its interface minimal to ensure compatibility. Hence, it receives an argument with the file or folder with the manifest files that you want to check, as you can see here:

Kubeconform Explained: Validate Kubernetes Manifests and Prevent API Errors

Then you have several options to do other things, such as the ones shown below:

-ignore-filename-pattern value

regular expression specifying paths to ignore (can be specified multiple times)

-ignore-missing-schemas

skip files with missing schemas instead of failing

-Kubernetes-version string

version of Kubernetes to validate against, e.g.: 1.18.0 (default “master”)

-output string

output format – json, junit, pretty, tap, text (default “text”)

-reject string

comma-separated list of kinds or GVKs to reject

-skip string

comma-separated list of kinds or GVKs to ignore

-strict

disallow additional properties not in schema or duplicated keys

-summary

print a summary at the end (ignored for junit output)

Use-cases of Kuberconform

There are different use cases where Kubeconfrom can play a good role. One is regarding Kubernetes upgrades, sometimes you need to ensure that your current manifest is still going to work in the new release that the cluster will be upgraded to, and with this tool, we can ensure that our YAML is still compatible with the latest version directly getting it from the environment and validate it properly.

Another notable aspect of Kubeconform is its seamless integration into existing CI/CD pipelines. You can easily incorporate kubeconform as a step in your pipeline to automatically validate Kubernetes configuration files before deploying them. By doing so, you can catch configuration errors early in the development process, reduce the risk of deployment failures, and maintain high configuration consistency.

In addition to its validation capabilities, kubeconform provides helpful feedback and suggestions for improving your Kubernetes configuration files. It highlights specific issues or deviations from the defined rules and offers guidance on addressing them. This simplifies the troubleshooting process and helps developers and administrators become more familiar with best practices and Kubernetes configuration standards.

Conclusion

Kubeconform is an invaluable utility for Kubernetes users who strive for reliable and consistent deployments. It empowers teams to maintain a high standard of configuration quality, reduces the likelihood of misconfigurations, and improves the overall stability and security of Kubernetes-based applications.

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.

DevSecOps vs DevOps: Key Differences Explained by Answering 3 Core Questions

DevSecOps vs DevOps: Key Differences Explained by Answering 3 Core Questions

DevSecOps is a concept you probably have heard extensively in the last few months. You will see it in alignment with the traditional idea of DevOps. This probably, at some point, makes you wonder about a DevSecOps vs DevOps comparison, even trying to understand what are the main differences between them or if they are the same concept. And also, with other ideas starting to appear, such as Platform Engineering or Site Reliability, it is beginning to create some confusion in the field that I would like to clarify today in this article.

What is DevSecOps?

DevSecOps is an extension of the DevOps concept and methodology. Now, it is not a joint effort between Development and Operation practices but a joint effort among Development, Operation, and Security.

DevSecOps vs DevOps: Key Differences Explained by Answering 3 Core Questions
Diagram by GeekFlare: A DevSecOps Introduction (https://geekflare.com/devsecops-introduction/)

Implies introducing security policies, practices, and tools to ensure that the DevOps cycles provide security along this process. We already commented on including security components to provide a more secure deployment process. We even have specific articles about these tools, such as scanners, docker registries, etc.

Why DevSecOps is important?

DevSecOps, or to be more explicit, including security practices as part of the DevOps process, is critical because we are moving to hybrid and cloud architectures where we incorporate new design, deployment, and development patterns such as containers, microservices, and so on.

This situation makes that we are moving from one side to having hundreds of applications in the most complex cases to thousands of applications, and to have dozens of servers to thousands of containers, each of them with different base images and third-party libraries that can be obsolete, have a security hole or just be raised new vulnerabilities such as we have seen in the past with the Spring Framework or the Log4J library to shout some of the most recent global substantial security issues that the companies dealt with.

So, even the most extensive security team cannot be at pace checking manually or with a set of scripting all the different new challenges to the security if we don’t include them as part of the overall process of the development and deployment of the components. This is where the concept of shift-left security is usually considered, and we already covered that in this article you can read here.

DevSecOps vs DevOps: Is DevSecOps just updated DevOps?

So based on the above definition, you can think: “Ok, so when somebody talks about DevOps as not thinking about security”. This is not true.

In the same aspect, when we talk about DevOps, it is not explicitly all the detailed steps, such as software quality assurance, unit testing, etc. So, as happens with many extensions in this industry, the original, global or generic concept includes the contents of the wings as well.

So, in the end, DevOps and DevSecOps are the same things, especially today when all companies and organizations are moving to the cloud or hybrid environments where security is critical and non-negotiable. Hence, every task that we do, from developing software to access to any service, needs to be done with Security in mind. But I used both concepts in different scenarios. I will use DevSecOps when I would like to explicitly highlight the security aspect because of the audience, the context, or the topic we are discussing to do differentiation.

Still, in any generic context, DevOps will include the security checks will be retained for sure because if it is not, it is just useless. Me.

 Summary

So, in the end, when somebody speaks today about DevOps, it implicitly includes the security aspect, so there is no difference between both concepts. But you will see and also find it helpful to use the specific term DevSecOps when you want to highlight or differentiate this part of the process.

How To Improve Your Kubernetes Workload Development Productivity

timelapse photo of highway during golden hour

Telepresence is the way to reduce the time between your lines of code and a cloud-native workload running.

timelapse photo of highway during golden hour
Photo by Joey Kyber on Unsplash

We all know how cloud-native workloads and Kubernetes have changed how we do things. There are a lot of benefits that come with the effect of containerization and orchestration platforms such as Kubernetes, and we have discussed a lot about it: scalability, self-healing, auto-discovery, resilience, and so on.

But some challenges have been raised, most of them on the operational aspect that we have a lot of projects focused on tackling, but usually, we forget about what the ambassador has defined as the “inner dev cycle.”

The “inner dev cycle” is the productive workflow that each developer follows when working on a new application, service, or component. This iterative flow is where we code, test what we’ve coded, and fix what is not working or improve what we already have.

This flow has existed since the beginning of time; it doesn’t matter if you were coding in C using STD Library or COBOL in the early 1980 or doing nodejs with the latest frameworks and libraries at your disposal.

We have seen movements towards making this inner cycle more effective, especially in front-end development. We have many options to see the last change we have done in code, just saving the file. But for the first time when the movement to a container-based platform, this flow makes devs less productive.

The main reason is that the number of tasks a dev needs to do has increased. Imagine this set of steps that we need to perform:

  • Build the app
  • Build the container image
  • Deploy the container image in Kubernetes

These actions are not as fast as testing your changes locally, making devs less productive than before, which is what the “telepresence” project is trying to solve.

Telepresence is an incubator project from the CNCF that has recently focused a lot of attention because it has included OOTB in the latest releases of the Docker Desktop component. Based on its own words, this is the definition of the telepresence project:

Telepresence is an open-source tool that lets developers code and test microservices locally against a remote Kubernetes cluster. Telepresence facilitates more efficient development workflows while relieving the need to worry about other service dependencies.

Ok, so let’s see how we can start? Let’s dive in together. The first thing we need to do is to install telepresence in our Kubernetes cluster:

Note: It is also a way to install telepresence using Helm in your cluster following these steps:

helm repo add datawire  https://app.getambassador.io
helm repo update
kubectl create namespace ambassador
helm install traffic-manager --namespace ambassador datawire/telepresence

Now I will create a simple container that will host a Golang application that exposes a simple REST service and make it more accessible; I will follow the tutorial that is available below; you can do it as well.

Once we have our golang application ready, we are going to generate the container from it, using the following Dockerfile:

FROM golang:latest

RUN apt-get update
RUN apt-get upgrade -y

ENV GOBIN /go/bin

WORKDIR /app

COPY *.go ./
RUN go env -w GO111MODULE=off
RUN go get .
RUN go build -o /go-rest
EXPOSE 8080
CMD [ "/go-rest" ]

Then once we have the app, we’re going to upload to the Kubernetes server and run it as a deployment, as you can see in the picture below:

kubectl create deployment rest-service --image=quay.io/alexandrev/go-test  --port=8080
kubectl expose deploy/rest-service

Once we have that, it is the moment to start executing the telepresence, and we will start connecting to the cluster using the following command telepresence connect, and it will show an output like this one:

How To Improve Your Kubernetes Workload Development Productivity

Then we are going to list the endpoints available to intercept with the command telepresence listand we will see our rest-service that we have exposed before:

How To Improve Your Kubernetes Workload Development Productivity

Now, we will run the specific interceptor, but before that, we’re going to do the trick so we can connect it to our Visual Studio Code. We will generate a launch.json file in Visual Studio Code with the following content:

{
    // Use IntelliSense to learn about possible attributes.
    // Hover to view descriptions of existing attributes.
    // For more information, visit: https://go.microsoft.com/fwlink/?linkid=830387
    "version": "0.2.0",
    "configurations": [
        {
            "name": "Launch with env file",
            "type": "go",
            "request": "launch",
            "mode": "debug",
            "program": "1",
            "envFile": "NULL/go-debug.env"
           }
    ]
}

The interesting part here is the envFile argument that points to a non-existent file go-debug.env on the same folder, so we need to make sure that we generate that file when we do the interception. So we will use the following command:

telepresence intercept rest-service --port 8080:8080 --env-file /Users/avazquez/Data/Projects/GitHub/rest-golang/go-debug.env

And now, we can start our debug session in Visual Studio code and maybe add a breakpoint and some lines, as you can see in the picture below:

How To Improve Your Kubernetes Workload Development Productivity

So, now, if we hit the pod in Kubernetes, we will see how the breakpoint is being reached as we were in a local debugging session.

How To Improve Your Kubernetes Workload Development Productivity

That means that we can inspect variables and everything, change the code, or do whatever we need to speed up our development!

📚 Want to dive deeper into Kubernetes? This article is part of our comprehensive Kubernetes Architecture Patterns guide, where you’ll find all fundamental and advanced concepts explained step by step.