Kubernetes gives you a lot of rope. Enough to build something remarkable, and more than enough to hang yourself with. The default configuration is optimized for getting things running, not for keeping them secure. And by the time teams realize this - usually after a security review, an audit, or an incident - they’re already carrying significant security debt.
This article doesn’t rehash Kubernetes documentation. It focuses on the decisions that actually determine your security posture: where the real attack surface is, which defaults bite you in production, and how to build layered defenses without grinding your developers to a halt.
1. The API Server: Your Cluster’s Most Exposed Surface
Every action in a Kubernetes cluster flows through the API server. That makes it the highest-value target for any attacker and the most critical component to harden. Treat it like a perimeter, not an internal service.
Authentication and the Anonymous Access Problem
Kubernetes ships with anonymous access partially enabled by default - a sensible choice for local development that becomes a liability in production. In permissive configurations, unauthenticated requests can reach discovery endpoints, exposing cluster version and API group information. That’s enough to fingerprint your environment and target known CVEs (Common Vulnerabilities and Exposures - publicly disclosed security flaws).
In production, anonymous access should be restricted exclusively to health-check paths: /healthz, /readyz, and /livez. Everything else requires a valid identity.
More critically, move away from static long-lived service account tokens entirely. Bound ServiceAccount tokens - scoped to specific pods and nodes with short expiry windows - significantly reduce the blast radius of a token leak. A static token exfiltrated from a pod can be used from anywhere, indefinitely. A bound token is tied to the issuing pod’s lifecycle and cannot be replayed from outside the cluster context.
etcd: Hardening the State Store
If etcd is compromised, the cluster is compromised - full stop. Everything lives in etcd: workload definitions, service account tokens, and secrets. It should be unreachable from anything except the API server, protected by mutual TLS, and encrypted at rest.
The encryption decision matters more than teams typically acknowledge. Static key providers (like aescbc or secretbox) are simple to configure but require manual key rotation, which rarely happens on a good schedule. Integrating an external KMS plugin pushes key lifecycle management to a dedicated system, adds rotation automation, and produces audit trails. The trade-off is latency on secret reads - acceptable for most clusters, but worth benchmarking before you commit.
| Encryption Provider | Key Source | Security Level | Operational Complexity | Performance Trade-off |
| aescbc | Static file | Low | Low - but manual rotation | Negligible |
| secretbox | Static file | Low | Low - but manual rotation | Negligible |
| KMS v1 (external) | External KMS plugin | High | High | Moderate latency |
| KMS v2 (external) | External KMS plugin | Highest | Moderate (optimized) | Low at scale |
KMS v2 is worth the migration effort if you're managing secrets at any real scale.
2. RBAC: Where Security Debt Accumulates Fastest
RBAC is the right model for Kubernetes authorization. It’s also the most reliable source of unintended privilege in every large cluster I’ve seen. The issue isn’t that RBAC is poorly designed - it’s that it’s easy to over-provision and hard to audit after the fact.
The Escalation Paths Nobody Talks About
Three verbs deserve special attention:
- bind: Allows a principal to create RoleBindings and ClusterRoleBindings, including to roles they don’t currently hold. This means a user with bind permissions can grant themselves any role in the cluster. It’s a privilege escalation path disguised as an administrative convenience.
- escalate: Allows modifying a Role or ClusterRole to include permissions the modifier doesn’t have. The API server normally prevents this; escalate bypasses that check explicitly.
- impersonate: Allows acting as another user, group, or service account. An attacker who can impersonate system:masters has full, unrestricted cluster access.
There’s a less-discussed escalation path involving the CertificateSigningRequest (CSR) API. A principal with create rights on CSRs and update rights on certificatesigningrequests/approval for the kube-apiserver-client signer can issue their own client certificates with arbitrary common names - including system:masters, which bypasses all RBAC checks.
The nodes/proxy GET Escalation Gap
A principal with nodes/proxy GET permission can initiate exec sessions into any pod on that node - not just read metrics. This turns “read-only telemetry access” into remote code execution across the cluster.
Many monitoring setups require GET access to nodes/proxy to scrape kubelet metrics. What engineers miss is that pod exec operations begin as HTTP GET requests via WebSocket upgrade. The API server sees both operations identically from a path-permission perspective. The Kubernetes security team has categorized this as intended behavior, which means the only mitigation is avoiding nodes/proxy access in favor of push-based metrics pipelines where possible.
Auditing Shadow Admins
A “shadow admin” is any principal that accumulates the effective power of cluster-admin through a combination of individually reasonable-looking permissions. Manual inspection of RBAC manifests won’t surface these. You need automated tooling (rakkess, rbac-lookup, or similar open-source tools) to map role bindings to their effective permission sets and identify dangerous combinations.
| High-Risk Permission | Resource | Why It’s Dangerous |
| create, update | rolebindings | Lets a principal grant themselves or others arbitrary roles |
| patch | nodes | Enables label manipulation to attract sensitive workloads |
| get, list, watch | secrets | Direct access to credentials for every system in the cluster |
| create | pods/exec | Remote code execution in any container the principal can reach |
| create + approval rights | certificatesigningrequests | Can issue client certs with system:masters CN, bypassing all RBAC |
Audit RBAC quarterly at minimum. In fast-moving environments, unused bindings and orphaned roles accumulate faster than anyone tracks.
3. Workload Identity: Getting Off Static Credentials
Static, long-lived credentials embedded in Kubernetes Secrets are the single most common root cause of post-compromise lateral movement. The solution is to treat workload identity the same way you treat human identity: short-lived, scoped, and auditable.
OIDC-Based Workload Identity
The OIDC federation pattern is now the standard approach for connecting Kubernetes workloads to external systems without static credentials. The cluster issues a short-lived JWT to each pod via its service account token. That token is presented to the target system (a cloud IAM, a secret store, a CI/CD tool), which validates it against the cluster’s OIDC discovery endpoint and issues scoped access.
Solving the “Secret Zero” Problem with ESO
The External Secrets Operator (ESO) is the practical standard for synchronizing secrets from an external store into Kubernetes. It solves the bootstrap problem cleanly: the pod presents its Kubernetes service account token to the external secret store, which validates the identity via the TokenReview API and returns a scoped, short-lived access credential. ESO’s advantage over sidecar-based injection approaches is that applications stay Kubernetes-native - they read secrets from mounted volumes or environment variables as always. The rotation, auditing, and lifecycle management live in the secret store, not in every application’s startup logic.
| Secrets Pattern | Mechanism | Security Posture | Key Trade-off |
| Sealed Secrets | Asymmetric encryption, committed to Git | Static - no automatic rotation | Simple to adopt, weak on rotation |
| External Secrets Operator | Syncs from external store, OIDC auth | High - short-lived, rotatable | Sync delay; external store dependency |
| CSI Secret Store Driver | Mounts secrets as files at pod start | High - secrets never hit etcd | IO overhead; driver maintenance |
| Sidecar Injector | Sidecar fetches credentials at startup | High - ephemeral tokens | Startup latency; sidecar sprawl |
One hard-learned lesson from production: critical open-source security tooling can stall. ESO experienced a period of reduced maintenance activity due to maintainer burnout. When a project that manages secrets for thousands of production clusters has a single active maintainer, that’s a systemic risk. Evaluate the maintainer health of any security-critical open-source component before depending on it deeply.
4. Supply Chain Integrity: Trust the Build, Not Just the Image
A container image that passes a vulnerability scan isn’t necessarily trustworthy. Supply chain attacks - where an attacker substitutes a legitimate image with a backdoored one - are precisely the class of threat that vulnerability scanning doesn’t catch.
Image Signing with Sigstore / Cosign
Signing image digests (not tags) creates a cryptographic link between the build artifact and its provenance. Cosign (part of the Sigstore project) is the open-source standard for this. Signing the immutable digest ensures what you verified is what runs.
Attestations extend this further. A signed SLSA provenance record embeds metadata about how the image was built: which CI workflow ran, on which trusted runner, from which source commit. Admission controllers can then enforce that no image enters the cluster without a valid provenance attestation meeting your organization’s build standards.
OPA-Based Admission Enforcement for Supply Chain
Enforcing supply chain requirements at admission time using an OPA-based policy engine is the right architecture. The challenge at scale is performance: image signature verification involves cryptographic operations that can push you toward the 10-second admission webhook timeout under load.
Two caching strategies make this tractable in large clusters:
- Verifier caching: Reuse initialized verifier instances (e.g., KMS-backed verifiers) across requests to avoid repeated initialization overhead.
- Result caching: Cache the verification outcome for a specific image digest for a short TTL. Once you’ve verified a digest, skip the full cryptographic call for subsequent pods pulling the same image during a rollout.
Concurrent verification across all containers in a Pod spec - using a worker pool with a shared error channel - ensures that a multi-container pod’s admission latency is bounded by the slowest single verification, not the sum of all of them.
Pod Security Standards: The Practical Rollout
The removal of PodSecurityPolicy (PSP) shifted the baseline security model to Pod Security Standards (PSS). The three tiers-Privileged, Baseline, and Restricted-are straightforward in principle but painful in practice.
To move beyond "Baseline" and successfully implement the Restricted profile, you must address the container-to-host boundary using two modern kernel features:
- User Namespaces (Stateless Isolation): Traditionally, the root user inside a container is still root on the host. User Namespaces map the container’s root UID to a non-privileged UID on the host. This neutralizes the majority of container escape techniques by ensuring an attacker who "breaks out" lands as a "nobody" user on the node.
- Recursive Read-Only (RRO) Mounts: While standard read-only mounts prevent writes to the top-level volume, they often leave subpaths vulnerable. RRO propagates the read-only constraint recursively through the entire mount tree, closing bypasses used in sophisticated malicious write operations.
The smart rollout path: Start namespaces in warn and audit mode. Warn surfaces violations to developers; audit writes them to the log. Once you’ve cataloged what breaks, apply User Namespaces to bridge the gap between "legacy root needs" and "production security requirements" before switching to enforce.
5. Runtime Security: Defense After the Perimeter Fails
Preventative controls are necessary. They’re not sufficient. Runtime security is your detection and response layer for everything that static policy couldn’t catch.
Falco: Syscall Monitoring with Signal-to-Noise Discipline
Falco uses eBPF probes to monitor syscalls and evaluate them against a rules engine. The production reality: default Falco installations generate enormous alert volumes - upward of thousands of events per day in active clusters. Most teams that deploy Falco and don’t tune it end up ignoring the alerts entirely, which is worse than not running it.
Three things that actually move the needle on Falco signal quality:
- Macro-based suppression: Define allow-list macros for known-legitimate but suspicious-looking patterns (e.g., a Java application reading /proc for JVM tuning). Apply them surgically, not broadly.
- Process lineage rules: The highest-fidelity alerts come from watching parent-child process relationships, not just syscall types. A shell spawned by a web server process is almost always an indicator of command injection.
- evt.type filtering first: Place event type filters at the top of every rule so the Falco engine can short-circuit evaluation before running the full AST. This directly reduces CPU overhead at high event rates.
Tetragon: Enforcement at the Kernel Level
Tetragon takes a different approach: instead of alerting in userspace after an event has occurred, it enforces policy directly in the kernel using eBPF. It can terminate a process with SIGKILL or block a kernel function call before the malicious action completes.
Tetragon’s policies are Kubernetes-aware natively: they can target pods by label selectors and namespaces. Its kernel-side filtering means lower overhead than userspace alert generation, typically under 1% CPU impact in production workloads. Falco and Tetragon are complementary rather than competing - broad detection coverage from Falco, targeted enforcement from Tetragon on your highest-risk workloads.
| Security Mechanism | Performance Impact | Approx. Energy Overhead | Best Fit |
| WireGuard (node-to-node encryption) | High - throughput reduction | +25–30W per node | Sensitive inter-node traffic isolation |
| Istio/Linkerd mTLS (service mesh) | Moderate - added latency | +15–20W per node | L7 identity and mutual auth |
| eBPF observability (Falco/Tetragon) | Low - 1–2.5% CPU | Low | Runtime threat detection everywhere |
| Kubernetes NetworkPolicies (eBPF CNI) | Negligible | Low | L3/L4 workload micro-segmentation |
The decision to enable mTLS everywhere should be made with eyes open to the latency implications - the energy overhead is real, but secondary to the security value in standard cloud environments.
6. Network Micro-Segmentation: Closing the Lateral Movement Window
Kubernetes’ default network model is flat. Any pod can reach any other pod by default. NetworkPolicies change the default to deny-all and force applications to declare their communication requirements explicitly.
The Deny-All Baseline and Why It Always Breaks Something
Implementing a deny-all ingress and egress policy across all namespaces is the right baseline. The first time you do it in a production cluster, you will break something unexpected - usually DNS, internal metrics endpoints, or a telemetry pipeline that nobody documented. This is not a reason to avoid the policy. It’s a reason to do it carefully.
A phased approach that actually works:
- Observe first: Use flow logging (Cilium Hubble or similar) to capture a full picture of actual inter-pod traffic before writing any policies.
- Audit mode: Apply policies in dry-run mode. Review what would have been blocked against the observed traffic baseline. Resolve the gaps.
- Enforce gradually: Roll out namespace by namespace, starting with the lowest-blast-radius workloads.
CNI Performance: eBPF vs. iptables at Scale
| CNI Engine | L3/L4 Throughput (1k+ rules) | L7 Throughput | Scaling Behavior |
| Cilium (eBPF) | ~9 Gbps - consistent | ~94 Mbps - protocol parsing overhead | < 10% degradation at scale |
| Antrea (OVS) | ~1.2 Gbps | ~6.6 Gbps - efficient for HTTP | Degrades with rule growth |
| Standard iptables CNI | Low - linear rule scan | N/A | 60–70% throughput loss at scale |
Cilium is the right choice for L3/L4 policy enforcement at scale - the hash table lookup model doesn’t degrade as rule counts grow, which is the core failure mode of iptables. For HTTP-specific L7 filtering, Antrea’s OVS implementation has a measurably more efficient path.
Service Mesh: mTLS and the Shift to Ambient Mode
NetworkPolicies give you L3/L4 isolation. They don’t give you encryption or application-layer identity. For that, you need a service mesh. Istio and Linkerd both provide mutual TLS for all inter-service communication, authenticating workloads by cryptographic identity rather than IP address.
Istio’s Ambient Mode is the architecture worth evaluating for new deployments. It decouples L4 encryption (handled by a per-node ztunnel proxy) from L7 policy enforcement (handled by optional waypoint proxies per service). This eliminates the sidecar-per-pod model, which was the primary operational friction point with traditional mesh architectures. The trade-off is a newer implementation with less production mileage than sidecar mode.
7. Node and Kubelet Hardening: The Foundation
Cluster-level security controls are only as strong as the nodes they run on.
Kubelet Security Baseline
The Kubelet is the most powerful agent on the node. If its API is exposed, the cluster is effectively open.
- Disable anonymous auth: --anonymous-auth=false. This prevents unauthenticated requests from reaching the Kubelet API.
- Use Webhook authorization: --authorization-mode=Webhook ensures the Kubelet checks with the API server before executing commands, centralizing policy.
- Close the read-only port: --read-only-port=0 disables port 10255, which often leaks pod metadata to internal attackers.
Host OS and Surface Area Reduction
The node should be treated as a single-purpose appliance.
- Minimize the OS: Use container-optimized distributions (like Bottlerocket or Talos) that remove SSH, package managers, and unnecessary shells. This eliminates the tools an attacker needs to stay persistent after an escape.
- Kernel Self-Protection: Enable sysctl hardening (e.g. kernel.unprivileged_bpf_disabled=1) to prevent non-root users from loading eBPF programs that could be used for privilege escalation.
- Audit Logging: Ensure the host-level auditd is capturing changes to sensitive files like /etc/kubernetes/kubelet.conf and shipping them to a centralized SIEM.
8. DevSecOps: Making Security Stick
Security tooling that developers refuse to use is security theater. The hardest engineering problem in Kubernetes security isn’t choosing the right tools - it’s deploying them in ways that developers can work with.
Golden Paths, Not Golden Cages
Platform teams that get this right build secure defaults into the starting point, not onto an existing system. A “Golden Path” is a service template that bootstraps a new workload with all mandatory security configurations pre-wired: non-root security context, resource limits, network policy stubs, OTel instrumentation, and correct service account scoping. Developers get a working starting point; the platform team gets guaranteed baseline compliance.
The alternative - documenting security requirements and expecting teams to implement them correctly under delivery pressure - doesn’t work at scale. It doesn’t work even at small scale, honestly. People copy from working examples, and if the working examples are insecure, that insecurity propagates everywhere.
GitOps and Security Debt Management
Security debt in Kubernetes almost always has the same shape: a cluster or namespace that started as temporary, never got properly hardened, and became load-bearing. Declarative GitOps - managing all cluster state through version-controlled manifests reconciled by tools like ArgoCD or Flux - doesn’t eliminate this problem, but it makes it visible. Drift, unauthorized changes, and upgrade discipline all become enforceable through the reconciliation loop.
The clusters most at risk are the ones running unsupported Kubernetes versions. Every minor version that falls out of support represents a growing backlog of unpatched CVEs. Automating version upgrades through GitOps pipelines, with staged rollouts and automated regression testing, is the only way to maintain upgrade discipline without dedicating disproportionate engineering time to it.
9. What a Real Cluster Compromise Looks Like
Here’s a pattern that reflects actual post-incident analysis from production Kubernetes environments:
- Initial access: An attacker exploits a vulnerability in a public-facing application - a deserialization flaw, an SSRF, a dependency with a known CVE that wasn’t patched. They gain code execution inside a container.
- Identity theft: The compromised container has a mounted service account token - often the default service account, often with permissions it doesn’t need. The attacker extracts the token from /var/run/secrets/kubernetes.io/serviceaccount/token.
- Enumeration: Using the stolen token, the attacker authenticates to the API server. They enumerate secrets, ConfigMaps, and other workloads accessible to that service account.
- Lateral movement: If the token has list/get on secrets, the attacker can retrieve credentials for databases, external APIs, and cloud provider IAM roles. If it has pod exec permissions, they can move into other workloads directly.
- Escalation: With sufficient permissions - or by finding a higher-privilege token in an accessible secret - they escalate toward cluster-admin or cloud account access.
Every step in this chain has a control that breaks it: vulnerability patching and image scanning at initial access; automounted token disabling and RBAC scoping at identity theft; network policies at lateral movement; runtime detection at enumeration. No single control stops everything. Defense-in-depth means every layer has to fail for the attack to succeed.
Conclusion: Security as an Engineering Practice
Kubernetes security isn’t a configuration checklist. It’s an ongoing engineering practice that requires the same rigor, iteration, and ownership as any other part of your platform. The clusters that get compromised are rarely the ones that never thought about security - they’re the ones that thought about it once, implemented the basics, and stopped.
The actual work is continuous: RBAC audits that catch privilege creep before it becomes a problem, runtime detection tuned to reduce noise rather than ignored because of it, upgrade pipelines that keep pace with the release cycle rather than falling years behind.
The mental model to carry into your next cluster review: assume any workload can be compromised. What can the attacker reach? What can they do? What will alert, and who will respond? If you can answer those questions concretely, your security posture is sound. If the honest answer is “I’m not sure,” that’s where the work starts.






