Inside Kubernetes The 2026 Architecture Breakdown

Subhendu Nayak
Inside Kubernetes The 2026 Architecture Breakdown

Architectural Foundations and the Modern Enterprise Paradigm

Kubernetes (K8s) was introduced by Google in 2014, building on its internal Borg cluster management system. In 2015, Kubernetes was contributed to the Cloud Native Computing Foundation, establishing a vendor-neutral ecosystem that now supports global infrastructure at organizations such as Netflix, Spotify, and Shopify.

The platform addresses a core operational challenge: managing large numbers of containers across distributed environments introduces significant complexity when handled manually. Kubernetes replaces this with a declarative, self-healing model. Engineers define the desired system state using YAML manifests, and Kubernetes continuously reconciles the actual state to match it—automating scheduling, failure recovery, scaling, and application rollouts.

According to the CNCF’s 2025 Annual Survey, Kubernetes adoption continues to grow, with 82% of container users running it in production (up from 66% in 2023). Additionally, 66% of organizations deploying generative AI models rely on Kubernetes to manage inference workloads. The platform has also expanded to support heterogeneous environments: KubeVirt enables management of virtual machines alongside containers, while K3s provides a streamlined option for edge and resource-constrained deployments.

Understanding Kubernetes requires examining its two primary layers: the control plane (orchestration layer) and worker nodes (execution layer).

Control Plane Anatomy: The Orchestration Layer

A Kubernetes deployment is referred to as a cluster. The control plane maintains a global view of the cluster, makes scheduling decisions, and continuously drives the system toward its desired state.

Rather than being a single process, the control plane consists of multiple specialized components that operate together. These components are typically distributed across several nodes to ensure high availability and resilience.

Core Control Plane Components

Component

Role

kube-apiserver

The primary entry point for all cluster operations. Handles authentication, RBAC authorization, schema validation, and admission control. Communicates over HTTPS and typically scales horizontally behind a load balancer.

etcd

A distributed key-value store that persists all cluster data, including specifications, state, secrets, and policies. Uses the Raft consensus algorithm for consistency. Loss of etcd data without backup results in complete cluster state loss.

kube-scheduler

Assigns unscheduled Pods to worker nodes using a two-phase process: filtering (eliminates unsuitable nodes) and scoring (ranks remaining nodes based on resource usage, affinity, and topology).

kube-controller-manager

Runs multiple reconciliation loops that continuously compare desired and actual states and take corrective actions.

cloud-controller-manager

Integrates the control plane with cloud provider APIs (e.g., AWS, Azure, GCP) to manage resources such as load balancers, networking, and storage.

Controller Responsibilities

Within the controller manager, each controller is responsible for a specific domain:

Controller

Responsibility

Node Controller

Detects node failures through missed heartbeats and reschedules affected Pods.

ReplicaSet Controller

Maintains the desired number of Pod replicas and replaces failed instances.

Endpoints Controller

Updates Service-to-Pod mappings as Pods are created or removed.

Service Account Controller

Creates default service accounts and API credentials within new namespaces.

Admission Control and Policy Evolution

Admission control serves as the final gate within the API server before changes are persisted. It follows a strict two-phase pipeline:

  1. Mutating controllers modify incoming requests
  2. Validating controllers enforce rules and accept or reject requests

As of Kubernetes v1.36, this pipeline includes CEL-based policy mechanisms that reduce reliance on external webhook infrastructure:

API

Status (v1.36)

Purpose

ValidatingAdmissionPolicy

GA (since v1.30)

Enables in-process validation rules without requiring external webhook servers.

MutatingAdmissionPolicy

GA (since v1.36)

Allows resource mutation directly within the API server using CEL, reducing latency and operational overhead.

Both mechanisms operate entirely within the kube-apiserver, offering lower latency, eliminating TLS management overhead, and improving reliability compared to external admission webhooks.

API Extensibility: CRDs, Operators, and Admission Control

One of the most significant architectural strengths of Kubernetes is its extensible API. The platform is intentionally designed to allow users to define new resource types and embed domain-specific operational logic without modifying the core system.

Custom Resource Definitions (CRDs)

Custom Resource Definitions (CRDs) enable the creation of new resource types within the Kubernetes API. Once registered, these resources are treated as first-class objects, similar to built-in resources such as Pods or Services. Users can create and manage Custom Resources (CRs) using standard tools like kubectl and declarative YAML manifests.

The API server manages key aspects automatically, including storage, schema validation (via OpenAPI v3), role-based access control (RBAC), and versioning.

It is important to distinguish between:

  • CRD: the schema or definition
  • CR: an instance of that schema

This relationship is conceptually similar to a class and its object instance in programming.

Common examples include:

  • Certificate and ClusterIssuer in cert-manager
  • ServiceMonitor and PrometheusRule in the Prometheus Operator
  • VirtualService and DestinationRule in Istio

The Operator Pattern

A CRD defines structure but does not perform any actions. The Operator pattern extends this by pairing a CRD with a custom controller. This controller monitors changes to Custom Resources and runs a reconciliation loop to ensure that the actual system state matches the declared desired state.

This approach enables teams to encode operational knowledge directly into Kubernetes.

Layer

Description

Example

CRD

Schema registered with the API server

databases.example.com

Custom Resource (CR)

Instance of the schema

A specific Database object

Operator / Controller

Watches CRs and enforces desired state

Provisions and manages a PostgreSQL cluster

Operators are commonly developed using frameworks such as Kubebuilder or the Operator SDK. At scale, they are often managed using Operator Lifecycle Manager (OLM), which supports installation, upgrades, and dependency management across clusters.

API Extensibility vs. Admission Control

CRDs extend what Kubernetes can manage, while admission control governs how API requests are processed.

As discussed earlier, ValidatingAdmissionPolicy and MutatingAdmissionPolicy are integral to this model. They allow organizations to enforce policies directly within the Kubernetes API, reducing reliance on external webhook-based systems and improving operational consistency.

Worker Node Mechanics: The Distributed Compute Engine

While the control plane is responsible for decision-making, worker nodes execute those decisions. Each worker node is a physical or virtual machine that runs application workloads in containers.

Every node includes three primary components that coordinate with the control plane to run Pods.

Core Node Components

Component

Role

kubelet

The primary node agent. Monitors the API server for assigned Pods, translates specifications into runtime instructions via the Container Runtime Interface (CRI), and reports node and Pod health back to the control plane.

kube-proxy

Implements the Kubernetes Service abstraction by maintaining network routing rules (iptables or IPVS) to direct traffic from a Service’s virtual IP to the appropriate Pods.

Container Runtime

Executes containers. The kubelet interacts with it through the CRI, a standardized gRPC interface that decouples Kubernetes from specific runtime implementations.

CRI-Compliant Runtimes

Runtime

Profile

containerd

Industry standard; used by managed services such as AWS EKS, GKE, and AKS

CRI-O

Designed specifically for Kubernetes; widely used in Red Hat and OpenShift environments

gVisor / Kata Containers

Sandboxed runtimes that provide enhanced isolation for multi-tenant or security-sensitive workloads

Docker Engine is no longer supported as a Kubernetes runtime. Modern deployments use containerd (which Docker itself relies on internally). From Kubernetes v1.26 onward, runtimes must support the v1 CRI API; otherwise, the kubelet will not register the node.

The Pod Abstraction and Granular Lifecycle Management

Kubernetes Pod Lifecycle


The Pod is the smallest deployable unit in Kubernetes. It represents a logical grouping of one or more containers that:

  • Are scheduled onto the same node
  • Share a network namespace (single IP address)
  • Can share storage volumes

In most production scenarios, a Pod contains a single primary application container.

Pod Lifecycle Phases

Phase

Meaning

Pending

Accepted by the cluster but awaiting scheduling or image download

Running

Assigned to a node; at least one container is active

Succeeded

All containers completed successfully and will not restart

Failed

At least one container exited with an error

Unknown

Node communication failure prevents status determination

Startup Sequence and Health Probes

Before application containers start, init containers execute sequentially to perform setup tasks such as configuration initialization or dependency checks.

Once initialization is complete, the kubelet activates three types of health probes:

Probe

Failure Action

startupProbe

Delays liveness and readiness checks until the application is fully initialized

livenessProbe

Restarts the container if it becomes unresponsive or enters a failed state

readinessProbe

Removes the Pod from Service load balancing when it cannot safely handle traffic

Quality of Service (QoS) Classes

Each Pod is assigned a QoS class at creation time based on its resource configuration. This classification determines eviction priority under resource pressure and does not change during the Pod’s lifetime.

QoS Class

Condition

Eviction Priority

Guaranteed

Requests equal limits for all containers

Lowest (last evicted)

Burstable

Requests defined, but not equal to limits

Medium

BestEffort

No resource requests or limits defined

Highest (first evicted)

Graceful Termination

When a Pod is deleted, Kubernetes initiates a controlled shutdown process. It sends a SIGTERM signal to all containers and removes the Pod from Service endpoints to prevent new traffic. This allows in-flight requests to complete.

After the configured terminationGracePeriodSeconds (default: 30 seconds), any remaining processes receive a SIGKILL signal.

Controller Behavior and Operational Considerations

Deployment is the default choice for stateless workloads. It manages a ReplicaSet that enforces the desired number of replicas and orchestrates updates using rolling strategies. Parameters such as maxUnavailable and maxSurge allow teams to balance availability and rollout speed, making Deployments suitable for continuous delivery pipelines.

StatefulSet introduces guarantees that Deployments intentionally avoid. Pods are created, updated, and deleted in a defined order, and each Pod is assigned a stable network identity and persistent storage. These guarantees are essential for distributed systems such as databases, where instance identity and data locality directly affect correctness and performance.

DaemonSet operates at the infrastructure level. It ensures that a specific Pod runs on every node (or a subset defined by node selectors). This makes it the standard approach for deploying system-level services such as logging agents, monitoring collectors, networking components, and security tools.

Job and CronJob are designed for workloads with a defined lifecycle. A Job runs until a specified number of successful completions is reached, while a CronJob schedules Jobs at regular intervals. These abstractions are commonly used for data processing, backups, and maintenance tasks.

Selecting the appropriate controller is not simply a matter of preference—it directly impacts application availability, data integrity, rollout behavior, and recovery strategies.

Advanced Cluster Networking and the Gateway API Transition

Kubernetes networking is based on a foundational principle: every Pod receives a unique IP address, and all Pods can communicate with each other directly without network address translation (NAT). This flat network model simplifies service discovery and communication but requires a robust implementation layer.

This implementation is provided by a Container Network Interface (CNI) plugin. Kubernetes defines the networking model, but the CNI plugin enforces it in practice.

Common CNI solutions include Cilium, Calico, Flannel, and Weave. These differ in capabilities such as:

  • Support for eBPF-based packet processing
  • Native enforcement of NetworkPolicy
  • Encryption of pod-to-pod traffic
  • Observability and traffic visibility

As a result, selecting a CNI plugin is a significant architectural decision that influences security, performance, and operational visibility.

Services and Traffic Routing

Because Pods are short-lived and their IP addresses change frequently, Kubernetes introduces the Service abstraction to provide stable endpoints. A Service exposes a consistent virtual IP and routes traffic to a dynamic set of Pods selected via labels.

Service Type

Behaviour

ClusterIP

Exposes the Service internally within the cluster

NodePort

Opens a port on each node and forwards traffic to the Service

LoadBalancer

Provisions an external load balancer via the cloud provider

ExternalName

Maps the Service to an external DNS name

This abstraction decouples service consumers from the lifecycle of individual Pods, enabling seamless scaling and updates.

Gateway API and the Evolution Beyond Ingress

The traditional Ingress model centralized routing configuration in a single resource, often extended through controller-specific annotations. While functional, this approach introduced inconsistencies and limited extensibility.

The Gateway API represents a structured evolution of this model. It introduces role-based resource separation, aligning responsibilities across infrastructure, platform, and application teams.

Resource

Owner

Purpose

GatewayClass

Infrastructure administrator

Defines which controller implementation manages Gateways

Gateway

Cluster operator

Configures entry points, protocols, ports, and TLS settings

HTTPRoute

Application developer

Defines routing rules and maps requests to backend Services

This separation improves governance and aligns naturally with Kubernetes RBAC, enabling teams to operate independently while maintaining clear boundaries.

Compared to Ingress, the Gateway API provides:

  • Stronger extensibility through a structured resource model
  • Reduced reliance on annotations and controller-specific behavior
  • Improved portability across implementations

Migration tooling such as ingress2gateway supports transition by converting existing configurations into Gateway API resources.

Persistent Storage, State Management, and Dynamic Provisioning

Containers are stateless by default. Any data written to a container’s local filesystem is lost when the container restarts or is rescheduled. For stateful workloads, Kubernetes provides a storage abstraction that separates application requirements from underlying infrastructure.

Storage Object Model

Kubernetes storage is built on three core abstractions:

Object

Role

PersistentVolume (PV)

Represents a provisioned storage resource independent of Pod lifecycle

PersistentVolumeClaim (PVC)

Defines a request for storage with specific requirements (size, access mode, class)

StorageClass

Specifies how storage is dynamically provisioned, including the CSI driver and configuration parameters

This model allows developers to request storage declaratively, without needing to manage infrastructure details directly.

Static vs. Dynamic Provisioning

Static provisioning requires administrators to manually create PersistentVolumes in advance. While this offers precise control, it does not scale efficiently in dynamic environments.

Dynamic provisioning addresses this limitation. When a PVC references a StorageClass, Kubernetes automatically provisions storage using the associated CSI driver, creates the corresponding PV, and binds it to the claim. This process eliminates manual intervention and supports scalable, on-demand storage allocation.

Container Storage Interface (CSI)

The Container Storage Interface (CSI) standardizes how Kubernetes interacts with external storage systems. Storage providers implement CSI drivers to integrate their platforms with Kubernetes.

This approach offers several advantages:

  • Decouples storage innovation from Kubernetes release cycles
  • Enables independent updates and feature delivery
  • Provides consistent lifecycle management across storage backends

CSI has replaced legacy in-tree volume plugins, making it the standard mechanism for storage integration.

Volume Access Modes and Reclaim Policies

Access modes define how volumes can be mounted:

Access Mode

Behaviour

ReadWriteOnce (RWO)

Mounted read-write by a single node

ReadOnlyMany (ROX)

Mounted read-only by multiple nodes

ReadWriteMany (RWX)

Mounted read-write by multiple nodes

ReadWriteOncePod (RWOP)

Mounted read-write by a single Pod

Reclaim policies define what happens to storage after a PVC is deleted:

  • Delete: Automatically removes the underlying storage
  • Retain: Preserves the storage for manual recovery or auditing

Volume Snapshots

Kubernetes supports point-in-time backups through the VolumeSnapshot API. These snapshots enable data recovery, cloning, and migration workflows.

Recent enhancements include support for group snapshots, which allow multiple volumes to be captured in a consistent state. This is particularly important for distributed applications that store related data across multiple volumes, ensuring data integrity during backup and restore operations. 

Zero-Trust Security, Cluster Hardening, and Supply Chain Defense

Kubernetes is designed for ease of adoption by default, not for secure-by-default operation. A newly provisioned cluster allows containers to run as root, permits unrestricted network communication between workloads, and automatically mounts API credentials into Pods. Production readiness requires systematically replacing these permissive defaults with explicit, enforceable controls across all layers.

A practical approach to cluster hardening can be understood through four primary enforcement layers:

The Four Enforcement Layers

Layer

Mechanism

What It Controls

Identity & Access

RBAC + OIDC

Who can perform which API operations on which resources

Workload Isolation

Pod Security Standards (PSS)

What containers are allowed to do at runtime

Network

NetworkPolicy

Which workloads can communicate and over which ports

Supply Chain

Image signing + admission control

Whether a container image is trusted before execution

Identity and Access Control

Role-Based Access Control (RBAC) governs authorization within the Kubernetes API. The guiding principle is least privilege: each workload should use a dedicated ServiceAccount with only the permissions required for its function.

Cluster-wide administrative roles should be tightly restricted and never used as defaults for applications or users. Integrating external identity providers via OIDC further strengthens security by replacing static credentials with short-lived, automatically rotating tokens, improving traceability and reducing credential exposure risk.

Workload Isolation with Pod Security Standards

Pod Security Standards (PSS), introduced as the successor to PodSecurityPolicy, define baseline security expectations for workloads at the namespace level. Three profiles are available:

  • Privileged — unrestricted; intended for trusted system components
  • Baseline — prevents known privilege escalation paths while maintaining compatibility
  • Restricted — enforces modern security best practices (non-root execution, restricted capabilities, read-only filesystems)

For most production environments, the Restricted profile should be the default, with exceptions granted only where operationally necessary.

Network Segmentation

By default, Kubernetes clusters have no network segmentation—any Pod can communicate with any other Pod. This flat model increases risk if a workload is compromised.

NetworkPolicy introduces fine-grained control by defining allowed communication paths. A common best practice is to implement a default-deny policy at the namespace level, then explicitly allow only required traffic flows. This approach significantly reduces lateral movement within the cluster and strengthens overall isolation.

Supply Chain Security

Supply chain security focuses on controlling what enters the cluster before execution. This includes:

  • Scanning container images for vulnerabilities during CI pipelines
  • Cryptographically signing images to verify integrity and origin
  • Enforcing policies at admission time

Admission controls—such as ValidatingAdmissionPolicy or policy engines like OPA Gatekeeper—can ensure that only trusted, compliant images from approved registries are deployed.

Together, these controls establish a zero-trust posture, where every workload is verified before execution rather than assumed to be safe.

Multi-Cluster Architecture and Fleet Governance

Operating a single Kubernetes cluster is relatively well understood. The greater challenge lies in managing multiple clusters across regions, cloud providers, and environments. As of 2026, multi-cluster architectures have become standard for organizations operating at scale, particularly those with global workloads, regulatory requirements, or hybrid infrastructure strategies.

Why Multiple Clusters

A single cluster represents a shared failure domain. Issues such as control plane outages, configuration errors, or resource exhaustion can impact all workloads simultaneously.

Multiple clusters introduce clear isolation boundaries based on:

  • Geography (regional deployments)
  • Environment (development, staging, production)
  • Compliance requirements
  • Workload sensitivity

This segmentation reduces blast radius and improves resilience, availability, and governance.

Fleet Governance Challenges

As the number of clusters increases, operational complexity grows non-linearly. Key challenges include:

Challenge

Consequence if Unaddressed

Configuration drift

Clusters diverge from intended state, creating inconsistencies

Fragmented RBAC

Inconsistent access control across clusters increases risk

Upgrade coordination

Version misalignment introduces compatibility and security issues

Cost visibility

Lack of centralized tracking complicates cost management and allocation

GitOps as the Governance Foundation

GitOps has emerged as the standard model for managing multi-cluster environments. In this approach, Git repositories act as the single source of truth for all cluster configurations.

Tools such as Argo CD and Flux continuously synchronize declared configurations to clusters. Any deviation from the desired state is automatically detected and reconciled.

This model provides:

  • Full auditability of changes
  • Consistent configuration across environments
  • Improved reproducibility and rollback capabilities

Workload Placement and Cluster Lifecycle Management

For distributing workloads across clusters, platforms such as Karmada and Open Cluster Management enable policy-driven scheduling and centralized governance using Kubernetes-native APIs.

Cluster lifecycle management is increasingly handled through Cluster API, which applies the same declarative and reconciliation-based model used for applications to the clusters themselves. This enables consistent provisioning, upgrades, and decommissioning across environments.

Tags
Container OrchestrationCloud NativeKubernetes architecturecontrol planeworker nodesgateway api
Maximize Your Cloud Potential
Streamline your cloud infrastructure for cost-efficiency and enhanced security.
Discover how CloudOptimo optimize your AWS and Azure services.
Request a Demo