Skip to content

Security Considerations

Dploy runs user-selected workloads — whatever chart a DployTemplate points at — inside short-lived, per-tenant namespaces. That makes the runtime boundary a first-class security concern: a malicious or vulnerable environment shouldn’t be able to reach the node or another user’s environment.

Dploy already hardens the control plane: the API and operator are split across two service accounts so a compromised API can only create instance requests, never arbitrary workloads (see Architecture → The RBAC boundary).

What that split does not cover is the workload itself. Environments run as ordinary containers, which share the host kernel. A container escape (kernel exploit, misconfigured capability, runtime CVE) lands an attacker on the node — and from there, on every other tenant’s environment. For multi-tenant or untrusted workloads, that shared kernel is the weak point.

Kata Containers closes it by running each pod inside a lightweight VM with its own guest kernel, so the isolation boundary is the hypervisor rather than namespaces + cgroups.

ThreatStandard runtime (runc)Kata Containers
Container → host kernel escapeShared kernel — high blast radiusGuest kernel only; host kernel not exposed
Cross-tenant access after escapeReachableConfined to the pod’s VM
Kernel CVE exploitationHits the hostHits a throwaway guest
Hypervisor / hardware side-channelsn/aStill possible — not a silver bullet
  • Hardware virtualization on the worker nodes (/dev/kvm). On bare metal this is native; on cloud instances you must enable nested virtualization (e.g. GCP enable-nested-virtualization, Azure/AWS *.metal instances).

  • A CRI that Kata integrates with — containerd or CRI-O. kata-deploy configures containerd automatically.

  • Verify KVM is present on a node:

    Terminal window
    kubectl debug node/<node> -it --image=busybox -- ls -l /dev/kvm

kata-deploy is a DaemonSet that drops the Kata binaries onto each node, wires up the container runtime, and registers the RuntimeClass objects.

Terminal window
# RBAC for the installer
kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-rbac/base/kata-rbac.yaml
# The installer DaemonSet (stable variant)
kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/kata-deploy/base/kata-deploy-stable.yaml
# Wait for every node to finish installing
kubectl -n kube-system rollout status ds/kata-deploy
kubectl -n kube-system wait --timeout=10m --for=condition=Ready -l name=kata-deploy pod
# Register the RuntimeClasses (kata, kata-qemu, kata-clh, …)
kubectl apply -f https://raw.githubusercontent.com/kata-containers/kata-containers/main/tools/packaging/kata-deploy/runtimeclasses/kata-runtimeClasses.yaml

Confirm the RuntimeClasses and boot a test pod with its own kernel:

Terminal window
kubectl get runtimeclass # expect kata-qemu, kata-clh, …
kubectl run kata-test --image=busybox --restart=Never \
--overrides='{"spec":{"runtimeClassName":"kata-qemu"}}' -- uname -r
kubectl logs kata-test # guest kernel, different from `uname -r` on the node
kubectl delete pod kata-test

The goal is for every environment pod to run under the Kata RuntimeClass. You have two options; the enforcement approach is recommended because it doesn’t depend on each chart exposing a runtimeClassName value.

Dploy labels every workload namespace dploy.dev/managed=true. A policy engine can mutate all pods in those namespaces to use Kata, regardless of the chart. With Kyverno:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: dploy-kata-runtime
spec:
rules:
- name: set-kata-runtime
match:
any:
- resources:
kinds: [Pod]
namespaceSelector:
matchLabels:
dploy.dev/managed: "true"
mutate:
patchStrategicMerge:
spec:
runtimeClassName: kata-qemu

This is robust: it covers any template, and the boundary is enforced centrally rather than trusted to each chart’s values.

Verify a real environment is sandboxed:

Terminal window
NS=$(kubectl get dployinstance <name> -n dploy-system -o jsonpath='{.status.namespace}')
kubectl get pod -n "$NS" -o jsonpath='{.items[*].spec.runtimeClassName}' # kata-qemu
  • Overhead — each pod is a microVM: higher memory/CPU baseline and slower cold start than runc. Size templates accordingly and watch warm-pool sizing for pool-method templates.
  • Startup latency — VM boot adds seconds; relevant for on-demand environments where users wait on provisioning.
  • Feature limits — host-path mounts, some device passthrough, and certain GPU setups need extra configuration or aren’t supported. Validate your heaviest template under Kata before rollout.

Kata isolates the runtime; the layers below bound what an environment can consume, reach, and persist. They all target the per-environment namespace, which Dploy labels dploy.dev/managed=true — so propagate them automatically (see Auto-applying to managed namespaces) rather than hand-applying per namespace.

Defense in depth as concentric layers around a Dploy environment pod: outermost Admission (Pod Security Admission restricted plus Kyverno), then Network (default-deny NetworkPolicy), Compute (ResourceQuota and LimitRange), Storage (no hostPath, ephemeral limits, scoped StorageClass), Runtime (Kata Containers per-pod microVM), and at the center the environment Pod running a user-selected chart with runtimeClassName kata-qemu.

LayerMechanismBounds
RuntimeKata RuntimeClassKernel-level escape
ComputeResourceQuota + LimitRangeCPU/memory/pod/storage exhaustion
NetworkNetworkPolicy (default-deny)Lateral movement, metadata/control-plane access
StoragePSA + ephemeral limits + scoped StorageClasshostPath escape, disk-fill, data bleed
AdmissionPod Security Admission restricted + KyvernoPrivileged pods, host namespaces, drift

Kata (above) is the runtime boundary. Pair it with a ResourceQuota and LimitRange so a single environment can’t exhaust the node — doubly important under Kata, where each pod carries microVM overhead.

apiVersion: v1
kind: ResourceQuota
metadata:
name: env-quota
spec:
hard:
requests.cpu: "2"
requests.memory: 4Gi
limits.cpu: "4"
limits.memory: 8Gi
pods: "10"
requests.ephemeral-storage: 4Gi
limits.ephemeral-storage: 8Gi
---
apiVersion: v1
kind: LimitRange
metadata:
name: env-limits
spec:
limits:
- type: Container
default: { cpu: 500m, memory: 512Mi, ephemeral-storage: 1Gi }
defaultRequest: { cpu: 100m, memory: 128Mi, ephemeral-storage: 256Mi }
max: { cpu: "2", memory: 4Gi }

Default-deny ingress and egress, then allow only DNS and the ingress/gateway path. This stops a compromised environment from reaching other tenants, in-cluster services, or the cloud metadata endpoint.

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: default-deny
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
---
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
name: allow-dns-and-ingress
spec:
podSelector: {}
policyTypes: [Ingress, Egress]
ingress:
- from:
- namespaceSelector:
matchLabels:
kubernetes.io/metadata.name: traefik-system # your ingress/gateway ns
egress:
- to:
- namespaceSelector: {}
podSelector:
matchLabels:
k8s-app: kube-dns
ports:
- { protocol: UDP, port: 53 }
- { protocol: TCP, port: 53 }
  • No hostPath — blocked by Pod Security Admission restricted (below); it’s the most common path from a volume to the node filesystem.
  • Ephemeral-storage limits (in the LimitRange above) stop a disk-fill DoS on the node.
  • Scoped StorageClass with reclaimPolicy: Delete so an environment’s PVs are cleaned up at teardown and never re-bound by another tenant; enable encryption-at-rest at the storage backend.
  • Under Kata, volumes are surfaced inside the guest VM — never share a ReadWriteMany volume across environments, or you reintroduce a cross-tenant channel.

Enforce Pod Security Admission restricted on managed namespaces — it blocks privileged containers, host namespaces (hostNetwork/hostPID/hostIPC), hostPath, and running as root:

# labels to add on each managed namespace
pod-security.kubernetes.io/enforce: restricted
pod-security.kubernetes.io/enforce-version: latest

Use Kyverno for what PSA can’t express — requiring the Kata runtimeClassName (above), mandating resource limits, or pinning image registries.

ResourceQuota, LimitRange, and NetworkPolicy are namespaced, and Dploy creates namespaces on the fly. Generate them into every managed namespace with a Kyverno generate policy keyed on the dploy.dev/managed=true label — the same label the TLS and ExternalDNS flows rely on:

apiVersion: kyverno.io/v1
kind: ClusterPolicy
metadata:
name: dploy-namespace-hardening
spec:
rules:
# 1. label the namespace for restricted Pod Security
- name: enforce-restricted-psa
match:
any:
- resources:
kinds: [Namespace]
selector:
matchLabels:
dploy.dev/managed: "true"
mutate:
patchStrategicMerge:
metadata:
labels:
pod-security.kubernetes.io/enforce: restricted
# 2. drop a default-deny NetworkPolicy into it
- name: default-deny-netpol
match:
any:
- resources:
kinds: [Namespace]
selector:
matchLabels:
dploy.dev/managed: "true"
generate:
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
name: default-deny
namespace: "{{request.object.metadata.name}}"
synchronize: true
data:
spec:
podSelector: {}
policyTypes: [Ingress, Egress]

Repeat the generate rule for the ResourceQuota/LimitRange and the allow-list NetworkPolicy.