Kubernetes
Printed from:
Complete Kubernetes Cheatsheet
Targets Kubernetes 1.30+. Most teams now use a managed control plane (EKS, GKE, AKS) plus kubectl, helm, kustomize, and increasingly kind/k3d/Minikube locally.
Table of Contents
- Setup & Clusters
kubectlBasics- Contexts & Namespaces
- Pods
- Workloads (Deployment/StatefulSet/DaemonSet/Job/CronJob)
- Services & Networking
- Ingress / Gateway API
- ConfigMaps & Secrets
- Volumes & Storage
- RBAC & ServiceAccounts
- Autoscaling
- Probes, Resources, Limits
- Logs, Exec, Debug
- Helm
- Kustomize
- CRDs & Operators
- Security Hardening
- Troubleshooting
- Quick Reference
Setup & Clusters
123456789101112131415161718192021# Install kubectl (macOS)
brew install kubectl helm kustomize kubectx kubens stern k9s
# Linux
curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl"
sudo install kubectl /usr/local/bin/
# Verify
kubectl version --client
kubectl cluster-info
# Local clusters
kind create cluster --name dev # Docker-in-Docker, fast
k3d cluster create dev # k3s in Docker
minikube start --driver=docker
# Cloud-managed (just snippets)
aws eks update-kubeconfig --name prod --region us-east-1
gcloud container clusters get-credentials prod --region us-central1
az aks get-credentials --resource-group rg --name prod
kubectl Basics
123456789101112131415161718192021222324252627282930313233kubectl get <resource> # list
kubectl get pods -A # all namespaces
kubectl get pod,svc,deploy
kubectl get pods -o wide # node, IP
kubectl get pods -o yaml | less # full manifest
kubectl get pods -o jsonpath='{.items[*].metadata.name}'
kubectl get pods -l app=web,tier!=prod # label selectors
kubectl get pods --field-selector status.phase=Running
kubectl describe pod <name> # human-readable inspection
kubectl explain pod.spec.containers # field docs
kubectl apply -f manifest.yaml # declarative create/update
kubectl apply -k overlays/prod # kustomize
kubectl apply -R -f ./manifests/ # recursive
kubectl create deploy web --image=nginx # imperative
kubectl run dbg --image=alpine -it --rm -- sh # one-off pod
kubectl edit deploy web # edit live object in $EDITOR
kubectl patch deploy web --patch '{"spec":{"replicas":3}}'
kubectl scale deploy web --replicas=5
kubectl rollout restart deploy web
kubectl delete -f manifest.yaml
kubectl delete pod web-abc --force --grace-period=0
kubectl delete pods -l app=web
# Output formats
kubectl get pod web -o yaml | yq
kubectl get pod web -o json | jq
kubectl get pod web -o name # "pod/web"
Aliases worth setting:
123456alias k=kubectl
alias kgp='kubectl get pods'
alias kgs='kubectl get svc'
alias kgd='kubectl get deploy'
alias kdp='kubectl describe pod'
Contexts & Namespaces
12345678910111213kubectl config get-contexts
kubectl config current-context
kubectl config use-context prod
kubectl config set-context --current --namespace=team-a # set default ns
# kubectx / kubens (must-haves)
kubectx prod
kubens team-a
# Per-command override
kubectl get pods -n kube-system
kubectl get pods --context staging -n web
Pods
123456789101112131415161718192021222324252627282930apiVersion: v1
kind: Pod
metadata:
name: web
labels: { app: web }
spec:
serviceAccountName: web-sa
securityContext:
runAsNonRoot: true
runAsUser: 1000
fsGroup: 1000
seccompProfile: { type: RuntimeDefault }
containers:
- name: app
image: nginx:1.27
ports: [{ containerPort: 80 }]
env:
- name: DATABASE_URL
valueFrom: { secretKeyRef: { name: db, key: url } }
resources:
requests: { cpu: 100m, memory: 128Mi }
limits: { cpu: 500m, memory: 256Mi }
readinessProbe: { httpGet: { path: /healthz, port: 80 } }
livenessProbe: { httpGet: { path: /healthz, port: 80 }, periodSeconds: 30 }
securityContext:
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities: { drop: [ALL] }
volumes: []
12345678kubectl get pods # see status: Running / Pending / CrashLoopBackOff
kubectl logs -f web # follow
kubectl logs --previous web # last crashed instance
kubectl logs web -c sidecar # specific container
kubectl exec -it web -- sh
kubectl cp web:/var/log/app.log ./app.log
kubectl port-forward pod/web 8080:80
Workloads
Deployment (stateless apps)
1234567891011121314151617apiVersion: apps/v1
kind: Deployment
metadata: { name: web }
spec:
replicas: 3
strategy:
type: RollingUpdate
rollingUpdate: { maxSurge: 1, maxUnavailable: 0 }
selector: { matchLabels: { app: web } }
template:
metadata: { labels: { app: web } }
spec:
containers:
- name: app
image: ghcr.io/me/app:1.2.3
ports: [{ containerPort: 3000 }]
123456kubectl rollout status deploy/web
kubectl rollout history deploy/web
kubectl rollout undo deploy/web # to previous
kubectl rollout undo deploy/web --to-revision=3
kubectl set image deploy/web app=ghcr.io/me/app:1.2.4
StatefulSet (stable IDs, persistent volumes)
123456789101112131415161718192021apiVersion: apps/v1
kind: StatefulSet
metadata: { name: pg }
spec:
serviceName: pg-headless
replicas: 3
selector: { matchLabels: { app: pg } }
template:
metadata: { labels: { app: pg } }
spec:
containers:
- name: pg
image: postgres:16
volumeMounts: [{ name: data, mountPath: /var/lib/postgresql/data }]
volumeClaimTemplates:
- metadata: { name: data }
spec:
accessModes: [ReadWriteOnce]
resources: { requests: { storage: 20Gi } }
storageClassName: standard
DaemonSet (one pod per node — agents, log shippers)
12345678910111213apiVersion: apps/v1
kind: DaemonSet
metadata: { name: log-agent }
spec:
selector: { matchLabels: { app: log-agent } }
template:
metadata: { labels: { app: log-agent } }
spec:
tolerations: [{ operator: Exists }]
containers:
- name: agent
image: fluentbit:3
Job & CronJob
123456789101112131415161718192021222324252627apiVersion: batch/v1
kind: Job
metadata: { name: migrate }
spec:
backoffLimit: 3
template:
spec:
restartPolicy: OnFailure
containers:
- name: migrate
image: ghcr.io/me/app:1.2.3
command: ["./migrate"]
---
apiVersion: batch/v1
kind: CronJob
metadata: { name: cleanup }
spec:
schedule: "0 3 * * *" # 03:00 daily
successfulJobsHistoryLimit: 3
jobTemplate:
spec:
template:
spec:
restartPolicy: OnFailure
containers:
- { name: c, image: alpine, command: ["sh","-c","./cleanup.sh"] }
Services & Networking
1234567891011apiVersion: v1
kind: Service
metadata: { name: web }
spec:
type: ClusterIP # ClusterIP | NodePort | LoadBalancer | ExternalName
selector: { app: web }
ports:
- name: http
port: 80
targetPort: 3000
12345kubectl get svc # CLUSTER-IP, EXTERNAL-IP, PORTS
kubectl port-forward svc/web 8080:80
kubectl run dbg --rm -it --image=alpine -- sh
# in pod: nslookup web.default.svc.cluster.local
Service types:
- ClusterIP — internal only (default).
- NodePort — exposes on every node's IP at a high port.
- LoadBalancer — provisions cloud LB.
- ExternalName — DNS CNAME.
Headless service (clusterIP: None) → DNS returns all pod IPs (StatefulSet pattern).
NetworkPolicy (zero-trust between pods)
123456789101112131415apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata: { name: web-allow }
spec:
podSelector: { matchLabels: { app: web } }
policyTypes: [Ingress, Egress]
ingress:
- from: [{ podSelector: { matchLabels: { app: gateway } } }]
ports: [{ port: 3000 }]
egress:
- to:
- namespaceSelector: { matchLabels: { kubernetes.io/metadata.name: kube-system } }
podSelector: { matchLabels: { k8s-app: kube-dns } }
ports: [{ port: 53, protocol: UDP }]
Ingress / Gateway API
Ingress (older, still common)
12345678910111213141516171819apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: web
annotations:
nginx.ingress.kubernetes.io/ssl-redirect: "true"
spec:
ingressClassName: nginx
tls:
- hosts: [app.example.com]
secretName: web-tls
rules:
- host: app.example.com
http:
paths:
- path: /
pathType: Prefix
backend: { service: { name: web, port: { number: 80 } } }
Gateway API (modern replacement)
1234567891011apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata: { name: web }
spec:
parentRefs: [{ name: gateway }]
hostnames: [app.example.com]
rules:
- matches: [{ path: { type: PathPrefix, value: / } }]
backendRefs:
- { name: web, port: 80 }
Popular ingress / gateway controllers: ingress-nginx, traefik, contour, cilium, envoy-gateway, AWS ALB controller, GKE Gateway.
ConfigMaps & Secrets
12345678910kubectl create configmap app-config \
--from-literal=LOG_LEVEL=info \
--from-file=config.yaml
kubectl create secret generic db \
--from-literal=password='s3cr3t'
kubectl create secret tls web-tls --cert=tls.crt --key=tls.key
kubectl create secret docker-registry ghcr --docker-server=ghcr.io \
--docker-username=USER --docker-password=$TOKEN
123456789101112131415161718# Mount as env
env:
- name: LOG_LEVEL
valueFrom: { configMapKeyRef: { name: app-config, key: LOG_LEVEL } }
- name: DB_PASS
valueFrom: { secretKeyRef: { name: db, key: password } }
envFrom:
- configMapRef: { name: app-config }
- secretRef: { name: db }
# Mount as file
volumes:
- name: cfg
configMap: { name: app-config }
volumeMounts:
- { name: cfg, mountPath: /etc/app }
Don't commit raw Secrets to git. Use:
- SealedSecrets (sealed-secrets-controller)
- SOPS + Mozilla age
- External Secrets Operator (Vault / AWS Secrets Manager / GCP / Azure)
Volumes & Storage
123456789# Persistent volume claim
apiVersion: v1
kind: PersistentVolumeClaim
metadata: { name: data }
spec:
accessModes: [ReadWriteOnce] # RWO | ROX | RWX | RWOP
resources: { requests: { storage: 10Gi } }
storageClassName: standard
1234567891011121314volumes:
- name: data
persistentVolumeClaim: { claimName: data }
- name: scratch
emptyDir: {} # ephemeral, per-pod
- name: cfg
configMap: { name: app-config }
- name: secret
secret: { secretName: db }
- name: host
hostPath: { path: /var/log } # ⚠️ host filesystem
volumeMounts:
- { name: data, mountPath: /data }
123kubectl get pv,pvc,sc kubectl get storageclass
RBAC & ServiceAccounts
123456789101112131415161718apiVersion: v1
kind: ServiceAccount
metadata: { name: web-sa }
---
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata: { name: pod-reader, namespace: default }
rules:
- apiGroups: [""]
resources: [pods, pods/log]
verbs: [get, list, watch]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata: { name: web-sa-read }
subjects: [{ kind: ServiceAccount, name: web-sa, namespace: default }]
roleRef: { kind: Role, name: pod-reader, apiGroup: rbac.authorization.k8s.io }
ClusterRole / ClusterRoleBinding are cluster-wide equivalents.
1234kubectl auth can-i create deploy
kubectl auth can-i list pods --as system:serviceaccount:default:web-sa
kubectl auth whoami
Autoscaling
12345678910111213apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata: { name: web }
spec:
scaleTargetRef: { apiVersion: apps/v1, kind: Deployment, name: web }
minReplicas: 2
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target: { type: Utilization, averageUtilization: 70 }
Other autoscalers:
- VPA (Vertical Pod Autoscaler) — adjusts requests/limits.
- Cluster Autoscaler / Karpenter — adds/removes nodes.
- KEDA — event-driven scaling (queues, Kafka, etc.).
Probes, Resources, Limits
1234567891011121314151617containers:
- name: app
resources:
requests: { cpu: 100m, memory: 128Mi } # scheduler uses these
limits: { cpu: 500m, memory: 512Mi } # enforced; OOMKill if exceeded
startupProbe:
httpGet: { path: /healthz, port: 80 }
failureThreshold: 30
periodSeconds: 5
readinessProbe: # remove from Service if failing
httpGet: { path: /ready, port: 80 }
periodSeconds: 10
livenessProbe: # restart container if failing
httpGet: { path: /live, port: 80 }
periodSeconds: 30
initialDelaySeconds: 30
Limit / quota at the namespace level:
123456789apiVersion: v1
kind: ResourceQuota
metadata: { name: team-a }
spec:
hard:
requests.cpu: "20"
requests.memory: 40Gi
pods: "100"
Logs, Exec, Debug
1234567891011121314151617181920212223242526kubectl logs -f deploy/web # all pods of a deployment
kubectl logs -f web-abc -c sidecar
kubectl logs --since=10m web
kubectl logs --tail=100 web
stern web # multi-pod tail (great tool)
stern -l app=web --since 5m
kubectl exec -it web -- sh
kubectl exec web -- curl localhost:8080/metrics
# Ephemeral debug container (k8s 1.25+)
kubectl debug -it web --image=nicolaka/netshoot --target=app
# Port forwarding
kubectl port-forward deploy/web 8080:80
kubectl port-forward svc/web 8080:80
# Top
kubectl top pod
kubectl top node
# Events (sorted by time)
kubectl get events --sort-by=.lastTimestamp
kubectl get events --watch
Helm
123456789101112131415161718192021222324helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
helm search repo postgres
helm install pg bitnami/postgresql \
--version 15.0.0 \
--namespace data --create-namespace \
--set auth.postgresPassword=secret \
-f values.yaml
helm upgrade --install pg bitnami/postgresql -f values.yaml
helm list -A
helm history pg
helm rollback pg 2
helm uninstall pg
# Render templates locally
helm template myrelease ./chart --values values.yaml | kubectl apply -f -
# Chart skeleton
helm create mychart
helm lint mychart
helm package mychart
Chart structure: Chart.yaml, values.yaml, templates/*.yaml (Go templates).
Kustomize
1234567891011121314151617181920# base/kustomization.yaml
resources:
- deployment.yaml
- service.yaml
commonLabels: { app: web }
images:
- name: ghcr.io/me/app
newTag: 1.2.3
# overlays/prod/kustomization.yaml
resources: [../../base]
namespace: prod
replicas:
- { name: web, count: 5 }
patches:
- path: hpa-patch.yaml
configMapGenerator:
- name: app-config
literals: [LOG_LEVEL=warn]
123kubectl apply -k overlays/prod
kubectl kustomize overlays/prod | less # render only
CRDs & Operators
1234kubectl get crd kubectl api-resources | grep cert-manager kubectl explain certificate.spec
Common operators: cert-manager, External-DNS, ArgoCD / Flux (GitOps), Prometheus Operator, OpenTelemetry Operator, Strimzi (Kafka), CloudNativePG, Velero (backups), Crossplane, KEDA.
GitOps shape:
12git repo → ArgoCD/Flux → kubectl apply -k overlays/prod → cluster
Security Hardening
- Run containers as non-root (
runAsNonRoot: true,USERin Dockerfile). readOnlyRootFilesystem: true, mount/tmpasemptyDir.allowPrivilegeEscalation: false,drop: [ALL]caps.- Use Pod Security Standards (
restrictedprofile) via namespace labels:YAML1234567apiVersion: v1 kind: Namespace metadata: labels: pod-security.kubernetes.io/enforce: restricted pod-security.kubernetes.io/enforce-version: latest - Default-deny NetworkPolicies; explicitly allow needed flows.
- Don't grant
cluster-adminto apps; use minimal Roles. - Pin images by digest in prod:
image: ghcr.io/me/app@sha256:abc... - Sign images (cosign) + admission control (kyverno / OPA Gatekeeper / Sigstore policy controller).
- Scan with
trivy,grype, ordocker scout.
Troubleshooting
123456789101112131415161718192021222324252627282930313233# Pod stuck Pending
kubectl describe pod <name> # Events: usually scheduling / image pull
kubectl get events -n <ns> --sort-by=.lastTimestamp
# CrashLoopBackOff
kubectl logs <pod> --previous
kubectl describe pod <pod> # exit code, reason
# ImagePullBackOff
# - wrong tag / private registry without imagePullSecret
kubectl get pod -o jsonpath='{.spec.imagePullSecrets[*].name}'
# Service has no endpoints
kubectl get endpoints <svc> # empty → selector mismatch
kubectl get pods -l <service-selector> -o wide
# DNS broken
kubectl run dbg --rm -it --image=alpine -- sh
# inside: nslookup kubernetes.default
# OOMKilled
# - bump memory limits, check JVM heap / Node --max-old-space-size
# "Unable to connect to the server"
kubectl config current-context
kubectl config view --minify
# Node problems
kubectl get nodes -o wide
kubectl describe node <node>
kubectl cordon <node> # mark unschedulable
kubectl drain <node> --ignore-daemonsets --delete-emptydir-data
Quick Reference
12345678910111213141516171819202122232425262728293031323334353637383940414243# Cluster
kubectl cluster-info
kubectl get nodes -o wide
kubectl version
# Resources
kubectl get pods,svc,deploy,ing,pvc -A
kubectl describe <resource> <name>
kubectl explain <resource>.spec
kubectl get pod -o yaml
kubectl get pod -o jsonpath='{.spec.containers[*].image}'
# Apply / delete
kubectl apply -f file.yaml
kubectl apply -k overlays/prod
kubectl delete -f file.yaml
# Edit / scale / rollout
kubectl edit deploy web
kubectl scale deploy web --replicas=5
kubectl rollout restart deploy web
kubectl rollout undo deploy web
# Debug
kubectl logs -f <pod> [-c <container>] [--previous]
kubectl exec -it <pod> -- sh
kubectl debug -it <pod> --image=nicolaka/netshoot --target=<container>
kubectl port-forward svc/web 8080:80
kubectl top pod / node
kubectl get events --sort-by=.lastTimestamp
# Context / namespace
kubectx prod kubens team-a
kubectl config use-context prod
kubectl config set-context --current --namespace=team-a
# Helm
helm upgrade --install rel chart -f values.yaml
helm rollback rel <rev>
# Kustomize
kubectl apply -k overlays/prod
Tip: in 2025+, use kubectx/kubens + k9s + stern day-to-day; manage releases with Helm or Kustomize (often both via Argo CD); enforce Pod Security Standards: restricted and default-deny NetworkPolicies; pin images by digest in prod.
Continue Learning
Discover more cheatsheets to boost your productivity