Kubernetes Resources - A Comprehensive Guide

Understanding when and how to use Deployments, StatefulSets, DaemonSets, Jobs, and CronJobs

Featured image



Overview

Kubernetes offers various resource types to manage different aspects of container orchestration. Each resource type serves a specific purpose in the Kubernetes ecosystem, handling different application requirements and operational patterns.

This guide explores the core workload resources that define how applications run in a Kubernetes cluster, highlighting their key characteristics, use cases, and configuration approaches.

graph TD A[Kubernetes Workload Resources] --> B[Deployment] A --> C[StatefulSet] A --> D[DaemonSet] A --> E[Job/CronJob] B --> F[ReplicaSet] F --> G[Pods] C --> G D --> G E --> G


Kubernetes Core Workload Resources

The primary resource types we’ll examine in detail:



Deployments, ReplicaSets, and Pods

Deployment Hierarchy

A Deployment manages ReplicaSets, which in turn manage Pods. This hierarchy provides layers of abstraction that enable powerful features like rolling updates, scaling, and self-healing.

deploy-replica-pod

1. Pods

Pods are the smallest deployable units in Kubernetes, serving as the basic building blocks.

apiVersion: v1
kind: Pod
metadata:
  name: nginx-pod
  labels:
    app: nginx
spec:
  containers:
  - name: nginx
    image: nginx:1.14.2
    ports:
    - containerPort: 80

Key characteristics:


2. ReplicaSets

ReplicaSets ensure that a specified number of pod replicas are running at any given time.

apiVersion: apps/v1
kind: ReplicaSet
metadata:
  name: nginx-replicaset
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2

Key characteristics:


3. Deployments

Deployments are the recommended way to manage the creation and scaling of Pods. They provide declarative updates for Pods and ReplicaSets.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-deployment
spec:
  replicas: 3
  selector:
    matchLabels:
      app: nginx
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 1
      maxUnavailable: 0
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        resources:
          requests:
            cpu: 100m
            memory: 128Mi
          limits:
            cpu: 250m
            memory: 256Mi
        ports:
        - containerPort: 80
        livenessProbe:
          httpGet:
            path: /
            port: 80
          initialDelaySeconds: 3
          periodSeconds: 3

Key characteristics:


Rolling Updates with Deployments

Rolling updates allow you to update your application with zero downtime by incrementally replacing instances:

# Update the image
kubectl set image deployment/nginx-deployment nginx=nginx:1.16.1 --record

# Check the status of the rolling update
kubectl rollout status deployment/nginx-deployment
sequenceDiagram participant U as User participant A as API Server participant D as Deployment Controller participant RS as ReplicaSet Controller participant S as Scheduler participant N as Nodes U->>A: kubectl set image deployment/nginx A->>D: Update Deployment D->>A: Create new ReplicaSet D->>A: Scale up new ReplicaSet A->>RS: ReplicaSet Changes RS->>A: Create new Pods A->>S: Schedule Pods S->>N: Run Pods D->>A: Scale down old ReplicaSet A->>RS: ReplicaSet Changes RS->>A: Terminate old Pods

Rollbacks and History Management

# View rollout history
kubectl rollout history deployment/nginx-deployment

# Get details about a specific revision
kubectl rollout history deployment/nginx-deployment --revision=2

# Roll back to a previous revision
kubectl rollout undo deployment/nginx-deployment --to-revision=2

# Pause a rollout (for testing partial deployment)
kubectl rollout pause deployment/nginx-deployment

# Resume a rollout
kubectl rollout resume deployment/nginx-deployment



StatefulSets and DaemonSets

These resource types address specific workload requirements that can’t be satisfied by Deployments alone.

statefulset-daemonset


1. DaemonSets

DaemonSets ensure that all (or some) nodes run a copy of a Pod, making them ideal for node-level operations, monitoring, or services.

daemonset Image reference link

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd-elasticsearch
  namespace: kube-system
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      tolerations:
      - key: node-role.kubernetes.io/master
        effect: NoSchedule
      containers:
      - name: fluentd-elasticsearch
        image: quay.io/fluentd_elasticsearch/fluentd:v2.5.2
        resources:
          limits:
            memory: 200Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
      volumes:
      - name: varlog
        hostPath:
          path: /var/log

Key characteristics:

Example Use Cases:

DaemonSet Scheduling

DaemonSet Pods are scheduled using:

# View DaemonSets running on the cluster
kubectl get daemonset -n kube-system

# Examine a specific DaemonSet
kubectl describe daemonset calico-node -n kube-system


2. StatefulSets

StatefulSets are specialized workload resources designed for stateful applications requiring stable network identities and persistent storage.

statefulset Image reference link

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: postgres
spec:
  serviceName: "postgres"
  replicas: 3
  selector:
    matchLabels:
      app: postgres
  template:
    metadata:
      labels:
        app: postgres
    spec:
      containers:
      - name: postgres
        image: postgres:13
        ports:
        - containerPort: 5432
          name: postgredb
        env:
        - name: POSTGRES_PASSWORD
          valueFrom:
            secretKeyRef:
              name: postgres-secret
              key: password
        volumeMounts:
        - name: postgres-data
          mountPath: /var/lib/postgresql/data
  volumeClaimTemplates:
  - metadata:
      name: postgres-data
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "standard"
      resources:
        requests:
          storage: 10Gi

Key characteristics:

Example Use Cases:

StatefulSet Update Strategies

  • RollingUpdate: Update Pods in reverse ordinal order
  • OnDelete: Update only when Pods are manually deleted
  • Partition: Only update Pods with an ordinal greater than or equal to the partition value
# Scale a StatefulSet
kubectl scale statefulset postgres --replicas=5

# Get Persistent Volume Claims created by a StatefulSet
kubectl get pvc -l app=postgres



Jobs and CronJobs

These resource types manage task execution and scheduling within the cluster.

job-cronjob


1. Jobs

Jobs create one or more Pods and ensure that a specified number of them successfully terminate.

apiVersion: batch/v1
kind: Job
metadata:
  name: data-processor
spec:
  parallelism: 3
  completions: 5
  backoffLimit: 2
  activeDeadlineSeconds: 300
  template:
    spec:
      containers:
      - name: data-processor
        image: my-data-processor:v1
        command: ["python", "process_data.py"]
        resources:
          requests:
            memory: "128Mi"
            cpu: "100m"
          limits:
            memory: "256Mi"
            cpu: "200m"
      restartPolicy: OnFailure

Key characteristics:

Job Execution Patterns:

Example Use Cases:


2. CronJobs

CronJobs create Jobs on a time-based schedule, executing recurring tasks at specified intervals.

apiVersion: batch/v1
kind: CronJob
metadata:
  name: database-backup
spec:
  schedule: "0 2 * * *"
  concurrencyPolicy: Forbid
  successfulJobsHistoryLimit: 3
  failedJobsHistoryLimit: 1
  startingDeadlineSeconds: 120
  jobTemplate:
    spec:
      template:
        spec:
          containers:
          - name: backup
            image: my-db-backup:v1
            command:
            - /bin/sh
            - -c
            - echo "Starting backup"; sleep 5; echo "Backup completed"
            env:
            - name: DB_PASSWORD
              valueFrom:
                secretKeyRef:
                  name: db-creds
                  key: password
          restartPolicy: OnFailure

Key characteristics:

CronJob Schedule Format:

# ┌───────────── minute (0 - 59)
# │ ┌───────────── hour (0 - 23)
# │ │ ┌───────────── day of the month (1 - 31)
# │ │ │ ┌───────────── month (1 - 12)
# │ │ │ │ ┌───────────── day of the week (0 - 6) (Sunday to Saturday)
# │ │ │ │ │
# * * * * *

Concurrency Policies:

Example Use Cases:

# Check CronJob status
kubectl get cronjobs

# View the next scheduled run time
kubectl get cronjob database-backup -o json | jq '.status.lastScheduleTime'

# Manually trigger a CronJob
kubectl create job --from=cronjob/database-backup backup-manual-trigger



Resource Comparison and Selection Guide

Choosing the Right Resource Type

Selecting the appropriate Kubernetes resource is crucial for your application's reliability, scalability, and maintainability. Consider your application's state requirements, scaling patterns, update strategies, and operational needs.

🔧 Feature 🚀 Deployment 📦 StatefulSet 🛠️ DaemonSet ⏳ Job/CronJob
Primary Use Case Stateless applications Stateful applications Node-level operations Batch/scheduled tasks
Scaling Dynamic/Horizontal Ordered, sequential One per node Parallelism control
Pod Identity Ephemeral, random Stable, predictable Node-based Ephemeral, random
Storage Usually ephemeral Persistent per Pod Optional Usually ephemeral
Update Strategy Rolling update Ordered, controlled Rolling update Recreate
Pod Termination Any order Ordered (high to low index) Based on node removal After completion
Network Identity Service (load-balanced) Headless service with DNS Host network or standard Optional
Example Workloads Web servers, API services Databases, message queues Monitoring, logging agents Batch processing, backups
Self-healing Yes Yes (maintains identity) Yes (maintains node coverage) Optional (with restartPolicy)


Decision Flowchart

graph TD A[Start] --> B{Application Type?} B -->|Stateless| C[Deployment] B -->|Stateful| D[StatefulSet] B -->|System Service| E[DaemonSet] B -->|Batch/Task| F{Recurring?} F -->|Yes| G[CronJob] F -->|No| H[Job] C -->|"Examples: Web Servers, APIs"| I[Deployment + Service] D -->|"Examples: Databases, Message Queues"| J[StatefulSet + Headless Service] E -->|"Examples: Logging, Monitoring"| K[DaemonSet] G -->|"Examples: Scheduled Backups, Reports"| L[CronJob] H -->|"Examples: Migrations, Data Processing"| M[Job]


💡 Best Practices

General Resource Management

  • Define resource requests and limits to ensure proper scheduling and prevent resource contention
  • Use labels and annotations for better organization and integration with other tools
  • Set appropriate liveness and readiness probes to enhance reliability
  • Configure Pod Disruption Budgets (PDBs) for critical workloads
  • Use namespaces to organize and isolate resources

Resource-Specific Recommendations

  • Deployments: Use the RollingUpdate strategy with appropriate maxSurge and maxUnavailable values
  • StatefulSets: Always use a headless service and configure proper volumeClaimTemplates
  • DaemonSets: Use tolerations to run on tainted nodes when necessary
  • Jobs: Set appropriate backoffLimit and activeDeadlineSeconds to handle failures
  • CronJobs: Choose the right concurrencyPolicy and set history limits to manage resource consumption



References