Building a Kubernetes Cluster with Kubespray (2026V.)

A complete guide to deploying Kubernetes v1.34 with Kubespray v2.30.0, Cilium CNI, and insecure registry configuration

Building a Kubernetes Cluster with Kubespray (2026V.)



Overview

There are several ways to build a Kubernetes cluster, but if you want repeatable and automated installations in an on-premises environment, Kubespray is one of the most powerful options.

Kubespray is an Ansible-based provisioning tool that allows flexible installation of Kubernetes clusters — including HA configurations — across various environments (GCP, AWS, on-premises, etc.) by simply defining YAML-based inventory files.

This guide covers the following key areas:


System Configuration

Based on April 7, 2026 standards.


Environment Details

Component Specification
Operating System Ubuntu 24.04 LTS
Kubernetes Version v1.34.3
Kubespray Version v2.30.0
CNI Plugin Cilium v1.19.1
Container Runtime containerd v2.2.1
Cluster DNS somaz-cluster.local


Node Specifications

Role Hostname IP Address CPU Memory
Control Plane + etcd k8s-control-01 10.10.10.17 4 cores 24GB
Worker k8s-compute-01 10.10.10.18 12 cores 48GB
Worker k8s-compute-02 10.10.10.19 12 cores 48GB
Worker k8s-compute-03 10.10.10.22 12 cores 48GB

We’ll first install up to k8s-compute-01, then add k8s-compute-02 and k8s-compute-03 using scale.yml.


Prerequisites


SSH Key Setup

All nodes must be accessible via SSH without password authentication.

# Generate SSH key
ssh-keygen

# Update /etc/hosts for easier node access
sudo vi /etc/hosts
10.10.10.17 k8s-control-01
10.10.10.18 k8s-compute-01
10.10.10.19 k8s-compute-02
10.10.10.22 k8s-compute-03

# Copy SSH key to all nodes
ssh-copy-id k8s-control-01
ssh-copy-id k8s-compute-01
ssh-copy-id k8s-compute-02

# Verify SSH connections
ssh k8s-control-01
ssh k8s-compute-01
ssh k8s-compute-02


Package Installation and Kubespray Setup

# Install Python 3.10 and required packages
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get -y update
sudo apt install -y python3.10 python3-pip git python3.10-venv

python3.10 --version

# Clone Kubespray repository
git clone https://github.com/kubernetes-sigs/kubespray.git
cd kubespray

# Checkout desired version
git checkout v2.30.0

# Copy sample inventory
cp -rfp inventory/sample inventory/somaz-cluster

# Create and activate virtual environment
python3.10 -m venv venv
source venv/bin/activate

# Install dependencies
pip install -U -r requirements.txt


Inventory Configuration


inventory.ini

inventory/somaz-cluster/inventory.ini

We’ll initially include only k8s-compute-01. k8s-compute-02 will be added later using scale.yml.

[kube_control_plane]
k8s-control-01 ansible_host=10.10.10.17 ip=10.10.10.17 etcd_member_name=etcd1

[etcd:children]
kube_control_plane

[kube_node]
k8s-compute-01


k8s-cluster.yml

inventory/somaz-cluster/group_vars/k8s_cluster/k8s-cluster.yml

# CNI plugin configuration
kube_network_plugin: cilium

# Cluster domain
cluster_name: somaz-cluster.local

# When using Cilium, kube_owner must be set to root
# https://kubespray.io/#/docs/CNI/cilium?id=unprivileged-agent-configuration
kube_owner: root
Important: Cilium Permission Issue

If kube_owner: root is not set, Cilium will encounter a /opt/cni/bin permission error. See: Cilium Issue #23838


addons.yml

inventory/somaz-cluster/group_vars/k8s_cluster/addons.yml

# Enable Helm
helm_enabled: true

# Enable Metrics Server
metrics_server_enabled: true

# Enable Krew (kubectl plugin manager)
krew_enabled: true
krew_root_dir: "/usr/local/krew"


containerd.yml — Insecure Registry Configuration

inventory/somaz-cluster/group_vars/all/containerd.yml

When using a private registry (e.g., Harbor) over HTTP, you need to add insecure registry settings to containerd. Defining this in the Kubespray inventory ensures it is automatically applied during node additions and upgrades.

# Harbor insecure registry configuration
containerd_registries_mirrors:
  - prefix: harbor.example.com
    mirrors:
      - host: http://harbor.example.com
        capabilities: ["pull", "resolve", "push"]
        skip_verify: true
        plain_http: true

# Harbor authentication
containerd_registry_auth:
  - registry: harbor.example.com
    username: admin
    password: your-password

Once applied, the file /etc/containerd/certs.d/harbor.example.com/hosts.toml is automatically generated on each node.

Best Practices for Insecure Registry

ssh somaz@<node-ip> "cat /etc/containerd/certs.d/harbor.example.com/hosts.toml"
ansible-playbook -i inventory/somaz-cluster/inventory.ini cluster.yml --tags containerd -b --become-user=root


Cluster Deployment


Verify Ansible Connectivity

Before deployment, verify that all nodes are reachable via Ansible.

# Must run from ~/kubespray directory
ansible all -i inventory/somaz-cluster/inventory.ini -m ping


Run the Playbook

# Foreground execution
ansible-playbook -i inventory/somaz-cluster/inventory.ini cluster.yml --become

# Background execution (recommended for long-running deployments)
nohup ansible-playbook -i inventory/somaz-cluster/inventory.ini cluster.yml --become &

# Monitor logs
tail -f nohup.out


Configure kubectl

After deployment, set up kubectl on the Control Plane node.

# Copy kubeconfig
mkdir ~/.kube
sudo cp /etc/kubernetes/admin.conf ~/.kube/config
sudo chown $USER:$USER ~/.kube/config

# Set up completion and aliases
echo '# kubectl completion and alias' >> ~/.bashrc
echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc


Verify Installation

k get nodes
NAME             STATUS   ROLES           AGE    VERSION
k8s-compute-01   Ready    <none>          2d4h   v1.34.3
k8s-control-01   Ready    control-plane   2d4h   v1.34.3

k version
Client Version: v1.34.3
Kustomize Version: v5.6.0
Server Version: v1.34.3

containerd --version
containerd github.com/containerd/containerd/v2 v2.2.1

# Check Cilium version
k get ds cilium -n kube-system -o=jsonpath='{.spec.template.spec.containers[0].image}'
quay.io/cilium/cilium:v1.19.1


Troubleshooting: Cilium Permission Error

When using Cilium as the CNI, permission issues with the /opt/cni/bin directory can cause errors.


Symptoms

Pods get stuck in Init:CrashLoopBackOff state, or Cilium agent logs show permission-related errors.


Fix 1: Manual Permission Change

chown -R root:root /opt/cni/bin


inventory/somaz-cluster/group_vars/k8s_cluster/k8s-cluster.yml

# Default
# kube_owner: kube

# Change to root when using Cilium
kube_owner: root
Note

Starting from Kubespray v2.30.0, the official comment # Note: cilium needs to set kube_owner to root has been added. See: Cilium Issue #23838


Adding Worker Nodes (Scale Up)


Step 1. Preparation

# Add new node to /etc/hosts
sudo vi /etc/hosts
10.10.10.19 k8s-compute-02

# Copy SSH key
ssh-copy-id k8s-compute-02


Step 2. Update inventory.ini

[kube_control_plane]
k8s-control-01 ansible_host=10.10.10.17 ip=10.10.10.17 etcd_member_name=etcd1

[etcd:children]
kube_control_plane

[kube_node]
k8s-compute-01
k8s-compute-02    # New node added

[add_node]
k8s-compute-02    # Target for scale.yml


Step 3. Gather Facts (Critical!)

You must gather facts from existing nodes first. Skipping this step will cause an ipwrap error.

ansible-playbook -i inventory/somaz-cluster/inventory.ini playbooks/facts.yml --become
Warning

If you skip this step, you will encounter the following error:

AnsibleFilterError: Unrecognized type for ipwrap filter


Step 4. Run scale.yml

# Run scale.yml targeting only the new node
nohup ansible-playbook -i inventory/somaz-cluster/inventory.ini scale.yml --limit add_node --become &

# Monitor logs
tail -f nohup.out


Step 5. Verify and Clean Up

k get nodes
NAME             STATUS   ROLES           AGE    VERSION
k8s-compute-01   Ready    <none>          2d5h   v1.34.3
k8s-compute-02   Ready    <none>          1m     v1.34.3
k8s-control-01   Ready    control-plane   2d5h   v1.34.3

After verification, comment out or remove the [add_node] section:

# [add_node]
# k8s-compute-02  # Already added


Removing Worker Nodes (Scale Down)

cd ~/kubespray
source venv/bin/activate

ansible-playbook -i inventory/somaz-cluster/inventory.ini remove-node.yml -b --extra-vars='node=k8s-compute-02' --extra-vars reset_nodes=true

After completion, remove the node from inventory.ini.


Operational Tips


Inventory Local Backup

Backing up the Control Plane’s inventory locally can be useful for recovery during server failures.

scp -r somaz@10.10.10.17:~/kubespray/inventory/somaz-cluster ~/my-project/kubespray/inventory-somaz-cluster


Version Check Script

Create a script for quick cluster status verification.

#!/bin/bash
echo "=== K8s Version ==="
kubectl get nodes -o wide

echo ""
echo "=== Kubespray Version ==="
cd ~/kubespray && git describe --tags

echo ""
echo "=== Containerd Version ==="
containerd --version

echo ""
echo "=== Cilium Version ==="
kubectl get ds cilium -n kube-system -o=jsonpath='{.spec.template.spec.containers[0].image}'
echo ""

echo ""
echo "=== Certificate Expiration ==="
sudo kubeadm certs check-expiration 2>/dev/null | head -15


Important Notes Summary

Item Description
venv Always ensure Kubespray's Python venv is activated (source venv/bin/activate)
Path All ansible-playbook commands must be run from ~/kubespray directory
facts.yml Must be run before adding nodes (prevents ipwrap error)
kube_owner Must be set to root when using Cilium
Insecure Registry Manage through inventory instead of direct server modification
etcd Backup Always create a snapshot before upgrades
StatefulSet Verify data before draining nodes
Replicas Maintain at least 2 Deployment replicas for zero-downtime services


Conclusion

Kubespray goes beyond simple installation automation to enable production-level Kubernetes cluster configuration. It’s particularly useful in the following scenarios:

Establishing the flow of Ansible-based automation → Cluster construction → GitOps-based app deployment can elevate operational automation to the next level.

The next post will cover Kubernetes cluster upgrades using Kubespray.



References