15 min to read
Installing Kubernetes with Kubespray and Adding Worker Nodes (2024V.)
A comprehensive guide to setting up Kubernetes using Kubespray on GCP

Overview
Kubespray is a powerful tool that combines the flexibility of Ansible with the robustness of Kubernetes, enabling the deployment of production-ready Kubernetes clusters on various infrastructures. This guide provides detailed instructions for installing a Kubernetes cluster using Kubespray on Google Cloud Platform (GCP) and demonstrates how to add worker nodes to scale your cluster.
Kubespray is a composition of Ansible playbooks, inventory, provisioning tools, and domain knowledge for deploying a production-ready Kubernetes cluster. It allows for highly customizable deployments and supports multiple cloud providers, bare metal installations, and virtualized environments.
Key features include:
- Composable attributes
- Multiple network plugin support (Calico, Flannel, Cilium, etc.)
- HA cluster setup
- Configurable addons
- Support for most popular Linux distributions
System Configuration
Environment Details
Component | Specification |
---|---|
Operating System | Ubuntu 20.04 LTS (Focal) |
Cloud Provider | Google Compute Engine (GCP) |
Kubernetes Version | v1.28.6 (deployed by Kubespray) |
CNI Plugin | Calico (default) |
Container Runtime | containerd |
Python Version | 3.10.13 |
Node Specifications
- Hostname: test-server
- IP: 10.77.101.62
- CPU: 2 cores
- Memory: 8096MB
- Role: Control Plane + etcd
- Node 1: test-server-agent (10.77.101.57, 2 CPU, 8096MB RAM)
- Node 2: test-server-agent2 (10.77.101.200, 2 CPU, 8096MB RAM)
- Role: Worker nodes running application workloads
Infrastructure Setup
Infrastructure as Code (IaC)
We use Terraform to provision our GCP infrastructure. Here are the key resources:
Control Plane Node Configuration
resource "google_compute_address" "test_server_ip" {
name = var.test_server_ip
}
resource "google_compute_instance" "test_server" {
name = var.test_server
machine_type = "n2-standard-2"
zone = "${var.region}-a"
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2004-lts"
size = 10
}
}
network_interface {
network = var.shared_vpc
subnetwork = "${var.subnet_share}-mgmt-a"
access_config {
nat_ip = google_compute_address.test_server_ip.address
}
}
# Recommended metadata for Kubernetes nodes
metadata = {
"startup-script" = <<-EOF
#!/bin/bash
swapoff -a
sed -i '/swap/d' /etc/fstab
# Set system parameters for Kubernetes
cat > /etc/sysctl.d/99-kubernetes.conf <<EOF2
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF2
sysctl --system
EOF
}
}
Worker Node Configuration
resource "google_compute_address" "test_server_agent_ip" {
name = var.test_server_agent_ip
}
resource "google_compute_instance" "test_server_agent" {
name = var.test_server_agent
machine_type = "n2-standard-2"
zone = "${var.region}-a"
boot_disk {
initialize_params {
image = "ubuntu-os-cloud/ubuntu-2004-lts"
size = 10
}
}
network_interface {
network = var.shared_vpc
subnetwork = "${var.subnet_share}-mgmt-a"
access_config {
nat_ip = google_compute_address.test_server_agent_ip.address
}
}
# Same startup script as control plane for Kubernetes prerequisites
metadata = {
"startup-script" = <<-EOF
#!/bin/bash
swapoff -a
sed -i '/swap/d' /etc/fstab
# Set system parameters for Kubernetes
cat > /etc/sysctl.d/99-kubernetes.conf <<EOF2
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
EOF2
sysctl --system
EOF
}
}
Prerequisites
Before starting the Kubespray installation, you need to prepare your environment.
System Requirements
- Operating System: Ubuntu 20.04 or other supported Linux distributions
- SSH Access: Between the deployment host and all nodes
- Python: Version 3.9+ (we’ll use 3.10)
- Ansible: Will be installed by the setup process
- Git: For cloning the Kubespray repository
- sudo privileges: On all nodes
Node Preparation
SSH Key Setup
# Generate SSH key if needed
ssh-keygen -t rsa -b 4096 -C "kubespray-deployment"
# Copy SSH key to all nodes
ssh-copy-id somaz@10.77.101.62 # Control plane
ssh-copy-id somaz@10.77.101.57 # Worker 1
ssh-copy-id somaz@10.77.101.200 # Worker 2
# Update /etc/hosts for easier node access
cat << EOF | sudo tee -a /etc/hosts
10.77.101.62 test-server
10.77.101.57 test-server-agent
10.77.101.200 test-server-agent2
EOF
Package Installation
# Update package lists
sudo apt-get update
# Install Python 3.10
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt-get -y update
sudo apt install -y python3.10 python3-pip git python3.10-venv
# Verify Python version
python3.10 --version # Should show Python 3.10.13
System Configuration
# Disable swap (required for Kubernetes)
sudo swapoff -a
sudo sed -i '/swap/d' /etc/fstab
# Load required kernel modules
cat << EOF | sudo tee /etc/modules-load.d/k8s.conf
overlay
br_netfilter
EOF
sudo modprobe overlay
sudo modprobe br_netfilter
# Configure kernel parameters
cat << EOF | sudo tee /etc/sysctl.d/k8s.conf
net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1
EOF
sudo sysctl --system
Kubespray Deployment Process
1. Clone Repository and Setup Environment
# Clone the Kubespray repository
git clone https://github.com/kubernetes-sigs/kubespray.git
# Setup virtual environment
VENVDIR=kubespray-venv
KUBESPRAYDIR=kubespray
python3.10 -m venv $VENVDIR
source $VENVDIR/bin/activate
cd $KUBESPRAYDIR
# Install dependencies
pip install -U -r requirements.txt
# Check Ansible version
ansible --version
2. Prepare Ansible Inventory
# Copy sample inventory
cp -rfp inventory/sample inventory/somaz-cluster
# Update inventory with nodes
declare -a IPS=(10.77.101.62 10.77.101.57)
CONFIG_FILE=inventory/somaz-cluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
3. Configure Inventory
The inventory generator creates a basic configuration, but we’ll make additional adjustments for our setup.
# inventory/somaz-cluster/inventory.ini
[all]
test-server ansible_host=10.77.101.62 ip=10.77.101.62
test-server-agent ansible_host=10.77.101.57 ip=10.77.101.57
# Control plane node(s)
[kube_control_plane]
test-server
# etcd cluster member(s)
[etcd]
test-server
# Kubernetes worker node(s)
[kube_node]
test-server-agent
# All groups with assigned roles
[k8s_cluster:children]
kube_control_plane
kube_node
calico_rr
Advanced Inventory Configuration (Optional)
You can further customize your deployment by editing these additional files:
# Edit group variables for all nodes
vi inventory/somaz-cluster/group_vars/all/all.yml
# Customize Kubernetes-specific parameters
vi inventory/somaz-cluster/group_vars/k8s_cluster/k8s-cluster.yml
# Configure network plugin options
vi inventory/somaz-cluster/group_vars/k8s_cluster/k8s-net-*.yml
Common customizations:
kube_version
: Specify Kubernetes version (default is 1.28.6 as of this writing)kube_network_plugin
: Choose network plugin (calico, flannel, cilium, etc.)etcd_deployment_type
: etcd deployment method (host or docker-based)container_manager
: Choose containerd or docker (containerd is recommended)
4. Verify Ansible Connectivity
# Test connection to all nodes
ansible all -i inventory/somaz-cluster/inventory.ini -m ping
# Optional: Update apt cache on all nodes
ansible all -i inventory/somaz-cluster/inventory.ini -m apt -a 'update_cache=yes' --become
5. Run Playbook
Now we’re ready to deploy the Kubernetes cluster using Kubespray’s Ansible playbooks.
# Deploy the cluster (this will take 15-30 minutes)
ansible-playbook -i inventory/somaz-cluster/inventory.ini cluster.yml --become
The cluster deployment can take 15-30 minutes depending on your internet connection speed and server performance. During the deployment process, Ansible will:
- Install container runtime (containerd by default)
- Deploy etcd cluster
- Install Kubernetes components (kubeadm, kubelet, kubectl)
- Initialize the Kubernetes control plane
- Join worker nodes to the cluster
- Deploy network plugins and add-ons
Be patient and monitor the output for any errors. Most issues can be resolved by looking at the Ansible error messages.
6. Configure kubectl
After successful deployment, you need to configure kubectl to interact with your new cluster.
# Create kubectl config directory
mkdir -p ~/.kube
# Copy admin configuration
sudo cp /etc/kubernetes/admin.conf ~/.kube/config
sudo chown $(id -u):$(id -g) ~/.kube/config
# Setup kubectl autocomplete (optional but recommended)
echo '# kubectl completion and alias' >> ~/.bashrc
echo 'source <(kubectl completion bash)' >> ~/.bashrc
echo 'alias k=kubectl' >> ~/.bashrc
echo 'complete -F __start_kubectl k' >> ~/.bashrc
source ~/.bashrc
# Test kubectl
kubectl get nodes
Adding Worker Nodes
One of the key advantages of Kubernetes is its scalability. Let’s add another worker node to our cluster.
1. Update Inventory
First, we need to update our Ansible inventory to include the new worker node.
# Add new node to IPS array
declare -a IPS=(10.77.101.62 10.77.101.57 10.77.101.200)
CONFIG_FILE=inventory/somaz-cluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
2. Modify inventory.ini
After using the inventory builder, it’s good to check and make sure the new node was properly added to the correct groups.
# inventory/somaz-cluster/inventory.ini
[all]
test-server ansible_host=10.77.101.62 ip=10.77.101.62
test-server-agent ansible_host=10.77.101.57 ip=10.77.101.57
test-server-agent2 ansible_host=10.77.101.200 ip=10.77.101.200
[kube_control_plane]
test-server
[etcd]
test-server
[kube_node]
test-server-agent
test-server-agent2 # Make sure the new node is in the kube_node group
3. Run Scale Playbook
Kubespray includes a special scale.yml
playbook specifically for adding new worker nodes without affecting the existing cluster.
# Add new nodes to the cluster
ansible-playbook -i inventory/somaz-cluster/inventory.ini scale.yml --become
The scale.yml
playbook is designed to only operate on new nodes, which makes it much faster than running the full cluster.yml
playbook. It will install the necessary Kubernetes components on the new node and join it to the existing cluster.
Scaling Down (Removing Nodes)
If you need to remove a node from the cluster, Kubespray also provides a removal playbook:
# To remove nodes, first update your inventory to reflect the desired state
# Then run the remove-node playbook
ansible-playbook -i inventory/somaz-cluster/inventory.ini remove-node.yml -e node=test-server-agent2 --become
Verification and Monitoring
Verify Cluster Status
# Check nodes
kubectl get nodes -o wide
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
test-server Ready control-plane 21m v1.28.6 10.77.101.62 <none> Ubuntu 20.04.6 LTS 5.15.0-1053-gcp containerd://1.7.1
test-server-agent Ready <none> 20m v1.28.6 10.77.101.57 <none> Ubuntu 20.04.6 LTS 5.15.0-1053-gcp containerd://1.7.1
test-server-agent2 Ready <none> 65s v1.28.6 10.77.101.200 <none> Ubuntu 20.04.6 LTS 5.15.0-1053-gcp containerd://1.7.1
# Check system namespace
kubectl get po -n kube-system
NAME READY STATUS RESTARTS AGE
calico-kube-controllers-76475c5546-vb58b 1/1 Running 0 20m
calico-node-hvx95 1/1 Running 0 20m
calico-node-lq4tg 1/1 Running 0 20m
calico-node-vftnr 1/1 Running 0 1m
coredns-77f7cc69db-ctvk6 1/1 Running 0 19m
coredns-77f7cc69db-h4bbx 1/1 Running 0 19m
dns-autoscaler-5b576d9b75-pvbwj 1/1 Running 0 19m
kube-apiserver-test-server 1/1 Running 0 21m
kube-controller-manager-test-server 1/1 Running 0 21m
kube-proxy-5n5tq 1/1 Running 0 20m
kube-proxy-lx25t 1/1 Running 0 1m
kube-proxy-s6x8h 1/1 Running 0 20m
kube-scheduler-test-server 1/1 Running 0 21m
kubernetes-dashboard-787dd78ffd-jl8bd 1/1 Running 0 19m
metrics-server-67df99fc7d-p8nzk 1/1 Running 0 19m
nginx-proxy-test-server-agent 1/1 Running 0 20m
nginx-proxy-test-server-agent2 1/1 Running 0 1m
nodelocaldns-fwl5r 1/1 Running 0 19m
nodelocaldns-hkvk2 1/1 Running 0 1m
nodelocaldns-szfj6 1/1 Running 0 19m
Monitoring Cluster Components
Cluster Maintenance
Upgrading the Cluster
Kubespray makes it easy to upgrade your Kubernetes cluster to newer versions.
# Update Kubespray repository
cd $KUBESPRAYDIR
git fetch --all
git checkout <desired_version_tag>
# Update dependencies
pip install -U -r requirements.txt
# Update inventory parameters for the new version
# Edit inventory/somaz-cluster/group_vars/k8s_cluster/k8s-cluster.yml
# Set kube_version to the desired version
# Run the upgrade playbook
ansible-playbook -i inventory/somaz-cluster/group_vars/all/all.yml upgrade-cluster.yml --become
Backup and Restore
It’s crucial to back up your etcd data regularly:
# Backup etcd data
ansible-playbook -i inventory/somaz-cluster/inventory.ini etcd_backup.yml --become
# This creates a backup in /tmp/etcd_backup on the etcd host(s)
Troubleshooting
Common Issues and Solutions
Issue | Cause | Solution |
---|---|---|
SSH connection failures | SSH keys not properly set up | Verify SSH keys with ssh-copy-id and test connections manually |
Python dependencies errors | Incompatible Python version or packages | Use the recommended Python version and ensure the virtual environment is activated |
Network plugin failures | Network configuration issues | Check node connectivity and firewall rules; verify pods in kube-system namespace |
Node NotReady status | kubelet not running properly | Check systemctl status kubelet and kubelet logs |
etcd cluster issues | etcd member communication problems | Verify etcd health with etcdctl endpoint health and check etcd logs |
Kubespray Debug Tips
# Run Ansible in verbose mode for detailed output
ansible-playbook -i inventory/somaz-cluster/inventory.ini cluster.yml --become -vvv
# Check logs on nodes
ansible all -i inventory/somaz-cluster/inventory.ini -m shell -a "journalctl -xeu kubelet" --become
# Reset the cluster to start fresh
ansible-playbook -i inventory/somaz-cluster/inventory.ini reset.yml --become
Advanced Configuration
Customizing the Deployment
Kubespray offers extensive customization options through the Ansible inventory. Here are some common customizations:
High Availability Setup
# For HA setup, add multiple control plane nodes in inventory
# Then in group_vars/all/all.yml
loadbalancer_apiserver:
address: <VIP address>
port: 6443
Custom Network Configuration
# In group_vars/k8s_cluster/k8s-net-calico.yml (for Calico)
calico_ipip_mode: "Always"
calico_vxlan_mode: "Never"
calico_network_backend: "bird"
# Pod CIDR customization
kube_pods_subnet: 10.233.64.0/18
# Service CIDR customization
kube_service_addresses: 10.233.0.0/18
Add-on Configuration
# In group_vars/k8s_cluster/addons.yml
dashboard_enabled: true
metrics_server_enabled: true
ingress_nginx_enabled: true
Conclusion
You have successfully deployed a Kubernetes cluster using Kubespray and added a worker node to scale your infrastructure. This flexible deployment method allows you to create a production-ready Kubernetes environment on various infrastructures, including cloud providers like GCP and on-premises environments.
Kubespray strikes a balance between the simplicity of kubeadm and the flexibility of full custom deployments, making it an excellent choice for teams that need a customizable yet standardized Kubernetes setup.
Now that your Kubernetes cluster is up and running, consider:
- Setting up persistent storage with CSI drivers
- Implementing proper networking with an Ingress Controller
- Configuring monitoring with Prometheus and Grafana
- Establishing proper backup procedures for the cluster
- Setting up CI/CD pipelines for your applications
Comments