MetalLB - Load Balancer for Bare Metal Kubernetes Clusters

A comprehensive guide to implementing MetalLB in bare metal environments

Featured image

Image Reference



Overview

In cloud environments (AWS, GCP, Azure), Kubernetes can leverage built-in cloud load balancers when you create a Service of type LoadBalancer. However, in bare metal environments, these cloud resources aren’t available, leaving the LoadBalancer service in a perpetual “pending” state.

MetalLB solves this problem by providing a network load balancer implementation for Kubernetes clusters that don’t run on a cloud provider, effectively enabling the LoadBalancer service type in any environment.

The LoadBalancer Problem

When you create a LoadBalancer service in Kubernetes, the cluster typically requests an external load balancer from the cloud provider. In bare metal environments, this request can't be fulfilled by default, resulting in:

MetalLB addresses this limitation by providing its own load balancer implementation that assigns real external IPs to these services.

$ kubectl get svc
NAME          TYPE           CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE
kubernetes    ClusterIP      10.96.0.1        <none>        443/TCP        10h
myapp         LoadBalancer   10.106.247.153   <pending>     80:31122/TCP   10m


graph TD A[Client] -->|Access Service| B{Cloud Environment?} B -->|Yes| C[Cloud Load Balancer] B -->|No| D[MetalLB] C -->|External IP| E[Kubernetes Service] D -->|External IP| E E -->|Routes Traffic| F[Pods] style A fill:#f9f9f9,stroke:#333,stroke-width:1px style B fill:#f5f5f5,stroke:#333,stroke-width:1px style C fill:#e1f5fe,stroke:#333,stroke-width:1px style D fill:#fff9c4,stroke:#333,stroke-width:1px,color:#d84315 style E fill:#e8f5e9,stroke:#333,stroke-width:1px style F fill:#e8f5e9,stroke:#333,stroke-width:1px


MetalLB Architecture

MetalLB consists of two main components that work together to provide load balancing functionality:

  1. Controller: A deployment that assigns IP addresses to LoadBalancer services
  2. Speaker: A DaemonSet that advertises the services using either Layer 2 (ARP/NDP) or BGP
graph TD A[MetalLB Controller] -->|Watches| B[Kubernetes API] B -->|Service Events| A A -->|Allocates IPs from| C[Address Pools] A -->|Configures| D[MetalLB Speaker] D -->|Runs on| E[Each Node] D -->|Advertises Routes via| F{Mode?} F -->|Layer 2| G[ARP/NDP] F -->|BGP| H[BGP Protocol] style A fill:#bbdefb,stroke:#333,stroke-width:1px style B fill:#c8e6c9,stroke:#333,stroke-width:1px style C fill:#ffecb3,stroke:#333,stroke-width:1px style D fill:#bbdefb,stroke:#333,stroke-width:1px style E fill:#e1bee7,stroke:#333,stroke-width:1px style F fill:#f5f5f5,stroke:#333,stroke-width:1px style G fill:#ffccbc,stroke:#333,stroke-width:1px style H fill:#b3e5fc,stroke:#333,stroke-width:1px

How It Works

  1. When a LoadBalancer service is created, the MetalLB controller assigns an IP address from a configured address pool
  2. The speaker pods (running on each node) advertise the IP address using either Layer 2 or BGP mode
  3. External traffic can now reach the service using the assigned external IP
  4. Once traffic reaches a node, standard Kubernetes service routing directs it to the appropriate pods


Operating Modes

MetalLB supports two operating modes, each with distinct characteristics and use cases:

Layer 2 Mode (ARP/NDP)

In Layer 2 mode, one speaker pod becomes the “leader” for each service and responds to ARP requests for the service’s IP address, making the node appear to be the machine that owns the service IP.

sequenceDiagram participant Client participant Router participant Node1 as Kubernetes Node 1 (Leader) participant Node2 as Kubernetes Node 2 Note over Node1,Node2: MetalLB speakers elect a leader Client->>Router: Who has IP 192.168.1.240? Router->>Node1: Who has IP 192.168.1.240? Router->>Node2: Who has IP 192.168.1.240? Node1->>Router: I have 192.168.1.240 (MAC: AA:BB:CC:DD:EE:FF) Router->>Client: 192.168.1.240 is at AA:BB:CC:DD:EE:FF Client->>Node1: Traffic to 192.168.1.240
Key Features of Layer 2 Mode:
  • Simplicity: No special network equipment needed
  • Compatibility: Works with any network configuration
  • Automatic Failover: If the leader node fails, another node takes over
  • Session Persistence: All traffic for a service goes to a single node
  • Network Efficiency: Uses standard ARP/NDP protocols that all network equipment understands
Limitations:
  • Single Node Traffic: All external traffic for a service flows through a single node, which can become a bottleneck
  • Disruptions During Failover: Brief connectivity disruption during leader failover
  • No Traffic Distribution: Cannot distribute load across multiple entry points

BGP Mode

In BGP mode, each speaker pod establishes BGP peering sessions with configured routers and advertises routes for service IPs.

sequenceDiagram participant Client participant Router as BGP Router participant Node1 as Kubernetes Node 1 participant Node2 as Kubernetes Node 2 Note over Node1,Node2: MetalLB speakers establish BGP sessions Node1->>Router: BGP: I can route to 192.168.1.240 Node2->>Router: BGP: I can route to 192.168.1.240 Note over Router: Router uses ECMP to distribute traffic Client->>Router: Traffic to 192.168.1.240 Router->>Node1: Some traffic to 192.168.1.240 Router->>Node2: Some traffic to 192.168.1.240
Key Features of BGP Mode:
  • True Load Balancing: Using ECMP (Equal-Cost Multi-Path), routers can distribute traffic across multiple nodes
  • Scalability: Better performance for high-traffic services
  • No Disruption on Node Failure: Traffic to remaining nodes continues uninterrupted
  • Advanced Routing: Supports traffic engineering via BGP communities and local preferences
  • Efficient Failover: Near-instant recovery from node failures
Requirements:
  • BGP-capable Router: Requires routers that support BGP and ideally ECMP
  • Network Expertise: More complex to set up and troubleshoot
  • Configuration: Requires BGP peering setup on both MetalLB and router side

Mode Comparison

Feature Layer 2 Mode BGP Mode
Setup Complexity Low - minimal configuration Medium to High - requires BGP router configuration
Network Requirements Standard Ethernet network BGP-capable routers
Traffic Distribution Single node handles all traffic for a service Traffic can be distributed across multiple nodes
Failover Speed Seconds (requires ARP cache refresh) Milliseconds to seconds (BGP reconvergence)
Performance Ceiling Limited by single node capacity Can scale across multiple nodes
Protocol Used ARP (IPv4) / NDP (IPv6) Border Gateway Protocol


Installation and Configuration

Prerequisites

For optimal operation, especially when using kube-proxy in IPVS mode, enable strictARP:



Installation Methods

# Add MetalLB helm repo
helm repo add metallb https://metallb.github.io/metallb

# Update helm repositories
helm repo update

# Install MetalLB
helm install metallb metallb/metallb -n metallb-system --create-namespace

# Verify the installation
kubectl get pods -n metallb-system

Method 2: Install with Manifests

# Install MetalLB components
kubectl apply -f https://raw.githubusercontent.com/metallb/metallb/main/config/manifests/metallb-native.yaml

# Verify the installation
kubectl get pods -n metallb-system

Configuration

MetalLB requires two key configurations:

  1. IPAddressPool: Defines the IP ranges that MetalLB can assign
  2. L2Advertisement or BGPAdvertisement: Defines how these IPs are advertised

Layer 2 Configuration Example

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: first-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.240-192.168.1.250
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: layer2-config
  namespace: metallb-system
spec:
  ipAddressPools:
  - first-pool

BGP Configuration Example

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: bgp-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.10.0/24
---
apiVersion: metallb.io/v1beta1
kind: BGPPeer
metadata:
  name: router-peer
  namespace: metallb-system
spec:
  myASN: 64500
  peerASN: 64501
  peerAddress: 192.168.0.1
---
apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
  name: bgp-advert
  namespace: metallb-system
spec:
  ipAddressPools:
  - bgp-pool

Complete Configuration Example

Here’s a complete example with address pools and both Layer 2 and BGP configuration:

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: production-public-ips
  namespace: metallb-system
spec:
  addresses:
  - 192.168.1.240-192.168.1.250
  - 192.168.10.0/24
  autoAssign: true
  avoidBuggyIPs: true
---
apiVersion: metallb.io/v1beta1
kind: L2Advertisement
metadata:
  name: layer2-prod
  namespace: metallb-system
spec:
  ipAddressPools:
  - production-public-ips
---
apiVersion: metallb.io/v1beta1
kind: BGPPeer
metadata:
  name: router1
  namespace: metallb-system
spec:
  myASN: 64500
  peerASN: 64501
  peerAddress: 192.168.0.1
  holdTime: 120s
  keepaliveTime: 30s
  routerID: 192.168.0.100
  nodeSelectors:
  - matchLabels:
      kubernetes.io/hostname: worker-1
---
apiVersion: metallb.io/v1beta1
kind: BGPPeer
metadata:
  name: router2
  namespace: metallb-system
spec:
  myASN: 64500
  peerASN: 64501
  peerAddress: 192.168.0.2
  holdTime: 120s
  keepaliveTime: 30s
  routerID: 192.168.0.100
  nodeSelectors:
  - matchLabels:
      kubernetes.io/hostname: worker-2
---
apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
  name: bgp-prod
  namespace: metallb-system
spec:
  ipAddressPools:
  - production-public-ips
  communities:
  - 64500:100
  localPref: 100
  aggregationLength: 32

Applying and Verifying Configuration

# Apply configuration
kubectl apply -f metallb-config.yaml

# Verify IPAddressPool configuration
kubectl describe ipaddresspool -n metallb-system

# Verify L2Advertisement configuration
kubectl describe l2advertisement -n metallb-system

# For BGP mode, verify BGP configuration
kubectl describe bgppeer -n metallb-system
kubectl describe bgpadvertisement -n metallb-system


Testing MetalLB

Deploy a Test Application

apiVersion: apps/v1
kind: Deployment
metadata:
  name: nginx-test
spec:
  selector:
    matchLabels:
      app: nginx-test
  template:
    metadata:
      labels:
        app: nginx-test
    spec:
      containers:
      - name: nginx
        image: nginx:1.21
        ports:
        - name: http
          containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: nginx-test
spec:
  ports:
  - name: http
    port: 80
    protocol: TCP
    targetPort: 80
  selector:
    app: nginx-test
  type: LoadBalancer

Verify External IP Assignment

# Check if the service received an external IP
kubectl get svc nginx-test

# Expected output:
# NAME         TYPE           CLUSTER-IP      EXTERNAL-IP      PORT(S)        AGE
# nginx-test   LoadBalancer   10.43.162.209   192.168.1.240    80:31621/TCP   10s

# Test connectivity
curl http://192.168.1.240


Advanced Configuration

Address Pool Options

apiVersion: metallb.io/v1beta1
kind: IPAddressPool
metadata:
  name: advanced-pool
  namespace: metallb-system
spec:
  addresses:
  - 192.168.10.0/24
  - 192.168.20.10-192.168.20.50
  autoAssign: true  # Whether to automatically assign from this pool (default: true)
  avoidBuggyIPs: true  # Avoid .0 and .255 addresses which can cause problems

BGP Options

apiVersion: metallb.io/v1beta1
kind: BGPAdvertisement
metadata:
  name: advanced-bgp
  namespace: metallb-system
spec:
  ipAddressPools:
  - advanced-pool
  aggregationLength: 32  # Advertise individual /32 routes
  localPref: 100  # BGP local preference
  communities:  # BGP communities to attach
  - 65535:65282  # no-export community
  - 64500:100    # Custom community

BFD (Bidirectional Forwarding Detection) for Faster Failover

For faster BGP failover, MetalLB supports BFD (available in newer versions):

apiVersion: metallb.io/v1beta1
kind: BFDProfile
metadata:
  name: fast-failover
  namespace: metallb-system
spec:
  receiveInterval: 300ms
  transmitInterval: 300ms
  detectMultiplier: 3
  echoInterval: 50ms
  echoMode: true
  passiveMode: false
---
apiVersion: metallb.io/v1beta1
kind: BGPPeer
metadata:
  name: router-with-bfd
  namespace: metallb-system
spec:
  myASN: 64500
  peerASN: 64501
  peerAddress: 192.168.0.1
  bfdProfile: fast-failover


Troubleshooting

Common Issues

LoadBalancer service stuck in pending
  • Check if MetalLB pods are running: kubectl get pods -n metallb-system
  • Verify address pool configuration: kubectl describe ipaddresspool -n metallb-system
  • Check for configuration errors in logs: kubectl logs -n metallb-system -l app=metallb,component=controller
  • Ensure the IP range in the pool is available in your network
Cannot reach service external IP
  • For Layer 2 mode, ensure the IP is in the same subnet as your nodes
  • Check speaker logs: kubectl logs -n metallb-system -l app=metallb,component=speaker
  • Verify network connectivity between client and Kubernetes nodes
  • Test if firewall rules are blocking traffic
BGP issues
  • Verify BGP peering is established: kubectl logs -n metallb-system -l app=metallb,component=speaker | grep "BGP session established"
  • Check BGPPeer configuration: kubectl describe bgppeer -n metallb-system
  • Confirm ASN numbers and peer addresses are correct
  • Verify router configuration matches MetalLB settings

Validating Configuration

# Check controller and speaker pod status
kubectl get pods -n metallb-system

# Check for events related to MetalLB
kubectl get events -n metallb-system

# Check controller logs
kubectl logs -n metallb-system -l app=metallb,component=controller

# Check speaker logs
kubectl logs -n metallb-system -l app=metallb,component=speaker

# View address assignments
kubectl get services --all-namespaces -o wide | grep LoadBalancer

Debugging Commands

# For Layer 2 mode, check ARP table on a client machine
arp -a

# For BGP mode, check BGP sessions on the router
# Example for a Cisco router:
# show ip bgp summary
# show ip bgp neighbors

# Test connectivity to assigned IPs
ping <external-ip>
curl http://<external-ip>

# Check MetalLB speaker's internal state (newer versions)
kubectl exec -n metallb-system -l app=metallb,component=speaker -- mirror-status


Best Practices

Security Considerations

  1. Network Policies: Implement Kubernetes network policies to control traffic to your exposed services
  2. Separate Network: Use a dedicated IP range for MetalLB that’s separate from your regular infrastructure
  3. RBAC: Ensure proper RBAC permissions for MetalLB components
  4. BGP Authentication: For BGP mode, implement MD5 authentication with your BGP peers when possible

Performance Optimization

  1. Choose the Right Mode: Use BGP mode for high-traffic services that need true load balancing
  2. Node Selection: For Layer 2 mode, use node annotations to influence leader election for optimal traffic distribution
  3. Bandwidth Consideration: Monitor node network usage when using Layer 2 mode as traffic concentration can lead to bottlenecks
  4. Resource Allocation: Ensure MetalLB components have adequate resources, especially in large clusters

Deployment Recommendations

  1. Use Helm: The Helm chart provides a more manageable way to install and upgrade MetalLB
  2. Multiple Address Pools: Define separate address pools for different environments or service types
  3. Start Small: Begin with Layer 2 mode for simplicity, then migrate to BGP if needed
  4. Regular Updates: Keep MetalLB updated to benefit from bug fixes and new features
  5. Monitoring: Implement monitoring for MetalLB components using Prometheus metrics


Conclusion

MetalLB provides a robust solution for implementing LoadBalancer services in bare metal Kubernetes environments. By understanding the differences between Layer 2 and BGP modes, you can choose the approach that best fits your infrastructure requirements and operational capabilities.

For small to medium deployments, Layer 2 mode offers simplicity and ease of setup. For larger, more complex environments where true load balancing is required, BGP mode provides the necessary flexibility and scalability.

Key Takeaways
  • MetalLB enables LoadBalancer services in bare metal Kubernetes clusters
  • Layer 2 mode is simpler but routes all service traffic through one node
  • BGP mode enables true load balancing but requires BGP-capable network equipment
  • Configuration involves defining address pools and advertisement methods
  • MetalLB integrates seamlessly with standard Kubernetes networking



References