14 min to read
Installing Prometheus and Thanos with Helm
A step-by-step guide to setting up Prometheus and Thanos monitoring stack

Installation Guide for Prometheus and Thanos
Introduction and Prerequisites
This guide follows our previous post about Prometheus and Thanos, focusing on the practical implementation aspects. Before proceeding with the installation, ensure you have the following prerequisites in place:
- Kubernetes Cluster: Running cluster with proper networking configuration
- Storage: Storage provisioner configured for persistent volumes
- Object Storage: Access to S3, GCS, MinIO, or other compatible object storage for Thanos
- Helm: Helm 3.x installed and configured
- kubectl: Properly configured with access to your cluster
- Namespace: A namespace for your monitoring stack (we'll use "monitoring" in this guide)
If you're setting up a multi-cluster configuration, ensure you have network connectivity between clusters and the proper DNS configuration in place.
Installing Prometheus
Available Installation Methods
Chart | Description |
---|---|
prometheus-community |
|
kube-prometheus-stack |
|
Installing Prometheus CRDs
Before installing Prometheus with the operator-based approach, you need to install the required Custom Resource Definitions (CRDs):
Execute these commands to install the Prometheus Operator CRDs:
# Add the Prometheus community repository
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
# Install CRDs in the monitoring namespace
helm install prometheus-operator-crds -n monitoring prometheus-community/prometheus-operator-crds
# Verify the installation
kubectl get crd | grep monitoring
Installing CRDs separately before the main chart ensures:
- Clean upgrades without CRD-related conflicts
- Better management of CRD lifecycle
- Prevention of accidental CRD deletion during chart uninstallation
Option 1: Installing Prometheus Community Chart
Create a values file (values/somaz.yaml
) with the following configuration:
server:
name: server
image:
repository: quay.io/prometheus/prometheus
persistentVolume:
enabled: true
accessModes:
- ReadWriteOnce
storageClass: "default"
replicaCount: 1
statefulSet:
enabled: false
service:
enabled: true
type: NodePort
alertmanager:
enabled: true
persistence:
size: 2Gi
kube-state-metrics:
enabled: true
prometheus-node-exporter:
enabled: true
prometheus-pushgateway:
enabled: true
serviceAnnotations:
prometheus.io/probe: pushgateway
Install Prometheus using these commands:
# Install Prometheus
helm install prometheus-community prometheus-community/prometheus -n monitoring -f ./values/somaz.yaml --create-namespace
# Upgrade if needed
helm upgrade prometheus-community prometheus-community/prometheus -n monitoring -f ./values/somaz.yaml
Option 2: Installing Kube-Prometheus-Stack
Create a values file (values/somaz.yaml
) with this configuration for Thanos integration:
alertmanager:
enabled: true
config:
global:
resolve_timeout: 5m
service:
type: NodePort
grafana:
enabled: false
prometheus:
enabled: true
thanosService:
enabled: true
thanosServiceMonitor:
enabled: true
thanosServiceExternal:
enabled: true
type: NodePort
ingress:
enabled: true
ingressClassName: nginx
hosts:
- prometheus.somaz.link
prometheusSpec:
replicas: 1
thanos:
baseImage: quay.io/thanos/thanos
version: v0.36.1
objectStorageConfig:
existingSecret:
name: thanos-objstore
key: objstore.yml
storageSpec:
volumeClaimTemplate:
spec:
storageClassName: default
accessModes: ["ReadWriteOnce"]
resources:
requests:
storage: 100Gi
externalLabels:
provider: somaz
region: seoul
cluster: mgmt
This configuration:
- Enables the Thanos sidecar for long-term storage integration
- References a secret (
thanos-objstore
) containing object storage configuration - Adds external labels to identify metrics from this cluster
- Configures ingress for accessing Prometheus UI
- Sets up persistent storage for metrics data
Install the Kube-Prometheus-Stack using these commands:
# Verify the configuration
helm lint --values ./values/somaz.yaml
# Install the stack
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f ./values/somaz.yaml --create-namespace
# Upgrade if needed
helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack -n monitoring -f ./values/somaz.yaml
Thanos Object Storage Configuration
Creating the Object Storage Secret
Create a file named objstore.yml
with your storage provider configuration:
type: s3
config:
bucket: thanos
endpoint: minio.storage.svc.cluster.local:9000
access_key: minioadmin
secret_key: minioadmin
insecure: true
Thanos supports various object storage providers:
- AWS S3: For production environments in AWS
- Google Cloud Storage: For GCP deployments
- Azure Blob Storage: For Azure environments
- MinIO: For on-premises or testing (as shown in example)
- Others: Alibaba OSS, Tencent COS, OpenStack Swift, etc.
Adjust the configuration according to your chosen provider.
Create a Kubernetes secret with your object storage configuration:
# Create the secret in the monitoring namespace
kubectl create secret generic thanos-objstore -n monitoring --from-file=objstore.yml
Installing Thanos
Single-Cluster Thanos Configuration
Create a values file (values/somaz.yaml
) for a single-cluster Thanos deployment:
global:
defaultStorageClass: "default"
fullnameOverride: "thanos"
clusterDomain: somaz-cluster.local
existingObjstoreSecret: "thanos-objstore"
query:
enabled: true
logLevel: debug
replicaLabel:
- prometheus_replica
stores:
- dnssrv+_grpc._tcp.kube-prometheus-stack-thanos-discovery.monitoring.svc.somaz-cluster.local
ingress:
enabled: true
hostname: thanos-query.somaz.link
ingressClassName: nginx
queryFrontend:
enabled: true
ingress:
enabled: true
hostname: thanos.somaz.link
compactor:
enabled: true
retentionResolutionRaw: 60d
retentionResolution5m: 60d
retentionResolution1h: 1y
persistence:
enabled: true
size: 10Gi
storegateway:
enabled: true
persistence:
enabled: true
size: 10Gi
This configuration sets up the following Thanos components:
- Query: The frontend for querying metrics from both local Prometheus and object storage
- Query Frontend: Provides advanced query optimization and caching
- Compactor: Handles data compaction and downsampling in object storage
- Store Gateway: Accesses historical metrics in object storage
The stores
configuration connects to your Prometheus Thanos sidecar using DNS service discovery.
Multi-Cluster Thanos Configuration
For external clusters, create a values file (values/somaz-externalcluster.yaml
):
query:
enabled: true
stores:
- dnssrv+_grpc._tcp.kube-prometheus-stack-thanos-discovery.monitoring.svc.external-cluster.local
ingress:
grpc:
enabled: true
hostname: external-cluster-thanos-query.somaz.link
ingressClassName: "nginx"
annotations:
nginx.ingress.kubernetes.io/backend-protocol: "GRPC"
Update the main Thanos Query configuration to include the external cluster:
query:
stores:
- dnssrv+_grpc._tcp.kube-prometheus-stack-thanos-discovery.monitoring.svc.somaz-cluster.local
- dnssrv+_grpc._tcp.thanos-multicluster-query-grpc.monitoring.svc.somaz-cluster.local
In this setup:
- Each cluster runs Prometheus with Thanos sidecar uploading to shared object storage
- External clusters expose their Thanos Query endpoints via gRPC
- The primary cluster's Thanos Query connects to both local and external endpoints
- All metrics are deduplicated and unified in the primary Thanos Query Frontend
Installing Thanos Components
Install Thanos using the Bitnami Helm chart:
# Add the Bitnami repository
helm repo add bitnami https://charts.bitnami.com/bitnami
helm repo update
# Verify the configuration
helm lint --values ./values/somaz.yaml
# Install Thanos
helm install thanos bitnami/thanos -n monitoring -f ./values/somaz.yaml --create-namespace
# Upgrade if needed
helm upgrade thanos bitnami/thanos -n monitoring -f ./values/somaz.yaml
Verification and Troubleshooting
Verifying the Installation
Verify that all pods are running correctly:
# Check Prometheus and related components
kubectl get pods -n monitoring -l app=prometheus
# Check Thanos components
kubectl get pods -n monitoring -l app.kubernetes.io/name=thanos
# Check service endpoints
kubectl get svc -n monitoring
DNS and Connectivity Troubleshooting
Verify DNS resolution for service discovery:
# Check DNS resolution for Thanos discovery endpoints
kubectl run -it --rm --image=nicolaka/netshoot dns-test --restart=Never -- dig _grpc._tcp.kube-prometheus-stack-thanos-discovery.monitoring.svc.somaz-cluster.local
# Verify endpoints are registered
kubectl get ep -n monitoring | grep thanos-discovery
- DNS Resolution Failures: Ensure CoreDNS is working correctly and service names are accurate
- Thanos Query Cannot Find Stores: Verify store endpoints and network connectivity
- Object Storage Access Issues: Check credentials and endpoint configuration
- No Metrics in Thanos Query: Verify Prometheus external labels and proper sidecar configuration
- Ingress Connectivity Problems: Check ingress controller logs and annotations
Accessing User Interfaces
Access the following UIs to verify your installation:
- Prometheus UI: http://prometheus.somaz.link
- Thanos Query UI: http://thanos-query.somaz.link
- Thanos Frontend UI: http://thanos.somaz.link
Next Steps and Advanced Topics
Upcoming Content
We'll explore additional monitoring capabilities including:
- Loki with Promtail: Logging solution that integrates with your monitoring stack
- Node Feature Discovery: Enhancing Kubernetes node capabilities detection
- Grafana Dashboards: Creating comprehensive visualization dashboards
Advanced Configuration Options
Consider these advanced configurations for production environments:
- Setting up high availability for Prometheus and Thanos components
- Configuring detailed alert rules and notification channels
- Implementing retention policies and downsampling strategies
- Adding custom exporters for application-specific metrics
- Integrating with existing notification systems (PagerDuty, Slack, etc.)
Key Points
-
Installation Approach
- kube-prometheus-stack for comprehensive monitoring with Operator
- prometheus-community chart for lightweight, customizable setups
- Bitnami Thanos chart for long-term storage and multi-cluster capabilities -
Key Components
- Prometheus with Thanos sidecar for metrics collection
- Object storage backend for long-term metrics retention
- Thanos Query for unified metric access
- Thanos Store Gateway for historical data access -
Multi-Cluster Setup
- Shared object storage between clusters
- Federated query across cluster boundaries
- Centralized view of all metrics
- Seamless scaling as your infrastructure grows
Comments