What is Prometheus and Thanos?

A comprehensive guide to Prometheus monitoring and Thanos scalability

Featured image

Image Reference



Overview

Let’s explore Prometheus and Thanos, two powerful tools for monitoring and scaling metrics collection in cloud-native environments.


What is Prometheus?

Prometheus is an open-source monitoring system maintained by the Cloud Native Computing Foundation (CNCF). It’s designed to collect and store time-series metrics from various systems and services.

Key Features of Prometheus
  1. Multidimensional Data Model
    • Stores data using metric names and key-value label pairs
    • Flexible and efficient data organization
  2. Powerful Query Language (PromQL)
    • Efficient data retrieval and aggregation
    • Complex query capabilities
  3. Independent Storage
    • Uses its own time-series database
    • No external database dependencies
  4. Service Discovery
    • Supports Kubernetes, Consul, and others
    • Automatic monitoring of dynamic environments

Prometheus Components

Core Components
  1. Prometheus Server
    • Collects and stores metrics
    • Handles scraping and storage
    • Executes queries
  2. Alertmanager
    • Manages alerts
    • Handles notification routing
    • Supports multiple notification channels (email, Slack, PagerDuty)
  3. Pushgateway
    • Handles metrics from short-lived jobs
    • Supports push-based metrics collection
  4. Exporters
    • Collects metrics from various services
    • Examples: Node Exporter, MySQL Exporter


What is Thanos?

Thanos is an open-source project that extends Prometheus capabilities with long-term storage, high availability, and multi-cluster support.

Key Features of Thanos
  1. Long-term Storage
    • Supports various cloud storage backends (S3, GCS, Azure)
    • Efficient data archiving
  2. Global Query View
    • Unified view across multiple Prometheus instances
    • Centralized querying
  3. High Availability
    • Redundant data storage
    • No single point of failure

Thanos Components

Core Components
  1. Sidecar
    • Connects to Prometheus
    • Uploads metrics to object storage
  2. Querier
    • Aggregates data from multiple sources
    • Provides unified query interface
  3. Store Gateway
    • Accesses historical data
    • Interfaces with object storage
  4. Compactor
    • Optimizes stored data
    • Handles data retention


Data Flow in Prometheus and Thanos

Data Flow Process
  1. Prometheus
    • Collects metrics from targets
    • Stores in local storage
  2. Thanos Sidecar
    • Reads from Prometheus storage
    • Uploads to object storage
  3. Object Storage
    • Long-term metric storage
    • Supports various providers (S3, GCS, etc.)
  4. Thanos Query
    • Handles user queries
    • Aggregates data from multiple sources

Data Flow Architecture in Prometheus and Thanos

graph TD; A[Prometheus] -->|Scrape metrics| B[Local Storage]; A -->|Real-time query and data push| C[Thanos Sidecar]; C -->|Query data| D[Thanos Query]; C -->|Query data| D; D -->|Query historical data| E[Thanos Store]; D -->|Upload old blocks| F[Users]; E -->|Serve historical data| G[Object Storage]; E -->|Access long-term data| G; F -->|Upload old blocks| G;


Next Steps

In the next post, we’ll explore:



References