ELK to EFK Migration: Why and How We Replaced Logstash with Fluentd

A practical guide to migrating from ELK Stack to EFK Stack on Kubernetes with Fluent Bit, Fluentd, and real-world log processing pipelines

ELK to EFK Migration: Why and How We Replaced Logstash with Fluentd



Overview

The ELK Stack (Elasticsearch + Logstash + Kibana) has been the industry standard for log management. However, in Kubernetes environments, Logstash’s heavy JVM footprint and Filebeat’s limited flexibility led us to migrate to the EFK Stack — replacing Logstash with Fluentd for log processing and Filebeat with Fluent Bit for log collection.

This post documents our migration journey, including the architectural decisions, the Fluentd filter pipeline design, and lessons learned.


Why Migrate from ELK to EFK?

Aspect ELK (Before) EFK (After)
Log Collector Filebeat (Go, ~30MB) Fluent Bit (C, ~5MB)
Log Processor Logstash (JVM, ~1GB RAM) Fluentd (Ruby, ~256MB RAM)
Storage Elasticsearch Elasticsearch (unchanged)
Visualization Kibana Kibana (unchanged)
Memory Usage High (JVM overhead) Low (native + lightweight Ruby)
Plugin Ecosystem Logstash plugins (Ruby) Fluentd plugins (700+) + Fluent Bit (CNCF)
K8s Native Not designed for K8s CNCF graduated project, built for cloud-native

Key reasons for our migration:


Architecture: Before vs After

Before (ELK)

App Logs → Filebeat (DaemonSet) → Logstash (Deployment) → Elasticsearch → Kibana

After (EFK)

NFS Log Files
├── dev/game/*
├── dev/battle/*
├── staging/game/*
└── qa/game/*
      │
      ▼
Fluent Bit (Deployment)
  - Tail input (NFS logs)
  - JSON parsing
  - Metadata injection (env, app, component)
  - Forward output → port 24224
      │
      ▼
Fluentd (StatefulSet)
  - Forward input (port 24224)
  - 5-stage filter pipeline
  - Elasticsearch output (dynamic index)
      │
      ▼
Elasticsearch (StatefulSet)
  - Indices: dev-myapp-game, dev-myapp-battle, etc.
  - HTTPS with certificate auth
      │
      ▼
Kibana (Deployment)
  - Dashboard: kibana.example.com


Component Details


Fluent Bit — Lightweight Log Collector

Fluent Bit runs as a Deployment (not DaemonSet) because logs are stored on NFS, not on local nodes.

Setting Value
Image cr.fluentbit.io/fluent/fluent-bit:3.1.4
Kind Deployment (1 replica)
Input Tail (NFS-mounted log files)
Output Forward → Fluentd:24224
Resources 100m~1000m CPU, 100Mi~2Gi RAM

Input Configuration:

[INPUT]
    Name              tail
    Tag               myapp.dev.game
    Path              /fluent-bit/logs/myapp/dev/game/*
    Parser            myapp_json
    Refresh_Interval  5
    Rotate_Wait       30
    Skip_Long_Lines   On
    DB                /fluent-bit/db/dev-game.db

Metadata Injection (Filter):

[FILTER]
    Name   modify
    Match  myapp.dev.game
    Set    environment dev
    Set    app myapp
    Set    component game
    Set    label game

Fluent Bit adds metadata (environment, app, component) to each log entry before forwarding to Fluentd. This enables dynamic index routing in Elasticsearch.


Fluentd — Log Processor

Fluentd is the heavy lifter — it receives logs from Fluent Bit and runs them through a 5-stage filter pipeline before sending to Elasticsearch.

Setting Value
Image fluent/fluentd-kubernetes-daemonset:v1.16
Kind StatefulSet (1 replica)
Input Forward (port 24224)
Output Elasticsearch (HTTPS, dynamic index)
Buffer File-based, 16MB chunks, 10s flush, 4 threads
Resources 200m~1000m CPU, 256Mi~1Gi RAM


Fluentd Filter Pipeline (5 Stages)

This is the most critical part of the migration — replacing Logstash’s data processing with Fluentd filters.

Stage 1: JSON Parsing Fallback

If Fluent Bit’s parser failed to parse JSON, Fluentd attempts a second parse.

<filter myapp.**>
  @type parser
  key_name log
  reserve_data true
  remove_key_name_field true
  <parse>
    @type json
    time_key @timestamp
    time_format %Y-%m-%dT%H:%M:%S.%L%z
  </parse>
</filter>

Stage 2: Pino ↔ Winston Log Level Normalization

Our applications use both Pino (numeric levels) and Winston (string levels). Fluentd normalizes them.

Stage 3: Timestamp Normalization

Unifies different timestamp formats to a single @timestamp field in Asia/Seoul timezone.

# Pino: Unix milliseconds → ISO8601
if record["time"].is_a?(Numeric)
  record["@timestamp"] = Time.at(record["time"] / 1000.0).localtime("+09:00").strftime(...)
end

# Winston: ISO8601 string → normalized
if record["timestamp"].is_a?(String)
  record["@timestamp"] = Time.parse(record["timestamp"]).localtime("+09:00").strftime(...)
end

# Cleanup: remove source fields
record.delete("time"); record.delete("msg"); record.delete("filepath"); record.delete("req")

Stage 4: Nested JSON Processing

Converts nested objects (requestBody, responseBody) to JSON strings to prevent Elasticsearch mapping explosion.

["requestBody", "responseBody", "requestHeader"].each do |field|
  if record[field].is_a?(Hash) || record[field].is_a?(Array)
    record[field] = record[field].to_json
  end
end

Stage 5: Stack Trace Unescape

Processes escaped newlines in error stack traces for proper display in Kibana.


Elasticsearch Output (Dynamic Index)

Fluentd routes logs to different Elasticsearch indices based on the log_source field set by Fluent Bit’s metadata.

<match myapp.**>
  @type elasticsearch
  host elasticsearch-master
  port 9200
  scheme https
  ssl_verify false
  logstash_format false
  index_name ${log_source}
  <!-- Results in indices like: dev-myapp-game, stg-myapp-game, qa-myapp-game -->

  <buffer tag>
    @type file
    path /var/log/fluent/elasticsearch-buffers
    flush_interval 10s
    flush_thread_count 4
    chunk_limit_size 16MB
    retry_type exponential_backoff
    retry_max_times 10
  </buffer>
</match>


Installation Guide


Fluent Bit Installation (Helm)

Fluent Bit is deployed as a local Helm chart since the configuration includes custom templates for NFS PV/PVC mounts.

# helmfile.yaml
repositories:
  - name: fluent
    url: https://fluent.github.io/helm-charts

releases:
  - name: fluent-bit
    namespace: monitoring
    chart: .              # Local chart (custom templates)
    version: 0.56.0
    values:
      - values/mgmt.yaml
# Deploy
cd observability/logging/fluent-bit
helmfile -f helmfile.yaml diff
helmfile -f helmfile.yaml apply

# Verify
kubectl get pods -n monitoring -l app.kubernetes.io/name=fluent-bit
kubectl logs -n monitoring -l app.kubernetes.io/name=fluent-bit

Key values (mgmt.yaml):

kind: Deployment
replicaCount: 1

image:
  repository: cr.fluentbit.io/fluent/fluent-bit
  tag: "3.1.4"

env:
  - name: FLUENT_ELASTICSEARCH_HOST
    value: "elasticsearch-master"
  - name: FLUENT_ELASTICSEARCH_PORT
    value: "9200"

resources:
  requests:
    cpu: 100m
    memory: 100Mi
  limits:
    cpu: 1000m
    memory: 2Gi

# Prometheus ServiceMonitor
serviceMonitor:
  enabled: true
  namespace: monitoring
  interval: 30s
  additionalLabels:
    release: kube-prometheus-stack


Fluentd Installation (Helm)

Fluentd uses an external Helm chart from the Fluent community.

# helmfile.yaml
repositories:
  - name: fluent
    url: https://fluent.github.io/helm-charts

releases:
  - name: fluentd
    namespace: monitoring
    chart: fluent/fluentd
    version: 0.5.3
    values:
      - values/mgmt.yaml
# Deploy
cd observability/logging/fluentd
helmfile -f helmfile.yaml diff
helmfile -f helmfile.yaml apply

# Verify
kubectl get pods -n monitoring -l app.kubernetes.io/name=fluentd
kubectl logs -n monitoring -l app.kubernetes.io/name=fluentd

Key values (mgmt.yaml):

kind: StatefulSet
variant: elasticsearch7
replicaCount: 1

image:
  repository: fluent/fluentd-kubernetes-daemonset
  tag: "v1.16-debian-elasticsearch7-1"

# Forwarder service (receives from Fluent Bit)
service:
  type: ClusterIP
  ports:
    - name: forwarder
      containerPort: 24224

# Elasticsearch connection
env:
  - name: FLUENT_ELASTICSEARCH_HOST
    value: "elasticsearch-master"
  - name: FLUENT_ELASTICSEARCH_PORT
    value: "9200"
  - name: FLUENT_ELASTICSEARCH_SCHEME
    value: "https"
  - name: FLUENT_ELASTICSEARCH_SSL_VERIFY
    value: "false"

# Persistent buffer storage
persistence:
  enabled: true
  storageClass: nfs-client-server
  size: 10Gi
  accessMode: ReadWriteOnce

resources:
  requests:
    cpu: 200m
    memory: 256Mi
  limits:
    cpu: 1000m
    memory: 1Gi

# Elasticsearch certificates
volumes:
  - name: elasticsearch-certs
    secret:
      secretName: elasticsearch-master-certs
volumeMounts:
  - name: elasticsearch-certs
    mountPath: /certs
    readOnly: true

# Prometheus ServiceMonitor
serviceMonitor:
  enabled: true
  namespace: monitoring
  interval: 30s
  additionalLabels:
    release: kube-prometheus-stack


Deployment Order

1. Elasticsearch (must be running first)
2. Fluentd (connects to Elasticsearch, listens on port 24224)
3. Fluent Bit (forwards logs to Fluentd:24224)
4. Kibana (connects to Elasticsearch for visualization)


Verification

# Check all EFK components are running
kubectl get pods -n monitoring | grep -E "elastic|fluent|kibana"

# Verify Fluent Bit → Fluentd connection
kubectl logs -n monitoring -l app.kubernetes.io/name=fluent-bit | grep -i "forward"

# Verify Fluentd → Elasticsearch connection
kubectl logs -n monitoring -l app.kubernetes.io/name=fluentd | grep -i "elasticsearch"

# Check Elasticsearch indices are being created
curl -sk -u elastic:<your-password> https://elasticsearch.example.com:9200/_cat/indices?v


Current Stack Summary

Component Version Type Role
Elasticsearch 8.5.1 StatefulSet (1 replica) Log storage & search
Fluent Bit 3.1.4 Deployment (1 replica) Lightweight log collection from NFS
Fluentd v1.16 StatefulSet (1 replica) Log processing & Elasticsearch output
Kibana 8.5.1 Deployment (1 replica) Log visualization dashboard


Optional: Loki + Promtail Stack

As an alternative to EFK, Loki + Promtail is available as an optional stack for Grafana-native log aggregation.

Container Logs → Promtail (DaemonSet) → Loki (SingleBinary) → Grafana

This is useful for teams already using Grafana who prefer label-based log querying over full-text search. Both stacks can coexist.


Migration Lessons Learned



References