Setting up EFK Stack on Kubernetes

Sunday. January 26, 2025 - 14 mins

How to Setup EFK Stack on Kubernetes: A Step-by-Step Guide

Introduction

EFK (Elasticsearch, Fluentd, Kibana) is a powerful open-source stack for centralized log aggregation, analysis, and visualization. When managing multiple applications and services on Kubernetes, centralizing logs ensures better monitoring and troubleshooting.

This guide walks you through setting up the EFK stack on a Kubernetes cluster.

What is the EFK Stack?

Elasticsearch: A distributed search and analytics engine that stores and retrieves large log volumes. It’s designed for scalability and efficiency.
Fluentd: A flexible log collector that unifies data collection and forwards logs to various destinations like Elasticsearch.
Kibana: A user-friendly interface for querying, visualizing, and analyzing log data.

Elasticsearch is optimized for managing unstructured log data, Fluentd serves as the log shipper, and Kibana provides visualization tools for better insights.

Architecture of EFK Stack

The EFK stack architecture for Kubernetes involves:

Fluentd: Deployed as a DaemonSet to gather logs from all nodes. It forwards logs to the Elasticsearch endpoint.
Elasticsearch: Deployed as a StatefulSet for persisting log data. It provides a service endpoint for Fluentd and Kibana.
Kibana: Deployed as a Deployment to visualize the data stored in Elasticsearch.

High-Level Architecture

The deployment includes Fluentd for log collection, Elasticsearch for storage, and Kibana for visualization.

Step-by-Step Setup

You can clone https://github.com/guneycansanli/k8s-training.git and fond all yamls under kubernetes-efk-yamls directory.
Note: All the EFK components get deployed in the default namespace.

git clone https://github.com/guneycansanli/k8s-training.git

1. Deploy Elasticsearch StatefulSet

This command installs the latest stable version of Argo Rollouts in the argo-rollouts namespace

Create a Headless Service

Define the headless service (es-svc.yaml) for internal communication between Elasticsearch pods:

apiVersion: v1
kind: Service
metadata:
  name: elasticsearch
spec:
  selector:
    app: elasticsearch
  clusterIP: None
  ports:
    - port: 9200
      name: rest
    - port: 9300
      name: inter-node

efk

Apply the service:

kubectl apply -f es-svc.yaml

Before we begin creating the statefulset for elastic search, let’s recall that a statefulset requires a storage class defined beforehand using which it can create volumes whenever required.
Since I work with my local cluster in my home lab , Local Clusters Lack Native Storage Backends Local Kubernetes clusters (e.g., minikube, kubeadm setups) don’t come with built-in storage backends. Instead Storage is tied to the local filesystem of the worker nodes. You have to manually specify the storage paths or directories that the PVs map to, using mechanisms like hostPath or local volumes.
You need to crate StorageClass

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
  name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer

kubectl apply -f local-storage-class.yaml

We also need to have PersistentVolume to Bound

apiVersion: v1
kind: PersistentVolume
metadata:
  name: local-pv-worker-0
spec:
  capacity:
    storage: 2Gi
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Retain
  storageClassName: local-storage
  local:
    path: /opt/k8s/es-cluster-0
  nodeAffinity:
    required:
      nodeSelectorTerms:
      - matchExpressions:
        - key: kubernetes.io/hostname
          operator: In
          values:
            - kworker1

kubectl apply -f local-persistent-volume-kworker1.yaml

Note: Make sure path directory already exist and also You may have some issues if you crate PV on worker or master node.
I have crated 3 diffrent PV so ElasticSearch istences can claim (We have 3 nodes/replicas in cluster)
Deploy Elasticsearch
Use the following StatefulSet (es-sts.yaml) to deploy Elasticsearch:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: es-cluster
spec:
  serviceName: elasticsearch
  replicas: 3
  selector:
    matchLabels:
      app: elasticsearch
  template:
    metadata:
      labels:
        app: elasticsearch
    spec:
      containers:
      - name: elasticsearch
        image: docker.elastic.co/elasticsearch/elasticsearch:7.5.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        ports:
        - containerPort: 9200
          name: rest
          protocol: TCP
        - containerPort: 9300
          name: inter-node
          protocol: TCP
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
        env:
          - name: cluster.name
            value: k8s-logs
          - name: node.name
            valueFrom:
              fieldRef:
                fieldPath: metadata.name
          - name: discovery.seed_hosts
            value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
          - name: cluster.initial_master_nodes
            value: "es-cluster-0,es-cluster-1,es-cluster-2"
          - name: ES_JAVA_OPTS
            value: "-Xms512m -Xmx512m"
      initContainers:
      - name: fix-permissions
        image: busybox
        command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
        securityContext:
          privileged: true
        volumeMounts:
        - name: data
          mountPath: /usr/share/elasticsearch/data
      - name: increase-vm-max-map
        image: busybox
        command: ["sysctl", "-w", "vm.max_map_count=262144"]
        securityContext:
          privileged: true
      - name: increase-fd-ulimit
        image: busybox
        command: ["sh", "-c", "ulimit -n 65536"]
        securityContext:
          privileged: true
      tolerations:
      - key: "node-role.kubernetes.io/control-plane"
        operator: "Exists"
        effect: "NoSchedule"
      nodeSelector:
        kubernetes.io/hostname: kworker1
  volumeClaimTemplates:
  - metadata:
      name: data
      labels:
        app: elasticsearch
    spec:
      accessModes: [ "ReadWriteOnce" ]
      storageClassName: "local-storage"
      resources:
        requests:
          storage: 2Gi

Apply the StatefulSet:

kubectl apply -f es-sts.yaml

Verify Elasticsearch Deployment

After the Elastisearch pods come into the running state, let us try and verify the Elasticsearch statefulset. The easiest method to do this is to check the status of the cluster. In order to check the status, port-forward the Elasticsearch pod’s 9200 port.

kubectl port-forward es-cluster-0 9200:9200

You can sned curl to health check.

curl http://localhost:9200/_cluster/health/?pretty

efk

Deploy Kibana Deployment & Service

Kibana can be created as a simple Kubernetes deployment. If you check the following Kibana deployment manifest file, we have an env var ELASTICSEARCH_URL defined to configure the Elasticsearch cluster endpoint. Kibana uses the endpoint URL to connect to elasticsearch.

Create the Kibana deployment manifest as kibana-deployment.yaml

apiVersion: apps/v1
kind: Deployment
metadata:
  name: kibana
  labels:
    app: kibana
spec:
  replicas: 1
  selector:
    matchLabels:
      app: kibana
  template:
    metadata:
      labels:
        app: kibana
    spec:
      containers:
      - name: kibana
        image: docker.elastic.co/kibana/kibana:7.5.0
        resources:
          limits:
            cpu: 1000m
          requests:
            cpu: 100m
        env:
          - name: ELASTICSEARCH_URL
            value: http://elasticsearch:9200
        ports:
        - containerPort: 5601

Create the manifest.

kubectl create -f kibana-deployment.yaml

To access the Kibana UI via the node’s IP address, we’ll create a NodePort service. This is a simple way to expose the service for demonstration purposes. A NodePort service assigns a specific port on each cluster node, allowing you to access the application using the node’s IP and the assigned port.In real-world projects, however, it’s more common to use Kubernetes Ingress in combination with a ClusterIP service. This setup provides better control, security, and scalability by routing external traffic through the ingress, while the service itself remains accessible only within the cluster.
Save the following manifest as kibana-svc.yaml

apiVersion: v1
kind: Service
metadata:
  name: kibana-np
spec:
  selector: 
    app: kibana
  type: NodePort  
  ports:
    - port: 8080
      targetPort: 5601 
      nodePort: 30000

Create the kibana-svc

kubectl create -f kibana-svc.yaml

Now you will be able to access Kibana over http://:3000

efk

You may need open port to cluster

Verify Kibana Deployment

After the pods come into the running state, let us try and verify Kibana deployment. The easiest method to do this is through the UI access of the cluster.

To check the status, port-forward the Kibana pod’s 5601 port. If you have created the nodePort service, you can also use that.

kubectl port-forward <kibana-pod-name> 5601:5601

efk

Deploy Fluentd Kubernetes Manifests

Deploying Fluentd as a DaemonSet

Fluentd is typically deployed as a DaemonSet to ensure it can collect logs from all nodes in the cluster. To accomplish this, it also needs elevated permissions to access and retrieve pod metadata across all namespaces.

Service Accounts in Kubernetes

To grant specific permissions to components within a Kubernetes cluster, service accounts are used in conjunction with cluster roles and cluster role bindings. Let’s proceed to create the necessary service account and define appropriate roles.

Creating a Fluentd Cluster Role

A cluster role in Kubernetes defines rules that outline a specific set of permissions. For Fluentd, we need to grant permissions to access both pods and namespaces.

Create a manifest fluentd-role.yaml

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: fluentd
  labels:
    app: fluentd
rules:
- apiGroups:
  - ""
  resources:
  - pods
  - namespaces
  verbs:
  - get
  - list
  - watch

Apply the manifest
```
kubectl create -f fluentd-role.yaml
```

Create Fluentd Service Account A service account in kubernetes is an entity to provide identity to a pod. Here, we want to create a service account to be used with fluentd pods.

Create a manifest fluentd-sa.yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: fluentd
  labels:
    app: fluentd

Apply the manifest

kubectl create -f fluentd-sa.yaml

Fluentd Cluster Role Binding

Create a manifest fluentd-rb.yaml

kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: fluentd
roleRef:
  kind: ClusterRole
  name: fluentd
  apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
  name: fluentd
  namespace: default

Apply the manifest
```
kubectl create -f fluentd-rb.yaml
```

Deploy Fluentd DaemonSet

Save the following as fluentd-ds.yaml

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: fluentd
  labels:
    app: fluentd
spec:
  selector:
    matchLabels:
      app: fluentd
  template:
    metadata:
      labels:
        app: fluentd
    spec:
      serviceAccount: fluentd
      serviceAccountName: fluentd
      containers:
      - name: fluentd
        image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
        env:
          - name:  FLUENT_ELASTICSEARCH_HOST
            value: "elasticsearch.default.svc.cluster.local"
          - name:  FLUENT_ELASTICSEARCH_PORT
            value: "9200"
          - name: FLUENT_ELASTICSEARCH_SCHEME
            value: "http"
          - name: FLUENTD_SYSTEMD_CONF
            value: disable
          - name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
            value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
        resources:
          limits:
            memory: 512Mi
          requests:
            cpu: 100m
            memory: 200Mi
        volumeMounts:
        - name: varlog
          mountPath: /var/log
        - name: varlibdockercontainers
          mountPath: /var/lib/docker/containers
          readOnly: true
      terminationGracePeriodSeconds: 30
      volumes:
      - name: varlog
        hostPath:
          path: /var/log
      - name: varlibdockercontainers
        hostPath:
          path: /var/lib/docker/containers

Apply

kubectl create -f fluentd-ds.yaml

Verify Fluentd Setup
In order to verify the fluentd installation, let us start a pod that creates logs continuously. We will then try to see these logs inside Kibana.

Save the following as test-pod.yaml

apiVersion: v1
kind: Pod
metadata:
  name: counter
spec:
  containers:
  - name: count
    image: busybox
    args: [/bin/sh, -c,'i=0; while true; do echo "Thanks for visiting devopscube! $i"; i=$((i+1)); sleep 1; done']

Apply

kubectl create -f test-pod.yaml

At the end We should have below resources

efk

We can verify logs are in Kibana now
We need to do port-forward again
Step 1: Open kibana UI using proxy or the nodeport service endpoint. Head to management console inside it.

efk

Step 2: Select the “Index Patterns” option under Kibana section.
Step 3: Create a new Index Patten using the pattern – “logstash-*” and
Step 4: Select “@timestamp” in the timestamps option.
Now We should able to see indexes/logs

efk

Thanks for reading…

Guneycan Sanli.

Guneycan Sanli

A person who like learning, music, travelling and sports.