Setting up EFK Stack on Kubernetes
- 14 minsHow to Setup EFK Stack on Kubernetes: A Step-by-Step Guide
Introduction
EFK (Elasticsearch, Fluentd, Kibana) is a powerful open-source stack for centralized log aggregation, analysis, and visualization. When managing multiple applications and services on Kubernetes, centralizing logs ensures better monitoring and troubleshooting.
This guide walks you through setting up the EFK stack on a Kubernetes cluster.
What is the EFK Stack?
- Elasticsearch: A distributed search and analytics engine that stores and retrieves large log volumes. It’s designed for scalability and efficiency.
- Fluentd: A flexible log collector that unifies data collection and forwards logs to various destinations like Elasticsearch.
- Kibana: A user-friendly interface for querying, visualizing, and analyzing log data.
Elasticsearch is optimized for managing unstructured log data, Fluentd serves as the log shipper, and Kibana provides visualization tools for better insights.
Architecture of EFK Stack
The EFK stack architecture for Kubernetes involves:
- Fluentd: Deployed as a DaemonSet to gather logs from all nodes. It forwards logs to the Elasticsearch endpoint.
- Elasticsearch: Deployed as a StatefulSet for persisting log data. It provides a service endpoint for Fluentd and Kibana.
- Kibana: Deployed as a Deployment to visualize the data stored in Elasticsearch.
High-Level Architecture
The deployment includes Fluentd for log collection, Elasticsearch for storage, and Kibana for visualization.
Step-by-Step Setup
-
You can clone https://github.com/guneycansanli/k8s-training.git and fond all yamls under kubernetes-efk-yamls directory.
-
Note: All the EFK components get deployed in the default namespace.
git clone https://github.com/guneycansanli/k8s-training.git
1. Deploy Elasticsearch StatefulSet
- This command installs the latest stable version of Argo Rollouts in the argo-rollouts namespace
Create a Headless Service
- Define the headless service (
es-svc.yaml
) for internal communication between Elasticsearch pods:
apiVersion: v1
kind: Service
metadata:
name: elasticsearch
spec:
selector:
app: elasticsearch
clusterIP: None
ports:
- port: 9200
name: rest
- port: 9300
name: inter-node
- Apply the service:
kubectl apply -f es-svc.yaml
- Before we begin creating the statefulset for elastic search, let’s recall that a statefulset requires a storage class defined beforehand using which it can create volumes whenever required.
- Since I work with my local cluster in my home lab , Local Clusters Lack Native Storage Backends Local Kubernetes clusters (e.g., minikube, kubeadm setups) don’t come with built-in storage backends. Instead Storage is tied to the local filesystem of the worker nodes. You have to manually specify the storage paths or directories that the PVs map to, using mechanisms like hostPath or local volumes.
- You need to crate StorageClass
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: local-storage
provisioner: kubernetes.io/no-provisioner
volumeBindingMode: WaitForFirstConsumer
kubectl apply -f local-storage-class.yaml
- We also need to have PersistentVolume to Bound
apiVersion: v1
kind: PersistentVolume
metadata:
name: local-pv-worker-0
spec:
capacity:
storage: 2Gi
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Retain
storageClassName: local-storage
local:
path: /opt/k8s/es-cluster-0
nodeAffinity:
required:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/hostname
operator: In
values:
- kworker1
kubectl apply -f local-persistent-volume-kworker1.yaml
- Note: Make sure path directory already exist and also You may have some issues if you crate PV on worker or master node.
- I have crated 3 diffrent PV so ElasticSearch istences can claim (We have 3 nodes/replicas in cluster)
- Deploy Elasticsearch
- Use the following StatefulSet (es-sts.yaml) to deploy Elasticsearch:
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: es-cluster
spec:
serviceName: elasticsearch
replicas: 3
selector:
matchLabels:
app: elasticsearch
template:
metadata:
labels:
app: elasticsearch
spec:
containers:
- name: elasticsearch
image: docker.elastic.co/elasticsearch/elasticsearch:7.5.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
ports:
- containerPort: 9200
name: rest
protocol: TCP
- containerPort: 9300
name: inter-node
protocol: TCP
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
env:
- name: cluster.name
value: k8s-logs
- name: node.name
valueFrom:
fieldRef:
fieldPath: metadata.name
- name: discovery.seed_hosts
value: "es-cluster-0.elasticsearch,es-cluster-1.elasticsearch,es-cluster-2.elasticsearch"
- name: cluster.initial_master_nodes
value: "es-cluster-0,es-cluster-1,es-cluster-2"
- name: ES_JAVA_OPTS
value: "-Xms512m -Xmx512m"
initContainers:
- name: fix-permissions
image: busybox
command: ["sh", "-c", "chown -R 1000:1000 /usr/share/elasticsearch/data"]
securityContext:
privileged: true
volumeMounts:
- name: data
mountPath: /usr/share/elasticsearch/data
- name: increase-vm-max-map
image: busybox
command: ["sysctl", "-w", "vm.max_map_count=262144"]
securityContext:
privileged: true
- name: increase-fd-ulimit
image: busybox
command: ["sh", "-c", "ulimit -n 65536"]
securityContext:
privileged: true
tolerations:
- key: "node-role.kubernetes.io/control-plane"
operator: "Exists"
effect: "NoSchedule"
nodeSelector:
kubernetes.io/hostname: kworker1
volumeClaimTemplates:
- metadata:
name: data
labels:
app: elasticsearch
spec:
accessModes: [ "ReadWriteOnce" ]
storageClassName: "local-storage"
resources:
requests:
storage: 2Gi
- Apply the StatefulSet:
kubectl apply -f es-sts.yaml
Verify Elasticsearch Deployment
- After the Elastisearch pods come into the running state, let us try and verify the Elasticsearch statefulset. The easiest method to do this is to check the status of the cluster. In order to check the status, port-forward the Elasticsearch pod’s 9200 port.
kubectl port-forward es-cluster-0 9200:9200
- You can sned curl to health check.
curl http://localhost:9200/_cluster/health/?pretty
Deploy Kibana Deployment & Service
Kibana can be created as a simple Kubernetes deployment. If you check the following Kibana deployment manifest file, we have an env var ELASTICSEARCH_URL defined to configure the Elasticsearch cluster endpoint. Kibana uses the endpoint URL to connect to elasticsearch.
- Create the Kibana deployment manifest as kibana-deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: kibana
labels:
app: kibana
spec:
replicas: 1
selector:
matchLabels:
app: kibana
template:
metadata:
labels:
app: kibana
spec:
containers:
- name: kibana
image: docker.elastic.co/kibana/kibana:7.5.0
resources:
limits:
cpu: 1000m
requests:
cpu: 100m
env:
- name: ELASTICSEARCH_URL
value: http://elasticsearch:9200
ports:
- containerPort: 5601
- Create the manifest.
kubectl create -f kibana-deployment.yaml
-
To access the Kibana UI via the node’s IP address, we’ll create a NodePort service. This is a simple way to expose the service for demonstration purposes. A NodePort service assigns a specific port on each cluster node, allowing you to access the application using the node’s IP and the assigned port.In real-world projects, however, it’s more common to use Kubernetes Ingress in combination with a ClusterIP service. This setup provides better control, security, and scalability by routing external traffic through the ingress, while the service itself remains accessible only within the cluster.
- Save the following manifest as kibana-svc.yaml
apiVersion: v1
kind: Service
metadata:
name: kibana-np
spec:
selector:
app: kibana
type: NodePort
ports:
- port: 8080
targetPort: 5601
nodePort: 30000
- Create the kibana-svc
kubectl create -f kibana-svc.yaml
- Now you will be able to access Kibana over http://
:3000
- You may need open port to cluster
Verify Kibana Deployment
- After the pods come into the running state, let us try and verify Kibana deployment. The easiest method to do this is through the UI access of the cluster.
To check the status, port-forward the Kibana pod’s 5601 port. If you have created the nodePort service, you can also use that.
kubectl port-forward <kibana-pod-name> 5601:5601
Deploy Fluentd Kubernetes Manifests
Deploying Fluentd as a DaemonSet
Fluentd is typically deployed as a DaemonSet to ensure it can collect logs from all nodes in the cluster. To accomplish this, it also needs elevated permissions to access and retrieve pod metadata across all namespaces.
Service Accounts in Kubernetes
To grant specific permissions to components within a Kubernetes cluster, service accounts are used in conjunction with cluster roles and cluster role bindings. Let’s proceed to create the necessary service account and define appropriate roles.
Creating a Fluentd Cluster Role
A cluster role in Kubernetes defines rules that outline a specific set of permissions. For Fluentd, we need to grant permissions to access both pods and namespaces.
- Create a manifest fluentd-role.yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: fluentd
labels:
app: fluentd
rules:
- apiGroups:
- ""
resources:
- pods
- namespaces
verbs:
- get
- list
- watch
- Apply the manifest
kubectl create -f fluentd-role.yaml
- Create Fluentd Service Account A service account in kubernetes is an entity to provide identity to a pod. Here, we want to create a service account to be used with fluentd pods.
- Create a manifest fluentd-sa.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: fluentd
labels:
app: fluentd
- Apply the manifest
kubectl create -f fluentd-sa.yaml
- Fluentd Cluster Role Binding
- Create a manifest fluentd-rb.yaml
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: fluentd
roleRef:
kind: ClusterRole
name: fluentd
apiGroup: rbac.authorization.k8s.io
subjects:
- kind: ServiceAccount
name: fluentd
namespace: default
- Apply the manifest
kubectl create -f fluentd-rb.yaml
- Deploy Fluentd DaemonSet
- Save the following as fluentd-ds.yaml
apiVersion: apps/v1
kind: DaemonSet
metadata:
name: fluentd
labels:
app: fluentd
spec:
selector:
matchLabels:
app: fluentd
template:
metadata:
labels:
app: fluentd
spec:
serviceAccount: fluentd
serviceAccountName: fluentd
containers:
- name: fluentd
image: fluent/fluentd-kubernetes-daemonset:v1.4.2-debian-elasticsearch-1.1
env:
- name: FLUENT_ELASTICSEARCH_HOST
value: "elasticsearch.default.svc.cluster.local"
- name: FLUENT_ELASTICSEARCH_PORT
value: "9200"
- name: FLUENT_ELASTICSEARCH_SCHEME
value: "http"
- name: FLUENTD_SYSTEMD_CONF
value: disable
- name: FLUENT_CONTAINER_TAIL_PARSER_TYPE
value: /^(?<time>.+) (?<stream>stdout|stderr) [^ ]* (?<log>.*)$/
resources:
limits:
memory: 512Mi
requests:
cpu: 100m
memory: 200Mi
volumeMounts:
- name: varlog
mountPath: /var/log
- name: varlibdockercontainers
mountPath: /var/lib/docker/containers
readOnly: true
terminationGracePeriodSeconds: 30
volumes:
- name: varlog
hostPath:
path: /var/log
- name: varlibdockercontainers
hostPath:
path: /var/lib/docker/containers
- Apply
kubectl create -f fluentd-ds.yaml
- Verify Fluentd Setup
- In order to verify the fluentd installation, let us start a pod that creates logs continuously. We will then try to see these logs inside Kibana.
- Save the following as test-pod.yaml
apiVersion: v1
kind: Pod
metadata:
name: counter
spec:
containers:
- name: count
image: busybox
args: [/bin/sh, -c,'i=0; while true; do echo "Thanks for visiting devopscube! $i"; i=$((i+1)); sleep 1; done']
- Apply
kubectl create -f test-pod.yaml
- At the end We should have below resources
- We can verify logs are in Kibana now
- We need to do port-forward again
- Step 1: Open kibana UI using proxy or the nodeport service endpoint. Head to management console inside it.
- Step 2: Select the “Index Patterns” option under Kibana section.
- Step 3: Create a new Index Patten using the pattern – “logstash-*” and
- Step 4: Select “@timestamp” in the timestamps option.
- Now We should able to see indexes/logs
Thanks for reading…
Guneycan Sanli.