Apache Kafka is the numerous common buffer solution deployed together with the ELK Stack. Kafka is deployed within the logs delivery and the indexing units, acting as a segregation unit for the data being collected:
In this blog, we’ll see how to deploy all the components required to set up a resilient logs pipeline with Apache Kafka and ELK Stack:
To perform the steps below, I set up a single Ubuntu 18.04 VM machine on AWS EC2 using local storage. In real-life scenarios, you will probably have all these components running on separate machines.
I started the instance in the public subnet of a VPC and then set up a security group to enable access from anywhere using SSH and TCP 5601 (for Kibana).
Using Apache Access Logs for the pipeline, you can use VPC Flow Logs, ALB Access logs etc.
We will start by installing the main component in the stack — Elasticsearch.
Login to your Ubuntu system using sudo privileges. For the remote Ubuntu server using ssh to access it. Windows users can use putty or Powershell to log in to Ubuntu system.
Elasticsearch requires Java to run on any system. Make sure your system has Java installed by running the following command. This command will show you the current Java version.
Check the installation is successful or not by the below command
Finally, I added a new elastic IP address and associated it with the running instance.
The example logs used for the tutorial are Apache access logs.
Step 1: Installing Elasticsearch
We will start by installing the main component in the stack — Elasticsearch. Since version 7.x, Elasticsearch is bundled with Java so we can jump right ahead with adding Elastic’s signing key:
Download and install the public signing key:
Now you may need to install the apt-transport-https package on Debian before proceeding:
Our next step is to add the repository definition to our system:
You can install the Elasticsearch Debian package with:
Before we bootstrap Elasticsearch, we need to apply some basic configurations using the Elasticsearch configuration file at: /etc/elasticsearch/elasticsearch.yml:
Since we are installing Elasticsearch on AWS, we will bind Elasticsearch to the localhost.
Also, we need to define the private IP of our EC2 instance as a master-eligible node:
Save the file and run Elasticsearch with:
To confirm that everything is working as expected, point curl to: http://localhost:9200, and you should see something like the following output (give Elasticsearch a minute or two before you start to worry about not seeing any response):
Step 2: Installing Logstash
Next up, the “L” in ELK — Logstash. Logstash and installing it is easy. Just type the following command.
Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs and ships them on to Elasticsearch for indexing.
Verify Java is installed:
Let’s create a new config file:
Since we already defined the repository in the system, all we have to do to install Logstash is run:
Next, we will configure a Logstash pipeline that pulls our logs from a Kafka topic, processes these logs, and ships them on to Elasticsearch for indexing.
Let’s create a new config file:
As you can see — we’re using the Logstash Kafka input plugin to define the Kafka host and the topic we want Logstash to pull from. We’re applying some filtering to the logs and we’re shipping the data to our local Elasticsearch instance.
Step 3: Installing Kibana
Let’s move on to the next component in the ELK Stack — Kibana. As before, we will use a simple apt command to install Kibana:
We will then open up the Kibana configuration file at: /etc/kibana/kibana.yml, and make sure we have the correct configurations defined:
Then enable and start the Kibana service:
We would need to install Firebeat. Use:
Open up Kibana in your browser with http://<PUBLIC_IP>:5601. You will be presented with the Kibana home page.
5+ years of involvement in Database Development. Worked with numerous companies on projects including database optimization, performance tuning, automation, and SQL programming.