In my last blog post, I mentioned that a new blog on honeypot visualization would be coming soon. My apologies for the long delay to those who have been waiting for this post. I thought that I had settled on the best course of action regarding honeypot visualization options but, as you will see, it seems that I was premature. Be forewarned – this post is quite long.
I’ve discussed visualizing honeypot data in previous blog articles. At the time, the best options that I had found were Kippo-Graph for my Cowrie honeypot data and DionaeaFR for my Dionaea data.
My previous blog posts detailed a few twists and tricks that you need to know to get them both going. The problem is that the list of twists and tricks seemed to grow or change as new versions of the underlying software dependencies would be released. The basic problem is that while much of the underlying dependency software is still being updated, the visualization packages themselves are not.
Even if you can get Kippo-Graph and DionaeaFR running successfully on the honeypot server, it’s only a matter of time before you’ll need to set everything up anew. I usually rebuild my honeypots pretty regularly – maybe every month or two.
Neither creating a new droplet nor reinstalling the honeypots is a problem – it’s now a very quick process (see my last post). I can redo everything in probably 5-10 minutes.
Having to reinstall the front-end software each time is another story and that’s what started me on this quest.
Towards a Solution
I concluded that the best practice, in my situation, was to install the visualization software locally in a virtual machine. I could do whatever tweaks were necessary to get it working in the local VM and then no longer need to worry about it whenever I rebuilt my honeypot server. I would use the SCP command to copy the data from the honeypot server to my local VM and then perform the visualizations there.
Once I had settled on this method, I realized that there were now other possibile visualization options that I could use.
Ever since my initial experiments with the Modern Honey Network (MHN) I had been very interested in using an ELK stack for visualizing my honeypot data. For those who aren’t familiar with it, the ELK stack is an open-source conglomeration of software, all made by the same company – Elastic. The acronym stands for:
E – Elasticsearch
L – Logstash
K – Kibana
In a nutshell, Logstash processes log files in a variety of formats and sends the data to an Elasticsearch index for categorization, organization and retrieval. Lastly, Kibana provides a way to visualize the data stored in an Elasticsearch index using a web browser. Using Kibana allows one to create a highly customizable visualization dashboard.
Now that I was planning to use a local VM as the visualization server using ELK became a possibility.
If you’re new to using the ELK stack components, the best video overview that I found was this one by Minsuk Heo. Sometimes his accent is little tricky but, in my opinion, they are the best “ELK from the ground up” videos on the internet:
Experiments With ELK
Often, an ELK stack environment includes an additional component that is used the send the data from the server to the ELK computer. Elastic provides a tool called Filebeats to do this. The basic data flow looks something like this:
Data source(s) —> Filebeats –> Logstash –> Elasticsearch –> Kibana
I ran into my first hurdle when I realized that this pipeline wouldn’t be an option for me since my ELK stack was running on a VM on one of my home computers and my honeypots were in the cloud. The second part to this problem came about when virtually all of the documentation and tutorials I found were assuming this type of usage. I needed to find a way to place locally stored data into an Elasticsearch index. This is when I found a blog post and companion video by Mirko Nasato. Mirko runs a social media game called Wordismic. In his blog and video he details how he imports his locally stored game data from his PC into Elasticsearch just like what we want to do.
Using Mirko’s tutorial, I was able to index my Cowrie honeypot data and then use Kibana to visualize it. This was really easy since Cowrie can natively log its data in JSON format.In no time, I had a working dashboard like this:
Using ELK for Dionaea was another story, however.
Dionaea has an option that is supposed to allow it to log to a JSON file. It’s a .yaml file that you just have to enable (as with many other services and handlers in Dionaea). I’ve used this feature in the past while testing. For some reason, the JSON logging seems to no longer be working. At least that was the case a couple of weeks ago.
To circumvent this problem, I concluded that I could just install DionaeaFR on the local visualization VM to view my Dionaea data. That sounded like a winning plan until I had a direct message on Twitter from Ignacio Sanmillan about Dionaea and MHN. The very next day, he sent me a link to check out the basic visualization that he had worked up. That was really impressive! He’s added a few additional features since then and it seems to be really coming along.
Seeing how quickly Ignacio was able put together his front-end started me thinking about making my own Dionaea dashboard. This proved to be much more difficult for me than expected (my poor PHP skills became clear!). For now, I would stick to a local version of DionaeaFR for my Dionaea data.
Setup and Installation
Here are the details of the installation procedures that I use to get my honeypot visualization solution up and running.
VirtualBox with Linux Mint 64-bit (and GuestAdditions)
1 Processor / 3 GB RAM / 50 GB HD
1 Shared Folder
If you’re using a Shared Folder to transfer data back and forth between the VM and the host, the following command will help with any permissions issues you may encounter.
sudo adduser username vboxsf
sudo apt-get update sudo apt-get upgrade sudo apt-get install curl sudo apt-get install software-properties-common sudo apt-get install ubuntu-restricted-extras
The ELK stack requires a recent version of the Java Development Kit.
Download Java SE Development Kit from here.
I downloaded jdk1.8.0_131-linux-x64.tar.gz just before writing this article. Now copy the archive to the /opt folder:
sudo cp jdk1.8.0_131-linux-x64.tar.gz /opt cd /opt tar -zxvf jdk1.8.0_131-linux-x64.tar.gz chown -R root jdk1.8.0_131
Now you can delete the original .tar.gz file to free up space.
Next, you need to setup some alternatives for Java components.
sudo update-alternatives --install /usr/bin/java java /opt/jdk1.8.0_60/bin/java 1 sudo update-alternatives --install /usr/bin/javac javac /opt/jdk1.8.0_60/bin/javac 1 sudo update-alternatives --install /usr/bin/jar jar /opt/jdk1.8.0_60/bin/jar 1
You’ll have to then set the Java version to be used manually.
sudo update-alternatives --config java
Select the version you just installed.
If you want, you can test by running them at the command line (just type java, for instance, and be sure the correct version runs).
The ELK Stack
There a couple of ways to install and run the ELK components. One method downloads the components as .DEB files and installs them using GDebi. After installation, you can run the components as services and configure them to start automatically. This is very easy and it’s the way I did it initially.
In my initial experiments with ELK and Cowrie, I was not using Logstash. After all, we only need to get the local data into an index and then visualize it. This is also how Mirko Nasato imported his data in the referenced blog post/video. We will continue this and not deal with using Logstash.
Option 1 – Running ELK Components as Services
Go to the Elastic website and download the latest of the 3 components – Elasticsearch, Logstash and Kibana
(Users of Linux distributions should get the .DEB files)
Once downloaded, use GDebi to install each component. This will install to /usr/share/elasticsearch (similar folder for other components).
Now simply start the services with:
sudo service elasticsearch start sudo service kibana start
Setting ELK to Start On Bootup
Check to see whether your system is using SysV or systemd
ps -p 1
Running ElasticSearch with systemd (e.g., Elementary OS)
Start and Stop Elasticsearch by:
sudo systemctl start elasticsearch.service sudo systemctl stop elasticsearch.service
To configure Elasticsearch to start on boot, run the following:
sudo /bin/systemctl daemon-reload sudo /bin/systemctl enable elasticsearch.service
Running ElasticSearch with SysV init (e.g.,Linux Mint)
Start and Stop Elasticsearch by:
sudo -i service elasticsearch start sudo -i service elasticsearch stop
To configure Elastic to start automatically:
sudo update-rc.d elasticsearch defaults 95 10
You can do the same to start Kibana and Logstash as services and on boot.
Option 2 – Running ELK Components as Needed
When I configured my VM to run ELK only as needed, I downloaded the compressed ELK Stack component files from here (https://www.elastic.co/downloads). I got the .tar.gz files.
From a Terminal window, copy the files to the /opt folder (from the Downloads folder):
sudo cp elasticsearch-5.4.3.tar.gz /opt sudo cp kibana-5.4.3-linux-x86_64.tar.gz /opt sudo cp logstash-5.4.3.tar.gz /opt
Now, change to the /opt folder and untar each file:
cd /opt sudo tar -zxvf elasticsearch-5.4.3.tar.gz sudo tar -zxvf kibana-5.4.3-linux-x86_64.tar.gz sudo tar -zxvf logstash-5.4.3.tar.gz
To start Elasticsearch, for instance, you just do the following:
cd elasticsearch-5.4.3 ./bin/elasticsearch
You can use the same process to start Kibana and Logstash.
Before starting the components, I made some minor configuration changes below that are probably not really necessary. Edit the Elasticearch file by:
sudo nano /etc/elasticsearch/elasticsearch.yml
Now, change the Cluster and Node names:
I set mine to:
Change network.host: 192.168.0.1 to network-host: localhost
Save and exit.
Starting and accessing ELK
Now that you have ELK running, we should be able to use web browser and access Elasticsearch at localhost:9200 and Kibana at localhost:5601.
Working with ELK
I’ve uploaded the scripts that I mention in the next section to my GitHub page here.
The first thing that we need is an Elasticsearch index where we can store our data. Generally speaking, creating an index is simple. You can create an index called “test” by typing the following at a terminal prompt:
curl -XPUT 'localhost:9200/test'
You should an “acknowledged”: true result. You can verify that an index named “test” exists by typing:
curl -XGET 'localhost:9200/test?pretty'
This script creates the Cowrie index in Elasticsearch and also provides some mapping information:
Now, you need to get your data into the new Elasticsearch index. The following script loops through the folder named ‘cowrie’ and processes each file using the ./bulk_index.sh script. It prepends the appropriate data to each line of the Cowrie data files putting them in the proper JSON format for Elasticsearch and adds each to the ‘cowrie’ index that we just created.
for JSON in cowrie/*.*; do ./bulk_index.sh $JSON; done
Now that you data has been saved to an Elasticsearch index, you can use Kibana to start visualizing it.
- Open your browser and go to localhost:5601
- In the field asking you to define an index name or pattern, erase logstash-* and type cowrie
- Be sure to check the box that says Index contains time-based events
- Click the create button
- Now click “Discover” on the left menu bar to verify and view your data
- Most likely, it is only showing the “last 15 minutes” time frame (upper right) – adjust this range for your data
Next, start adding some visualizations.
- Click “Visualize” on the left menu
- Click “Create Visualization” and off you go.
(Note: it is not my intention to give a tutorial on creating visualizations in Kibana. There are many great resources for that).
The only missing piece of our ELK puzzle is turning the IP addresses into geo-points so that we can plot them on a map. To do this, we’ll need to use the powers of Logstash. To date, I haven’t gotten this working properly. I’ll do a short blog post as soon as I can get it working.
Visualizing Dionaea Data
Since I’ve covered DionaeaFR in a previous post, I won’t go into any details about it. The only thing that’s new about the DionaeaFR installation is that there’s now a script that will do all of the heavy lifting. I received a comment on my original DionaeaFR blog post from someone who goes by R1ckyz1. He wrote a script to install DionaeaFR and all of its dependencies. I tweaked it a bit and you can find it here.
Once DionaeaFR has been installed, you can start it using this script. After that, you should be able to access the interface in a browser using localhost:8000
Conclusion and Still Unfinished
For the most part, I’m not happy with this blog entry – there’s still too much left undone. Hopefully, this lengthy foray into honeypot visualization options has given you something to work with as you consider how to best display and evaluate the data that you capture.
Basically, here’s what I still would like to do with this topic:
- Find a way to display multiple fields in one visualization (e.g., username/password)
- Clean up the Kibana dashboard – remove visualization labels
- Convert the IP addresses into geo-points to be able to display them on a map
- Figure out a way to get Dionaea JSON working again so I can display the data with ELK as well
- Try to recreate the DionaeaFR display data using amCharts
I guess that means that there will most likely be a sequel to this blog post at some point (hopefully soon).