In this article I’ll demonstrate step by step how to setup Splunk analytics to detect successful known and unknown malware attacks on web hosting systems in real time.
In addition the same solution will include instructions to deploy fully automated investigative analytics to discover the origins of attackers (IP addresses) as well as any modifications within the file system.
This information is essential to discover and immediately eliminate all possible backdoors and exploits that attacker tried to plant.
Real time alerts will be delivered via email to system administrator as soon as attack occurs. The same information will be available via Splunk web interface for further analysis.
The information presented will help system administrators as well as hosting service providers to devise measures to detect close to 100% of possibly successful cyber attacks and take immediate actions before malware tries to propagate further and cause significant damage or loss to business.
Deployment of such system not only can help to prevent significant monetary and business losses for the enterprise but also can assist in avoiding an embarrassment and negative publicity where customers and news outlets learns about successful hack before the actual business does.
While the steps described are somewhat specific to CentOS Linux hosting based on WHM/Cpanels – the same approach can be adjusted to any type of hosting and operating environments due to Splunk’s multi-platform support capabilities.
If you’ve ever administered web hosting servers or been a webmaster of a self hosted website – the issues with trojans, backdoors, exploits, viruses, defacing and all kinds of web malware should be very familiar to you.
Dealing with malware becomes critical when sites are built with popular content management systems like WordPress. Self hosted wordpress sites are one of the most attractive and easiest target for hackers and spammers.
Scanning for outdated plugins, buggy themes and using 0-day vulnerabilities hackers are trying to penetrate web hosting defences 24/7/365. Once successful – the hacked website and quite often the hosting server itself become zombies in the arms of malicious attackers.
This meant I have to deal with many acts of malware infections, defacing, sudden outgoing spam activities, injection of malicious content, planted phishing pages, redirects to spammy sites as well as uploads of malware and trojans.
Using anti-malware and security scanning software helps to detect the presence of malware and to receive alerts on detected infections up to the point.
The problem is that it only detects about 75-85% of malware occurrences and by no means it contains any information about how malware got in – and that is the big problem.
Just like a living virus – malware ensures self sustainability by planting it’s own copies and installing hidden backdoors all over the hosting file system space.
Deleting 9 our of 10 malware occurrences means that one hidden backdoor still remains somewhere and it will be reused to reinfect and take control over the system again. Detecting all occurences of malware is essential to protect the file system space.
In addition it is no less important to detect attacker’s origins to set firewall blocks and especially trace his steps leading into successful break-in. This will allow system administrators to discover weak spots and previously unknown vulnerabilities and quickly close the loopholes.
Reminder Note: Some folder paths and settings mentioned here are specific to CPanel / WHM based hosting on CentOS 6 Linux. Although concepts described can be ported to any configuration and even Windows environments due to flexibility of Splunk.
The problem I encountered with other malware detection systems is their delayed, signature-based scans that not only miss unknown infections but does not provide any insights into who, how and when factors of successful cyber attacks.
To solve this problem I was looking into setting up file monitoring daemons, such as inotify or auditd and similars. Idea is that monitoring daemon would generate log of alerts which subsequently will be imported into Splunk.
The problem with these tools is that they either require complex configurations or provide tons of extra useless event information or fail to monitor subtrees, etc…
I also dislike adding more and more moving parts into existing solutions as this complicates maintenance tasks overseeing all of them.
To monitor for the presence of malware we need to:
- Be able to monitor file system (user account web home directories) in real time starting from the given directory recursively. In my case it’s /home/* and subtrees.
- Be able to monitor for all essential events of interest: adds, updates, renames.
- Be able to scan modified file contents for suspicious fragments in real time.
- Be able to discover Web origins of suspicious modifications, such as IP addresses of users caused modification to occur.
This will allow to trace back malicious activity and discover the root cause of security breach.
I decided to solve tasks 1,2,3 in non-traditional way: Instead of installing and configuring some monitoring daemons and twisting their arms and legs to make them do what I need – why can’t I use Splunk for the same purpose? Splunk is perfectly suitable to monitor changes within subtrees, so why not to use this ability to monitor user filespaces.
So I decided to throw all the garbage of user file system content into Splunk and see how it feels.
Splunk loves garbage! 🙂 And that’s the beauty of it: throw any type of data regardless of format to Splunk and it will do it’s best to make sense of it, index it and make it available for searches. Even with all defaults – no data will be confusing to Splunk.
Essentially this approach tells Splunk to consider user file system content as data inputs. All user scripts: *.php, *.js, *.py, *.pl similars – Splunk will consider them as a source of “eventing” data.
In our case actual event of interest is ‘source’ (actual file name) as well as _raw content of it.
To setup this I’ve installed Splunk forwarder on actual hosting server and used second server (could be VPS or dedicated as well) to install Enterprise Splunk receiving side and configure all alerting logic.
This way I put minimal load to hosting server and it’s only task is to send data in real time to indexer for further analysis. Technically it is possible to setup everything on the single machine if wanted to.
Splunk forwarder will be sending all data about web traffic (apache’ access_combined events), send all events about user file scripts modifications as well as contents of apache configuration file:
We need httpd.conf information to be able to map user accounts + source files, such as:
/home/johnsmith/public_html/bad-script.php – to actual domain name.
httpd.conf contains configuration entries such as this:
<VirtualHost 220.127.116.11:80> ServerName blog.johnsmith.com ServerAlias www.blog.johnsmith.com DocumentRoot /home/johnsmith/public_html ... </VirtualHost>
When we detect that bad-script.php is suspicious – we want to find IP address of malicious attacker.
The way WHM/CPanel setup works is that each CPanel user’s space is separate from any other user. In other words it’s impossible (without root access to server itself) for WEB attacker to plant stuff from ‘johnsmith’ user space to any other user. Although if CPanel user ‘johnsmith’ hosts number of websites within the same CPanel such as: blog.johnsmith.com and store.johnsmith.com – attacker could easily copy or modify script files within each of these sites. When we know the ‘source’ field name of suspicious file – which is nothing else but the actual filename: /home/johnsmith/public_html/bad-script.php – we can derive account name, which is ‘johnsmith’.
Then using configuration data from httpd.conf and the time of attack – we can “triangulate” list of IP addresses that visited any site belonging to ‘johnsmith’ account within this timeframe. From this point we will have enough data to recognize attacker and to trace his steps backwards to see where and how the break in occurred.
We will setup Splunk forwarder to monitor historical logs for investigative purposes as well as real time WEB traffic logs.
Here’s how architecture of this solution looks like:
Splunk indexer will be configured to deliver customized email alerts about file modifications as well as about suspicious pattern detected events.
Installing Splunk on indexer (destination machine)
- Follow Splunk installation instructions. My usual sequence of steps is: