10 / 05 / 2015
One of my enterprise clients observed that certain class of attacks having a number of distinctive characteristics: attacker who possessed correct user account credentials won’t try to engage into malicious behavior right away.
Initial activity would involve certain degree of reconnaissance and gathering of future victim’s specific data, such as account balances, trade histories and other. So normal “red flags” and risk scoring metrics won’t generate any alerts.
However in many cases such pre-fraudulent activity was still carrying an unusual behavior marks: either session velocity (average number of seconds per hit) or session density (number of hits within the session) or both exceeded normal, baseline session patterns typical for the average client application user’s behavior.
Abnormally high session velocity is also a typical pattern of an automated script-driven session that both fraudsters and competition were using to siphon data from the client’s servers.
One of the possible solutions to detect these activities would be to calculate average session velocity and density and then apply these values to trigger alerts when session metrics exceeded thresholds.
The issue here is that due to the client’s business specific these averages vary greatly depending on the time of the day, time of the week and also the month of the year.
10 / 05 / 2015
Back in my days at IBM T.J. Watson Research Center where we were working on techniques to detect known and unknown malware, the fast growing challenge was the rising threat of malware’s abilities to become polymorphic.
Malicious snippets of code encrypted themselves and made it very difficult to apply conventional signature based detection techniques.
We’ve developed a tiny virtual machine in C language that was able to load malware code in real time and analyze it’s behavior without need to figure out how to decrypt it. Certain score metrics were assigned to keypoints and function calls and logic was put in place to trigger an alert if “risk” score exceeded certain heuristic threshold.
That technique allowed us to deliver top quality enterprise security solution (purchased by Symantec later on) that was capable of detecting previously unknown threats. That was more than 15 years ago.
While working with financial clients and technology companies today I can see that old behavior pattern analysis stays as strong as ever helping enterprises to discover new types of suspicious behaviors and investigate malicious activities with previously unknown patterns from previously unknown sources.
29 / 03 / 2015
To remind – this is the challenge – what we want to accomplish:
Detect and alert when C-class IP subnet tries to access at least 5 different accounts within an hour and at least 75% of total accounts touched has never been accessed from this subnet *and* from this USER_AGENT within the last 45 days.
And, as you may remember from Part 1, here’s the basic logic that we need to implement to make it happen:
- Scan last hour of access log data and find the list of subnets that tried to access multiple accounts within that hour.
- For each of these accounts – take username, IP, USER_AGENT and scan the previous 45 days of traffic history to find usernames that has never been touched by this IP/USER_AGENT combo.
- Alert if number of found accounts is above threshold.
I’ve spent quite a bit of effort to come up with a single query that does all of the above and in a pretty efficient manner.
The biggest part of challenge is that the query needs to find events (#1 above) but then it needs to run very custom search for each event against summary index that we’ve created (#2 above). And added icing on this cake is that the query needs to return results only if there are *no matches* found for the second part of search.
This quickly gets mind-boggling and it is a rather interesting puzzle to solve with SPL.
The way I solved it – is with a combination of macros + advanced subsearch. But instead of returning traditional results – the subsearch will return new, custom crafted Splunk search query to be executed by the outer search.
I named this approach Advanced Negative Look Behind (ANLB) query.
29 / 03 / 2015
Summary indexing is a great way to speedup Splunk searches by pre-creating a subset of only necessary data for specific purpose. In our case we need to filter out of all available WEB traffic data only login events. This will allow us to have very fast, much smaller data subset with all the information we need to reference against when matching with new, suspicious login events.
To proceed with building summary index we need to make a set of assumptions. These assumptions are needed to build the query and all other elements of the solution. You’ll be able to substitute names to your specifics later on if wanted to.
- Lets assume you have your WEB logs with all the event data indexed in Splunk already.
All web events are located within index named: logs.
- Field names (or aliases):
- HTTP request method (GET, POST, HEAD, etc..): method
- URL of page accessed: page
- Username field: username
- IP address of visitor: ip
- USER_AGENT value: ua
To generate summary index of login data – we need to create index itself first.
29 / 03 / 2015
- Detecting Bank Accounts Takeover Fraud Cyberattacks with Splunk. Part 2: Building Reference Summary Index of Logins Data.
- Detecting Bank Accounts Takeover Fraud Cyberattacks with Splunk. Part3: The Advanced Negative Look Behind Query.
In these series of posts I’ll cover the complete strategy of utilizing Splunk Enterprise in detecting customer account takeover fraud as well as setting up an automated alerts when such activity is detected.
While I’ve helped to implement these measures for large financial firm – the same approach can be applied to any online enterprise where it is essential to protect online customer accounts, quickly detect suspicious activity and to act and prevent monetary and business losses.
The techniques I am going to describe generate pretty low level of false positives and contain efficient ways to adjust trigger thresholds within multiple metrics for specific business needs. In addition – it is tested and works really well for portals that generate up to 3,000,000-5,000,000 hits a day.
Specific use case that is covered in these posts applies to situation where credentials of multiple clients (sometimes thousands or more) got in the hands of attackers who will try to take advantage of these for monetary, competitive or political gains. With the help of Splunk, enterprise will be able to quickly and automatically detect such situation and take necessary measures to protect business and clients.
Account takeover fraud comes into play when fraudster gains access to customer account credentials via any means: phishing campaigns, malware, spyware or by buying sets of stolen customer credential data on darknets or black online markets and forums.
I won’t get into the details of multiple possible ways customer credentials may be compromised but the end result is an ability of unauthorized person(s) to access multiple customer accounts and cause significant damages to customers and to business, including large monetary losses.
The worst way the enterprise can learn about cyberattack on their own customers is from CNN.
24 / 02 / 2015
Once you got all the beautiful and rich traffic data exported from Tealeaf and imported in Splunk – the possibilities are virtually endless to create very powerful search and cross referencing analytics and security investigation tool.
Within my consulting work for a major financial services firm I’ve built a specialized Splunk App that allows using single dashboard to execute multiple searches and visualize results and trends by leveraging Tealeaf data.
In addition to that a number of custom searches and alerts were created to create summary indexes and automatically detect and alert on possible malware infections, notify about suspicious activity patterns and out of bound activities.
Before sharing more details about visualizing trends and malicious traffic patterns – here are few tips on general design of custom Splunk security analytic apps and dashboards with Splunk.
While I cannot offer image snapshots of the actual dashboard’s visuals used within financial firm due to obvious client’s security concerns, I can describe the general approach we took to overcome Tealeaf’s limitations with Splunk as well as number of important points on how to take the most advantage of Splunk as a security research tool.
18 / 01 / 2015
Traffic Ray is a real time Web traffic analytics Splunk App I built for web server administrators and web hosting service providers.
Traffic Ray leverages raw Apache log files to visualize incoming Web server traffic allowing to discover incoming IP activity patterns, detect malicious activity, view bandwidth consumption trends and gain insights into Web visitor’s origins and behaviors on a single dashboard. Ok, on two :).
Being a webmaster as well as Web hosting server administrator myself – I often wanted to get unobstructed, quick, visual, real time view into incoming Web traffic stats and patterns. While working with many different reporting and analytic solutions I found most of them to be either too convoluted, overly generic, suspiciously intrusive or unacceptably restrictive. I needed an easy ability to gain comprehensive server-wide Web traffic insights as well as ways to do quick drilldowns into specific IP address behavior patterns or specific Web site bandwidth consumption trends.
Also, quite often ill-behaving or outright malicious incoming Web traffic source causes server to send automated, generic, non-descriptive system alerts about excessive server loads, suspicious running processes and alike that would require further root cause analysis.
Before I had to rely on multiple tools to put together a big picture of events as well as login to system shell and manually search and grep through raw logs to discover culprits of suspicious activity – and that was time consuming and unpleasant process.
11 / 11 / 2014
Let’s get our hands dirty. First step in building fraud investigation and security analytics platform with TeaLeaf is making TeaLeaf’s data available for Splunk. Then Splunk will take care of all the deep security queries and specialized investigative dashboarding.
Disclaimer: all data you see on this site was autogenerated for demonstration purposes. It demonstrates concepts and ideas but does not shows any real names, IP addresses and any other information that matches real world events.
TeaLeaf comes with cxConnect for Data Analysis component.
“Tealeaf cxConnect for Data Analysis is an application that enables the transfer of data from your Tealeaf CX datastore to external reporting environments. Tealeaf cxConnect for Data Analysis can deliver data in real-time to external systems such as event processing systems or enable that data to be retrieved in a batch mode. Extraction of customer interaction data into log files, SAS, Microsoft SQL Server or Oracle databases are supported. Data extraction jobs can be run on a scheduled or ad-hoc basis. Flexible filters and controls can be used to include or exclude any sessions or parts of sessions, according to your business reporting needs“.
Source: IBM TeaLeaf.
Although from my experience “real-time” claim is a long shot (at least I didn’t find a way to accomplish above in real-time), but I managed to do pretty successful regular, hourly, detailed TeaLeaf log exports.
If you’d try to use cxConnect right off the bat for log exports and select all default options – you’ll end up with humongous set of files that will contain mountain data you don’t really need wasting your disk space. It took me quite a while to configure cxConnect to export data that i need and to make it not include data that i don’t need.
10 / 11 / 2014
IBM TeaLeaf is one of the leading customer experience management platforms from IBM.
IBM TeaLeaf is set of tools allowing enterprises to record all customer interactions with their Web Application portals with further capabilities of visual session replays. IBM TeaLeaf also offers a set of interfaces to design custom events, alerts, dashboards and visual reports.
TeaLeaf allows to define custom reporting dimensions that could be very specific to the given business needs.
Tracking clicks, conversions, customer struggles, optimizing sales funnels, analyzing mobile experiences, presenting any kind of data in a visually appealing way are only few of many available benefits that TeaLeaf offers.
As a consultant helping large brokerage and financial firm to manage firm-wide TeaLeaf deployment – I see another fast growing application for IBM TeaLeaf – financial fraud investigations, security analytics, forensic analysis and investigation of suspicious activities.
When corporate security departments receive suspicious activity reports and requests to investigate possible ATO (Account TakeOver – case where fraudsters buy on a black market set of valid customer credentials obtained by targeted phishing attacks for example) – they come to TeaLeaf to dig into raw data and do forensics investigations.
After finding necessary visitor hits or customer sessions data within TeaLeaf database, security investigators launch TeaLeaf RTV viewer to visually preview actual sessions that potentially involve fraudulent activity.
TeaLeaf allows searching for pieces of data by predefined metrics – IP address, browser OS, browser version, User Agent, text in request, text in response as well as via raw text data fragments possibly found within hit or session data.
TeaLeaf generally offers two ways to search for pieces of information – via it’s browser interface (“Search” menu option) or using RealiTea Viewer (RTV) – separate desktop application allowing to run raw searches via direct connections to TeaLeaf data repository (data canisters).
The main advantage of RTV is that it allows to run searches on currently active sessions (in other words with almost no delay in obtaining fresh traffic data) as well as it is quite fast.
Building online membership site business is an exciting step on the road to build our own source of residual income.
We all have talents, we all love and know to do certain things better than anyone else. We all know how to solve certain problems that others would love to learn from us. Taking time to put our skills, experience, passion and knowledge on the web and having a chance to monetize it brings triple excitement:
- You do something more about what you love.
- You share what you love to do with other people and helping them to solve their problems.
- You can make money doing what you love.
One way to monetize your skills and experience is to build a site and share your skills and experience in form of writings, posts, helpful articles, downloadable ebooks, text, video and audio tutorials or in any other kind of “digital” format. Idea is to share your talents with other people, helping them to solve their problems and making money along the way by charging for information access.
Today on internet people are happily paying for useful practical information.