IBM Tealeaf plus Splunk
IBM TeaLeaf is one of the leading customer experience management platforms from IBM.
IBM TeaLeaf is set of tools allowing enterprises to record all customer interactions with their Web Application portals with further capabilities of visual session replays. IBM TeaLeaf also offers a set of interfaces to design custom events, alerts, dashboards and visual reports.
TeaLeaf allows to define custom reporting dimensions that could be very specific to the given business needs.
Tracking clicks, conversions, customer struggles, optimizing sales funnels, analyzing mobile experiences, presenting any kind of data in a visually appealing way are only few of many available benefits that TeaLeaf offers.
As a consultant helping large brokerage and financial firm to manage firm-wide TeaLeaf deployment – I see another fast growing application for IBM TeaLeaf – financial fraud investigations, security analytics, forensic analysis and investigation of suspicious activities.
When corporate security departments receive suspicious activity reports and requests to investigate possible ATO (Account TakeOver – case where fraudsters buy on a black market set of valid customer credentials obtained by targeted phishing attacks for example) – they come to TeaLeaf to dig into raw data and do forensics investigations.
After finding necessary visitor hits or customer sessions data within TeaLeaf database, security investigators launch TeaLeaf RTV viewer to visually preview actual sessions that potentially involve fraudulent activity.
TeaLeaf allows searching for pieces of data by predefined metrics – IP address, browser OS, browser version, User Agent, text in request, text in response as well as via raw text data fragments possibly found within hit or session data.
TeaLeaf generally offers two ways to search for pieces of information – via it’s browser interface (“Search” menu option) or using RealiTea Viewer (RTV) – separate desktop application allowing to run raw searches via direct connections to TeaLeaf data repository (data canisters).
The main advantage of RTV is that it allows to run searches on currently active sessions (in other words with almost no delay in obtaining fresh traffic data) as well as it is quite fast.
The disadvantage of RTV is that when you need to search for more than one metric (say a number of IP addresses, pages and referers) – it’s quickly become clunky and complicated to configure more complex queries. Even after search is completed – you can either visually preview (replay) results session by session or you’ll be left with very basic, flat, notepad-like looking window showing long set of raw data that besides actual HTTP request/response information contains many TeaLeaf specific variables that were injected into this raw set via post processing logic making data sifting tasks even more complicated.
Enterprise information security departments often interested to detect and be alerted if visitors to banking portal comes from the range of “bad” IP addresses either with high risk score or previously used to commit fraud.
Security investigators often tasked with requests to urgently investigate banking or trading accounts affected by the range of attacking subnets (IP address ranges defined by CIDR masks), or countries of origins.
Another, more complicated challenge is to assemble the list of all IP addresses that were used to access multiple accounts within short period of time (typical case of access credentials bought on a black market or obtained via phishing attack).
All above tasks will either take prohibitively long amount of time to complete or be outright impossible to accomplish using TeaLeaf’s own facilities (as of TeaLeaf version 8.7).
TeaLeaf is great in generating structured data – and information is all there – but there is no convenient way to tap into it for in-depth fraud analytics.
To build the bridge between very rich set of TeaLeaf data and ability to run complicated forensics security analytics and investigative queries I decided to deploy Splunk instance as a proof of concept and connect it with TeaLeaf as a data source.
My vision was to build powerful and fast security analytics platform using IBM TeaLeaf and Splunk combined.
TeaLeaf generates and makes available extremely unique and rich set of data that contains way more information than any regular web server logs.
Additionally to that TeaLeaf makes it easy to customize it, add new fields, variables and dimensions without changing actual data structure much.
Splunk is a data analytics software platform capable of managing extremely large sets of loosely structured data in real time, indexing it as well as building queries of any complexity to analyze such data sets. Splunk allows creation of customizable real time alerts and building very specialized business-specific dashboards and apps making data search, alerting and presentation tasks much easier and more visual.
Using above tools I was able to build custom Splunk App optimized for banking and financial industry client’s data set coming directly from TeaLeaf on a hourly basis.
This resulted in deployment of powerful and flexible security and fraud investigation platform allowing searches through all available data at a rate of up to 10,000,000 events per second and presenting results in a rich set of tables and graphs.
For example it took about 10 seconds to scan for all instances of Shellshock and SQL injection attacks launched within the last 30 days against one of major client facing banking portal.
And I will show you in details how to built such setup in a future posts.
Final App’s interface does not require much typing and offers clickable drill-down abilities to quickly investigate specific metrics across different dimensions.
What was impossible or took hours and sometimes days before – now takes seconds or minutes.
One of the well received feature was an added ability of investigation dashboard to receive a content of an unstructured fraud report via simple copy/paste operation, automatically parse it and convert it to fast executing Splunk query fully automatically.
Step by step details of above implementation will follow soon.