Reporting and Forensics are the principal products of a SIEM system. We have pushed, prodded, and poked at the data to get it into a manageable format, so now we need to put it to use. Reports and forensic analysis are the features most users work with on a day to day basis. Collection, normalization, correlation and all the other things we do are just to get us to the point where we can conduct forensics and report on our findings. These features play a big part in customer satisfaction, so while we’ll dig in to describe how the technology works, we will also discuss what to look for when making buying decisions.
For those of us who have been in the industry for a long time, the term ‘reporting’ brings back bad memories. It evokes hundreds of pages of printouts on tractor feed paper, with thousands of entries, each row looking exactly the same as the last. It brings to mind hours of scanning these lines, yellow highlighter in hand, marking unusual entries. It brings to mind the tailoring of reports to include new data, excluding unneeded columns, importing files into print services, and hoping nothing got messed up which might require restarting from the beginning.
Those days are fortunately long gone, as SIEM and Log Management have evolved their capabilities to automate a lot of this work, providing graphical representations that allow viewing data in novel ways. Reporting is a key capability because this process was just plain hard work. To evaluate reporting features included in SIEM/LM, we need to understand what it is, and the stages of a reporting process. You will notice from the description above that there are several different steps to the production of reports, and depending on your role, you may see reporting as basically one of these subtasks. The term ‘reporting’ is a colloquialism used to encompass a group of activities: selecting, formatting, moving, and reviewing data are all parts of the reporting process.
So what is reporting? At its simplest, reporting is just selecting a subset of the data we previously captured for review, focused analysis, or a permanent record (‘artifact’) of activity. Its primary use is to put data into an understandable form, so we can analyze activity and substantiate controls without having to comb through lots of irrelevant stuff. The report comprises the simplified view needed to facilitate review or, as we will discuss later, forensic analysis. We also should not be constrained by the traditional definition of a report, which is a stack of papers (or in modern days a PDF). Our definition of reporting can embrace views within an interface that facilitate analysis and investigation.
The second common use is to capture and record events that demonstrates completion of an assigned task. These reports are historic records kept for verification. Trouble-ticket work orders and regulatory reports are common examples, where a report is created and ‘signed’ by both the producer of the report and an auditor. These snapshots of events may be kept within, or stored separately from, the SIEM/LM system.
There are a couple basic aspects to reporting that we that we want to pay close attention to when evaluating SIEM/LM reporting capabilities:
- What reports are included with the standard product?
- How easy is it to manage and automate reports?
- How easy is it to create new, ad-hoc reports?
- What export and integration options are available?
For many standard tasks and compliance needs, pre-built reports are provided by the vendor to lower costs and speed up product deployment. At minimum, vendors provide canned reports for PCI, Sarbanes-Oxley, and HIPAA. We know that compliance is the reason many of you are reading this series, and will be the reason you invest in SIEM. Reports embody the tangible benefit to auditors, operations, and security staff. Just keep in mind that 2000 built-in reports is not necessarily better than 100, despite vendor claims. Most end users typically use 10-15 reports on an ongoing basis, and those must be automated and customized to the user’s requirements.
Most end users want to feel unique, so they like to customize the reports – even if the built-in reports are fine. But there is a real need for ad-hoc reports in forensic analysis and implementation of new rules. Most policies take time to refine, to be sure that we collect only the data we need, and that what we collect is complete and accurate. So the reporting engine needs to make this process easy, or the user experience suffers dramatically.
Finally, the data within the reports is often shared across different audiences and applications. The ability to export raw data for use with third party-reporting and analysis tools is important, and demands careful consideration during selection.
People say end users buy interface and reports, and that is true for the most part. We call that broad idea _user experience_m and although many security professionals minimize the focus on reporting during the evaluation process, it can be a critical mistake. Reports are how you will show value from the SIEM/LM platform, so make sure the engine can support the information you need to show.
It was just this past January that I read an “analyst” report on SIEM, where the author felt forensic analysis was policy driven. The report claimed that you could automate forensic analysis and do away with costly forensic investigations. Yes, you could have critical data at your fingertips by setting up policies in advance! I nearly snorted beer out my nose! Believe me: if forensic analysis was that freaking easy, we would detect events in real time and stop them from happening! If we know in advance what to look for, there is no reason to wait until afterwards to perform the analysis – instead we would alert on it. And this is really the difference between alerting on data and forensic analysis of the same data. We need to correlate data from multiple sources and have a real live human being make a judgement call. Let’s be clear: these pseudo-analyst claims and vendor promotional fluff (you know who they are) are complete BS, and do a disservice to end users by creating absurd expectations.
Now that I’m off the soapbox, let’s take a step back. Forensic analysis is conducted by trained security and network analysts to investigate an event, or more likely a sequence of events, indicating fraud or misuse. An analyst may have an idea what to look for in advance, but more often you don’t actually know what you are looking for, and you need to navigate through thousands of events to piece together what happened and understand the breadth of the damage. This involves rewriting queries over and over to drill down and look at data, using different methods of graphing and visualization before finding the proverbial needle in the haystack.
The use cases for forensic analysis are numerous, including examination of past events and data to determine what happened in your network, OS, or application. This may be to verify something that was supposed to happen actually occurred, or to better understand whether strange activity was fraud or misuse. You might need forensic analysis for simple health checks on equipment and business operations. You may need it to scan user activity to support disciplinary actions against employees. You might even need to provide data to law enforcement to pursue criminal data breaches.
Unlike correlation and alerting, where we have automated analysis of events, forensic analysis is largely manual. Fortunately we can leverage collection, normalization, and correlation – much of the data has already been collected, aggregated, and indexed within the SIEM/LM platform.
A forensic analysis usually starts with data provided by a report, an alert, or a query against the SIEM/LM repository. We start with an idea of whether we are interested in specific application traffic, strange behavior from a host, or pretty much an infinite number of things that could be suspicious. We select data with the attributes we are interested in, gathering information we need to analyze events and validate whether the initial suspicious activity is much ado about nothing, or indicates a major issue.
These queries may be as simple as “Show all failed logins for user ‘mrothman’”, or as specific as “Show events from all firewalls, between 1 and 4 am, that involved this list of users”. It is increasingly common to examine application-layer or database activity to provide context for business transactions – for example, “list all changes to the general ledger table where the user was not ‘
GA_Admin’ or the application was not ‘
There are a couple important capabilities we need to effectively perform forensic analysis:
- Custom queries and views of data in the repository
- Access to correlated and normalized data
- Drill-down to view non-normalized or supplementary data
- Ability to reference and access older data
- Speed, since forensics is usually a race against time (and attackers)
Basically the most important capability is to enable a skilled analyst to follow their instincts. Forensics is all about making their job easier by facilitating access, correlation, and viewing of data. They may start with a set of anomalous communications between two devices, but end up looking at application logs and database transactions to prove a significant data breach. If queries take too long, data is manipulated, or data is not collected, the investigator’s ability to do his/her job is hindered. So the main role of SIEM/LM in forensics is to streamline the process.
To be clear, the tool only makes the process faster and more accurate. Without a strong incident response process, no tool can solve the problem. Although we all get very impressed by a zillion built-in reports and cool drill-down investigations during a vendor demo, don’t miss the forest for the trees. SIEM/Log Management platforms can only streamline a process that already exists. And if the process is bad, you’ll just execute on that bad process faster.