When you think about it, security success in today’s environment comes down to a handful of key imperatives. First we need to improve the security of our environment. We are losing ground to the bad guys, and we’ve got to make some inroads on more quickly figuring out what’s being attacked and stopping it.
Next we’ve got to do more with less. Yes, it seems the global economy is improving, but we can’t expect to get back to the halcyon days of spend first, ask questions later – ever. With more systems under management we have more to worry about and less time to spend poring over reports, looking for the proverbial needle in the haystack. Given the number of new attacks – counted by any measure you like – we’ve got to increase the efficiency of our resource utilization.
Finally, auditors show up a few times a year, and they want their reports. Summary reports, detail reports, and reports that validate other reports. The entire auditor dance focuses on convincing the audit team that you have the proper security controls implemented and effective. That involves a tremendous amount of data gathering, analysis, and reporting just to set up; with continued tweaking over time. It’s basically a full time job to get ready for the audit, dropped on folks who already have full time jobs. So we’ve got to automate those functions to the greatest degree possible.
Yes, there are lots of other reasons organizations embrace SIEM and Log Management technology, but these three make up the vast majority of the projects we see funded. So let’s dig into each use case and understand exactly what problem we are trying to solve.
Use Case #1: React Faster
Imagine the typical day of a security analyst. They sit down at their desk, check out their monitors, and start seeing events scroll past. A lot of events, probably millions. Their job is to look at that information and figure out what’s wrong and identify the root cause of each problem.
They probably have alerts set up to report critical issues within their individual system consoles, in an effort to cull down the millions of events into some finite set of things to investigate – per system. So the analyst goes back and forth between the firewall, IPS, and network traffic analysis consoles. If a WAF is deployed, or a database activity monitoring product, they have to deal with that as well. An office chair that swivels easily is a good investment to keep your neck from wearing out.
Security analysts tend to be pretty talented folks, so they do find stuff based on their understanding of the networks and devices and their own familiarity with normal, which allows them to recognize not normal. There are some events that just look weird but cannot be captured in a policy or rule. Successful reviews rise from the ability of the human analyst to interpret the alerts between the various systems and identify attacks.
The issues with this scenario are numerous:
- Too much data, not enough information: With anywhere from 10-2000 devices to monitor, each generating a couple thousand logs and/or alerts a day, there is plenty of data. The analyst has to turn that data into information, which is a tall order for anyone.
- High signal to noise ratio: With that much data, the analyst is likely only going to investigate the most obvious attacks. And without some way to reduce the number of alerts to deal with, there will be lots of false positives to wade through, impacting productivity.
- No “situational awareness”: The new new term in security circles is situational awareness; the concept that anomalous situations are lost in a sea of detail unless the bigger business context in considered. With only events to wade through, a human analyst will lose context and not be able to keep track of the big picture.
- Too many tools to isolate root cause: Without centralizing data from multiple systems, there is no way to know if an IPS alert was related to a web attack or some other issue. So the analyst needs to quickly move from system to system to validate and confirm the attack, and to understand the depth of the issue. That approach isn’t particularly efficient and in an incident situation, time is the enemy.
We’ve written on numerous occasions about the need to react faster, since we can’t predict where the next attack is coming from. The promise of SIEM and Log Management solutions is to help us react faster – and better – and make the world a better place, right? The features and functions a security analyst will employ are:
- Data aggregation: SIEM/LM solutions aggregate data from many sources, including network, security, servers, databases, applications, etc. – providing the ability to monitor everything. Having all of the events in one place helps avoid missing subtle but important ones.
- Correlation: Correlation looks for common attributes, and links events together into meaningful bundles. Being able to look at all events in a particular window of time, or everything a specific user did, gives us a meaningful way to investigate security events. This technology provides the ability to perform a variety of correlation techniques to integrate different sources, in order to turn data into useful information. Check out our more detailed view of correlation.
- Alerting: Automated analysis of correlated events can produce more substantial and detailed alerts, and help identify what needs to be investigated right now.
- Dashboards: With liberal use of eye candy, SIEM/LM tools take event data and turn it into fancy charts. These charts can assist the analyst in seeing patterns, and more importantly in seeing activity that is not a standard pattern, or not visible when looking at individual log entries.
So ultimately this use case provides the security analyst with a set of automatic eyes and ears to wade through all the data and help identify what’s most important and requires attention now.
This is the first white paper that Mike and I have written together, and as you can tell, we’re kinda verbose. As such I am splitting this post into two segments, with the other use cases coming Monday; we will follow up with the business justification. Later in this series, we’ll discuss specifically how to address this use case using the SIEM/LM toolset, and manage expectations for the amount of time and effort required to build the system and to feed it on an ongoing basis.
Reader interactions
2 Replies to “Understanding and Selecting SIEM/LM: Use Cases, Part 1”
Adding to what Dwayne said, it is also important to capture “changes of interest” and then be able to correlate them to log events of interest — whether that is using SIEM technology or simple forensics of the raw log data.
The fact is, the activities that leave infrastructure and data at risk of compromise are not caused by log activities alone, or even change activities alone. They are often the result of a combination of both. Without the ability to find and filter suspect changes “and” suspect log events you may miss the problem.
As Dwayne said, that is what we are working at Tripwire.
Adrian, you are singing my song. This is exactly the issue I’m focused on (at Tripwire, a technology solution provider).
Reacting faster, to me, means starting with visibility so you have awareness of all the events & changes in your environment, but using intelligence & automation to systematically evaluate what’s happening and focus the organization on the things that matter.
The challenge is compounded by silos (of technology & expertise) and we need to make it easier for organizations to evaluate disparate data in a way that adds context.
I liken it to a human model – our ears hear all the sounds around us, but our brain focuses our attention on things that represent danger, pleasure, or “things that make you go hmmmm.”
Security solutions should do the same thing – focus an organization’s attention on things that represent risks or attacks, things that you want to validate or reinforce, or the events & changes of interest (the IT equivalent of “things that make you go hmmmm).”
Keep up the great analysis.