It seems like BigData is all the rage. With things like NoSQL and Hadoop getting all the database wonks hot under the collar, smart forward-thinking folks like Amrit and Hoff increasingly point out the applicability of these techniques to security, and they’re right. I certainly agree that many of these new technologies will have a huge impact on our ability to figure out what’s happening in our environments. And not a moment too soon.
Hoff wrote a couple recent posts discussing the coming renaissance of Big Data and Security (InfoSec Fail: The Problem with BigData is Little Data and More on Security and BigData…Where Data Analytics and Security Collide), and Amrit followed up with BigData, Hadoop, and the Impending Informationpocalypse, making great points about the fragility of any (relatively) new technology, as well as the need to really know what we are looking for.
That’s the biggest fly in this BigData/security ointment. We need proper context to draw useful conclusions about anything. More data does not provide more context. If anything, it provides less because these analysis tools are only as good as the rules they use to alert us to stuff. It’s non-trivial to get this right. Even with the best infrastructure, monitoring everything all the time, you still need to know what to look for.
And it won’t get any easier. Knowing what to look for will get much more complicated. The volume of data promises to mushroom over the next few years, as full packet capture starts to hit the mainstream and more folks start seriously monitoring databases and applications. This will ripple through the entire monitoring ecosystem. Now any company claiming the ability to do security management/analysis will need to not only have some security ninja on staff (to know what to look for), but also some legitimate BigData qualifications.
This isn’t a new direction for the SIEM players. More than one vendor calls what they do security intelligence, modeled after the business intelligence market. That entails a BigData approach to business analysis. To get there, the SIEM vendors have built their own BigData platforms. This means they each have a purpose-built data store that can provide the kind of analysis and correlation required to find the proverbial needle in a stack of haystacks. They invested not because they wanted to build their own data stores, but because no commercial or open source technology could satisfy their requirements.
Do Hadoop and these other technologies change that? Maybe. As Amrit points out, new technologies can be brittle, so it will be a while before tools (or services) based on these latest technologies are ready for prime time. But the writing is on the wall. Security is a BigData problem, and it’s not a stretch to think that some enterprising souls will apply BigData technologies to the security intelligence problem. Which is a great thing – we certainly have not solved the problem.
OMG, maybe we will see some innovation in security soon. But I’m not holding my breath.
Reader interactions
4 Replies to “Security has always been a BigData problem”
Hoff,
I think we are all in violent agreement:
1. We need data (see Adam’s comment)
2. We need methods to process the data (Big data solutions)
3. We need intelligence on top of the data stores and data processing capabilities, such as visual analytics, knowledge management, collaboration, etc. to leverage the data.
I felt the blog post was a bit too much focused on the data layer, omitting number 1 and 3.
Happy Holidays!
-raffy
Raffy —
I disagree with your disagreement 😉
The part you missed/ignored (especially in not reading the blog post to which Mike refers) is that Big Data will ultimately help the “…security intelligence” problem.
See his second-to-last paragraph: “Security is a BigData problem, and it’s not a stretch to think that some enterprising souls will apply BigData technologies to the security intelligence problem. Which is a great thing – we certainly have not solved the problem.”
^^^ Check again that blog of mine you didn’t read 😉
I think the point that got missed in your retort is that Mike is suggesting -rightly- that Big Data is an enabler toward then leveraging an analytics/intelligence layer ATOP capabilities like MapReduce.
Big Data isn’t a silver bullet. What you do with that data matters. A lot. Visualization matters. A lot.
…but don’t throw the baby out with the big data brown water.
If you don’t have the data, you have nothing to visualize.
/Hoff
Security doesn’t have a big data problem. Security has a no data problem. We keep suppressing information about the outcomes of our activities, and crushing any feedback loop before it gets started.
Hadoop can’t change that, but we can, by doing things like admitting we have problems. Here’s my contribution: http://newschoolsecurity.com/2011/12/owning-up-to-pwning-part-2/
I disagree. Security is not a big data problem. Security, the way you portray it, is lacking the capabilities to understand the data. Big data is not helping anyone gaining insights. It’s just the data store and the processing capabilities on top of the data store. Just because you can process petabytes of data through map reduce and column based data stores doesn’t mean that you actually know how to process the data to find interesting things.
That’s where data visualization enters the picture! It is the key to gaining understanding and insights into your security data! That’s what I have been working on with things like http://secviz.org and now I am starting a company to address exactly this problem. Stay tuned!