Security Analytics with Big Data [New Series]By Adrian Lane
Big Data is being touted as a ‘transformative’ technology for security event analysis – promised to detect threats in the ever-increasing volume of event data generated from in-house, mobile, and cloud-based services. But a combination of PR hype, vendor positioning, and customer questions has pushed it to the top of my research agenda. Many customers are asking “Wait, don’t I already have SIEM for event analysis?” Yes, you do. And SIEM is designed and built solve the same problems – but 7-8 years ago – and it is failing to keep up with current problems. It’s not just that we’re trying to scale up to a much larger set of data, but we also need to react to events an order of magnitude faster than before. Still more troubling is that we are collecting multiple types of data, each requiring new and different analysis techniques to detect advanced attacks. Oh, and while all that slows down SIEM and log management systems, you are under the gun to identify attacks faster than before.
This trifecta of issues limit the usefulness of SIEM and Log Management – and makes customers cranky. Many SIEM platforms can’t scale to the quantity of data they need to manage. Some are incapable of even storing basic data as fast as it comes in – forget about storing and analyzing non-standard data types. ‘Real-time’ analysis is a commonly cited as SIEM feature but after collection, storage, normalization, correlation, and enrichment, you are lucky to access new events within an hour – much less within a minute. The good news is that big data, correctly deployed, can solve these issues. In this paper we will examine how big data addresses scalability and performance, improves analysis, can accommodate multiple data types, and will be leveraged with existing environments. Or goal is to help users differentiate reality from wishful thinking, and to provide enough information to make informed purchasing decisions.
To do this we need to demystify big data and contrast how it differs from traditional data management systems. We will offer a clear and unique definition of big data and explain how it helps overcome current technical limitations. We will offer a pragmatic way for customers to leverage big data, enabling them to select a solution strategically. We will highlight the limitations of SIEM and Log Management, key areas of customer dissatisfaction, areas where big data excels in comparison. We will also discuss some changes required for big data analysis and data management, as well as a change in mindset necessary to take full advantage.
This is not all theory and speculation – big data is currently being employed to detect security threats, address new requirements for IT security, and even help gauge the effectiveness of other security investments. Big data natively addresses ever-increasing event volume and the rate at which we need to examine new events. There is no question that it holds promise for security intelligence, both in the numerous ways it can parse information and through its native capabilities to sift proverbial needles from monstrous haystacks. Cloud and mobile architectures force us to reexamine how we manage security data, and to scale across broader sets of systems and events – neither of which mesh with the structured data repositories on which most organizations rely. But most IT and security practitioners do not yet fully understand big data or how to employ it so they are unable to weed through all the hype, FUD, and hyperbole. To take full advantage, however, requires both a deeper understanding of the technology and a subtle shift in mindset to enable informed decisions on incorporate big data into existing IT systems, perhaps by shifting to newer big data platforms.
This research paper will highlight several areas:
- Use Cases: We will discuss issues customers cite with performance and scalability, particularly for security event analysis. We will discuss in detail how SIEM, Log Management, and event-centric systems struggle under new requirements for data velocity and data management, and why existing technologies aren’t cutting it. We will also discuss the inflexibility of pre-BD analysis, alerting, and reporting – and how they demand a new approach to security and forensics, as we struggle to keep pace with the evolution of IT.
- New Events and Approaches: This post will explain why we need to consider additional data types that go beyond events. Existing technologies struggle to meet emerging needs because threat data does not conform to traditional syslog and netflow event types. There is a clear trend toward broader data analysis to detect advanced attacks and better understand risks.
- What is Big Data and how does it work? This post will offer a basic definition of big data, along with a discussion of the native capabilities that make big data different than traditional analysis tools. We will discuss how features like HDFS, MapReduce, Hive, and Pig work together to address issues of scale, velocity, performance, and multiple data types.
- The promise of big data: We will explain why big data is viewed as a disruptive technology for security analytics. We will show how big data solutions mitigate problems and change security and event analysis. We will discuss how big data platforms handle collecting and parsing event data, and cover different queries and reports that support new threat analyses.
- How big data changes security platforms: This post will discuss how to supplement existing systems – through standalone instances, partial integration of big data with existing systems, systems that natively leverage big data infrastructure, or fully integrated systems that run atop NoSQL structures. We will also discuss operational changes to SIEM usage, including the growing importance of data scientists to security.
- Integration roadmap and planning: In this section we will address the common concerns, limitations, and realities of merging big data into your IT systems. Specifically, we will discuss:
- Integration and deployment issues
- Platform selection (diversity of platforms and data)
- Policy and report development
- Data privacy and sharing
- Big data platform security basics
Our next post will cover use cases, the key areas where SIEM needs to improve, and some key areas of customer dissatisfaction.