React Faster and Better: Incident Response Gaps

In our introduction to this series we mentioned that the current practice of incident response isn’t up to dealing with the compromises and penetrations we see today. It isn’t that the incident response process itself is broken, but how companies implement response is the problem.

Today’s incident responders are challenged on multiple fronts. First, the depth and complexity of attacks are significantly more advanced than commonly discussed. We can’t even say this is a recent trend – advanced attacks have existed for many years – but we do see them affecting a wider range of organizations, with a higher degree of specificity and targeting than ever before. It’s no longer merely the defense industry and large financial institutions that need to worry about determined persistent attackers. In the midst of this onslaught, the businesses we protect are using a wider range of technology – including consumer tools – in far more distributed environments. Finally, responders face the dual-edged sword of a plethora of tools; some of them are highly effective, and others that contribute to information overload.

Before we dig into the gaps we need to provide a bit of context. First, keep in mind that we are focusing on larger organizations with dedicated incident response resources. Practically speaking, this probably means at least a few thousand employees and a dedicated IT security staff. Smaller organizations should still glean insight from this series, but probably don’t have resources to implement the recommendations.

Second, these issues and recommendations are based on discussions with real incident response teams. Not everyone has the same issues – especially across large organizations – nor the same strengths. So don’t get upset when we start pointing out problems or making recommendations that don’t apply to you – as with any research, we generalize to address a broad audience.

Across the organizations we talk with, some common incident response gaps emerge:

Too much reliance on prevention at the expense of monitoring and response. We still find even large organizations that rely too heavily on their defensive security tools rather than balancing prevention with monitoring and detection. This imbalance of resources leads to gaps in the monitoring and alerting infrastructure, with inadequate resources for response. All organizations are eventually breached, and targeted organizations always have some kind of attacker presence. Always.
Too much of the wrong kinds of information too early in the process. While you do need extensive auditing, logging, and monitoring data, you can’t use every feed and alert to kick off your process or in the initial investigation. And to expect that you can correlate all of these disparate data sources as an ongoing practice is ludicrous. Effective prioritization and filtering is key.
Too little of the right kinds of information too early (or late) in the process. You shouldn’t have to jump right from an alert into manually crawling log files. By the same token, after you’ve handled the initial incident you shouldn’t need to rely exclusively on SIEM for your forensics investigation and root cause analysis. This again goes back to filtering and prioritization, along with sufficient collection. This also requires two levels of collection for your key device types – the first being what you can do continuously. The second is the much more detailed information you need to pinpoint root cause or perform post-mortem analysis.
Poor alert filtering and prioritization. We constantly talk about false positives because those are the most visible, but the problem is less that an alert triggered, and more determining its importance in context. This ties directly to the previous two gaps, and requires finding the right balance between alerting, continuing collection of information for initial response, and gathering more granular information for after-action investigation.
Poorly structured escalation options. One of the most important concepts in incident response is the capability to smoothly escalate incidents to the right resources. Your incident response process and organizations must take this into account. You just can’t effectively escalate with a flat response structure; tiering based on multiple factors such as geography and expertise is key. And this process must be determined well in advance of any incident. Escalation failure during response is a serious problem.
Response whack-a-mole. Responding without the necessary insight and intelligence leads to an ongoing battle where the organization is always one step behind the attacker. While you can’t wait for full forensic investigations before clamping down on an incident to contain the damage, you need enough information to make informed and coordinated decisions that really stop the attack – not merely a symptom. So balancing hair-trigger response with analysis/paralysis is critical to ensure you minimize damage and potential data loss.
*Your goal in incident response is to detect and contain attacks as quickly as possible – limiting the damage by constraining the window within the attacker operates.** To pull this off you need an effective process with graceful escalation to the right resources, to collect the right amount of the right kinds of information to streamline your process, to do ongoing analysis to identify problems earlier, and to coordinate your response to kill the threat instead of just a symptom.

But all too often we see flat response structures, too much of the wrong information early in the process with too little of the right information late in the process, and a lack of coordination and focus that allow the bad guys to operate with near impunity once they establish their first beachhead. And let’s be clear, they have a beachhead. Whether you know about it is another matter.

In our next couple posts Mike will start talking about what information to collect and how to define and manage your triggers for alerts. Then I’ll close out by talking about escalation, investigations, and intelligently kicking the bad guys out.

Blog

React Faster and Better: Incident Response Gaps

Comments

Leave a Reply Cancel reply

Research

Firestarter: Multicloud Deployment Structures and Blast Radius

Firestarter: So you want to multicloud?

Firestarter: 2019: Insert Winter is Coming Meme Here

Firestarter: re:Invent Security Review

Firestarter: Hardware Hacks and Lift and Pray

Sign Up for Our Newsletter

Contact

About

Quick Links