Incident Response Fundamentals: IntroductionBy Mike Rothman
Over the past year, as an industry we have come to realize that we are dealing with different adversaries using different attack techniques with different goals. Yes, the folks looking for financial gain by compromising devices are still out there. But add a well-funded, potentially state-sponsored, persistent and patient adversary to the mix, and we need to draw a new conclusion. Basically, we now must assume our networks and systems are compromised. That is a tough realization, but any other conclusion doesn’t really jive with reality, or at least the reality of everyone we talk to.
For a number of years, we’ve been calling bunk on the concept of “getting ahead of the threat” – most of the things viewed as proactive. Anyone trying to take such action has been disappointed by their ability to stop attacks, regardless of how much money or political capital they expended to drive change. Basing our entire security strategy on the belief that we can stop attacks if we just spend enough, tune enough, or comply enough; is no longer credible – if it ever was. We need to change our definition of success from stopping an attack (which would be nice, but isn’t always practical) to reacting faster and better to attacks, and containing the damage.
We’re not saying you should give up on trying to prevent attacks – but place as much (or more) emphasis on detecting, responding to, and mitigating them. This has been a common theme in Securosis research since the beginning, and now we will document exactly what that means and how to get there.
We don’t get a lot of push-back anymore on our position that organizations can’t stop all attacks. From a certain perspective that is progress, and we also believe many security professionals have spent a lot of time managing expectations internally so there is an understanding that perfect security cannot be achieved (or that management is unwilling to fund it and compromise everything else to in favor of security improvements). But following that concept to the next step means we need to get much better at detecting attacks sooner. We have already documented a number of approaches at the network layer in terms of monitoring everything and looking for not normal. They also apply to the application (part 1 & part 2) and database (part 1 & part 2), which we have been talking about in our Monitoring up the Stack series.
So in the first part of this new series, we will talk about the data collection infrastructure you should be thinking about, what kind of organizational model allows you to react faster, and what to do before the attack is detected. If you know you are being attacked, you are already ahead of the vast majority of companies out there. But what then?
Once you understand you are under attack, then your incident response process needs to kick in. Most organizations do this poorly because they have neither the process nor the skills to figure out what’s happening and do something useful about it. Many organizations have a documented incident response program, but that doesn’t mean it’s effective or that the organization has embraced what it really means to respond to an incident. And this is about much more than just tools and flowcharts. Unless the process is well established and somewhat second nature, it will fail under duress – which is the definition of an incident.
It is also important to remember that this process touches much more than just IT. It must involve other organizations (legal, HR, operational risk, etc.), in order to actually manage or mitigate the organizational risk of any attack. One of the things that Rich’s emergency response experience has shown is that chain of command is critical; and everyone must be in alignment on process, responsibilities, and accountabilities; before the incident happens. Again, a lot of this stuff seems like common sense (and it is!), but we have seen few organizations that do this well, so we’ll walk through what we mean by reacting better throughout the series.
Before, During, and After
The concept we will come back to throughout this series is before, during, and after the attack. This will provide context for the different things that must happen based on where you are within the attack lifecycle.
- Before: Figure out what data to monitor, how much of it is useful, how to make use of it, and how long to retain it, is key to building the infrastructure for persistent monitoring. This must happen before the attack, because you only get one chance to collect that data, when things are happening. You don’t get to go back and record it after the fact (unless you completely fail to learn from the first attack, and they hit you again – not a good way to get a second chance!).
- During: How can you contain the damage as quickly as possible? By identifying root cause accurately and remediating effectively. We’ll dig into how to identify the attack, who to work with to provide the data you need, and how to do this in the heat of battle.
- After: Once the attack has been contained, focus shifts to making sure it doesn’t happen again. In these posts we’ll discuss the forensics process, and necessary tools and skills – as well as how to maintain chain of custody and the post mortem required to learn something from a difficult situation.
We’ll also discuss the current state of threat management tools, including SIEM, IDS/IPS, and network packet capture, to define their place in our approach. Finally we consider how network security is evolving and what kind of architectural constructs you should be thinking about as you revisit your data collection and defensive strategies.
At the end of this series you will have a good overview of how to deal with all sorts of threats and a high level process for identifying the issues, containing the damage, and using the feedback loop to ensure you don’t make the same mistakes again. That’s the plan, anyway.