React Faster and Better: Organizing for ResponseBy Rich
Now that we have a sense of what data to focus on at the beginning of an incident, it’s time to start digging into the response and investigations process itself and talk specifically about what they entail. In larger enterprises, organizing the response process and teams can be extremely complex, due both to the volume of incidents and the complexity of the organizational structure (politics). Some teams align with business units, others with tools, and yet others are centralized.
Leading organizations we speak with consistently display a range of established best practices for responding to threats. Each is a little different on specifics, but they all have tiered escalation plans, optimized to specific threat types, planned out in advace. Occasionally we see a radical re-architecting of these structures and incident response processes due to significant changes in the nature of security risks, regulatory changes, or volume of incidents. Support tools and technology also evolve to support changing processes.
We start the process once an alert has triggered and front-line personnel are initiating the response process. This involves multiple teams and tiers, depending on the nature of the incident. Before detailing the organizational structure, there are a few points to keep in mind:
- There is no ‘right’ organization: Team organization is influenced by the overall organizational layout and nature of the business. We describe a hierarchical and centralized structure, but we have talked with organizations which spread these functions across different teams to align with business units. That said, nearly every organization has a top-tier team or individual responsible for major incidents and those crossing business or agency lines.
- Organize for longevity: Organize around skills and responsibilities rather than tools. Tools come and go, and it’s important that the team utilize platform-specific skills without devolving to a focus on specifici tools.
- Communicate early, even if you don’t have answers yet: It’s important to communicate the basic nature of incidents up the chain early, but not necessarily the details. Higher-level tiers need to know that an incident is occurring and the basics, even if they won’t be directly involved. This helps them prepare resources early and identify incidents with broad scope, even if the early responder doesn’t realize the full impact. Not every incident needs to be passed on, especially as many low-level incidents are handled pretty much immediately, but anything with broader potential should result in a ‘heads-up’ notification.
- Carefully define containment policies: Advanced attacks, as well as those potentially involving law enforcement, require different handling than a simple external intrusion attempt. Cutting off malware or instantly cleaning systems could trigger an attacker response and result in a deeper and more complex infection. Our instinct is to cut all attacks off when we detect them, but this may result in more and longer term damage; sometimes partial containment, monitoring, or other action (or even inaction) is more appropriate. Plan containment scenarios for major attack types early, communicate them, and make sure junior personnel are trained to react properly.
- Clearly define roles and responsibilities: Every team member should know when to escalate, as well as who to notify and when. All too often, a crisis occurs because junior folks tried to manage risk which they lacked the scope, authority, or ability to handle.
The key to managing incidents in large environments is to focus on people and process. The right foundation optimizes incident response and enables nimble and graceful escalation. Making incident response look easy is actually very very hard, and take a lot of work and practice. But the benefits are there. The faster and more effectively you can engage the right resources, the less time the attacker has to wreak havoc in your environment.
In our next posts we will walk through the response tiers and talk about types of incidents, tools, and skills involved at each level.