I know what you’re thinking to yourself right now: “They promised me a cool series of posts on the cutting edge of incident response, and now we’re talking management principles and boxes on an org chart? What a rip.”
But believe it or not, the most important aspect of incident response is the right organization, followed by the right process. How do I know this? Because I’ve been through a ton of incident response training with local and federal agencies, and have directly responded to everything from single-rescuer ski accidents to Hurricane Katrina. (And a few IT things in the middle, but those don’t sound nearly as exciting).
While working as an emergency responder I fall under something known as the National Incident Management System, which uses a formalized process and structure called the Incident Command System (ICS). ICS consists of a standard management hierarchy and processes for managing temporary incidents of any size and nature. ICS was originally developed for managing large wildfires in the 1970s, and has since expanded into a national standard that’s also used (and adapted) by a variety of other countries and groups. While our React Faster and Better series won’t to teach you all of ICS, everything we will talk about in terms of process and organization is adapted directly from it.
There’s no reason to reinvent the wheel when you have something with over 30 years of battle-hardened testing available. Additionally, those of you in larger companies or verticals like healthcare or public utilities may be required to learn and use ICS in your own incidents.
Incident Command System Principles
ICS solves a lot of the problems we encounter in incidents. Its focus is on clear communications and accountability, with a structure that expands and contracts as needed, allowing disparate groups to combine even if they’ve never worked together before. ICS includes 5 key concepts:
- Unity of command: Each person involved in an incident only responds to one supervisor.
- Common terminology: It’s hard to communicate when everyone uses their own lingo. Common terminology applies to both the organizational structure (with defined roles, like “Incident Commander”, that everyone understands) and use of plain English (or the language of your choice) in incident communications. You can still talk RPC flaws all you want, but when communicating with management and non-techies you’ll use phrases like “The server is down because we were hacked.”
- Management by objectives: Responders have specific objectives to achieve, in priority order, as defined in a response plan. No running around fighting fires without central coordination.
- Flexible and modular organization: Your org structure should expand and contract as needed based on the nature and size of the incident. The organizational structure can be as small as a single individual, and as large as the entire company.
- Span of control: No one should manage less or more than 3-7 other individuals, with 5 being the sweet spot. This one comes from many years of management science, which have repeatedly confirmed that attempting to directly manage more is ineffective, while managing less is an inefficient use of resources.
If you want to learn more about ICS you can run through the same self-training course used by incident responders at FEMA’s online training site. Start with ICS 100, which covers the basics. While the process we’ll outline in this series is based on ICS principles, it’s specific to information security incident response. We won’t be using terms like “branch” and “section” because they would distract from our focus, but you can clearly plug them in if you want to standardize on ICS. But if you need the Air Ops branch for a cyberattack, something is very very wrong.
For the next post we will focus on three of the key concepts related to organizational structure: unity of command, flexible and modular organization, and span of control, as we talk about the key response roles and structure.