So far in this series, we’ve discussed the challenges of security operations, making sense of security data, and refining detection/analytics, which are all critical components of building a modern, scalable SOC. Yet, there is an inconvenient fact that warrants discussion. Unless someone does something with the information, the best data and analytics don’t result in a positive security outcome. Security success depends on consistent and effective operational motions. Sadly, this remains a commonly overlooked aspect of building the SOC. As we wrap up the series, we’re going to go from alert to action and do it effectively and efficiently, every time (consistently), which we’ll call the 3 E’s. The goal is to automate everything that can be automated, enabling the carbon (you know, humans) to focus on the things that suit them best. Will we get there by 2025? That depends on you, as the technology is available, it’s a matter of whether you use it. The 3 E’s First, let’s be clear on the objective of security operations, which is to facilitate positive security outcomes. Ensuring these outcomes is to focus on the 3 E’s. Effectiveness: With what’s at stake for security, you need to be right because security is asymmetric. The attackers only need to be right once, and defenders need to defeat them every time. In reality, it’s not that simple, as attackers do need to string together multiple successful attacks to achieve their mission, but that’s beside the point. A SOC that only finds stuff sometimes is not successful. You want to minimize false positives and eliminate false negatives. If an alert fires, it should identify an area of interest with sufficient context to facilitate verification and investigation. Efficiency: You also need to do things as quickly as possible, consuming a minimum of resources due to limited available resources and the significant damage (especially against an attack like ransomware) that can happen in minutes. You need tooling that makes the analyst’s job easier, not harder. You also need to facilitate the communication and collaboration between teams to ensure escalation happens cleanly and quickly. Breaking down the barriers between traditional operational silos becomes a critical path to streamlining operations. Every Time (Consistency): Finally, you need the operational motions to be designed and executed the same way, every time. But aren’t there many ways to solve a problem? Maybe. But as you scale up your security team, having specific playbooks to address issues makes it easy to onboard new personnel and ensure they achieve the first two goals: Effectiveness and Efficiency. Strive to streamline the operational motions (as associated playbooks) over time, as things change and as you learn what works in your environment. Do you get to the 3 E’s overnight? Or course not. It takes years and a lot of effort to get there. But we can tell you that you never get there unless you start the journey. Defining Playbooks The first step to a highly functioning SOC is being intentional. You want to determine the proper operational motions for categories of attacks before you have to address them. The more granular the playbook, the less variance you’ll get in the response and the more consistent your operations. Building the playbooks iteratively allows you to learn what works and what doesn’t, tuning and refining the playbook every time you use it. These are living documents and should be treated as such. So how many playbooks should you define? As a matter of practice, the more playbooks, the better; but you can’t boil the ocean, especially as you get started. Begin by enumerating the situations you see most frequently. These typically include phishing, malware attacks/compromised devices, ransomware, DDoS, unauthorized account creation, and network security rule changes. To be clear, pretty much any alert could trigger a playbook, so ultimately you may get to dozens, if not hundreds. But start with maybe the top 5 alerts detected in your environment and start with those. What goes into a playbook? Let’s look at the components of the playbook: Trigger: Start with the trigger, which will be an alert and have some specific contextual information to guide the next steps. Enrichment: Based on the type of alert, there will be additional context and information helpful to understanding the situation and streamlining the work of the analyst handling the issue. Maybe it’s DNS reputation on a suspicious IP address or an adversary profile based on the command and control traffic. You want to ensure the analyst has sufficient information to dig into the alert immediately. Verification: At this point, a determination needs to be made as to whether the issue warrants further investigation. What’s required to make that call? For a malware attack, maybe it’s checking the email gateway for a phishing email that arrived in the user’s inbox. Or a notification from the egress filter that a device contacted a suspicious IP address. For each trigger, you want to list the facts that will lead you to conclude this is a real issue and assess the severity. Action: Upon verification, what actions need to be taken? Should the device be quarantined and a forensic image of the device be captured? Should an escalation of privileges or firewall rule change get rolled back? You’ll want to determine what needs to be done and document that motion in granular detail, so there are no questions about what should be done. You’ll also look for automation opportunities, which we’ll discuss later in the post. Confirmation: Was the action step(s) successful? Next, confirm whether the actions dictated in the playbook happened successfully. This may involve scanning the device (or service) to ensure the change was rolled back or making sure the device is not accessible anymore to an attacker. Escalation: What’s next? Does it get routed to a 2nd tier for further verification and research? Is it sent directly to an operations team to be fixed if it can’t be automated? Can the issue be closed out because you’ve gotten the confirmation that