The Future of Security Operations: Embracing the Machines
To state the obvious, traditional security operations is broken. Every organization faces more sophisticated attacks, the possibility of targeted adversaries, and far more complicated infrastructure; compounding the problem, we have fewer skilled resources to execute on security programs. Obviously it’s time to evolve security operations by leveraging technology to both accelerate human work and take care of rote, tedious tasks which don’t add value. So security orchestration and automation are terms you will hear pretty consistently from here on out. Some security practitioners resist the idea of automation, mostly because if done incorrectly the ramifications are severe and likely career-limiting. So we’ve advocated a slow and measured approach, starting with use cases that won’t crater the infrastructure if something goes awry. We discussed two of those in depth: enriching alerts and accelerating incident response, in our Regaining Balance post. The value of being able to respond to more alerts, better, is obvious. So we expect technologies focused on this (small) aspect of security operations to become pervasive over the next 2-3 years. But the real leverage lies not just in making post-attack functions work better. The question is: How can you improve your security posture and make your environment more resilient by orchestrating and automating security controls? That’s what this post will dig into. But first we need to set some rules of engagement for what automation of this sort looks like. And more importantly, how you can establish trust in what you are automating. Ultimately the Future of Security Operations hinges on this concept. Without trust, you are destined to remain in the same hamster wheel of security pain (h/t to Andy Jaquith). Attack, alert, respond, remediate, repeat. Obviously that hasn’t worked too well, or we wouldn’t continue having the same conversations year after year. The Need for Trustable Automation It’s always interesting to broach the topic of security automation with folks who have had negative experiences with early (typically network-centric) automation. They instantaneously break out in hives when discussing automatically reconfiguring anything. We get it. When there is downtime or another adverse situation, ops people get fired and can’t pay their mortgages. Predictably, survival instincts kick in, limiting use of automation. Thus our focus on Trustable Automation – which means you tread carefully, building trust in both your automated processes and the underlying decisions that trigger them. Iterate your way to broader use of automation with a simple phased approach. Human approval: The first step is to insert a decision point into the process where a human takes a look and ensures the proper functions will happen as a result of automation. This is basically putting a big red button in the middle of the process and giving an ops person the ability to perform a few checks and then hit it. It’s faster but not really fast, because it still involves waiting on a human. Accept that some processes are so critical they never get past human approval, because the organization just cannot risk a mistake. Automation with significant logging: The next step is to take the training wheels off and let functions happen automatically, while making sure to log pretty much everything and have humans keep close tabs on it. Think of this as taking the training wheels off, but staying within a few feet of the bike, just in case it tips over. Or running an application in Debug mode so you can see exactly what is happening. If something does happen which you don’t expect, you’ll be right there to figure out what didn’t work as expected and correct it. As you build trust in the process, we recommend you continue to scrutinize logs, even when things go perfectly. This helps you understand the frequency of changes, and which changes are made. Basically you are developing a baseline of your automated process, which you can use in the next phase. Automation with guardrails: Finally you reach the point where you don’t need to step through every process. The machines are doing their job. That said, you still don’t want things to go haywire. Now you leverage the baseline you developed using automation with logging. With these thresholds you can build guardrails to make sure nothing happens outside your tolerances. For example, if you are automatically adding entries to an egress IP blacklist to stop internal traffic going to known bad locations, and all of a sudden your traffic to your SaaS CRM system is due to be added to your blacklist due to a fault threat intel update, you can prevent that update and alert administrators to investigate the threat intel update. Obviously this requires a fundamental understanding of the processes being automated and an ability to distinguish between low-risk changes which should be made automatically from those which require human review. But that level of knowledge is what engenders trust, right? Once you have built some trust in your automated process, you still want a conceptual net to make sure you don’t go splat if something doesn’t work as intended. The second requirement for trustable automation is rollback. You need to be able to quickly and easily get back to a known good configuration. So when rolling out any kind of automation (whether via scripting or a platform), you’ll want to make sure you store state information, and have the capability to reverse any changes quickly and completely. And yes, this is something you’ll want to test extensively, both as you select an automation platform and once you start using it. The point is that as you design orchestration and automation functions, you have a lot of flexibility to get there at your own pace. Some folks have a high threshold for pain and jump in with both feet, understanding at some point they will likely need to clean up a mess. Others choose to tiptoe toward an automated future, adding use cases as they build comfort in the ability of their controls to work without human involvement. There is no right answer