The Future of SecOps: Regaining Balance

By Mike Rothman

The first post in this series, Behind the 8 Ball, raised a number of key challenges practicing security in our current environment. These include continual advancement and innovation by attackers seeking new ways to compromise devices and exfiltrate data, increasing complexity of technology infrastructure, frequent changes to said infrastructure, and finally the systemic skills shortage which limits our resources available to handle all the challenges created by the other issues. Basically, practitioners are behind the 8-ball in getting their job done and protecting corporate data.

As we discussed in that earlier post, thinking differently about security entails you changing things up to take a (dare we say it?) more enlightened approach, basically focusing the right resources on the right functions. We know it seems obvious that having expensive staff focused on rote and tedious functions is a suboptimal way to deploy resources. But most organizations do it anyway. We prefer to have our valuable, constrained, and usually highly skilled humans doing what humans are good at, such as:

  • identifying triggers that might indicate malicious activity
  • drilling into suspicious activity to understand the depth of attacks and assess potential damage
  • figuring out workarounds to address attacks

Humans in these roles generally know what to look for, but aren’t very good at looking at huge amounts of data to find those patterns. Many don’t like doing the same things over and over again – they get bored and less effective. They don’t like graveyard shifts, and they want work that teaches them new things and stretches their capabilities. Basically they want to work in an environment where they do cool stuff and can grow their skills. And (especially in security) they can choose where they work. If they don’t get the right opportunity in your organization, they will find another which better suits their capabilities and work style.

On the other hand machines have no problem working 24/7 and don’t complain about boring tasks – at least not yet. They don’t threaten to find another place to work, nor do they agitate for broader job responsibilities or better refreshments in the break room. We’re being a bit facetious here, and certainly don’t advocate replacing your security team with robots. But in today’s asymmetric environment, where you can’t keep up with the task list, robots may be your only chance to regain balance and keep pace.

So we will expand a bit on a couple concepts from our Intro to Threat Operations paper, because over time we expect our vision of threat operations to become a subset of SecOps.

  • Enriching Alerts: The idea is to take an alert and add a bunch of common information you know an analyst will want to the alert, before to sending it to an analyst. This way the analyst doesn’t need to spend time gathering information from those various systems and information sources, and can get right to work validating the alert and determining potential impact.
  • Incident Response: Once an alert has been validated, a standard set of activities are generally part of response. Some of these activities can be automated via integration with affected systems (networks, endpoint management, SaaS, etc.) and the time saved enables responders to focus on higher-level tasks such as determining proliferation and assessing data loss.

Enriching Alerts

Let’s dig into enriching alerts from your security monitoring systems, and how this can work without human intervention. We start with a couple different alerts, and some educated guesses as to what would be useful to an analyst.

  • Alert: Connection to a known bad IP: Let’s say an alert fires for connectivity to a known bad IP address (thanks, threat intel!). With source and destination addresses, an analyst would typically start gathering basic information. 1. Identity: Who uses the device? With a source IP it’s usually straightforward to see who the address is allocated to, and then what devices that person tends to use.

    1. Target: Using a destination IP external site comes into focus. An analyst would probably perform geo-location to figure out where the IP is and a whois query to figure out who owns it. They could also figure out the hosting provider and search their threat intel service to see if the IP belongs to a known botnet, and dig up any associated tactics.
    2. Network traffic: The analyst may also check out network traffic from the device to look for strange patterns (possibly C&C or reconnaissance) or uncharacteristically large volumes to or from that device over the past few days.
    3. Device hygiene: The analyst also needs to know specifics about the device, such as when it was last patched and does it have a non-standard configuration?
    4. Recent changes: The analyst would probably be interested in software running on the device, and whether any programs have been installed or configurations changed recently.
  • Alert: Strange registry activity: In this scenario an alert is triggered because a device has had its registry changed, but it cannot be traced back to authorized patches or software installs. The analyst could use similar information to the first example, but device hygiene and recent device changes would be of particular interest. The general flow of network traffic would also be of interest, given that the device may have been receiving instructions or configuration changes from external devices. In isolation registry changes may not be a concern, but in close proximity of a larger inbound data transfer the odds of trouble increase. Additionally, checking out web traffic logs from the device could provide clues to what they were doing that might have resulted in compromise.

  • Alert: Large USB file transfer: We can also see the impact of enrichment in an insider threat scenario. Maybe an insider used their USB port for the first time recently, and transferred 1GB of data in a 3-hour window. That could generate a DLP alert. At that point it would be good to know which internal data sources the device has been communicating with, and any anomalous data volumes over the past few days, which could indicate information mining in preparation for exfiltration. It would also be helpful to review inbound connections and recent device changes, because the device could have been compromised by an external actor using a remote Trojan to attack the device.

In these scenarios, and another 1,000 we could concoct, all the information the analyst needs to get started is readily available within existing systems and security data/intelligence sources. Whatever tool an analyst uses to triage can be pre-populated with this information.

The ability to enrich alerts doesn’t end there. If files are involved in the connection, the system could automatically poll an external file reputation service to see whether they are recognized as malicious. File samples could be set to a sandbox to report on what each file actually does, and if it tends to be part of a known attack pattern. Additionally, if the file turns out to be part of a malware kit, the system could then search for other files known to be related, and perhaps across other devices within the organization.

All this can be done before an analyst ever starts processing an alert. These simple examples should be enough to illustrate the potential of automated enrichment to give analysts a chunk of what they need to figure out whether an alert is legitimate, and if so then how much risk it poses.

Incident Response

Once an analyst validates an alert and performs an initial damage assessment, the incident would be sent along to the response team. At this point a number of activities can be performed without a responder’s direct involvement or attention to accelerate response. If you consider potential responses to the alerts above, you can see how orchestration and automation can make responders far more efficient and reduce risk.

  • Connection to known bad IP: Let’s say an analyst determines that a device connected to a known bad IP, because it was compromised and added to a botnet. What would a responder then want to do?
    1. Isolate the device: First the device should be isolated from the network and put on a quarantine network with full packet capture to enable deeper monitoring, and to prevent further data exfiltration.
    2. Forensic images: The responder will need to take an image of the device for further analysis and to maintain chain of custody.
    3. Load forensics tools on the imaged device: The standard set of forensic tools are then loaded up, and the images connected for both disk and memory forensics.

All these functions can happen automatically once an alert is validated and escalated to the response team. The responder starts with images from the compromised device, forensics tools ready to go, and a case file with all available information about the attack and potential adversary at their fingertips when they begin response.

But opportunities to work faster and better don’t end here. If the responder discovers a system file that has been changed on the compromised device, they can further automate their process. They can search the security analytics system to see whether that file or a similar one has been downloaded to any other devices, run the file through a sandbox to observe and then search for its behaviors, and (if they get hits on other potentially compromised devices) incorporate additional devices into the response process, isolating and imaging them automatically.

These same techniques apply to pretty much any kind of alert or case that comes across a responder’s desk. The registry alert above applies mostly to memory forensics, but the same general processes apply.

Ditto for the large USB file transfer indicating an insider attack. But if you suspect an insider it’s generally more prudent not to isolate the device, to avoid alerting them. So that alert would trigger a different automated runbook, likely involving full packet capture of the device, analysis of file usage over the past 60-90 days, and notifying Human Resources and Legal of a potential malicious insider.

What is the common thread across all these scenarios? The ability to accelerate SecOps by planning out activities in the form of runbooks, and then orchestrating and automating execution to the greatest extent possible.


These seem self-evident, but let’s be masters of the obvious and state them anyway. This potential future for security operations enables you to:

  • React Faster and Better: Your analysts have better information because the alerts they receive include information they currently spend time gathering. Your responders work better because they already have potentially compromised devices isolated and imaged, and a wealth of threat intel about what the attack might be, who is behind it, and their likely next move.
  • Operationalizing Process: Your best folks just know what to do, but your other folks typically have no idea, so they stumble and meander through each incident; some figure it out alone, and others give up and look for another gig. If you could have your best folks build runbooks which define proper processes for the most common situations, you can minimize performance variation and make everyone more productive.
  • Improve Employee Retention: Employees who work in an environment where they can be successful, with the right tools to achieve their objectives, tend to stay. It’s not about the money for most security folks – they want to do their jobs. If you have systems in place to keep humans doing what they are good at, and your competition (for staff) doesn’t, it becomes increasingly hard for employees to leave. Some will choose to build a similar environment somewhere else – that’s great, and how the industry improves overall. But many realize how hard it is, and what a step backwards it would be to manually do the work you have already automated.

So what are you waiting for? We never like to sell past the close, but we’ll do it anyway. Enriching alerts and incident response are only the tip of the iceberg relative for SecOps processes which can be accelerated and improved with a dose of orchestration and automation. We will wrap up with our next post, detailing a few more use cases which provide overwhelming evidence of our need to embrace the future.

No Related Posts