The key to dealing with advanced attackers is not closing off every window of vulnerability. As we have discussed throughout this series, advanced attackers will figure out a way to gain a foothold in your environment. Actually they will find multiple ways into your environment. So if you hope for any semblance of success, your goal cannot be to stop them – instead you need to work on shorteneing the window between compromise and detection. We have called that Reacting Faster and Better for years. 5 years to be exact, but who’s counting?

The general concept is that you want to monitor your environment, gathering key security information that can either identify typical attack patterns as they are happening (yes, a SIEM-like capability), or more likely searching for indicators identified via intelligence activities.

Collecting All the Security Data

We say “all the security data” a bit tongue-in-cheek, but not too much. We have been saying Monitor Everything almost as long as we have been talking about Reacting Faster, because if you fail to collect data you won’t have an opportunity to get it later. Unfortunately most organizations don’t realize their security data collection leaves huge gaps until the high-priced forensics folks let you know they can’t truly isolate the attack, or the perpetrator, or the malware, or much of anything, because you just don’t have the data.

Most folks only need to learn that lesson once.

So the first order of business is to lay down a collection infrastructure to store all your security data. The good news is that you have likely been collecting security data for quite some time, and your existing investment and infrastructure should be directly useful for dealing with advanced attackers. This means existing log management system may be useful after all. But perhaps not – you might have tools that aren’t at all suited to helping you find advanced attackers in your midst. One step at a time – now let’s delve into the data you need to collect.

  1. Network Security Devices: Your firewalls and IPS devices generate huge logs of what’s blocked, what’s not, and which rules are effective. You will receive intelligence that typically involves port/protocol/destination combinations or application identifiers for next-generation firewalls, which can identify potential attack traffic.
  2. Configuration Data: One key area to mine for indicators is the configuration data from your devices. It enables you to look for very specific files and/or configurations that have been identified as indicators of compromise.
  3. Identity: Similarly information about logins, authentication failures, and other identity-related data is useful for matching against attack profiles from third-party threat intelligence providers.
  4. NetFlow: This is another data type commonly used in SIEM environments; it provides information on protocols, sources, and destinations for network traffic as it traverses devices. NetFlow records are similar to firewall logs but far smaller, making them more useful for high-speed networks. Flows can identify lateral movement by attackers, as well as large exfiltration file transfers.
  5. Network Packet Capture: The next frontier for security data collection is actually to capture all network traffic on key segments. Forensics folks have been doing this for years during investigations, but proactive continuous full packet capture – for the inevitable incident responses which haven’t even started yet – is still an early market. For more detail on how full packet capture impacts security operations check out our Network Security Analytics research.
  6. Application/Database Logs: Application and database logs are generally less relevant, unless they come from standard applications or components likely to be specifically targeted by attackers. But you might be able to discover unusual application and/or database transactions – which might represent bulk data removal, injection attempts, or efforts to attack your critical data.
  7. Vulnerability Scans: This is another information source with limited value, detailing which devices are vulnerable to specific attacks. They help eliminate devices from your search criteria to streamline search activities.

Of course this isn’t an exhaustive list, and you are likely already capturing much of this data. That’s a good thing, but capturing and analyzing data within the context of a compliance audit is fundamentally different than trying to detect advanced attacker activity.

We are sticking to the CISO view for this series so we won’t dig into the technical nuances of the collection infrastructure. But they must be built on a strong analytical foundation which provides a threat-centric view of the world rather than one a focused on compliance reporting. More advanced organizations may already have a Security Operations Center (SOC) leveraging a SIEM platform for more security-oriented correlation and forensics to pinpoint and investigate attacks. That’s a start, but you will likely require some kind of Big Data thing, which should be clear after we discuss what we need this detection platform to do.

Attack Patterns FTW

As much as we have talked about the futility of blocking every advanced attack, that doesn’t mean we shouldn’t learn from both the past and the misfortune of others. We spent a time early in this process on sizing up the adversary for some insight into what is likely to be attacked, and perhaps even how. That enables you to look for those attack patterns within your security data – the promise of SIEM technology for years.

The ultimate disconnect with SIEM was the hard truth that you needed to know what you were looking for. Far too many vendors forgot to mention that little requirement when selling you a bill of goods. Perhaps they expected attackers to post their plans on Facebook or something? But once you do the work to model the likely attacks on your key information, and then enumerate those attack patterns in your tool, you can get tremendous value. Just don’t expect it to be fully automated.

The best case is that you receive an alert about a very likely attack because it’s something you were looking for. But the quickest way to get killed is to plan for the best case. So we also need to ensure we are ready for the worst case. That is advanced attackers using attacks you haven’t seen before, in ways you don’t expect. That’s when all of your gathered intelligence comes into play.

Mining for Indicator Gold

We have already listed a number of different threat intelligence feeds, which can be used to search for specific malware files, command and control traffic, DNS request patterns, and a variety of other indicators. Mining for indicators isn’t that much different from early gold prospecting. They were trying to find gold among millions of rocks moving down the stream. Their main tool was a metal strainer – a filter.

The advantage that we have today in security is that we can tune our filters to search through billions of ‘rocks’ at a pretty fast clip. So you can search your security data infrastructure for almost anything you are collecting – or even better, for a series of events and/or files within your environment – quickly and accurately to narrow down your searches to the most likely attacks.

Which brings us to the overhyped use of Big Data Analytics in the process of mining for attack indicators. Not that some of the hype isn’t justified. The use of tools to more effectively index and search huge security data sets is critical to finding advanced attackers quickly. But like SIEM, its predecessor technology, Big Data vendors are especially prone to hyperbole about the short-term value of their security analytics platforms.

Keeping this discussion at a high level, we recently summed up how Big Data will impact security analytics:

We have every confidence that big data holds promise for security intelligence, both because we have witnessed attacker behavior captured in event data just waiting to be pulled out, and because we have also seen miraculous ideas sprout from people just playing around with database queries. In the same way hackers stumble on vulnerabilities while playing with protocols, security analysts stumble on interesting data just by asking the same question (query) different ways. The data holds promise. The mining of most data, and all of the work that will be required in writing M-R (MapReduce) scripts to locate actionable intelligence, is not yet here. It will take years of dedicated work – and it’s will take script development on different data types for different NoSQL varieties.

In other words it still early days for this technology to solve these problems. You are clearly constrained in terms of internal capabilities (you will be looking for a lot of data scientists over the next few years), as well as the lack of maturity of technologies such as Hadoop, MapReduce, Pig, Hive, and a variety of others in the security context. So remain skeptical about promises that a magic box that will ingest scads of security data and pop out advanced attackers.

But companies seriously looking to detect advanced attackers within their environments will be capturing packets to supplement the other data they already collect, and subsequently starting to use Big Data technologies to mine it all. Sounds easy, right? Unfortunately it is thankless work, so make sure you swing by the cubes of your forensics folks to give them a big thank-you. They spend a lot of time chasing down false positives, all for the occasional times they find an active attack. That brings us to the next step in finding advanced attackers: verification of the attack.

Photo credit: “There’s Gold In Them There Hills!” originally uploaded by Podknox