Our first post in Network-based Threat Intelligence delved into the kill chain. We outlined the process attackers go through to compromise a device and steal its data. Attackers are very good at their jobs, so it’s best to assume any endpoint is compromised. But with recent advances in obscuring attacks (through tactics such as VM awareness) and the sad fact that many compromised devices lie in wait for instructions from their C&C network, you need to start thinking a bit differently about finding these compromised devices – even if they don’t act compromised.
Network-based threat intelligence is all about using information gleaned from network traffic to determine which devices are compromised. We call that following the Trail of Bits, to reflect the difficulty of undertaking modern malware activities (flexible and dynamic malware, various command and control infrastructures, automated beaconing, etc.) without leveraging the network. Attackers try to hide in plain site and obscure their communications within the tens of billions of legitimate packets traversing enterprise networks. But they always leave a trail or evidence of the attack, if you know what to look for.
It turns out we learned most of what we need in kindergarten. It’s about asking the right questions. The five key questions are Who?, What?, Where?, When?, and How?, and they can help us determine whether a device may be compromised. So let’s dig into our questions and see how this would work.
The first key set of indicators to look for are based on where devices are sending requests. This important because modern command and control requires frequent communication with each compromised device. So the malware downloader must first establish contact with the C&C network; then it can get new malware or other instructions.
The old reliable network indicator is reputation. First established in the battle against spam, we tag each IP address as either ‘good’ or ‘bad’. Yes, this looks an awful lot like the traditional black list/negative security approach of blocking bad. History has shown the difficulty of keeping a black list current, accurate, and comprehensive over time. Combined with advances by attackers, we are left with blind spots in reputation’s ability to identify questionable traffic.
One of these blind spots results from attackers using legitimate sites as C&C nodes or for other nefarious uses. In this scenario a binary reputation (good or bad) is inadequate – the site itself is legitimate but not behaving correctly. For instance, if an integrated ad network or other third party web site is compromised, a simplistic reputation system could flag the entire site as malicious. A recent example of that was the Netseer hack, where browser-based web filters flagged traffic to legitimate sites as malicious due to integration with a compromised ad network. They threw the proverbial baby out with the bathwater.
Another issue with IP reputation is the fact that IP addresses change constantly based on what command and control nodes are operational at any given time. Much of the sophistication in today’s C&C infrastructure has to do with how attackers associate domains with IP addresses on a dynamic basis. With the increasing use of domain generating algorithms (DGA), malware doesn’t need to be hard-coded with specific IP addresses – instead it cycles through a set of domains (based on the DGA) searching for a C&C controller. This provides tremendous flexibility, enabling attackers to protect the ability of newly compromised devices to establish contact, despite domain takedowns and C&C interruptions.
This makes the case for DNS traffic analysis in the identification of C&C traffic, along with monitoring packet stream. Ultimately domain requests (to find active C&C nodes) will be translated into IP addresses, which requires a DNS request. By monitoring these DNS requests for massive amounts of traffic (as you would see in a very large enterprise or a carrier network), patterns associated with C&C traffic and domain generation algorithms can be identified.
If we look to the basics of network anomaly detection, by tracking and trending all ingress and egress traffic; flow patterns can be used to map network topology, track egress points, etc. By identifying a baseline of normal communication patterns we can pinpoint new destinations, communications outside ‘normal’ activity, and perhaps spikes in traffic volume. For example, if you see traffic originating from the marketing group during off hours, without a known reason (such as a big product launch or ad campaign), that might warrant investigation.
The next question involves what kind of requests and/or files are coming in and going out. We have written a paper on Network-based Malware Detection, so we won’t revisit it here. But we need to point out that by analyzing and profiling how each piece of malware uses the network, you can monitor for those traffic patterns on your own network.
In addition, this enables you to work around VM-aware malware. The malware escapes detection as it enters the network, because it doesn’t do anything when it detects it’s running in a sandbox VM. But on an bare-metal device it executes the malicious code to compromise the device. To take the analysis to the next level, you can track the destination of the suspicious file, and then monitor specifically for evidence that the malware has executed and done damage. Again, it’s not always possible to block the malware on the way in, but you can shorten the window between compromise and detection by searching for identifying communication pattern that indicate a successful attack.
You can also look for types of connection requests which might indicate command and control, or other malicious traffic. This could include looking strange or unusual protocols, untrusted SSL, spoofed headers, etc. You can also try to identify requests from automated actors, which have predictable patterns even when randomized to simulate a human being.
But this means all egress and ingress traffic is in play; it all needs to be monitored and analyzed in order to isolate patterns and answer the where, when, what, and how questions. Of course that still leaves an key final question.
Now we get to the issue of analyzing the specific device for atypical behavior. Clearly the device in question should behave in a certain way, depending on what it should be doing, so behavioral anomalies may indicate compromise. We will defer discussion of device profiling and monitoring network traffic to verify proper behavior for our next post.
Turning Data into Dynamic Intelligence
Asking the key questions can be done within any organization. But any single organization will only see small subset of attacks under way. So without factoring in external information – the intelligence part of network-based threat intelligence – gleaned by leveraging some type of information sharing network, you can only look for stuff already happening to you. That defeats the purpose of Early Warning.
So to make network-based threat intelligence work you need access to a significant amount of traffic, in order to find and recognize useful patterns. Failing that you need to partner with a provider with this kind of data, and the ability to deliver it to you in a useful fashion.
Network-based threat intelligence must be dynamic because the targets, algorithms, domains, and pretty much everything else, are constantly changing. Attackers change their approaches constantly to improve efficiency, obscurity, and resilience. So you need to continually adapt the patterns you search for.
One other essential point is defining a compromise. A single indicator does not necessarily mean a device is compromised. And we all know the hazards of false positives in security practice. So the success or failure of any security control will hinge on your ability to make efficient and accurate determinations of compromise. The next post will talk about deploying sensors, making determinations, and ultimately isolating compromised devices on your network to achieve the desired Quick Win.