As we discussed in the last post, detecting today’s advanced malware requires more than just looking at the file (the classic AV technique) – we now also need to leverage behavioral indicators. To make things more interesting, even suspiciuous behavior can be legitimate in certain circumstances. So for accurate and effective detection you need better context on what the code does, where it came from, and who it came from, in order to reach a reasonable verdict on whether to allow or block execution.

What happens when you don’t have that context? Let’s jump into the time machine and harken back to the early days of host intrusion prevention (HIPS) and HIPS-like products. They ran on devices and scanned for both attack signatures and behaviors that indicated malware. Without proper context, these controls blocked all sorts of things – involving scads of false positives – and generally wreaking havoc on operations. That didn’t work out very well for organizations which actually needed their devices up and running, even if that imposed a cost in terms of security. Go figure.

But the concept of watching for attacks on devices is solid. It was more of an implementation problem; nowadays additional context reduces false positives, increases accuracy, and limits disruption of operations – all worthy goals for a control to manage new attack vectors. So let’s dig into a few data sources (beyond behavioral indicators) that can help identify bad stuff.

From Where: the Dropper

In the last post we mentioned that malware writers use droppers to gain a presence on devices, and then download current and/or additional attacks, instead of attempting to get the entire malware on the device as part of the initial compromise. Of course droppers are malware just as much as anything else else, but they morph more frequently, which makes initial detection difficult. And as we described in Malware Analysis Quant, the only thing worse than being infected is getting re-infected by the same malware.

So profiling malware droppers enables you to search for these files in your environment. By tracing the path of those droppers you can identify devices which have been compromised but not yet activated. The key to this effort is analysis of data about which files are on which devices; when a file is discovered to be bad, if you have the data and analytics in place it becomes easy to determine which devices have the bad file installed.

Of course this is still a reactive effort. But the presence of a dropper (or similar known bad file), combined with any other bad behavior, is fairly damning evidence of a compromised device. Tracing the droppers back far enough points you to the origination point of the malware; eliminate any vestiges, and you can prevent reinfection.

Who Dat: Reputation

The other useful source for detecting advanced malware is the reputation of a file, sender, or IP address. Initially developed to improve the effectiveness of anti-spam gear, reputation has emerged as a fundamental aspect of every vendor’s threat intelligence offering. The larger security vendors have access to considerable amounts of data from hundreds of millions of installed endpoints and network devices; they mine their datasets to determine which files, devices, and network addresses tend to do bad things.

This is all an inexact science – especially in light of the simplicity of morphing a file, spoofing an IP address, or fiddling with a device fingerprint. You need to expect advanced adversaries to look like something innocent, even when they aren’t. You cannot afford to rest your malware-or-clean verdict strictly on reputation – but you can use it as a supporting data source, for additional context when analyzing a possible attack.

Of course malware writers don’t make it easy to figure out what they are doing. Your best bet is to assemble as much data as you can, analyze what’s going on within the device (behavioral analysis), and combine with data from outside sources to judge the nature and intent of code running (or attempting to run) on your devices – this at least gives you a fighting chance. So far we have focused on analysis and detection, but detection doesn’t help without a mechanism to actually block attacks once they are detected. So we will wrap up this series next week, with an assessment of the different classes of security controls that can leverage this context data to block specific attacks.