As we wrap up our Evolving Endpoint Malware Detection series, it’s time to take it to the next level. We spent the first three posts on why detection is challenging, the types of behavioral indicators you should look for, and some additional data sources for added context to improve effectiveness and reduce false positives. Now we need to do something with the information we have gathered – basically to provide a verdict on whether something is malware or not, and if it is to block it. Alas, this is where you need to understand the trade-offs between different controls and decide what is best for your environment.
The Malware Detection ‘Cocktail’
Let’s jump back in the time machine, to the good old days on the cutting edge of spam detection. Spammers got pretty good and evolved their techniques to evade every new defense the email security folks came up with. 3-4 years in, around 2004-2005, the vendors used 15-20 different tactics to determine whether any particular email message was unsolicited. Sound familiar? Malware detection has reached a similar point. Lots of techniques, none foolproof, and severe consequences for false positives.
What can we learn from how the anti-spam vendors evolved? Aside from the fact that over time the effectiveness you can achieve and maintain is limited? The best approach for dealing with a number of different detection techniques is to use a cocktail approach. This involves scoring each technique (possibly quite coarsely), feeding it into an algorithm with appropriate weighting for each technique, and then determining a threshold that indicates something bad. Obviously the secret sauce is in the algorithm, and it’s the vendor’s responsibility to handle it.
Yes, a lot of this happens (and should remain) behind the curtain, but we are trying to explain how the process works so you can be an educated shopper for new devices and products that claim to detect advanced malware.
But we have also learned from the anti-spam folks that you cannot be right every time. So we need to plug our research on incident response and forensics, including Incident Response Fundamentals, React Faster and Better, and Network Security Analysis, to ensure you are prepared for the inevitable failures of even the best malware detection.
Let’s take a look at the components and controls you will rely on:
Traditional Endpoint Protection
Thanks to your friendly compliance mandate and check-box-centric auditors, you still need endpoint protection – often called anti-virus. But most endpoint security suites encompass much more than traditional anti-virus signatures, including some of the tactics we have discussed in this series. Obviously with 15-20 players remaining in this market, the quality of detection is all over the map and quite dynamic. Each vendor goes through ups and downs in detection effectiveness.
So how do we recommend choosing an endpoint suite? That could be an entire series itself, but suffice it to say that the effectiveness of detection probably shouldn’t be the most important selection criteria. It is too hard to verify, and they each do a decent job of finding known malware, and a mediocre job of finding the advanced attacks we have focused this series on. You need endpoint protection for compliance; so you should minimize price, ensure that agents can be effectively managed (especially if you have thousands of endpoints), and make sure that the agents are as thin as possible. It’s bad enough having to use a control that doesn’t work as well as it needs to, but crushing device performance adds insult to injury. By all means, check the latest comparative effectiveness rankings, but understand they go out of date pretty quickly.
Network-based Malware Detection
We believe that the earlier you can detect malware and block it, the less mess you will inevitably have to clean up. That means working to eliminate attacks at the perimeter or even in the cloud before an attack ever gets near your desktop. How can you do this? A new type of network security device scrutinizes ingress traffic to detect malware files before they enter your corporate network. We expect this capability to become a feature of pretty much every perimeter device over time, but for now you will need to deal with specialist companies and separate devices. We published some research on this earlier in 2012; so check out Network-based Malware Detection for details on the approaches, limitations, and roles of these devices in your network security strategy.
Advanced Endpoint Controls
We all understand that traditional endpoint security suites leave too much attack surface exposed to advanced attackers, depending on your pain threshold (how likely you are to be targeted by an advanced attacker). An additional level of endpoint protection may be necessary. So let’s discuss some of these alternatives – which detect and block based on behavioral indicators, track file trajectories and proliferation, and/or allow authorized executables.
The first category of advanced endpoint control is really next-generation host intrusion prevention (HIPS) technology. As we have mentioned, HIPS looks for funky behavior within the endpoint, but has lacked sufficient context to be truly effective. A few technologies have emerged to address these concerns, leveraging the kind of malware detection cocktail discussed above. This analytical approach to what’s happening on the endpoint, and applying proper context based on application and specific behavior can reduce false positives and improve effectiveness. These tools impact user experience by blocking things (which is usually a good thing), but need to be put through proper diligence before broad deployment. But you do that with all new technologies anyway, right?
As we talked about in Providing Context, malware proliferation analytics can be very useful for tracking the spread of malware within your environment, securing the origin point, and reducing the possibility of constant reinfection. So we are fans of this kind of analysis as another layer of defense. You have two main options for gathering the information for this kind of analysis: either on the endpoint or within the network. Endpoint solutions provide a thin agent which sends information up to a cloud-based repository. Obviously this involves another agent on the desktop and another interface to manage, but it can leverage outbreak data from many other organizations, to yield interesting information.
You can also look for known C&C traffic on your network by monitoring egress traffic. This is one step removed – the device is already compromised – but can produce an accurate assessment of which devices need to be cleaned up immediately. Looking at the network does not provide definitive identification of the malware origin point, which limits its utility for reducing reinfection.
Finally, you still have a draconian option: application whitelisting. Use a “default deny” approach on endpoints, which depends on a set of authorized executables that can run on endpoints, and blocking everything else. This is draconian because it dramatically impacts the user experience – generally not in a good way. This technology offers grace periods to allow execution of programs until an administrator approves or rejects the request, but this compromise violates the security model. Some vendors perform memory analysis and are introducing other behavioral approaches to make the grace period less risky, but any grace period introduces significant risk. So we see AWL as more appropriate to fixed function devices such as kiosks, call centers, control systems, etc., where general purpose software shouldn’t be running.
Of course most of these advanced tactics will eventually be subsumed into existing controls, either via acquisition or internal development. That’s just how security markets (and most other technology markets) work. What’s advanced today will be standard tomorrow. That said, the process can take 2-3 years, and most organizations cannot afford to wait, so you can evaluate many of these technologies to fill the gap.
Of course not all these controls run exclusively on endpoints. Despite the title of this series, you need to use all the controls at your disposal, and some work better at other places within your IT infrastructure. It is important to factor these controls into the trade-offs you make when designing malware defenses.
Detection before, during, and after Attacks
When you are trying to detect advanced malware, you need to include time in your planning. Different approaches are more or less effective depending on when you use them. For instance, the reputation of a sender or file is most valuable early. If you get a hit with reputation-based intelligence, you can skip other more demanding analyses. Likewise, malware file signature checks are quick and should be performed early.
But it is getting much harder to detect attacks before the malware executes. Many behavioral indicators are only available when the malware is running, and others appear after the malware has activated. We have a quick chart to show what we mean.
Device/Location Variance
Another aspect to consider when designing control sets is the amount of control (or lack thereof) you have over the device, and what kind of device it is. There is substantial variation in what you can between mobile devices and PCs. Obviously you have the most control over corporate PCs; where you can perform a credentialed scan and install a device agent to check file signatures and reputation, and check for behavioral indicators. On devices you don’t control, such as those belonging to contractors and customers and perhaps employees, you might be able to scan at connection to the network or install a browser plug-in to protect a specific web application or set of domains. But you need to tread carefully – privacy is often a major concern on devices you don’t control. If installing an agent or plug-in is a non-starter, then you need after-attack network monitoring as a fallback, hoping to catch the attacker misbehaving before they loot your shop.
Smartphones are a bit different. You may be able to install an agent, but functionality varies widely between the various mobile operating systems. Don’t expect much ability to check behavioral indicators on a smartphone, as agents rarely have real-time access to the mobile kernel. Android provides more access than iOS, so anti-malware agents are available on Android. But understand that any mobile agent is quite limited compared to a PC-based agent. Unless the mobile device is jailbroken or rooted, which is a whole different discussion.
Remember that the current location of a device affects your ability to protect it. When the device is connected to your network (whether physically or via VPN) it gains ingress and egress protection from your perimeter security controls. You can block attacks at the perimeter via network-based malware detection or email security devices. You can also perform egress filtering to check for C&C traffic or data exfiltration, which indicate an attack.
When the device is not connected to your network those network-based controls are unavailable. So you need some type of agentry for a fighting chance – and also to scrutinize the device when it connects to your network, just to make sure nothing bad happened to the device while it was out in the wild.
Compromises
The controls at your disposal range from monitoring to locking down devices. Detecting advanced malware requires all of them, but you need to be conscious of your disruptive impact on end users. Find a balance that is sufficiently secure but not too disruptive, navigating the constraints of device ownership and control, and workable across device locations and network connectivity scenarios. There is no simple right answer – just opportunity to manage expectations and ensure that decision makers understand the compromises they make.
And with that we wrap up the Evolving Endpoint Malware Detection series. As usual we will assemble these posts into a paper over the next week or so, and we appreciate any feedback you have. We will factor that feedback into the final paper.
Comments