Network-based Malware Detection: The Impact of the Cloud
Is it that time already? Yep, it’s time to wrap up our series on Network-based Malware Detection. We started with the need to block malware more effectively on the perimeter, particularly because you know you have users who are not the sharpest tools in the shed. Then we discussed the different techniques involved in detecting malware. Finally we tackled location, assessing critically whether the traditional endpoint protection model has outlived its usefulness. So far we have made the case for considering gateway-based malware detection as one of the next key capabilities needed on your perimeter. Now it’s about wading through the hyperbole and evaluating the strengths and weakness of each approach. AV on the Box To provide a full view of all the alternatives we need to start with the status quo, which is a traditional AV engine (typically OEMed from an endpoint AV vendor) on your gateway. Yes, this is basically what lower-end UTM devices do. This approach focuses on detecting malware within the content stream (think email/web filtering), and (just like traditional AV approaches) it isn’t very effective for detecting modern malware. AV doesn’t work very well on your endpoint, and alas it’s not much better on perimeter gateways. Sandboxing on the Box The latest iteration, beyond running a traditional AV engine on the box, involves executing malware in a protected sandbox on the perimeter device and observing what it does. Depending on the behavior of the file – whether it does bad things – it can be blocked in real time. Virtualizing victim devices on perimeter platforms to test malware at network speeds is a substantial advance. And we have seen these devices provide a measurable improvement in ability to block malware at the gateway. But of course this entails trade-offs. First of all, do you really want to be executing malware within your production network? Of course it is supposed to be an isolated environment, but it’s still a risk – even if a small one. The second trade-off is performance. You are limited to the performance of the perimeter device. Only so many virtual victims can be spun up on a given network device at a time, so at some point you will hit a scalability wall. You can throw bigger boxes at the problem but local analysis is inherently limiting. And remember that these are new and additional dedicated devices. For some organizations that isn’t a problem – they simply get a new box to solve a new problem. Others are more resistant to spending rack space on the perimeter on one more niche device. Finally, this model provides no leverage. This approach requires you to execute every suspicious file locally, even if the malware has been sent to every company in the world. And because detecting malware is an inexact science, you will probably miss the first time something comes in, and suffer the consequences. You need a feedback loop to take advantage of what you learned during incident response / malware analysis (as described in the Malware Analysis Quant research) on the device. Shame on you if you do all the work to analyze the malware, but don’t make sure it cannot strike again. So to net this out, doing more sophisticated malware detection on the perimeter gateway represents a major advance, and has helped to detect a lot of the lower-hanging fruit missed by traditional AV. It is at a disadvantage against truly targeted unique malware, but then again nothing aside from unplugging from the Internet can really solve that problem. Leveraging the Cloud for Malware Detection We often point out there is rarely anything really new – just recycled ideas packaged a bit differently. We see this again with network-based malware detection, as we did for endpoint AV. When it became impractical to continue pushing a billion malware signatures to each protected endpoint, AV vendors started leveraging the cloud to track the reputation of individual file, determine if they are bad, and then tell endpoints to block them. The vendor’s AV cloud would analyze unknown files and make a determination of goodness or badness depending on what the file does. Of course that analysis isn’t real-time, so the first couple iterations of each new attack end poorly for the victims. But over time the malware is profiled, and then blocked when it shows up again. This concept also applies to detecting malware on the perimeter security gateway. A list of bad files can be cached on the devices, and new unrecognized files can be uploaded to the cloud service for analysis and an approve/block verdict. This addresses a number of the issues inherent to local analysis, as described above. You send the malware off to someone else’s cloud service rather than executing it locally. You have no performance limitations (assuming the network itself is reasonably fast) because the analysis isn’t on your hardware, and this capability adds little overhead to perimeter security gateways, which are likely already overburdened dealing with all these new application-aware policies. And you can take full advantage of the vendor’s cloud service, with its excellent leverage. If organization A sees a new malware file and the cloud service learns it’s bad, all subscribers to the cloud service can automatically block that malware and any recognizable cousins. So the larger the network, the less likely you are to see (and be infected by) the first specimen of any particular malware file – instead you can learn from other people’s misfortune and block the malware. So what’s the catch? It’s about the same as the latest generation of endpoint AV. The latency between when you see the attack and when specific malware files are known bad. That could be days at this point, but as the technology improves (and it will) the window will shrink to hours. But there will always be a window of exposure, since you aren’t actually analyzing the malware at the perimeter. And detection will never be perfect – malware writers already make it