Endpoint Advanced Protection Buyer’s Guide: Key Technologies for Detection and Response
Now let’s dig into some key EDR technologies which appear across all the use cases: detection, response, and hunting. Agent The agent is deployed to each monitored endpoint, so you be sensitive to its size and its performance hit on devices. A main complaint regarding older endpoint protection was performance impact on devices. The smaller the better, and the less performance impact the better (duh!), but just as important is agent deployability and maintainability. Full capture versus metadata: There are differing strong opinions on how much telemetry to capture and store from each device. Similar to the question of whether to do full network packet capture or to derive metadata from the packet stream, there is a level of granularity available with a full endpoint capture which isn’t available via metadata, for session reconstruction and more detail about what an adversary actually did. But full capture is very resource and storage intensive, so depending on the sophistication of your response team and process, metadata may be your best option. Also consider products that can gather more device telemetry when triggered, perhaps by an alert or connection to a suspicious network. Offline collection: Mobile endpoints are not always on the network, so agents much be able to continue collecting event data when disconnected. Once back on the network, cached endpoint telemetry should be uploaded to the central repository, which can then perform aggregate analysis. Multi-platform support: It’s a multi-platform world, and your endpoint security strategy needs to factor in not just Windows devices, but also Macs and Linux. Even if these platforms aren’t targeted they could be used in sophisticated operations as repositories, staging grounds, and file stores. Different operating systems offer different levels of telemetry access. Security vendors have less access to the kernel on both Mac and Linux systems than on Windows. Also dig into how vendors leverage built-in operating system services to provide sufficiently granular data for analysis. Finally, mobile devices access and store critical enterprise data, although their vulnerability is still subject to debate. We do not consider mobile devices as part of these selection criteria, although for many organizations an integrated capability is an advantage. Kernel vs. user space: There is a semi-religious battle over whether a detection agent needs to live at the kernel level (with all the potential device instability risks that entails), or accurate detection can take place exclusively at the kernel level. Any agent must be able to detect attacks at lower levels of the operating system – such as root kits – as well as any attempts at agent tampering (again, likely outside user space). Again, we don’t get religious, and we appreciate that user-space agents are less disruptive, but are not willing to compromise on detecting all attacks. Tamper proof: Speaking of tampering, to address another long standing issue with traditional EPP, you’ll want to dig into the product security of any agent you install on any endpoint in your environment. We can still remember the awesome Black Hat talks where EPP agent after EPP agent was shown to be more vulnerable than some enterprise applications. Let’s learn from those mistakes and dig into the security and resilience of the detection agents to make sure you aren’t just adding attack surface. Scalability: Finally, scale is a key consideration for any enterprise. You might have 1,000 or 100,000 devices, or even more; but regardless you need to ensure the tool will work for the number of endpoints you need to support, and the staff on your team – both in terms of numbers and sophistication. Of course you need to handle deployment and management of agents, but don’t forget the scalability and responsiveness of analysis and searching. Machine Learning Machine learning is a catch-all term which endpoint detection/response vendors use for sophisticated mathematical analysis across a large dataset to generate models, intended to detect malicious device activity. Many aspects of advanced mathematics are directly relevant to detection and response. Static file analysis: With upwards of a billion malicious file samples in circulation, mathematical malware analysis can pinpoint commonalities across malicious files. With a model of what malware looks like, detection offerings can then search for these attributes to identify ‘new’ malware. False positives are always a concern with static analysis, so part of diligence is ensuring the models are tested constantly, and static analysis should only be one part of malware detection. Behavioral profiles: Similarly, behaviors of malware can be analyzed and profiled using machine learning. Malware profiling produces a dynamic model which can be used to look for malicious behavior. Those are the main use cases for machine learning in malware detection, but there are a number of considerations when evaluating machine learning approaches, including: Targeted attacks: With an increasing amount of attacks specifically targeting individual organizations, it is again important to distinguish delivery from compromise. Targeted attacks use custom (and personalized) methods to deliver attacks – which may or may not involve custom malware – but once the attacker has access to a device they use similar tactics to a traditional malware attack, so machine learning models don’t necessarily need to do anything unusual to deal with targeted attacks. Cloud analytics: The analytics required to develop malware machine learning models are very computationally intensive. Cloud computing is the most flexible way to access that kind of compute power, so it makes sense that most vendors perform their number crunching and modeling in the cloud. Of course the models must be able to run on endpoints to detect malicious activity, so they are typically built in the cloud and executed locally on every endpoint. But don’t get distracted with where computation happens, so long as performance and accuracy are acceptable. Sample sizes: Some vendors claim that their intel is better than another company’s. That’s a hard claim to prove, but sample sizes matter. Looking at a billion malware samples is better than looking at 10,000. Is there a difference between looking at a hundred million samples and at a billion?