Malware Analysis Quant: Process Descriptions
I’m happy to report that we have finished the process description posts for the Malware Analysis Quant project. Not all of you follow our Heavy Feed (even though you should), so here is a list of all the posts. The Malware Analysis Quant project addresses how organizations confirm, analyze and then address malware infections. This is important because today’s anti-malware defenses basically don’t work (hard to argue), and as a result way too much malware makes it through defenses. When you get an infection you start a process to figure out what happened. First you need to figure out what the attack is, how it works, how to stop or work around it, and how far it has spread within your organization. That’s all before you can even think about fixing it. So let’s jump in with both feet. Process Map Confirm Infection Subprocess This process typically starts when the help desk gets a call. How can they confirm a device has been infected? Notification: The process can start in a number of ways, including a help desk call, an alert from a third party (such as a payment processor or law enforcement), or an alert from an endpoint suite. However it starts, you need to figure out whether it’s a real issue. Quarantine: The initial goal is to contain the damage, so the first step is typically to remove the device from the network to prevent it from replicating or pivoting. Triage: With the device off the net, now you have a chance to figure out how sick it is. This involves all sorts of quick and dirty analysis to figure out whether it’s a serious problem – exactly what it is can wait. Confirm: At this point you should have enough information to know whether the device is infected and by what. Now you have to decide what to do next. Confirm Infection Process Descriptions Based on what you found you will either: 1) stop the process (if the device isn’t infected), 2) analyze the malware (if you have no idea what it is), or 3) assess malware proliferation (if you know what it is and have a profile). Analyze Malware Subprocess By now you know there is an infection, but you don’t know what it is. Is it just an annoyance, or is it stealing key data and presenting a clear and present danger to the organization? Here are some typical malware analysis steps for building a detailed profile. Build Testbed: It’s rarely a good idea to analyze malware on production devices connected to production networks. So your first step is to build a testbed to analyze what you found. This tends to be a one-time effort, but you’ll always be adding to the testbed based on the evolution of your attack surface. Static Analysis: The first actual analysis step is static analysis of the malware file to identify things like packers, compile dates, and functions used by the program. Dynamic Analysis: There are three aspects of what we call Dynamic Analysis: device analysis, network analysis, and proliferation analysis. To dig a layer deeper, first we look at the impact of the malware on the specific device, dynamically analyzing the program to figure out what it actually does. Here you are seeking perspective on the memory, configuration, persistence, new executables, etc. involved in execution of the program. This is done by running the malware in a sandbox. After understanding what the malware does to the device you can start to figure out the communications paths it uses. You know, isolating things like command and control traffic, DNS tactics, exfiltration paths, network traffic patterns, and other clues to identify the attack. The Malware Profile: Finally we need to document what we learned during our malware analysis, packaged in what we call a Malware Profile. With a malware profile in our hot little hands we need to figure out how widely it spread. That’s the next process. Malware Proliferation Subprocess Now you know what the malware does, you need to figure out whether it’s spreading, and how much. This involves: Define Rules: Take your malware profile and turn it into something you can search on with the tools at your disposal. This might involve configuring vulnerability scan attributes, IDS/IPS rules, asset management queries, etc. Define Rules: Process Description Find Infected Devices: Then take your rules and use them to try to find badness in your environment. This typically entails two separate functions: first run a vulnerability and/or configuration scan on all devices, then search logs for indicators defined in the Malware Profile. If you find matching files or configuration settings, you need to be alerted of another compromised device. Then search the logs, as malware may be able to hide itself from a traditional vulnerability scan but might not be able to hide its presence from log files. Of course this assumes you are actually externalizing device logs. Likewise, you may be able to pinpoint specific traffic patterns that indicate compromised devices, so look through your network traffic logs, which might include flow records or even full packet capture streams. Find Infected Devices: Process Description Remediate: Finally you need to figure out whether you are going to remediate the malware, and if so, how. Can your endpoint agent clean it? Do you have to reimage? Obviously there is significant cost impact to clean up, which must be weighed against the likelihood of reinfection. Remediate: Process Description Monitor for Reinfection One of the biggest issues in the fight against malware is reinfection. It’s not like these are static attacks you are dealing with. Malware changes constantly – especially targeted malware. Additionally, some of your users might make the same mistake and become infected with the same attack. Right, oh joy, but it happens – a lot. So making sure you update the malware profile as needed, and continuously check for new infections, are key parts of the process as well. Monitor for Reinfection: Process Description At this point we’re ready to start Phase 2 of Quant, which is to take each of the process steps and define a set of metrics to