Login  |  Register  |  Contact
Monday, April 02, 2012

Malware Analysis Quant: Index of Posts

By Mike Rothman

Here is the complete list of posts in the Malware Analysis Quant research project. Enjoy…

The Malware Analysis Process

Process Map, Draft 1: Check out how we started the project – it’s always interesting to see how the research evolves as we work through it.

Process Descriptions

  1. Confirm Infection
  2. Build Testbed
  3. Static Analysis
  4. Dynamic Analysis
  5. The Malware Profile
  6. Defining Rules
  7. Find Infected Devices
  8. Remediate
  9. Monitoring for Reinfection

Updated Process Map and Process Descriptions Paper

Survey

By the way, the survey is still open and will be for the next 4-5 months. We’ll take another run at driving up responses in August/September, but in the meantime feel free to fill it out if you haven’t already.

Metrics Posts

  1. Metrics – Confirm Infection
  2. Metrics – Build Testbed
  3. Metrics – Static Analysis
  4. Metrics – Dynamic Analysis
  5. Metrics – The Malware Profile
  6. Metrics – Defining Rules
  7. Metrics – Find Infected Devices
  8. Metrics – Remediate
  9. Metrics – Monitoring for Reinfection

–Mike Rothman

Friday, February 24, 2012

Malware Analysis Quant: Metrics—Monitor for Reinfection

By Mike Rothman

Yesterday’s Metrics for Remediation pointed out that looking for malware is not a one-time action. It must be done frequently – or even better, continuously. So any cost model needs to factor in this reality. We outlined the process of Monitoring for Reinfection, where the first step is to define how often you want to check and make sure you aren’t infected again by something you have already seen. Then run through the preceding steps in the Malware Proliferation subprocess. Scan your devices with testing tools, searching the logs and figuring out how to fix what you find.

Any serious estimate of malware costs must factor in this testing and retesting. So let’s see what it looks like:

Monitor for Reinfection

Variable Notes
Time to run testing tool periodically More likely: time to load rule into tool, test effectiveness, and build testing schedule – automation and scheduling yield huge dividends.
Time to run search queries More likely: time to load rules into a SIEM or Log Management product, test rules, and set appropriate alerting thresholds. Searching for malware is rarely the sole justification for buying a SIEM, but that is one way SIEM returns on its investment.
When receiving an alert, analyze results Similar to the Find Infected Devices section – you need to figure out whether an alert represents an actual infection.
Document results of alert Assemble information for other teams to use to remediate the infection.
Go back to remediation if new infections are found. You found something – now fix it.

And with that we have documented all the metrics involved in this Quant project. We are still fine tuning the process maps, so there will be some shifting around over the next few weeks as we finalize things. So there is still time to weigh in either on the Malware Analysis Quant survey or through comments on any of these Quant posts.

–Mike Rothman

Thursday, February 23, 2012

Malware Analysis Quant: Metrics—Remediate

By Mike Rothman

You know you have an infection, and you know which devices are affected. So what do you do? Fix them, of course! But now it gets fun – you have to decide whether to remediate the devices, fix them (assuming you don’t have a compelling reason to wait), test to make sure they are fixed, and then ensure you eradicated everything. Joy.

Let’s look specifically at what this really takes:

Remediate

Variable Notes
Time to determine remediation strategy Do you fix the device? Wipe it? Do something else?
Time to gain consensus on remediation strategy Make sure everyone agrees, especially if the decision is to not remediate for some reason.
Time to remediate device
Time to test remediation Today’s malware is hard to kill, so you need to make sure you’ve really gotten rid of it, unless you’ve wiped the device.
Time to isolate “Patient Zero” Identify initiator/root cause of the infection to ensure all examples are identified and remediated.
Time to determine whether inoculation is necessary Do you need to change a configuration setting or implement a specific control to address this infection?
Time to inoculate, if necessary Implement the additional controls and/or change the configurations.

We’re almost done. The next step is to define some metrics to monitor the environment continually for future outbreaks. You didn’t think you were done yet, did you? Good – attackers are never done, so neither are you. And you need to factor in the costs of monitoring for reinfection to accurately capture the cost of fighting malware.

–Mike Rothman

Wednesday, February 22, 2012

Malware Analysis Quant: Metrics—Find Infected Devices

By Mike Rothman

Per the process description, the Find Infected Devices subprocess involves scanning devices with testing tools, using the rules defined earlier; and also searching logs for additional indicators of the malware. Of course in practice many of these steps are conducted simultaneously with automation. But Quant research breaks down exactly what needs to happen, in order to capture all the costs. Many of these steps can and should be streamlined using products and/or automation, but Quant costs out manual procedures, for comparison against (hopefully) improved workflows.

Let’s take a look at the entire process:

Find Infected Devices

Scan Devices

Variable Notes
Time to deploy rule on testing tool Load the rules developed earlier into the scanner or other tool.
Time to run rule
Time to analyze results Identify false positives and prioritize which devices have the most serious issues.
Time to document results Prepare documentation for the ops teams tasked with remediation.
Time to escalate infected devices to remediation

Search Logs

Variable Notes
Time to aggregate logs This can (and should) be leveraged with a log management initiative. It usually entails setting up collection from monitored devices. See Network Security Quant for detail on how.
Time to run ad hoc search queries Based on the queries defined for the malware, search the aggregated log data to identify potentially compromised devices.
Time to analyze results Identify false positives and prioritize which devices have the most serious issues.
Time to document results Prepare documentation for the ops teams tasked with remediation.
Time to escalate infected devices to remediation

Now you have a list of devices which have been compromised by the malware you are looking for. Or which at least show strong indications of compromise. So next we move on to remediation.

–Mike Rothman

Tuesday, February 21, 2012

Malware Analysis Quant: Metrics—Define Rules and Search Queries

By Mike Rothman

We now enter the last subprocess of our Malware Analysis Quant project: Malware Proliferation. You now have a Malware Profile so you know what the malware does and how it does it. Now it’s time find out whether it’s present elsewhere in your environment, which means using tools. But they need some direction – basically an idea of what to look for – so in this step you turn the indicators from the profile into rules and/or queries (when using log data) which can be used to find the bad stuff.

Let’s dig a bit deeper:

Define Rules and/or Search Queries

Variable Notes
Time to analyze malware profile Determine how to search for each indicator defined in the profile. Can you scan for it with a tool? Search logs?
Time to develop rules/queries for the indicator This can be tedious, but the tighter you make the search criteria, the fewer false positives you will need to deal with later.
Time to test rules/queries You need to set up a vulnerable device with the malware, on an isolated network, and then make sure your rules/queries actually find it.
Time to refine (and retest) rules/queries You will likely have to iterate a few times to get a set of rules/queries that work well. Again, the longer you spend getting the rules right, the fewer mistakes you’ll need to track down.
Time to document rules/queries As with all good processes, document what you did, and hopefully why.
Repeat for each indicator in the profile Lather, rinse, repeat. You’ll need a rule and/or query for each indicator identified in the profile.

Next you turn your rules/queries loose and try to find the bad stuff. So we’ll run through the Find Infected Devices metrics tomorrow.

–Mike Rothman

Monday, February 20, 2012

Malware Analysis Quant: Metrics—The Malware Profile

By Mike Rothman

Now we wrap up the Analyze Malware subprocess by looking at what it costs to documentat what we have learned. That’s the Malware Profile.

As we wrote in the original posts, the profile is used both to find out how widely malware has proliferated throughout your environment, and also to provide a point in time view of what the malware looks like, which you revisit periodically to factor in the inevitable changes as malware writers change their tactics. Let’s dig a bit deeper:

The Malware Profile

Variable Notes
Time to aggregate findings Gather the indicators identified during the analysis steps, including file attributes, registry settings, processes & services, new executables, domains & protocols, command and control activity, and persistence.
Time to document findings You should have a standard format for the profile, depending on the operational constituencies who will use it.
Time to distribute the profile You need to include the time to deliver the information and do the formal hand-off to make sure nothing falls into the cracks.
Time to revisit the profile Malware is not a static entity, so you need to budget time to revisit each malware profile periodically, to account for changes in attack vectors, payloads, etc.

With that, we are ready to start looking at the costs of finding malware in your environment, which means we will start breaking down the Malware Proliferation process next.

–Mike Rothman

Sunday, February 19, 2012

Malware Analysis Quant: Metrics—Dynamic Analysis

By Mike Rothman

Now that we have done what we can passively, through static analysis of the malware sample, it’s time to run it and see what happens. Fun! More detail on what that involves isin our process description of Dynamic Analysis.

There are three aspects of dynamic analysis: device analysis, network analysis, and proliferation analysis. When you run the malware you need to look for different indicators, depending on the type of analysis. Here is the entire process:

Dynamic Analysis

Variable Notes
Time to run malware against victim devices Match the file hash against database(s) of known malware files.

Device Analysis

Variable Notes
Time to capture and analyze volatile memory
Time to analyze configuration & registry changes
Time to assess and log file activity
Time to capture and analyze processes & services
Time to restart victim to test persistence
If no visible impact on VM, time to test against a physical machine Testing VM awareness generally requires a physical victim device, rather than a virtualized victim.

Network Analysis

Variable Notes
Time to capture network traffic
Time to search for suspicious destinations Using IP reputation and C&C analysis.
Time to analyze C&C traffic Determine what is being sent and where.
Time to analyze exfiltrated information If available.

Proliferation Analysis

Variable Notes
Time to set up another vulnerable victim This additional victim will be the target of any attempts by the malware to spread, pivot, or otherwise infect another device.
Time to capture network traffic (again) Rather than initial traffic patterns, you are now looking to see how the malware searches for devices and follows up.
Time to isolate reconnaissance traffic
Time to observe and assess proliferation activity Malware may use different tactics to compromise additional devices once established; so observe not just for the initial attack vector, but for anything else.

At this point, you have put in the effort to understand what the malware is doing, so the next step is to document those findings into a malware profile useful for other constituencies.

–Mike Rothman

Friday, February 17, 2012

Malware Analysis Quant: Metrics—Static Analysis

By Mike Rothman

As we continue with the Malware Analysis subprocess the next step is Static Analysis.

It’s useful to start your analysis with the file itself because malware authors leave plenty of clues within their executable files, which can be used to identify future infections. So we spend some time on the file before we worry about what it does. Let’s take a look at what’s involved:

Static Analyis

Variable Notes
Time to check file fingerprint Check the file hash against database(s) of known malware files.
Time to analyze file packer
Time to classify file structure
Time to analyze text strings This involves finding clues in snippets of the malware and searching for similar snippets to find potential patterns.
Time to disassemble malware Using a disassembler can provide a lot of insight into what the malware is doing, and again can provide identifiable patterns to help identify the attackers.

Of course, static analysis is only the beginning of the fun. Next we’ll look at what’s involved in doing the different aspects of dynamic analysis, including viewing the impact on a device and network, as well as studying how the malware proliferates.

–Mike Rothman

Thursday, February 16, 2012

Malware Analysis Quant: Metrics—Build Testbed

By Mike Rothman

Now that we have started defining metrics for how much it really costs to analyze malware, let’s dive into the Malware Analysis subprocess. The first step is Build Testbed.

You set up a testbed because you need somewhere to play. The work is mainly during initial setup, but keeping your environment and tools current requires ongoing effort which we build into the cost model. So let’s take a look at what’s involved:

Build Testbed

Variable Notes
Time to research and design testbed This should include both the equipment required (network, devices, etc.), as well as external data sources.
Cost of equipment and tools
Time to build isolated test network
Time to configure victim devices You will likely need both virtualized and physical victim devices.
Time to install network services May include DNS, Internet access, IP and/or file reputation data sources
Time to research and select testing tools
Time to install/configure testing tools May include device imaging, file/data analysis, registry/configuration analysis, sandbox, log analyzers, and/or network capture/analysis tools.
Time to build repositories for comparison As you perform analysis, you will have data to compare new attacks against, so this step involves building the environment to store and index other malware analysis findings.
Time to determine format/structure of malware profile The malware profile is how you will communicate findings about the malware to other operational groups.
Revisit tool selection and testbed design, as needed You should survey the latest/greatest in analysis tools fairly frequently to take advantage of innovation.

The good news is that you only have to do this once, and revisit the tool selection every so often. Perhaps quarterly, depending on the type of analysis you do. Next we’ll get into static analysis of the malware file.

–Mike Rothman

Wednesday, February 15, 2012

Malware Analysis Quant: Metrics—Confirm Infection

By Mike Rothman

After our little break, it’s time to dig back into the Malware Analysis Quant project. We’re in the home stretch now, and will be tearing through each subprocess to define a set of metrics that can measure what each step in the process costs.

But the primary cost for all aspects of analyzing malware is time. Sure, there are tools to buy and a testbed to build, but compared to the cost of the people you have pulling apart the malware, it’s all minimal. So we’ll break down the specific tasks, but to figure out what malware analysis costs you, you will need to track your own costs. This means measuring your activity down to a pretty granular level, which may or may not be possible in your environment.

So take what you can from this project. The last thing you want to do is spend more time gathering data and counting hours than doing your job. But in order to understand what it costs to analyze malware, you need to understand where you spend your time – there is just no way around it.

So without further ado, let’s jump into the Confirm Infection subprocess. As you may recall, there are four steps in this subprocess:

  1. Notification: The process can start in a number of ways, including a help desk call, an alert from a third party (payment processor, law enforcement, etc.), or an endpoint suite alert. However it starts, you need to figure out whether it’s a real issue.
  2. Quarantine: The first priority is to contain the damage, so the first step is typically to remove the device from the network to prevent it from replicating or pivoting (jumping to another device on your network).
  3. Triage: Once the device off the grid you get to figure out how sick it is. This involves all sorts of quick and dirty analysis – it’s not about figuring out exactly what it is, but simply whether it’s a problem.
  4. Confirm: At this point you should have enough information to know whether the device is infected and by what. Now you have a decision to make: what to do next.

Here are the operational metrics:

Notification

Variable Notes
Time to receive notification
Time for fact finding Ask questions to validate the issue and identify symptoms
Time to find device
Time to determine escalation The results from fact finding should drive a clear decision process to determine response escalation.
Time to document initial findings Clear documentation is important for escalation to the next level.

Quarantine

Variable Notes
Time to decide whether to isolate device The criteria for when to pull a device from the network should be clearly defined.
Time to isolate device
Time to capture and store device image The image is captured both for forensics and investigative purposes.

Triage

Variable Notes
Time to observe device behavior Here you are looking for something obvious. Like a zillion pop-up windows.
Time to run AV scan You need to cover all your bases, and sometimes the AV scan actually finds something.
Time to test configuration/registry
Time to analyze open processes
Time to examine file activity Malware usually changes or otherwise impacts files.
Time to analyze memory

Confirm

Variable Notes
Time to decide whether the device is infected Based on the triage results, using criteria defined in advance for determining whether a device is infected.
Time to attempt to identify malware Check internal and external resources to attempt to identify the infection.
Time to document findings and decision Regardless of the decision in the next step, you need to document what you’ve found.
Time to escalate/hand off to next step Depending on the decision, the file and documentation either goes to one group to analyze malware, or another group to assess proliferation.

In the next post we will focus on costs (in terms of both time and tools) to build the testbed.

–Mike Rothman

Monday, January 16, 2012

Malware Analysis Quant: Monitoring for Reinfection

By Mike Rothman

As we bring Phase 1 of our Malware Analysis Quant project to a close, let’s talk a bit about recursion. Because detecting malware is not a one-time effort. Malware writers constantly morph and evolve their attacks – using the same base techniques but tuning to account for current detection methods and anti-virus signatures, and generally to improve their wares. That’s why we advocate a process for revisiting malware profiles periodically as described in the Malware Profile step.

But that’s not the only recursion necessary in the Malware Analysis process. You need to be constantly vigilant to reinfections of the same attacks. Why? Because users continue to click on things. They fall for social engineering attacks and tend to get compromised even when you tell them not to. So you need to keep searching for indications of each attack on an ongoing basis. How frequently depends on many factors, including resources and automation, but we need to be clear about the ongoing costs of tracking malware proliferation.

The steps in monitoring for reinfection are:

  1. Define testing frequency: First step is to figure out how often you want to check whether you have been infected again. This varies depending on the frequency of attack, automation, resource realities, and a general assessment of risk. You can also test some devices – perhaps those with higher risk factors, such as mobile devices and/or those with very sensitive information – more frequently. Clearly you should test as frequently as practical, but every security decision requires a risk/benefit analysis.
  2. Run testing tools: You defined the rules already and even ran them once when you looked for infected devices, so add the rules to the persistent testing rules in your scanners and other tools and then run the tests. You don’t need to run all tests all the time, but can rely on frequency decisions made in the first step.
  3. Search Logs: As with the testing tools, a key aspect of finding infected devices is to monitor the logs of target devices and other network/security devices for indications of attack. This could be an ongoing script run against aggregated logs or a more sophisticated rule deployed on a SIEM. We are fans of continuous monitoring so if a SIEM is in place we prefer to see continuous monitoring. But of course that depends on the availability of automation.
  4. Analyze results: As if you didn’t have enough to do, you need to wade through the findings at some point. Alerting can be your friend here, highlighting specific situations and kicking your incident response process into gear.
  5. Document results: As in the original process step, you need to document your findings, which might be as simple as adding a device name or IP address on the back of a napkin. But if you have many hands in the process, with separate groups responsible for response and remediation, your documentation needs to be a bit more formal. Don’t assume the operations team (or whoever is responsible for remediation) has any background on this type of malware; don’t assume anything about the impact of the attack; don’t assume everybody feels the full urgency of fixing it; don’t assume anything. Everything must be spelled out in your documentation.
  6. If infected proceed to remediation: If there is a clear sign of infection then continue to remediate, leveraging your top-notch documentation. Obviously this entails many decisions, but we won’t put the cart before the horse. We will get into that once we have finished looking for all the indicators of a potential attack.

That wraps up the descriptions of each step in the process map. We will be putting together the survey instrument and launching that aspect of this research over the next week or so. Stay tuned for that, and as always feel free to contribute feedback on the process map.

BTW, if you have trouble adding a comment, please send us a note to info (at) securosis (dot) com. Since our migration to a new host, some blog features have been a little, uh, flaky.

–Mike Rothman

Friday, January 13, 2012

Malware Analysis Quant: Remediate

By Mike Rothman

At this point in the process you have confirmed the presence of malware on a device. Then you systematically analyzed that malware to build a profile of what it does and how. Finally you searched high and low in your environment to find all the devices it infected. Now what? The next step in the malware proliferation subprocess is to Remediate. That’s right, you finally get to address the issue.

Of course there are individuals who simply don’t want to know. They go about their business, trying to avoid work, and definitely aren’t searching for malware – especially the nasty, hard-to-find and even harder to clean variety. But you are reading this research, so we can assume you are not one of those lazy security folks – instead you want to solve the problem, do the right thing, and protect the information assets of your organization. If you are the former, you have stumbled across the wrong post – feel free to get back to your captivating game of Angry Birds.

But we digress. You’d think that once you find the malware you’d just clean it up. Right? Surprisingly enough, the answer is maybe. A lot more thinking goes into remediation than just a do-or-don’t decision. But before we get there we will outline the steps:

  1. Determine remediation strategy: Figure out whether you want to clean up the malware, and if so how. Yes, there are situations where you would choose not to clean an infection, as we will describe below.
  2. Remediate: Once you have decide to clean the device, do it. This may involve removing the malware or wiping the machine entirely, depending on the malware’s nature, what it malware does, and the value of the data on the infected machine. It is generally better to wipe and reimage machines where possible. With modern malware, you cannot be sure you have expunged it using any lesser method, so it’s easier and more reliable to just start over.
  3. Test remediation: Regardless of whether you cleaned or reimaged a device, take the time to test your remediation. As with patch management (as described in our Patch Management Quant research), we are talking about software, and software doesn’t always work. There is little rhyme or reason behind why changes sometimes don’t stick, but verify the remediation was effective. If it worked, great. If it didn’t, try again.
  4. Isolate Patient Zero: Far too many security folks focus on the initial removal of the malware, but the sad truth is that you are never done fighting an attack. That old adage, about those who forget history being doomed to repeat it, holds for malware as well. Hopefully, by finding the devices that were attacked, you can understand the malware’s trajectory through your environment. If you follow this thought to its logical conclusion, hopefully you can find the first malware victim in your environment, and ultimately identity the initial attack vector that resulted in the compromise. That’s what we call Patient Zero. Why is this important? Because you don’t want to be infected by the same malware again, so you need to identify and fix the root cause of the attack, or be doomed to repeat it.

To remediate or not to remediate? That is the question.

Earlier we mentioned a scenario where you would choose not to remediate a malware attack. We understand that is counterintuitive, but it does happen and you need to consider it. If you are the victim of a targeted attack by a persistent (most likely state-sponsored) attacker – and no, we aren’t going to use the acronym – then you may want to quarantine the compromised device, rather than simply fix it.

We know it’s counterintuitive, but this is important. A persistent attacker will have a presence in your environment. That’s their mission and they will do whatever it takes, for however long it takes, to achieve and maintain that presence. Once you remediate a compromised device, they will initiate another process to gain a new foothold. So the race just starts over. Alternatively, you might choose to quietly pull any sensitive data off that machine and then monitor it very closely – perhaps even implementing a special semi-quarantined network where it isn’t completely cut off but cannot do much damage. Then you can feed the device disinformation, monitor it, and track the tactics of your adversary – rather than tipping them off to compromise a new target.

More likely, you are not particularly targeted for this kind of attack; if so then simply carry on. Remediate the device and be sure to keep an eye on things on an ongoing basis. The first priority is tracking how the malware changes so you can keep your profile current. We discussed that when building the Malware Profile. The rest of your work is to check your infrastructure for signs of reinfection – the subject of our final process-oriented post. That is a critical part (and cost) of your battle against malware.

–Mike Rothman

Wednesday, January 11, 2012

Malware Analysis Quant: Find Infected Devices

By Mike Rothman

We have reached the middle of our Malware Proliferation subprocess, and you have defined rules to find the malware in question. Now we get to actually do something and look for a ‘smoking gun’. In these steps we will use testing tools and log analysis to pinpoint infections.

Scan Devices

Let’s start with testing tools. Here are the steps involved in scanning devices to find infections:

  1. Deploy rule on testing tool: This may be a scanner, pen testing tool, configuration manager, forensics tool, etc. You need to generate a rule for your tool from the profile developed in the last step.
  2. Run tool: Here you run the rule on the tool. We know this is obvious – it’s just part of laying out the whole process in sufficiently obvious detail for Quant, and it’s part of the cost model. Which means we have to list everything.
  3. Analyze results: Once the tool completes you analyze the results. Maybe a number of devices have clearly been compromised. Perhaps it’s less obvious, and you need to start looking for other markers. Either way you need to wade through the results to determine which devices have in fact been compromised.
  4. Document results: If you are performing the analysis and scanning, then the documentation may be as simple as a device name or IP address on the back of a napkin. But if you have many hands in the process, with separate groups responsible for response and remediation, your documentation will need to be a bit more formal. Don’t assume the operations team (or whoever is responsible for remediation) has any background on this type of malware; don’t assume anything about the impact of the attack; don’t assume everybody feels the full urgency of fixing it; don’t assume anything. Everything must be spelled out in your documentation.
  5. If infected, proceed to remediation: If there is a clear sign of infection then continue to remediate, leveraging your top-notch documentation. Obviously remediation entails many decisions, but let’s not putt the cart ahead of the horse. We will get into that once we have finished looking for all the indicators of a potential attack.

Search Logs

The steps involved in searching the logs are very similar to running the testing tools. So let’s not belabor the point, and just jump right in:

  1. Search Logs: Similar to the ‘rules’ used by the testing tools, search your logs for the indicators defined in the malware profile. You need a reasonably sophisticated search capability to find the proverbial needle in your haystack(s). Perhaps you are looking for C&C controllers, so you can search network logs for signs. Maybe you are looking for a specific executable loaded, or a particular running process, in which case you would search device logs. Our research shows active testing (as described above) usually provides the quickest way to find infected devices but you cannot afford to overlook logs. Especially to pinpoint potentially dormant malware – which might not yet have executed, or might be inactive in virtual machines, for instance.
  2. Analyze results: Again, working from your search restults, you may need to dig deeper into indicated devices to complete your investigation.
  3. Document results: As above. Nothing to add here.
  4. If infected, proceed to remediation: Same as above.

So in these steps we are finding all the devices on your network that show any sign of the malware you analyzed and profiled. This is not a one-time activity, and we will talk about the need to search for these indicators on an ongoing basis, when we wrap up the process model descriptions.

At the end of this step you know which devices have been compromised. Next you need to remediate. Of course you have a number of options for remediation, and which you choose depend on the particular situation. We will delve into that next.

–Mike Rothman

Monday, January 09, 2012

Malware Analysis Quant: Defining Rules

By Mike Rothman

We closed out the Malware Analysis subprocess in the last post, producing a comprehensive profile of the malware. Now we can decompose the final subprocess, figuring out how badly your organization has been infected. The first step in this Malware Proliferation subprocess is to define the rules you will use to find the malware with the tools you have.

It looks like this:

  1. Develop rule: First develop the rule. Sorry, but we have to be a bit pedantic for Quant. This depends heavily on the tool you will use to (try to) isolate infected devices. For instance, you might need to build a custom rule for your vulnerability scanner. Maybe you will build an IDS rule to look for command and control targets in your egress traffic. Perhaps you will search your CMDB (configuration management database) for specific configuration/registry settings or executables. Or more likely all of the above.
  2. Test rule: Be happy you have a testbed environment – now you get to test your shiny new rule. That means actually infecting a victim machine (in an isolated environment of course) and seeing whether the rule works. If so, move onto the Document step. If not figure out what needs to be changed and then move on to the next step.
  3. Refine rule (and retest): After failing the test (or perhaps just not exactly passing), make the appropriate changes and try again. Yes, depending on how complicated the malware is, this might involve a few rules (typically 3-4) or many. And if you have sophisticated malware analysts on staff you might not need to define as many rules, as analysts can confirm other indicators defined in the profile, without using other tools to confirm.
  4. Document rule: Once the rule is tested and passes muster, you need to document what the rule(s) look like. Again, how formally you document the rule(s) depends on how many different groups you have involved in incident response. If it’s a small team you might be able to get by with streamlined documentation. But for a large team, potentially involving third parties, you will need to be fairly formal with the documentation. Especially if a distinct operations group needs to run the scans or set up the rules on devices they control.
  5. Go back to Step 1: As described above, it’s pretty rare there will be one smoking gun indicator that enables a simple rule to identify malware and determine proliferation. Once you finish one rule you can go back to Step One and start the next rule, based on the indicators of the malware profile.

Finding ZeuS

To see how this process works let’s take a look at the ZeuS malware. You can find the profile of this attack on the OpenIOC site, and if you can parse the XML you will see a few indicators that identify this particular attack. Without going through all the indicators, you can quickly see a number of process indicators which describe the processes ZeuS tends to use. You can scan all your vulnerable devices for these processes.

If you want to look for the specific network sites typically associated with Zeus, you could consider the approach documented on Sourcefire’s VRTLabs site. They reference the cool ZeuS Tracker, which lists the C&C servers and fake URLs it uses.

We tend to have a decent amount of information available on how to find the widespread attacks within our environments. That also means you’ll need to figure out the best approach for tracking proliferation in your environment. Depending on the attack you might want to run each test in a different order, or skip certain tests entirely. Finding malware tends to be a rather particular endeavor, and that makes it, uh, fun. If you’re into that kind of thing.

In the next post we will take our rules and run some scans, as the typical first step for trying to find infected devices.

–Mike Rothman

Sunday, January 08, 2012

Malware Analysis Quant: The Malware Profile

By Mike Rothman

As we resume our dive through the steps in our Malware Analysis process map, we wrap up the Malware Analysis subprocess by leveraging all the analysis we have done in the previous few steps (Static Analysis and Dynamic Analysis), and building a profile of what we know about the malware attack.

We need to keep our goals in mind when building the profile.

  1. Assess Malware Proliferation: There are times when only one device gets infected during a malware attack. But other times it becomes an outbreak. So your first job after developing the profile is to figure out whether the malware has spread. That is our third subprocess, which we will dig into next week.
  2. Prevent reinfection: The other purpose of the malware profile is to make sure you don’t get reinfected. We will dig into how later, as well.

The key for this step is specificity. The more work you do now to describe the malware, the easier it will be later to build rules which achieve those goals. So part of the static and dynamic analysis is about digging deep, figuring out exactly what the malware does, and identifying markers which will help find it. We need to describe those markers now, in language the folks who build the rules to find the malware later can understand. So the process of packaging your malware profile looks like this:

  1. Aggregate findings: This first step is to take all the information from your analysis (including the device, network, and proliferation analyses) and put them all in one place. Depending on the size of your malware analysis team you may pull stuff from a bunch of different places. Here is a short (and not comprehensive) list of typs of information you might have – we described them all in previous posts.
    • File attributes
    • Registry settings
    • Processes/services
    • New executables
    • Domains/protocols
    • Command and control obfuscation
    • Persistence/VM awareness
  2. Package Profile: In this step you to document what you have found, in a way the folks looking for malware can leverage. If there is separation between the malware analyst and the incident responders who look for infections, then you need to work out the preferred packaging for this information. According to our research, the closer the analysts can get to packaged rules which responders can just plug into their scanners and forensics tools, the better the relationship will be between them.
  3. Distribute Profile: Again, depending on the size of your security team, there may be a number of folks who need access to the profile. So those interactions must be defined at the start of the process. Some organizations also share their analysis findings with key strategic vendors, industry information sharing groups, or mailing lists. So your profile might also be used externally, which may impact the type and depth of documentation you produce.

Typically we don’t highlight vendor activity in process maps, but we need to mention the work Mandiant has done with the OpenIOC initiative. They have produced a set of XML schemas which describe the types of information necessary to identify and find malware in your organization, and provided them as open source. Of course this is self serving – Mandiant’s incident response tools leverage the formats, so the more broadly OpenIOC is adopted, the better for them, but we haven’t found another comprehensive set of descriptors for malware indicators.

Revisiting the Malware Profile

The only thing we can count on from malware writers is that they will not stand still. They will continue to adapt and evolve their malware to avoid detection, to increase its infection rate, and perhaps to add control features to better leverage infected systems. So we need to revisit malware profiles periodically, looking for changes in their indicators. So how often will you revisit the profile? Basically every time you find the malware in your environment, as there might be new or changed indicators that require an update to the profile.

We also recommend that, if you can identify the name of the malware once anti-malware vendors have profiled and named it, you watch malware lists and other information sources for new information about it. For each of the high-profile attacks (Zeus and Stuxnet come to mind), you will notice that the research community continues to find different variants, which means you will need to update your profile.

So as part of the metrics model we will build later in this Quant project, we will factor in this continuing revision process for morphing malware.

With a very detailed (and preferably technical) profile, the incident response team can go about trying to figure out whether and how badly the malware has spread throughout your organization. That’s the focus of Malware Analysis Quant’s third process, and we will dig into those steps next.

–Mike Rothman