November 12, 2010 - Securosis

Incident Response Fundamentals: Mop up, Analyze, and QA

Rich November 12, 2010 1 Comment

You did well. You followed your incident response plan and the fire is out. Too bad that was the easy part, and you now get to start the long journey from ending a crisis all the way back to normal. If we get back to our before, during, and after segmentation, this is the ‘after’ part. In the vast majority of incidents the real work begins after the immediate incident is over, when you’re faced with the task of returning operations to status quo ante, finding out the root cause of the problem, and putting controls in place to ensure it doesn’t happen again. The after part of the process consists of three phases (Mop up, Analyze, and Q/A), two of which overlap and can be performed concurrently. And remember – we are describing a full incident response process and tend to use major situations in our examples, but everything we are talking about scales down for smaller incidents too, which might be managed by a single person in a matter of minutes or hours. The process should scale both up and down, depending on the severity and complexity of an incident, but even dealing with what seems to be the simplest incident requires a structured process. That way you won’t miss anything. Mop up We steal the term “mop up” from the world of firefighting – where cleaning up after yourself may literally involve a mop. Hopefully we won’t need to break out the mops in an IT incident (though stranger things have happened), but the concept is the same – clean up after yourself, and do what’s required to restore normal operations. This usually occurs concurrently with your full investigation and root cause analysis. There are two aspects to mopping up, each performed by different teams: Cleaning up incident response changes: During a response we may take actions that disrupt normal business operations, such as shutting down certain kinds of traffic, filtering email attachments, and locking down storage access. During the mop up we carefully return to our pre-incident state, but only as we determine it’s safe to do so, and some controls implemented during the response may remain in place. For example, during an incident you might have blocked all traffic on a certain port to disable the command and control network of a malware infection. During the mop up you might reopen the port, or open it and filter certain egress destinations. Mop up is complete when you have restored all changes to where you were before the incident, or have accepted specific changes as a permanent part of your standards/configurations. Some changes – such as updating patch levels – will clearly stay, while others – including temporary workarounds – need to be backed out as a permanent solution goes into place. Restoring operations: While the incident responders focus on investigation and cleaning out temporary controls they put in place during the incident, IT operations handles updating software and restoring normal operations. This could mean updating patch levels on all systems, or checking for and cleaning malware, or restoring systems from backup and bringing them back up to date, and so on. The incident response team defines the plan to safely return to operations and cleans up the remnants of its actions, while IT operations teams face the tougher task of getting all the systems and networks where they need to be on a ‘permanent’ basis (not that anything in IT is permanent, but you know what we mean). Investigation and Analysis The initial incident is under control, and operations are being restored to normal as a result of the mop up. Now is when you start in-depth investigation of the incident to determine its root cause and determine what you need to do to prevent a similar incident from happening in the future. Since you’ve handled the immediate problem, you should already have a good idea of what happened, but that’s a far cry from a full investigation. To use a medical analogy, think of it as switching from treating the symptoms to treating the source of the infection. To go back to our malware example, you can often manage the immediate incident even without knowing how the initial infection took place. Or in the case of a major malicious data leak, you switch from containing the leak and taking immediate action against the employee to building the forensic evidence required for legal action, and ensuring the leak becomes an isolated incident, not a systematic loss of data. In the investigation we piece together all the information we collected as part of the incident response with as much additional data we can find, to help produce an accurate timeline of what happened and why. This is a key reason we push heavy monitoring so strongly, as a core process throughout your organization – modern incidents and attacks can easily slip through the gaps of ‘point’ tools and basic logs. Extensive monitoring of all aspects of your environment (both the infrastructure and up the stack), often using a variety of technologies, provides more complete information for investigation and analysis. We have already talked about various data sources throughout this series, so instead of rehashing them, here are a few key areas that tend to provide more useful nuggets of information: Beyond events: Although IDS/IPS, SIEM, and firewall logs are great to help manage an ongoing incident, they may provide an incomplete picture during your deeper investigation. They tend to only record information when they detect a problem, which doesn’t help much if you don’t have the right signature or trigger in place. That’s where a network forensics (full network packet capture) solution comes in – by recording everything going on within the network, these devices allow you to look for the trails you would otherwise miss, and piece together exactly what happened using real data. System forensics: Some of the most valuable tools for analyzing servers and endpoints are system forensics tools. OS and application logs are all too easy to fudge during an attack. These tools are also

Read Post

What You Need to Know about DLP for PCI 2.0

Rich November 12, 2010 1 Comment

As I mentioned in my PCI 2.0 post, one of the new version’s most significant changes is that organizations now must not only confirm that they know where all their cardholder data is, but document how they know this and keep it up to date between assessments. You can do this manually, for now, but I suspect that won’t work except in the most basic environments. The rest of you will probably be looking at using Data Loss Prevention for content discovery. Why DLP? Because it’s the only technology I know of that can accurately and effectively gather the information you need. For more details (much more detail) check out my big DLP guide. For those of you looking at DLP or an alternate technology to help with PCI 2.0, here are some things to look for: A content analysis engine able to accurately detect PAN data. A good regular expression is a start, although without some additional tweaking that will probably result in a lot of false positives. Potentially a ton… The ability to scan a variety of storage types – file shares, document management systems, and whatever else you use. For large repositories, you’ll probably want a local agent rather than pure network scanning for performance reasons. It really depends on the volume of storage and the network bandwidth. Worst case, drop another NIC into the server (whatever is directly connected to the storage) and connect it via a subnet/private network to your scanning tool. Whatever you get, make sure it can examine common file types like Office documents. A text scanner without a file cracker can’t do this. Don’t forget about endpoints – if there’s any chance they touch cardholder data, you’ll probably be told to either scan a sample, or scan them all. An endpoint DLP agent is your best bet – even if you only run it occasionally. Few DLP solutions can scan databases. Either get one that can, or prepare yourself to manually extract to text files any database that might possibly come into scope. And pray your assessor doesn’t want them all checked. Good reporting – to save you time during the assessment process. DLP offers a lot more, but if all you care about is handling the PCI scope requirement, these are the core pieces and features you’ll need. Another option is to look at a service, which might be something SaaS based, or a consultant with DLP on a laptop. I’m pretty sure there won’t be any shortage of people willing to come in and help you with your PCI problems… for a price. Share:

Read Post

Research

Incident Response Fundamentals: Mop up, Analyze, and QA

What You Need to Know about DLP for PCI 2.0

Research

Firestarter: Multicloud Deployment Structures and Blast Radius

Firestarter: So you want to multicloud?

Firestarter: 2019: Insert Winter is Coming Meme Here

Firestarter: re:Invent Security Review

Firestarter: Hardware Hacks and Lift and Pray

Sign Up for Our Newsletter

Contact

About

Quick Links