Network Security Fundamentals: Monitor Everything
As we continue on our journey through the fundamentals of network security, the idea of network monitoring must be integral to any discussion. Why? Because we don’t know where the next attack is coming, so we need to get better at compressing the window between successful attack and detection, which then drives remediation activities. It’s a concept I coined back at Security Incite in 2006 called React Faster, which Rich subsequently improved upon by advocating Reacting Faster and Better. React Faster (and better) I’ve written extensively on the concept of React Faster, so here’s a quick description I penned back in 2008 as part of an analysis of Security Management Platforms, which hits the nail on the head. New attacks are happening at a fast and furious pace. It is a fool’s errand to spend time trying to anticipate where the issues are. REACT FASTER first acknowledges that all attacks cannot be stopped. Thus, focus remains on understanding typical traffic and application usage trends and monitoring for anomalous behavior, which could indicate an attack. By focusing on detecting attacks earlier and minimizing damage, security professionals both streamline their activities and improve their effectiveness. Rich’s corollary made the point that it’s not enough to just react faster, but you need to have a plan for how to react: Don’t just react – have a response plan with specific steps you don’t jump over until they’re complete. Take the most critical thing first, fix it, move to the next, and so on until you’re done. Evaluate, prioritize, contain, fix, and clean. So monitoring done well compresses the time between compromise and detection, and also accelerates root cause analysis to determine what the response should involve. Network Security Data Sources It’s hard to argue with the concept of reacting faster and collecting data to facilitate that activity. But with an infinite amount of data to collect, where do we start? What do we collect? How much of it? For how long? All of these are reasonable questions that need answers as you construct your network monitoring strategy. The major data sources from your network security infrastructure include: Firewall: Every monitoring strategy needs to correspond to the most prevalent attack vectors, and that means from the outside in. Yes, the insider threat is real, but script kiddies are alive and well and that means we need to start by looking at our Internet-facing devices. First we pull log and activity information from our firewalls and UTM devices on the perimeter. We look for strange patterns, which usually indicate something is wrong. We want to keep this data long enough to ensure we have sufficient data in the event of a well-executed low and slow attack, which means months rather than days. IPS: The next layer in tends to be IPS, looking for patterns of traffic that indicate a known attack. We want the alerts first and foremost. But we also want to collect the raw IPS logs as well. Just because the IPS doesn’t think specific traffic is an attack doesn’t mean it isn’t. It could be a dreaded 0-day, so we want to pull all the data we can off this box as well, since the forensic analysis can pinpoint when attacks first surfaced and also provide guidance as to the extent of the compromise. Vulnerability scans: Are those devices vulnerable to a specific attack? Vulnerability scan data is one of the key inputs to SIEM/correlation products. The best way to reduce false positives is not to fire an alert if the target is not vulnerable. Thus we keep scan data on hand, and use it both for real-time analysis and also forensics. If an attack happens during a window of vulnerability (like while you debate the merits of a certain patch with the ops guys), you need to know that. Network Flow Data: I’ve always been a big fan of network flow analysis and continue to be mystified that market never took off, given the usefulness of understanding how traffic flows within and out of a network. All is not lost, since a number of security management products use flow data in their analyses and a few lower end management products use flow data as well. Each flow record is small, so there is no reason not to keep a lot of it. Again, we use this data to both pinpoint potential badness, and also replay attacks to understand how they spread within the organization. Device Change Logs: If your network devices get compromised, it’s pretty much game over. Traffic can be redirected, logging suppressed, and lots of other badness can result. So keep track of device configuration and more importantly when those changes happen – which helps isolate the root causes of breaches. Yes, if the logs are turned off, you lose visibility, which can itself indicate an issue. Through the wonders of SNMP, you should collect data from all your routers, switches, and other pipes. Content security: Now we can climb the stack a bit to pull information off the content security gateways, since a lot of attacks still show up via phishing emails and malware-laden web links. Again, we aren’t trying to pull this data in necessarily to stop an attack (hopefully the anti-spam box will figure out you aren’t interested in the little blue pill), but rather to gather more information about the attack vectors and how an attack proliferates through your environment. Reacting faster is about learning all we can about what is compromised and responding in the most efficient and effective manner. Keeping things focused and pragmatic, you’d like to gather all this data all the time across all the networks. Of course, Uncle Reality comes to visit and understandably, collection of everything everywhere isn’t an option. So how do you prioritize? The objective is to use the data you already have. Most organizations have all of the devices listed above. So all the data sources exist, and should be prioritized based on importance to the