So it’s probably apparent that Mike and I have slightly different opinions on some security topics, such as Monitoring Everything (or not). But sometimes we have exactly the same viewpoint, for slightly different reasons. Correlation is one of these later examples.
I don’t like correlation. Actually, I am bitter that I have to do correlation at all. But we have to do it because log files suck. Please, I don’t want log management and SIEM vendors to get all huffy with that statement: it’s not your fault. All data sources for forensic information pretty much lack sufficient information for forensics, and deficient in options for tuning and filtering capabilities to make them better. Application developers did not have security in mind when they created the log data, and I have no idea what the inventors of Event Log had in mind when they spawned that useless stream of information, but it’s just not enough.
I know that this series is on network fundamentals, but I want to raise an example outside of the network space to clearly illustrate the issue. With database auditing, the database audit trail is the most accurate reflection of the database transactional history and database state. It records exactly what operations are performed. It is a useful centerpiece of auditing, but it is missing critical system events not recorded in the audit trail, and it does not have the raw SQL queries sent to the database. The audit trail is useful to an extent, but to enforce most policies for compliance or to perform a meaningful forensic investigation you must have additional sources of information (There are a couple vendors out there who, at this very moment, are tempted to comment on how their platform solves this issue with their novel approach. Please, resist the temptation). Relational database platforms do a better job of creating logs than most networks, platforms, or devices.
Log file forensics are a little like playing a giant game of 20 questions, and each record is the answer to a single question. You find something interesting in the firewall log, but you have to look elsewhere to get a better idea of what is going on. You look at an access control log file, and now it really looks like something funny is going on, but now you need to check the network activity files to try to estimate intent. But wait, the events don’t match up one for one, and activity against the applications does not map one-to-one with the log file, and the time stamps are skewed. Now what? Content matching, attribute matching, signatures, or heuristics?
Which data sources you select depends on the threats you are trying to detect and, possibly, react to. The success of correlation is largely dependent on how well you size up threats and figure out which combination of log events are needed. And which data sources you choose. Oh, and then how well develop the appropriate detection signatures. And then how well you maintain those policies as threats morph over time. All these steps take serious time and consideration.
So do you need correlation? Probably. Until you get something better. Regardless of the security tool or platform you use for threat detection, the threat assessment is critical to making it useful. Otherwise you are building a giant Cartesian monument to the gods of useless data.