By Mike Rothman
Every time I think I’m making progress on controlling my cynical gene, I see something that sets me back almost to square one. I’ve been in this game for a long time, and although I think subconsciously I know some things are going on, it’s still a bit shocking to see them in print.
What set me off this time is Richard Bejtlich’s brief thoughts on the WEIS 2010 (Workshop on the Economics of Information Security) conference. His first thoughts are around a presentation on cyber insurance. The presenter admitted that the industry has no expected loss data and no financial impact data. Really? They actually admitted that. But it gets better.
Your next question must be, “So how do they price the policies?” It certainly was mine. Yes! They have an answer for that: Price the policies high and see what happens. WHAT? Does Dr. Evil head their policy pricing committee? I can’t say I’m a big fan of insurance companies, and this is the reason why. They are basically making it up. Pulling the premiums out of their butts. Literally. And they would never err favor of folks buying the policies, so you see high prices.
Clearly this is a chicken & egg situation. They don’t have data because no one shares it. So they write some policies to start collecting data, but they price the policies probably too high for most companies to actually buy. So they still have no data. And those looking for insurance don’t really have any options.
I guess I need to ask why folks are looking for cyber-insurance anyway? I can see the idea of trying to get someone else to pay for disclosure – those are hard costs. Maybe you can throw clean-up into that, but how could you determine what is clean-up required from a specific attack, and what is just crappy security already in place? It’s not like you are insuring Sam Bradford’s shoulder here, so you aren’t going to get a policy to reimburse for brand damage.
Back when I worked for TruSecure, the company had an “insurance” policy guaranteeing something in the event of a breach on a client certified using the company’s Risk Management Methodology. At some point the policy expired, and when trying to renew it, we ran across the same crap. We didn’t know how to model loss data – there was none because the process was perfect. LOL! And they didn’t either. So the quote came back off the charts. Then we had to discontinue the program because we couldn’t underwrite the risk.
Seems almost 7 years later, we’re still in the same place. Actually we’re in a worse place because the folks writing these policies are now aggressively working the system to prevent payouts (see Colorado Casualty/University of Utah) when a breach occurs.
I guess from my perspective cyber-insurance is a waste of time. But I could be missing something, so I’ll open it up to you folks – you’re collectively a lot smarter than me. Do you buy cyber-insurance? For what? Have you been able to collect on any claims? Is the policy just to make your board happy? To cover your ass and shuffle blame to the insurance company? Do tell. Please!
Photo credit: “Dr Evil 700 Billion” originally uploaded by Radio_jct
Posted at Monday 19th July 2010 5:04 pm
(6) Comments •
By Mike Rothman
Everyone in security knows data isn’t the problem. We have all sorts of data – tons of it. The last two steps in the Monitor process (Collect and Store) were focused on gathering data and putting it where we can get to it. What’s in short supply is information. Billions of event/log records don’t mean much if you can’t pinpoint what’s really happening at any given time and send actionable alerts.
So analyzing the data is the next subprocess. Every alert that fires requires a good deal of work – putting all that disparate data into a common format, correlating it, reducing it, finding appropriate thresholds, and then maybe, just maybe, you’ll be able to spot a real attack before it’s too late. Let’s decompose each of these steps to understand whether and how to do this in your environment.
In the high level monitoring process map we described the Analyze step as follows:
The collected data is then analyzed to identify potential incidents based on alerting policies defined in Phase 1. This may involve numerous techniques, including simple rule matching (availability, usage, attack traffic policy violations, time-based rules, etc.) and/or multi-factor correlation based on multiple device types (SIEM).
Wow, that was a mouthful. To break this into digestible steps that you might actually be able to perform, here are 5 subprocesses:
- Normalize Events/Data: In the Collect step didn’t talk about putting all that data into a common format. Most analyses require some level of event data normalization. As vendor tools become more sophisticated, and can do more analysis on unstructured data, the need for normalization is reduced, but we always expect some level of normalizing to be required – if only so we can compare apples to apples, with data types that are largely apples and oranges.
- Correlate: Once we have the data in a common format we look for patterns which may indicate some kind of attack. Of course we need to know what we are looking for to define the rules that we hope will identify attacks. We spoke extensively about setting up these policies in Define Policies. A considerable amount of correlation can be automated but not all of it, so human analysts aren’t going away.
- Reduce Events: Our systems are so interrelated now that any attack touches multiple devices, resulting in many similar events being received by the central event repository. This gets unwieldy quickly, so a key function of the analysis is to eliminate duplicate events. Basically you need to increase the signal-to-noise ratio, and filter out irrelevant events. Note that we don’t mean delete – merely move out of the main analysis engine to keep things streamlined. For forensics we want to retain the full log record.
- Tune Thresholds: If we see 2 failed logins, that might be an attack, or not. If we spot 100 in 10 minutes, something funky is going on. Each rule needs thresholds, below which there is no alert. Rule tend to look like “If [something] happens X times within Y minutes, fire an alert.” But defining X and Y is hard work. You start with a pretty loose threshold that generates many alerts, and quickly tune and tighten those thresholds to keep the number of alerts manageable. When building compound policies, like “if [this] happens X times within Y minutes AND [that] happens Z times within W minutes”, it’s even more fun. Keep in mind that thresholds are more art than science, and require plenty of testing and observation to determine the correct mix. You may also be setting thresholds on baselines set via data capture (this was explained in Define Policies), but these need good thresholds just as much.
- Trigger Alerts: Finally, after you’ve normalized, correlated, reduced, and blown past the threshold, you need to alert someone to something. In this step you send the alert based on the policies defined previously. The alert needs to go to the right person or team, with sufficient information to allow validation and support response. Depending on your internal workflow, the alert might be sent from use the monitoring tool, a help desk system, paper, or smoke signals. Okay, smoke signals are out of style nowadays, but you get the point.
The Most Powerful Correlation Engine
As you can imagine, there is a lot of number crunching involved in this kind of analysis. That usually requires a lot of computing power cranking away on zillions of records in huge data stores, at least is if you deployed a tool to faciliate monitoring. I’ve met more than a few security professionals who use the world’s most powerful correlation engine much more effectively than any programmed tool. Yes, I’m talking about the human brain. These are unique folks, but there are people who can monitor event streams flying past them and ‘instinctively’ know when something is not normal.
Is this something you can count on? Not entirely, but we don’t think you can solve this problem purely in software, or without using software. As usual, somewhere in the middle is best for most organizations. We have seen many situations where risk priorities are set via SIEM, and an analyst can then very quickly can determine whether the issue requires further investigation/escalation. We’ll talk more about that when we discuss validating the alert.
The attack space is very dynamic, which means the correlation rules and thresholds for alerts that you use today will need to be adapted tomorrow. This doesn’t happen by itself, so your process needs to systematically factor in feedback from analysts about which rules are working and which aren’t. The rules and alerting thresholds get updated accordingly, and hopefully over time the system increases its effectiveness and value.
Device Type Variances
As discussed under Define Policies, each device type generates different data and so requires customized rules, reduction, and thresholds. But the biggest challenge in monitoring the different device types is figuring out the dependencies of rules that incorporate data from more than one device type. Many vendors ship their tools with a set of default policies that map to specific attack vectors and that’s a good start, but tuning those policies for your environment takes time. So when modeling these process steps (to understand the cost of delivering this service internally), we need to factor that demanding tuning process into the mix.
Large vs. Small Company Considerations
Although it’s probably not wise to make assumptions about large company behavior, on the surface a larger enterprise should be able to provide far more feedback and invest more in sophisticated instrumentation to automate a lot of the heavy-duty analysis. This is important because larger environments generate much more data, which makes manual/human analysis infeasible.
Small companies are generally more resource constrained – especially for the tuning process. If the thresholds are too loose many of alerts require validatation, which is time consuming. With thresholds too tight things can slip through the cracks. And getting adequate feedback to even go through the tuning process is a challenge when the entire team – which might just be one person – has an overflowing task list.
Monitoring systems aim to improve security and increase efficiency, but recognize that there a significant time investment is required to get the system to a point where it generates value. And that investment is ongoing, because the system must be constantly tuned to ensure relevance over time.
Now that we have our alerts generated, we need to figure out if there is anything there. That’s Validation and Escalation, our next set of subprocesses. Stay tuned.
Posted at Monday 19th July 2010 3:13 pm
(1) Comments •
We’ve been writing a lot on tokenization as we build the content for our next white paper, and in Adrian’s response to the PCI Council’s guidance on tokenization. I want to address something that’s really been ticking me off…
In our latest post in the series we described the details of token generation. One of the options, which we had to include since it’s built into many of the products, is encryption of the original value – then using the encrypted value as the token.
Here’s the thing: If you encrypt the value, it’s encryption, not tokenization! Encryption obfuscates, but a token removes, the original data.
Conceptually the major advantages of tokenization are:
- The token cannot be reversed back to the original value.
- The token maintains the same structure and data type as the original value.
While format preserving encryption can retain the structure and data type, it’s still reversible back to the original if you have the key and algorithm. Yes, you can add per-organization salt, but this is still encryption. I can see some cases where using a hash might make sense, but only if it’s a format preserving hash.
I worry that marketing is deliberately muddling the terms.
Opinions? Otherwise, I declare here and now that if you are using an encrypted value and calling it a ‘token’, that is not tokenization.
Posted at Monday 19th July 2010 6:35 am
(21) Comments •
By Adrian Lane
In this post we’ll dig into the technical details of tokens. What they are and how they are created; as well as some of the options for security, formatting, and performance. For those of you who read our stuff and tend to skim the more technical posts, I recommend you stop and pay a bit more attention to this one. Token generation and structure affect the security of the data, the ability to use the tokens as surrogates in other applications, and the overall performance of the system. In order to differentiate the various solutions, it’s important to understand the basics of token creation.
Let’s recap the process quickly. Each time sensitive data is sent to the token server three basic steps are performed. First, a token is created. Second, the token and the original data are stored together in the token database. Third, the token is returned to the calling application. The goal is not just to protect sensitive data without losing functionality within applications, so we cannot simply create any random blob of data. The format of the token needs to match the format of the original data, so it can be used exactly as if it were the original (sensitive) data. For example, a Social Security token needs to have at least the same size (if not data type) as a social security number. Supporting applications and databases can accept the substituted value as long as it matches the constraints of the original value.
Let’s take a closer look at each of the steps.
There are three common methods for creating tokens:
- Random Number Generation: This method substitutes data with a random number or alphanumeric value, and is our recommended method. Completely random tokens offers the greatest security, as the content cannot be reverse engineered. Some vendors use sequence generators to create tokens, grabbing the next value in the series to use for the token – this is not nearly as secure as a fully randomized number, but is very fast and secure enough for most (non-PCI) use cases. A major benefit of random numbers is that they are easy to adapt to any format constraints (discussed in greater detail below), and the random numbers can be generated in advance to improve performance.
- Encryption: This method generates a ‘token’ by encrypting the data. Sensitive information is padded with a random salt to prevent reverse engineering, and then encrypted with the token server’s private key. The advantage is that the ‘token’ is reasonably secure from reverse engineering, but the original value can be retrieved as needed. The downsides, however are significant – performance is very poor, Format Preserving Encryption algorithms are required, and data can be exposed when keys are compromised or guessed. Further, the PCI Council has not officially accepted format preserving cryptographic algorithms, and is awaiting NIST certification. Regardless, many large and geographically disperse organizations that require access to original data favor the utility of encrypted ‘tokens’, even though this isn’t really tokenization.
- One-way Hash Function: Hashing functions create tokens by running the original value through a non-reversible mathematical operation. This offers reasonable performance, and tokens can be formatted to match any data type. Like encryption, hashes must be created with a cryptographic salt (some random bits of data) to thwart dictionary attacks. Unlike encryption, tokens created through hashing are not reversible. Security is not as strong as fully random tokens but security, performance, and formatting flexibility are all improved over encryption.
Beware that some open source and commercial token servers use poor token generation methods of dubious value. Some use reversible masking, and others use unsalted encryption algorithms, and can thus be easily compromised and defeated.
We mentioned the importance of token formats earlier, and token solutions need to be flexible enough to handle multiple formats for the sensitive data they accept – such as personally identifiable information, Social Security numbers, and credit card numbers. In some cases, additional format constraints must be honored. As an example, a token representing a Social Security Number in a customer service application may need to retain the real last digits. This enables customer service representatives to verify user identities, without access to the rest of the SSN.
When tokenizing credit cards, tokens are the same size as the original credit card number – most implementations even ensure that tokens pass the LUHN check. As the token still resembles a card number, systems that use the card numbers need not to be altered to accommodate tokens. But unlike the real credit card or Social Security numbers, tokens cannot be used as financial instruments, and have no value other than as a reference to the original transaction or real account. The relationship between a token and a card number is unique for any given payment system, so even if someone compromises the entire token database sufficiently that they can commit transactions in that system (a rare but real possibility), the numbers are worthless outside the single environment they were created for. And most important, real tokens cannot be decrypted or otherwise restored back into the original credit card number.
Each data type has different use cases, and tokenization vendors offer various options to accomodate them.
Tokens, along with the data they represent, are stored within heavily secured database with extremely limited access. The data is typically encrypted (per PCI recommendations), ensuring sensitive data is not lost in the event of a database compromises or stolen media. The token (database) server is the only point of contact with any transaction system, payment system, or collection point to reduce risk and compliance scope. Access to the database is highly restricted, with administrative personnel denied read access to the data, and even authorized access the original data limited to carefully controlled circumstances.
As tokens are used to represent the same data for multiple events, possibly across multiple systems, most can issue different tokens for the same user data. A credit card number, for example, may get a different unique token for each transaction. The token server not only generates a new token to represent the new transaction, but is responsible for storing many tokens per user ID. Many use cases require that the token database support multiple tokens for each piece of original data, a one-to-many relationship. This provides better privacy and isolation, if the application does not need to be able to correlation transactions by card number. Applications that rely on the sensitive data (such as credit card numbers) to correlate accounts or other transactions will require modification to use data which is still available (such as an non-sensitive customer number).
Token servers may be internally owned and operated, or provided as a third party service. We will discuss deployment models in an upcoming post.
Token Storage in Applications
When the token server returns the token to the application, the application must safely store the token and effectively erase the original sensitive data. This is critical – not just to secure sensitive data, but also to maintain transactional consistency. An interesting side effect of preventing reverse engineering is that a token by itself is meaningless. It only has value in relation to some other information. The token server has the ability to map the tokenized value back to the original sensitive information, but is the only place this can be done. Supporting applications need to associate the token with something like a user name, transaction ID, merchant customer ID, or some other identifier. This means applications that use token services must be resilient to communications failures, and the token server must offer synchronization services for data recovery.
This is one of the largest potential weaknesses – whenever the original data is collected or requested from the token database, it might be exposed in memory, log files, or virtual memory. This is the most sensitive part of the architecture, and later we’ll discuss some of the many ways to split functions architecturally in order to minimize risk.
At this point you should have a good idea of how tokens are generated, structured, and stored. In our next posts we’ll dig deeper into the architecture as we discuss tokenization server requirements and functions, application integration, and some sample architectural use cases.
Posted at Monday 19th July 2010 4:29 am
(3) Comments •
By Adrian Lane
If you are interested in tokenization, check out Visa’s Tokenization Best Practices guide, released this week. The document is a very short four pages. It highlights the basics and is helpful in understanding minimum standards for deployment. That said, I think some simple changes would make the recommendations much better and deployments more secure.
From a security standpoint my issues are twofold: I think they fell far short with their recommendations on token generation, and that salting should be implemented differently than they suggest. I also believe that, given how prescriptive the advice is in several sections, Visa should clarify what they mean by encrypting the “Card Data Vault”, but that’s a subject for another day. First things first: let’s dig into the token generation issues.
The principle behind tokenization is to substitute a token for a real (sensitive) value, so you cannot reverse engineer the token into PAN data. But when choosing a token creation strategy, you must decide whether you want to be able to retrieve the value or not. If you will want to convert the token back to the original value, use encryption. But if you don’t need to this, there are better ways to secure PAN data than encryption or hashing!
My problem with the Visa recommendations is their first suggestion should have been simply to use a random number. If the output is not generated by a mathematical function applied to the input, it cannot be reversed to regenerate the original PAN data. The only way to discover PAN data from a real token is a (reverse) lookup in the token server database. Random tokens are simple to generate, and the size & data type constraints are trivial. This should be the default, as most firms should neither need or want PAN data retrievable from the token.
As for encryption, rather than suggest a “strong encryption cipher”, why not take this a step further and recommend a one time pad? This is a perfect application for that kind of substitution cipher. And one time pads are as secure a method as anything else. I’m guessing Visa did not suggest this because a handful of very large payment processors, with distributed operations, actually want to retrieve the PAN data in multiple locations. That means they need encryption, and they need to distribute the keys.
As for hashing, I think the method they prescribe is wrong. Remember that a hash is deterministic. You put in A, the hash digests the PAN data, and it produces B. Every time. Without fail. In order to avoid dictionary attacks you salt the input with a number. But the recommendations are ” … hashing of the cardholder data using a fixed but unique salt value per merchant”! If you use a static merchant ID as the salt, you are really not adding much in the way of computational complexity (or trying very hard to stop attacks). Odds are the value will be guessed or gathered at some point, as will the hashing algorithm – which subjects you to precomputed attacks against all the tokens. It seems to me that for PAN data, you can pick any salt you want, so why not make it different for each and every token? The token server can store the random salt with the token, and attacks become much tougher.
Finally, Visa did not even discuss format preservation. I am unaware of any tokenization deployment that does not retain the format of the original credit card number/PAN. In many cases they preserve data types as well. Punting on this subject is not really appropriate, as format preservation is what allows token systems to slide into existing operations without entirely reworking the applications and databases. Visa should have stepped up to the plate with format preserving encryption and fully endorsed format-preserving strong cryptography. This was
absent fromnot addressed in the Field Level Encryption Best Practices
in 2009, and remains conspicuous by its absence.
The odds are that if you are saddled with PCI-DSS responsibilities, you will not write your own ‘home-grown’ token servers. So keep in mind that these recommendations are open enough that vendors can easily provide botched implementations and still meet Visa’s guidelines. If you are only interested in getting systems out of scope, then any of these solutions is fine because QSAs will accept them as meeting the guidelines. But if you are going to the trouble of implementing a token server, it’s no more work to select one that offers strong security.
Posted at Friday 16th July 2010 3:00 pm
(6) Comments •
By Mike Rothman
Most of us active security types aren’t big fans of planning. We’d much rather be doing stuff. Hopefully, though, from the first two posts in the Monitor subprocess (Enumerate/Scope & Define Policies) you see that without a structured and complete planning process, your monitoring efforts are unlikely to succeed.
But now it’s time to get busy and start actually monitoring these devices. The first two steps in this part of the Monitor process are Collect and Store – building the infrastructure and systems to actually collect the data, and defining where we’ll store it. Operationally these two subprocesses tend to be integrated, as most organizations select tools with storage built-in. But along with our philosophy of not assuming that anyone will buy (or deploy) a tool for any of these processes, we’ll spell everything out in sufficient detail to model a manual activity.
In the high-level Monitoring process map, we described the Collect step as follows:
Collect alerts and log records based on the policies defined in Phase 1. Can be performed within a single-element manager or abstracted into a broader Security Information and Event Management (SIEM) system for multiple devices and device types.
The subprocesses are:
- Deploy Tool: It’s hard to collect data without any technology, so this first step is about selecting and procuring the mechanisms for data collection. This doesn’t necessarily mean buying a tool – there are other options. But it does mean taking the time to research and select some technology, as well to install and configure the choice.
- Integrate with Data Sources: After the technology is deployed we get to integrate with the data sources identified during the Enumeration and Scoping stage. This involves listing all the data sources, collecting any required permissions for those devices, configuring collection both on the receiving technology (collectors), and the devices being monitored (if required). Finally you need to test your rig to make sure you are pulling the right data from the right devices.
- Deploy Policies: After we have data in the system, it’s time to add the policies. This means correlation and alerting policies, and automating the validation/escalation process as much as possible. You’ll also need to test the policies to make sure they work as desired.
- Test Controls: The last step of the collection subprocess is effectively a system test of the whole shebang. You’ve tested the individual aspects (data sources, correlation, & alerts), so now you need to make sure everything works together. In order to do this right, you’ll spend time designing a test scenario and building a testbed – including developing a test data set. Then simulate some attacks and analyze the test data at a high level to be sure the system provides the data you need.
As you can tell, testing is a key part of the Collect step – if any of the subsystems fail, the system isn’t worth much. One way to bootstrap all this testing is through the proof of concept process during tool selection (assuming you select a tool). This way, the vendor works with your team to set up the system, collect some data, and run through the use case(s) driving the procurement. Obviously as you move to production use you’ll be adding a lot more data sources, correlation rules, and alerts – and extensively tuning the environment. But the proof of concept can provide a very useful running start.
To Tool or Not to Tool
It’s very easy to assume you will be deploying a specific tool to do the monitoring. Most folks look at a log collection device or perhaps even a SIEM platform to perform this monitoring, and there are now several managed service options to consider. So a big part of this process is figuring out whether you need a tool, and if so whether it makes more sense to buy or build it. If you decide to build it then you need to decide whether you want a Cadillac of monitoring tools, or if an open source ‘Kia’ will suffice.
There is no right or wrong answer – it gets back to how much time you have to deploy and maintain the system. Monitoring devices in a native management console is certainly possible, but understand this means a console for each device type, and you’ll lose the benefits of aggregation and correlation of multiple data types. Depending on your use case, that may be sufficient. But in practice we see many organizations moving to centralize monitoring of all the in-scope devices in an attempt to increase efficiency and streamline the compliance process.
Device Type Variances
There is quite a bit of difference in data collection from firewalls and IDS/IPS, versus servers. Most security devices support a
syslog feed, so you can just point the security devices to the collection point and be on your way. Some devices (like Check Point firewalls, for instance) have their own proprietary collection protocols, but pretty much all the commercial and open source products support the leading security device vendors.
Servers are a very different animal, and most of the time you’ll need to go get the log/event files. Depending on the type of server, you may be using either an OS-specific protocol like WMI, or an open protocol like
ssh. There are also open source collection tools for servers like Snare, and more generic logging platforms like Splunk – all of which can and do automate the process of collecting data from these servers.
Your policies will really dictate the kind of tool you get for all these collection processes, as collecting very detailed information for may different devices, and performing deep analysis on the data are cost prohibitive without a tool.
Large vs. Small Company Considerations
The real (and obvious) difference in collection between a large company and a smaller entity is scale. Large companies not only have more people and more funding for these monitoring projects (yes, that is a crass generalization), but they also have a lot more devices and device types, and generally want to do much more analysis and automation. They tend to be willing to pay for this scale and detail as well, but in a tight economy that’s not always the case.
The more modest collection requirements of smaller companies make things much easier. But keeping everything up to date and current can be problematic given the number of other things on your plate.
Buy vs. Build
We see a decent amount of interest in outsourcing the monitoring functions now – especially for firewalls and IDS/IPS – because the service providers can bring a great deal of leverage to the table, to drive economies of scale and deliver services at lower prices.
One of the main deliverables for our NSO Quant project is a model to help understand the cost to deliver monitoring services for your environment. We aren’t religious about whether outsourcing or doing it yourself is the right choice, but we are religious about providing the wherewithal to make an informed and objective decision.
Storing is keeping all this good stuff you are collecting. Duh. Actually it’s a bit more than that, because deciding where to store, and the associated architecture, are critical to scaling your monitoring system. You also need to make some tough decisions about archiving because the larger your data set the more compute-intensive the analysis, which slows everything down. We all know speed is our friend when trying to monitor stuff, so it’s about balancing sufficient data for detailed analysis/correlation against keeping the system manageable and responsive.
The key to when dealing with storage is leverage, so you’ll use the same storage for data from all your devices – not just firewalls, IDS/IPS, or servers – assuming you actually want to do multi-data-type analysis, at least.
Here are the subprocesses for Store:
- Select Event/Log Storage: Now that you have all this data, what are you going to do with it? That’s a key question – even if you select a tool to provide monitoring, you’ll need to do something with the data at some point. So you’ll be researching and selecting storage options, and also be spending some time in your data center to figure out what options you have for leveraging your organization’s existing storage infrastructure. Once you decide on the storage strategy it’s time to implement and deploy the collection targets.
- Deploy Storage and Retention Policies: Back in the Define Policies subprocess, you spent time figuring out how long to keep the data and building consensus for those decisions. Now you need to deploy the policies on the storage device(s), which usually involves a configuration and testing step. It’s not a good idea to lose data, especially if there are forensics requirements, so testing is key to this effort.
- Archive Old Events: Finally you’ll need to move older events out of the main storage system at some point. So you’ll be configuring the archival targets – likely leveraging existing archiving capabilities, as well as configuring the event/log storage environment to send old events to the archival system. Yes, once again we have to mention testing, especially to ensure archived data is accessible with sufficient data integrity.
Leverage and Storage
One of the key decisions for those building their own collection systems is to figure out whether you can leverage your organization’s existing storage environment. There is no easy answer – especially once you start thinking about the sensitivity of the data, the common compliance requirement for separation of duties, and access to private data on a need-to-know basis. Obviously using existing spindles is the most cost effective way to do things, but if you buy an SIEM/Log Management platform for monitoring it will include its own storage, so the question reduces down to archival. Obviously if you outsource monitoring storage is a non-issue, though some organizations keep a local copy just in case. There are costs involved in that, balanced against piece of mind.
Is Forensics Data Your Friend?
The other consideration in building your storage environment is the need for forensically clean data. If you ever want to move toward prosecuting malfeasance, you’ll need to collect and store your data in a certain way. Our pals at NIST have defined some guidelines for log management (PDF), but in a nutshell you’ll need to be able to prove chain of custody and integrity of data – basically to demonstrate that the records weren’t tampered with.
Large vs. Small Company Considerations
Once again, the obvious major difference between how large and small companies look at storage is scale. Depending on the event/log volume and the number of devices being monitored, you’ll have some specific architectural decisions on how to both collect and store the volume of data. As you research and select your collection points and their associated storage, you’ll be doing a lot of analysis into the appropriate deployment architecture. If you are looking at specific tools for monitoring, the vendor or reseller should be able to provide some guidance on what architectures they’ve used successfully as well.
Now that we have the data we need to figure out what it means, and that means it’s time for analysis. The Analyze step in our Monitoring process is next, and we’ll be posting it next week.
Posted at Friday 16th July 2010 2:41 pm
(1) Comments •
I’ve been living full time in Phoenix, Arizona for about 5 years now, and about 2 years part time before that. This was after spending my entire adult life in Boulder Colorado thanks to parole at the age of 18 from New Jersey. Despite still preferring the Broncos over the Cardinals, I think I’ve mostly adjusted to the change.
But damn, sometimes I wonder about this place.
First there’s the heat. It’s usually pretty tolerable up to about 100-105 Fahrenheit, thanks to the low humidity. When the humidity starts to creep up in the summer it leads directly to the monsoon rains which cool things down. Usually. Right now we’re hitting humidity as high as 30% with temps breaking 110F. The high this week is expected to hit 116F. We’re talking it’s so hot that the health department issued a warning that kids could get second degree burns from the pavement.
The heat also seems to be frying brains a bit.
- First up is our politicians, who don’t seem to realize that when you claim to be the number 2 kidnapping capital in the world (totally untrue), people might not come and visit no matter how many tourism ads you fund.
- Then there’s my neighbor. I’m not sure exactly which neighbor, but the one that lets their dog out in the morning while the coyotes are out. I was running today when I saw the dog outside, completely unmonitored, during prime snack time. Coyotes might actually play with bigger dogs, but the little ones are pretty much yappy snausages with legs.
- Finally, back on crime, we find the most awesome local news crime story in history. The good news? An armed home invader was shot. The bad news? By the 3 better-armed invaders already in the home holding the family hostage. It’s like the DefCon CTF. With guns.
I think it’s kind of cool I live in a state where I don’t need a gun, because if someone breaks into my home the odds are I’ll either be holed up in a trunk while my family finds ransom money, or the early bird bad guys will defend their turf and property (me and what was previously my property).
On to the Summary:
Webcasts, Podcasts, Outside Writing, and Conferences
Favorite Securosis Posts
Other Securosis Posts
Favorite Outside Posts
Project Quant Posts
Research Reports and Presentations
Top News and Posts
Blog Comment of the Week
Remember, for every comment selected, Securosis makes a $25 donation to Hackers for Charity. This week’s best comment goes to Jesse Krembs, in response to Simple Ideas to Start Improving the Economics of Cybersecurity.
Having incident response costs borne by the the business unit that is breached/responsible seems like a great idea. Tying it to performance bonuses seems like an idea worth exploring as well. Maybe a little $$$ motivation for stopping people in the hall who don’t have there badge for example. It makes me think that security groups inside a company should act in a consultant/regulator roll. Enforce a minimum rule set, that each department must live up to. Sell added security to departments as needed/affordable.
Figuring out how to tie the money to security performance without rolling a giant FUD ball is key and difficult.
Posted at Friday 16th July 2010 4:00 am
(0) Comments •
By Mike Rothman
I read Nassim Taleb’s “Black Swan” a few years ago and it was very instructive for me. I wrote about it a few times in a variety of old Incites (here and here), and the key message I took away was the futility of trying to build every scenario into a threat model, defensive posture, or security strategy.
In fact, I made that very point yesterday in the NSO Quant post on Defining Policies. You can’t model every threat, so don’t try. Focus on the highest perceived risk scenarios based on your knowledge, and work on what really represents that risk. What brings me back to this topic is Alex’s post: Forget trying to color the Swan, focus on what you know. Definitely read the post, as well as the comments.
Alex starts off by trying to clarify what a Black Swan is and what it isn’t, and whether our previous models and distributions apply to the new scenario. My own definition is that a Black Swan breaks the mold. We haven’t seen it before, therefore we don’t really understand its impact – not ahead of time anyway. But ultimately I don’t think it matters whether our previous distributions apply or not. Whether a Swan is Black, Creme, Plaid, or Gray is inconsequential when you have a situation and it needs to be dealt with. This gets back to approaches for dealing with incidents and what you can do when you don’t entirely understand the impact or the attack vector.
Dealing with the Swan involves doing pattern matching as part of your typical validation activity. You know something is funky, so the next step is to figure out what it is. Is it something you’ve seen before? If so, you have history for how to deal with it. That’s not brain surgery.
If it’s something you haven’t seen before, it gets interesting. Then you need to have some kind of advanced response team mobilized to figure out what it is and what needs to be done. Fun stuff like advanced forensics and reverse engineering could be involved. Cool. Most importantly, you need to assess whether your existing defenses are sufficient or if other (more dramatic) controls are required. Do you need to pull devices off the network? Shut down the entire thing?
Black Swans have been disruptive through history because folks mostly thought their existing contingencies and/or defenses were sufficient. They were wrong, and it was way too late by the time they realized. The idea is to make sure you assess your defenses early enough to make a difference.
It’s like those people who always play through the worst case scenario, regardless of how likely that scenario is. It makes me crazy because they get wrapped up in scenarios that aren’t going to happen, and they make everyone else around them miserable as they are doing it. But that doesn’t mean there isn’t a place for worst case scenario analysis. This is one of them. At the point you realize you are in uncharted territory, you must start running the scenarios to understand contingencies and go-forward plans in the absence of hard data and experience.
That’s the key message here. Once you know there is an issue, and it’s something you haven’t seen before, you have to start asking some very tough questions. Questions about survivability of devices, systems, applications, etc. Nothing can be out of bounds. You don’t hear too much about the companies/people that screw this up (except maybe in case studies) because they are no longer around. There, that’s your pleasant thought for today.
Posted at Thursday 15th July 2010 3:44 pm
(3) Comments •
By Adrian Lane
We have covered this before, but every now and again I run into a new slant on who bears responsibility for online transaction safety. Bank? Individual? If both, where do the responsibilities begin and end?
Over the last year a few friends, ejected from longtime professions due to the current economic depression, have started online businesses. A couple of these individuals did not even know what HTML was last year – but now they are building web sites, starting blogs and … taking credit cards online. It came as a surprise to several of these folks when their payment processors fined them, or disrupted service entirely because they had failed a remote security audit.
It seems that the web site itself passed its audit with a handful of cautionary notices that the auditor recommended they address. What failed was the management terminal – their home computer, used to dial into the account, had several severe issues. What made my friend aware that there was a problem at all was extra charges on his bill for, in essence, having crappy security. What a novel idea to raise awareness and motivate merchants! I applaud providing the resources to the merchants to help secure their environments. I also worry that this is a method for payment processors to “pass the buck” and lower their own security obligations. That’s probably because I am a cynic by nature, which is why I ended up in security, but that’s a different story.
Not having started a small business that takes credit cards online, I was ignorant of many measures payment processors are taking to raise the bar for security on end-user systems. They are sending out guidance on the basic security measures, conducting assessments, providing results, and suggesting additional security measures. In fact, the list of suggested security improvements that the processor – or processor’s service provider – suggested looks a lot like what is covered in a PCI self assessment questionnaire. Firewall rules, use of admin accounts, egress filtering, and so on. I thought this was pretty cool! But on the other side of the equation, all the credit card billing is happening on the web site, without them ever collecting credit card numbers. Good idea? Overkill?
These precautions are absolutely overwhelming for most people. Especially like one-person shops like my friends operate. They have absolutely no idea what a TCP reset is, or why they failed the test for it. They have never heard of egress filtering. But they are looking into home office security measures just like large retail merchants. Part of me thinks they need to have this basic understanding if they are going to conduct commerce online. Another part of me thinks they are being set up for failure.
I spent about 40 minutes on the phone today, giving one friend some guidance. My first piece of advice was to get a virtual environment set up and make sure he used it for banking and banking only. Then I focused on how to pass the audit. My goal was in this conversation was:
- Not overwhelm him with technical jargon and concepts that he simply did not, and would not, understand.
- Get him to pass the next audit with minimum effort on his part, and without having to buy any new hardware or software.
- Call his ISP, bank, and payment processor and wring out of them any tools and assistance they could provide.
- Turn on the basic Windows firewall and basic router security.
Honestly, the second item was the most important. Despite this person being really smart, I did not have any faith that he could set things up correctly – certainly not the first time, and perhaps not ever. So I, like many, just got him to where he could “check the box”. I just advised someone to do the minimum to pass a pseudo-PCI audit. sigh I’ll be performing penance for the rest of the week.
Posted at Wednesday 14th July 2010 11:34 pm
(4) Comments •
By Mike Rothman
So many attacks, so little time. If you are like pretty much everyone else we talk to, you are under the gun to figure out what’s happening and understand what’s under attack and to fix it. Right?
As you look to engage your monitoring process, you’ll be making a ton of decisions once you figure out what you have and what’s in scope. These decisions are the policies that will govern your monitoring, set the thresholds, determine how data is analyzed and trigger the alerts.
In the high level monitoring process map, we described the define policies step as follows:
Define the depth and breadth of the monitoring process, what data will be collected from the devices, and frequency of collection.
The scope of this project is specifically on firewalls, IDS/IPS and servers, so any of our samples will be tailored to those device types, but there is nothing in this process that precludes monitoring other network, security, computing, application and/or data capture devices.
There are five sub-processes in this step:
- Correlation Rules
In terms of our standard disclaimer, we build these sub-processes for organizations that want to undertake a monitoring initiative. We don’t make any assumptions about the size of company or whether a tool set will be used. Obviously the process will change depending on your specific circumstances as you’ll do some steps and not others. And yes, building your own monitoring environment is pretty complicated, but we think it’s important to give you a feel for everything that is required, so you can compare apples to apples relative to building your own versus buying a product(s) or using a service.
Our first set of policies will be around monitoring, specifically which activities on which devices will be monitored. We’ve got to have policies on frequency of data collection and also data retention. We also have to give some thought to the risk to the organization based on each policy. For example, your firewall will be detecting all sorts of attacks, so those events have one priority. But if you find a new super-user account on the database server, that’s a different level of risk.
The more you think through what you are monitoring, what conclusions you can draw from the data, and ultimately what the downside is when you find something from that device, it’ll make the rest of the monitoring process go smoothly.
The issue with correlation is that you need to know what you are looking for in order to set rules to find it. That also requires you understand the interactions between the different data types. How do you know that? Basically you need to do a threat modeling exercise based on the kinds of attack vectors want to find. Sure we’d love to tell you that your whiz-bang monitoring thingy will find all the attacks out of the box and you can get back to playing Angry Birds. But it doesn’t work that way.
Thus you get to spend some time on the white board mapping out the different kinds of attacks, suspect behavior, or exploits you’ve seen and expect to see. Then you map out how you’d detect that attack including specifics about the data types and series of events that need to happen for the attack to be successful. Finally you identify the correlation policy. You need to put yourself in the shoes of the hacker and think like them. Easy huh? Not so much, but at least it’s fun. This is more art than science.
Depending on the device, there may be some resources (whether it’s an open source tools with a set of default policies, think Snort or OSSEC or OSSIM) that can get you started. Again, don’t just think you’ll be able to download some stuff and get moving. You need to actively think about what you are trying to detect and build your policies accordingly. In order to maintain sanity, understand that defining (and refining and tuning and refining and tuning some more) correlation policies is an ongoing process.
Finally, it’s important to realize that you cannot possibly build a threat model for every kind of attack that may or may not be launched at your organization. So you are going to miss some stuff, and that’s part of the game. The objective of monitoring is to react faster, not react perfectly to every attack. The threat modeling exercise focuses you on watching for the most significant risks to your organization.
In this step, you define the scenario that triggers the alerts. You need to define what is an information only alert, as well as a variety of alert severities, depending on the attack (and the risk it represents). The best way we know to do this is to go back to your threat models. For each attack vector, there are likely a set of situations that are low priority and a set that are higher. What are the thresholds that determine the difference? Those are part of your alerting policies, since you want to make sure how you are doing things and why is documented clearly.
Next you need to define your notification strategy, by alert severity. Will you send a text message or an email or does a big red light flash in the SOC calling all hands on deck? We spent a lot of time during the scoping process getting consensus, but it’s not over at the end of that step. You also need to make sure you’ve got everyone on the same page relative to how you are going to notify them when an alert fires. You don’t want an admin to miss something important because they expected a text and don’t live in their email.
So as we go through our policy definition efforts, we’ve built threat models and the associated correlation and alerting policies. Next we need to think about how we prove whether the attack is legitimate or whether it’s just a false positive. So here you map out what you think an analyst can/should do to prove an attack. Maybe it’s looking at the log files or logging into the device and/or confirming with other devices. The specific validation activities will vary depending on the threat.
At the end of the validation step, you want to definitively be able to answer the question: Is this attack real? So your policies should define what real means in this context and set expectations for the level of substantiation you expect to validate the attack.
You also need to think about escalation, depending on the nature of the alert. Does this kind of alert go to network ops or security? Is it a legal counsel thing? What is the expected response time? How much data about the alert do you need to send along? Does the escalation happen via a trouble ticketing system? Who is responsible for closing the loop? All of these issues need to be worked through and agreed to by the powers that be. Yes, you’ve got to get consensus (or at least approval) for the escalation policies, since it involves sending information to other groups and expecting a specific response.
Finally, you’ve worked through all the issues. Well all the issues that you can model and know about, so the last step is to document all the policies and communicate responsibilities and expectations to the operations teams (or anyone else in the escalation chain). These policies are living documents, and will change frequently as new attacks are found in the wild and new devices/applications appear in your network.
You also need to think about whether you are going to be building a baseline from the data you collect. This involves monitoring specific devices for a period of time, assuming what you find is normal, and then looking for behavior that is not normal. This is certainly an approach to streamline the process of getting your monitoring system on line, but understand this involves making assumptions about what is good or bad and those assumptions may not be valid.
We prefer both, in terms of doing the threat modeling exercise, as well as establishing a baseline. We’d say it provides the best of both worlds, but that would be corny.
Device Type Variance
For each device type (firewall, IDS/IPS, servers) you largely go through the same process. You’ve got to figure out what data you will collect, any specific correlation that makes sense, what you’ll alert on, and how you’ll validate the alert and who gets the joy of receiving the escalation. Obviously each device provides different data about different aspects of the attack. By relying on the threat models to guide your policies, you can focus your efforts without trying to boil the ocean.
Big vs. Small Company Considerations
The biggest difference we tend to see in how big companies versus small companies do monitoring is relative to the amount of data and the number of policies used to drive correlation and alerting. Obviously big companies have economics to invest in tools and services to get the tools operational (and keep them operational).
So small companies need to compromise and that means you aren’t going to be able to find everything, so don’t try. Again, that’s where the threat models come into play. By focusing on the highest risk models, you can go through the process and do a good job with a handful of threat models. Large companies on the other hand need to be aware of analysis/paralysis since there is literally an infinite number of threat models, correlation policies and alerts that can be defined. At some point the planning has to stop and the doing has to start.
And speaking of starting to do things, tomorrow we’ll go through the Collect and Store steps as we move into the actually monitoring something.
Posted at Wednesday 14th July 2010 10:50 pm
(1) Comments •
Today Howard Schmidt meets with Secretary of Commerce Gary Locke and Department of Homeland Security Secretary Janet Napolitano to discuss ideas for changing the economics of cybersecurity. Howard knows his stuff, and recognizes that this isn’t a technology problem, nor something that can be improved with some new security standard or checklist. Crime is a function of economics, and electronic crime is no exception.
I spend a lot of time thinking about these issues, and here are a few simple suggestions to get us started:
- Eliminate the use of Social Security Numbers as the primary identifier for our credit history and to financial accounts. Phase the change in over time. When the banks all scream, ask them how they do it in Europe and other regions.
- Enforce a shared-costs model for credit card brands. Right now, banks and merchants carry nearly all the financial costs associated with credit card fraud. Although PCI is helping, it doesn’t address the fundamental weaknesses of the current magnetic stripe based system. Having the card brands share in losses will increase their motivation to increase the pace of innovation for card security.
- Require banks to extend the window of protection for fraudulent transactions on consumer and business bank accounts. Rather than forcing some series of fraud detection or verification requirements, making them extend the window where consumers and businesses aren’t liable for losses will motivate them to make the structural changes themselves. For example, by requiring transaction confirmation for ACH transfers over a certain amount.
- Within the government, require agencies to pay for incident response costs associated with cybercrime at the business unit level, instead of allowing it to be a shared cost borne by IT and security. This will motivate individual units to better prioritize security, since the money will come out of their own budgets instead of being funded by IT, which doesn’t have operational control of business decisions.
Just a few quick ideas to get us started. All of them are focused on changing the economics, leaving the technical and process details to work themselves out.
There are two big gaps that aren’t addressed here:
- Critical infrastructure/SCADA: I think this is an area where we will need to require prescriptive controls (air gaps & virtual air gaps) in regulation, with penalties. Since that isn’t a pure economic incentive, I didn’t include it above.
- Corporate intellectual property: There isn’t much the government can do here, although companies can adopt the practice of having business units pay for incident response costs (no, I don’t think I’ll live to see that day).
Any other ideas?
Posted at Wednesday 14th July 2010 6:42 pm
(17) Comments •
By Mike Rothman
I’m discovering that you do mellow with age. I remember when I first met the Boss how mellow and laid back her Dad was. Part of it is because he doesn’t hear too well anymore, which makes him blissfully unaware of what’s going on. But he’s also mellowed, at least according to my mother in law. He was evidently quite a hothead 40 years ago, but not any more. She warned me I’d mellow too over time, but I just laughed. Yeah, yeah, sure I will.
But sure enough, it’s happening. Yes, the kids still push my buttons and make me nuts, but most other things just don’t get me too fired up anymore. A case in point: the Securosis team got together last week for another of our world domination strategy sessions. On the trip back to the airport, I heard strange music. We had rented a Kia Soul, with the dancing hamsters and all, so I figured it might be the car. But it was my iPad cranking music.
WTF? What gremlin turned on my iPad? Took me a few seconds, but I found the culprit. I carry an external keyboard with the iPad and evidently it turned on, connected to the Pad, and proceeded to try to log in a bunch of times with whatever random strings were typed on the keyboard in my case. Turns out the security on the iPad works – at least for a brute force attack. I was locked out and needed to sync to my computer in the office to get back in.
I had my laptop, so I wasn’t totally out of business. But I was about 80% of the way through Dexter: Season 2 and had planned to watch a few more episodes on the flight home. Crap – no iPad, no Dexter. Years ago, this would have made me crazy. Frackin’ security. Frackin’ iPad. Hate hate hate. But now it was all good. I didn’t give it another thought and queued up for an Angry Birds extravaganza on my phone.
Then I remembered that I had the Dexter episodes on my laptop. Hurray! And I got an unexpected upgrade, with my very own power outlet at my seat, so my mostly depleted battery wasn’t an issue. Double hurray!! I could have made myself crazy, but what’s the point of that?
Another situation arose lately when I had to diffuse a pretty touchy situation between friends. It could have gotten physical, and therefore ugly with long-term ramifications. But diplomatic Mike got in, made peace, and positioned everyone to kiss and make up later. Not too long ago, I probably would have gotten caught up in the drama and made the situation worse.
As I was telling the Boss the story, she deadpanned that it must be the end of the world. When I shot her a puzzled look, she just commented that when I’m the voice of reason, armageddon can’t be too far behind.
Photo credits: “mello yello” originally uploaded by Xopher Smith
Recent Securosis Posts
- School’s out for Summer
- Taking the High Road
- Friday Summary: July 9 2010
- Top 3 Steps to Simplify DLP Without Compromise
- Preliminary Results from the Data Security Survey
- Tokenization Architecture – The Basics
- NSO Quant: Enumerate and Scope Sub-Processes
Incite 4 U
Since we provided an Incite-only mailing list option, we’ve started highlighting our other weekly posts above. One to definitely check out is the Preliminary Results from the Data Security Survey, since there is great data in there about what’s happening and what’s working. Rich will be doing a more detailed analysis in the short term, so stay tuned for that.
You can’t be half global… – Andy Grove (yeah, the Intel guy) started a good discussion about the US tech industry and job creation. Gunnar weighed in as well with some concerns about lost knowledge and chain of experience. I don’t get it. Is Intel a US company? Well, it’s headquartered in the US, but it’s a global company. So is GE. And Cisco and Apple and IBM and HP. Since when does a country have a scoreboard for manufacturing stuff? The scoreboard is on Wall Street and it’s measured in profit and loss. So big companies send commodity jobs wherever they find the best mix of cost, efficiency, and quality. We don’t have an innovation issue here in the US – we have a wage issue. The pay scales of some job functions in the US have gone way over their (international) value, so those jobs go somewhere else. Relative to job creation, free markets are unforgiving and skill sets need to evolve. If Apple could hire folks in the US to make iPhones for $10 a week, I suspect they would. But they can’t, so they don’t. If the point is that we miss out on the next wave of innovation because we don’t assemble the products in the US, I think that’s hogwash. These big companies have figured out sustainable advantage is moving out of commodity markets. Too bad a lot of workers don’t understand that yet. – MR
Tinfoil hats – Cyber Shield? Really? A giant monitoring project ? I don’t really understand how a colossal systems monitoring project is going to shield critical IT infrastructure. It may detect cyber threats, but only if they know what they are looking for. The actual efforts are classified, so we can’t be sure what type of monitoring they are planning to do. Maybe it’s space alien technology we have never seen before, implemented in ways we could never have dreamed of. Or maybe it’s a couple hundred million dollars to collect log data and worry about analysis later. Seriously, if the goal here is to protect critical infrastructure, here’s some free advice: take critical systems off the freaking’ Internet! Yeah, putting these systems on the ‘Net many years ago was a mistake because these organizations are both naive and cheap. Admit the mistake and spend your $100M on private systems that are much easier to secure, audit, and monitor. The NSA has plenty of satellites
I am sure they can spare some bandwidth for power and other SCADA control systems. If it’s really a matter of national security to protect these systems, do that. Otherwise it’s just another forensic tool to record how they were hacked. – AL
Conflict of interest much? – Testing security tools is never easy, and rarely reflects how they would really work for you. Mike covered this one already, but it is, yet again, rearing its head. NSS Labs is making waves with its focus on “real world” antivirus software testing. Rather than running tools against a standard set of malware samples, they’ve been mixing things up and testing AV tools against live malware (social engineering based), and modifications of known malware. The live test gives you an idea of how well the tools will work in real life with actual users behind them. The modifications tests give you an idea of whether the tools will detect new variants of known attacks. Needless to say, the AV vendors aren’t happy and are backing their own set of “standards” for testing while disparaging NSS, except the ones who scored well. I realize this is how the world works, but it’s still depressing. – RM
Automating firewall ops – Speaking of product reviews, NetworkWorld published one this week on firewall operations tools. You know, those tools that suck in firewall configs, analyze them and maybe even allow you to change said firewalls without leaving a hole so big the Titanic could sail through? Anyhow, this still feels like a niche market even though there are 5 players in it, because you need to have a bunch of firewalls to take advantage of such a tool. Clearly these tools provide value but ultimately it comes back to pricing. At the right price the value equation adds up. Ultimately they need to be integrated with the other ops tools (like patch/config, SIEM/LM, etc.), since the swivel chair most admins use to switch between different management systems is worn out. – MR
Eternal breach – Although credit cards are time limited (they come with expiration dates), a lot of other personal information lives longer than you do. Take your Social Security Number or private communications… once these are lost in a breach, any breach, the data stays in circulation and remains sensitive. That’s why the single year of credit monitoring offered by most organizations in their breach response is a bad joke. The risk isn’t limited to a year, so this is a CYA gesture. Help Net Security digs into this often ignored problem. I don’t really expect things to get any better; our personal information is all over the darn place, and we are at risk as soon as it’s exposed once… from anywhere. I’m going to crawl back into my bunker now. – RM
Deals, Good ‘n’ Plenty – There is no stopping the ongoing consolidation in the security space. Last week the folks at Webroot bought a web filtering SaaS shop called BrightCloud. Clearly you need both email and web filtering (yeah, that old content thing), so it was a hole in Webroot’s move towards being a SaaS shop. Yesterday we also saw GFI acquire Sunbelt’s VIPRE AV technology. This seems like a decent fit because over time distribution leverage is key to ongoing sustainability. That means you need to pump more stuff into existing customers. And given the price set by Sophos’ private equity deal, now was probably a good time for Sunbelt to do a deal, especially if they were facing growing pains. Shavlik seems a bit at risk here, since they OEM Sunbelt and compete with GFI. – MR
E-I-eEye-Oh! – During the last economic downturn, the dot-com bust days of 2000, HR personnel used to love to call people ‘job hoppers’. “Gee, it seems you have had a new job every 24 months for the last 6 years. We are really looking for candidates with a more stable track record.” It was a lazy excuse to dismiss candidates, but some of them believed it. I think that mindset still persists, even though the average job tenure in development is shy of 21 months (much shorter for Mike!), and just slightly better for IT. Regardless, that was the first thing that popped into my head when I learned that Marc Maiffret has jumped ship from FireEye back to eEye. Dennis Fisher has a nice interview with Marc over at Threatpost. Feels like just a few weeks ago he joined FireEye, but as most hiring managers will tell you, team chemistry is as important as job skills when it comes to hiring. I was sad to see Marc leave eEye – was it four years ago? – to start Invenio. At the time eEye was floundering, and from my perspective product management was poorly orchestrated. I am sure the investors were unhappy, but Marc seemed to get a disproportionate amount of the heat, and eEye lost a talented researcher. The new management team over at eEye still has their hands full with this reclamation project, but Marc’s a good addition to their research team. If eEye seriously wants to compete with Qualys and Rapid7, they need all the help they can get, and this looks like a good fit for both the company and Marc. Good luck, guys! – AL
Low Hanging Fruit doesn’t need to be expensive – Fast, cheap, or secure. Pick two. Or so the saying goes, but that’s especially true for SMB folks trying to protect their critical data. It ain’t cheap doing this security stuff, or is it? The reality is that given the maturity of SaaS options, most SMB folks should be looking at outsourcing critical systems (CRM, ERP, etc.). And for those systems still in-house, as well as networks and endpoints, you don’t need to make it complicated. Dark Reading presents some ideas, but we have also written quite a bit on fundamentals and low hanging fruit. No, world class security is not low hanging fruit, but compared to most other SMB (and even enterprise-size) companies, covering the fundamentals should be good enough. And no, I’m not saying to settle for crap security, but focusing on the fundamentals, especially the stuff that doesn’t cost much money (like secure configurations and update/patch) can make a huge difference in security posture without breaking the bank. – MR
Posted at Wednesday 14th July 2010 7:00 am
(1) Comments •
By Mike Rothman
As we get back to the Network Security Operations Quant series, our next step is to take each of the higher level process maps and break each step down into a series of subprocesses. Once these subprocesses are all posted and vetted (with community involvement – thanks in advance!), we’ll survey you all to see how many folks actually perform these specific steps in day-to-day operations.
We will first go into the subprocesses around Monitoring firewalls, IDS/IPS, and servers. The high level process map is below and you can refer to the original post for a higher-level description of each step.
The first two steps are enumerate, which means finding all the security, network, and server devices in your environment, and scope, which means determining the devices to be covered by the monitoring activity. It took a bit of discussion between us analyst types to determine which came first, the chicken or the egg – or in this case, the enumeration or the scoping step.
Ultimately we decided that enumeration really comes first because far too many organizations don’t know what they have. Yes, you heard that right. There are rogue or even authorized devices that slipped through the cracks, creating potential exposure. It doesn’t make sense to try figuring out the scope of the monitoring initiative without knowing what is actually there.
So we believe you must start with enumeration and then figure out what is in scope.
The enumeration step has to to with finding all the security, network, and server devices in your environment.
There are four major subprocesses in this step:
- Plan: We believe you plan the work and then work the plan, so you’ll see a lot of planning subprocesses throughout this research. In this step you figure out how you will enumerate, including what kinds of tools and techniques you will use. You also need to identify the business units to search, mapping their specific network domains (assuming you have business units) and developing a schedule for the activities.
- Setup: Next you need to set up your enumeration by acquiring and installing tools (if you go the automated route) or assembling your kit of scripts and open source methods. You also need to inform each group that you will be in their networks, looking at their stuff. In highly distributed environments it may be problematic to do ping sweeps and the like without giving everybody a ‘heads up’ first. You also need to get credentials (where required) and configure the tools you’ll be using.
- Enumerate: Ah, yes, you actually need to do the work after planning and setting up. You may be running active scans or analyzing passive collection efforts. You also need to validate what your tools tell you, since we all know how precise that technology stuff can be. Once you have the data, you’ll be spending some time filtering and compiling the results to get a feel for what’s really out there.
- Document: Finally you need to prepare an artifact of your efforts, if only to use in the next step when you define your monitoring scope. Whether you generate PDFs or kill some trees is not relevant to this subprocess – it’s about making sure you’ve got a record of what exists (at this point in time), as well as having a mechanism to check for changes periodically.
Device Type Variances
Is there any difference between enumerating a firewall, an IDS/IPS, and a server – the devices we are focused on with this research project? Not really. You are going to use the same tools and techniques to identify what’s out there because fundamentally they are all IP devices.
There will be some variation in what you do to validate what you’ve found. You may want to log into a server (with credentials) to verify it is actually a server. You may want to blast packets at a firewall or IDS/IPS. Depending on the size of your environment, you may need to statistically verify a subset of the found devices. It’s not like you can log into 10,000 server devices. Actually you could, but it’s probably not a good idea.
You are looking for the greatest degree of precision you can manage, but that must be balanced with common sense to figure out how much validation you can afford.
Large vs. Small Company Considerations
One of the downsides of trying to build generic process maps is you try to factor in every potential scenario and reflect that in the process. But in the real world, many of the steps in any process are built to support scaling for large enterprise environments. So for each subprocess we comment on how things change depending on whether you are trying to monitor 10 or 10,000 devices.
Regarding enumeration, the differences crop up both when planning the also during the actual enumeration process – specifically when verifying what you found. Planning for a large enterprise needs to be pretty detailed to cover a large of IP address space (likely several different spaces) and there may be severe ramifications to disruptions caused by the scanning. Not that smaller companies don’t care about disruption, but with fewer moving parts there is a smaller chance of unforseen consequences.
Clearly the verification aspect of enumeration varies, depending on how deeply you verify. There is a lot of information you can gather here, so it’s a matter of balancing time to gather, against time to verify, against the need for the data.
Once we’ve finished enumeration it’s time to figure out what we are actually going to monitor on an ongoing basis. This can be driven by compliance (all devices handling protected data must be monitored) or a critical application. Of course this tends not to be a decision you can make arbitrarily by yourself, so a big part of the scoping process ensures you get buy-in on what you decide to monitor.
Here are the four steps in the Scope process:
- Identify Requirements: Monitoring is not free, though your management may think so. So we need a process to figure out why you monitor, and from that what to monitor. That means building a case for monitoring devices, potentially leveraging things like compliance mandates and/or best practices. You also need to meet with the business users, risk/compliance team, legal counsel, and other influencers to understand what needs to be monitored from their perspectives and why.
- Specify Devices: Based on those requirements, weigh each possible device type against the requirements and then figure out which devices of each type should be monitored. You may look to geographies, business units, or other means to segment your installed base into device groups. In a perfect world you’d monitor everything, but the world isn’t perfect. So it’s important to keep economic reality in mind when deciding how deeply to monitor what.
- Select Collection Method: For each device you’ll need to figure out how you will collect the data, and what data you want. This may involve research if you haven’t fully figured it out yet.
- Document: Finally you document the devices determined in scope, and then undertake the fun job of achieving consensus. Yes, you already asked folks what should be monitored when identifying requirements, but we remain fans of both asking ahead of time and then reminding them what you’ve heard, and also confirming they still see agree when it comes time to start doing something. The consensus building can add time – which is why most folks skip it – but minimizes the chance that you’ll be surprised down the road. Remember, security folks hate surprises.
Device Type Variance
The process is the same whether you are scoping a firewall, IDS/IPS, or server. Obviously the techniques used to collect data vary by device type, so you’ll need to research each type separately, but the general process is the same.
Large vs. Small Company Considerations
These are generally the same as in the Enumerate process. The bigger the company, the more moving pieces, the harder requirements gathering is, and the more difficulty in getting a consensus. Got it? OK, that was a bit tongue in cheek, but as a security professional trying to get things done, you’ll need to figure out the level of research and consensus to attempt with each of these steps. Some folks would rather ask for forgiveness, but as you can imagine there are risks with that.
The good news is there is a lot of leverage in figuring out how to collect data from the various device types. Doing the research on collecting data from Windows or Linux servers is the same whether you have 15 or 1,500. The same for firewalls and IDS/IPS devices. But you’ll spend the extra time gaining consensus, right?
Tomorrow we’ll talk about defining the policies, which is more detailed due to the number of policies you need to define.
Posted at Tuesday 13th July 2010 8:15 pm
(3) Comments •
Fundamentally, tokenization is fairly simple. You are merely substituting a marker of limited value for something of greater value. The token isn’t completely valueless – it is important within its application environment – but that value is limited to the environment, or even a subset of that environment.
Think of a subway token or a gift card. You use cash to purchase the token or card, which then has value in the subway system or a retail outlet. That token has a one to one relationship with the cash used to purchase it (usually), but it’s only usable on that subway or in that retail outlet. It still has value, we’ve just restricted where it has value.
Tokenization in applications and databases does the same thing. We take a generally useful piece of data, like a credit card or Social Security Number, and convert it to a local token that’s useless outside the application environment designed to accept it. Someone might be able to use the token within your environment if they completely exploit your application, but they can’t then use that token anywhere else. In practical terms, this not only significantly reduces risks, but also (potentially) the scope of any compliance requirements around the sensitive data.
Here’s how it works in the most basic architecture:
- Your application collects or generates a piece of sensitive data.
- The data is immediately sent to the tokenization server – it is not stored locally.
- The tokenization server generates the random (or semi-random) token. The sensitive value and the token are stored in a highly-secured and restricted database (usually encrypted).
- The tokenization server returns the token to your application.
- The application stores the token, rather than the original value. The token is used for most transactions with the application.
- When the sensitive value is needed, an authorized application or user can request it. The value is never stored in any local databases, and in most cases access is highly restricted. This dramatically limits potential exposure.
For this to work, you need to ensure a few things:
- That there is no way to reproduce the original data without the tokenization server. This is different than encryption, where you can use a key and the encryption algorithm to recover the value from anywhere.
- All communications are encrypted.
- The application never stores the sensitive value, only the token.
- Ideally your application never even touches the original value – as we will discuss later, there are architectures and deployment options to split responsibilities; for example, having a non-user-accessible transaction system with access to the sensitive data separate from the customer facing side. You can have one system collect the data and send it to the tokenization server, another handle day to day customer interactions, and a third for handling transactions where the real value is needed.
- The tokenization server and database are highly secure. Modern implementations are far more complex and effective than a locked down database with both values stored in a table.
In our next posts we will expand on this model to show the architectural options, and dig into the technology itself. We’ll show you how tokens are generated, applications connected, and data stored securely; and how to make this work in complex distributed application environments.
But in the end it all comes down to the basics – take something of wide value and replacing it with a token with restricted value.
Understanding and Selecting a Tokenization Solution:
Posted at Tuesday 13th July 2010 7:48 pm
(0) Comments •
We’ve seen an absolutely tremendous response to the data security survey we launched last month. As I write this we are up to 1,154 responses, with over 70% of respondents completing the entire survey. Aside from the people who took the survey, we also received some great help building the survey in the first place (especially from the Security Metrics community). I’m really loving this entire open research thing.
We’re going to close the survey soon, and the analysis will probably take me a couple weeks (especially since my statistics skills are pretty rudimentary). But since we have so much good data, rather than waiting until I can complete the full analysis I thought it would be nice to get some preliminary results out there.
First, the caveats. Here’s what I mean by preliminary:
These are raw results right out of SurveyMonkey. I have not performed any deeper analysis on them, such as validating responses, statistical analysis, normalization, etc. Later analysis will certainly change the results, and don’t take these as anything more than an early peek.
Got it? I know this data is dirty, but it’s still interesting enough that I feel comfortable putting it out there.
And now to some of the results:
We had a pretty even spread of organization sizes:
Less than 100
More than 50000
Number of employees/users
Number of managed desktops
- 36% of respondents have 1-5 IT staff dedicated to data security, while 30% don’t have anyone assigned to the job (this is about what I expected, based on my client interactions).
- The top verticals represented were retail and commercial financial services, government, and technology.
- 54% of respondents identified themselves as being security management or professionals, with 44% identifying themselves as general IT management or practitioners.
- 53% of respondents need to comply with PCI, 48% with HIPAA/HITECH, and 38% with breach notification laws (seems low to me).
Overall it is a pretty broad spread of responses, and I’m looking forward to digging in and slicing some of these answers by vertical and organization size.
Before digging in, first a major design flaw in the survey. I didn’t allow people to select “none” as an option for the number of incidents. Thus “none” and “don’t know” are combined together, based on the comments people left on the questions. Considering how many people reviewed this before we opened it, this shows how easy it is to miss something obvious.
- On average, across major and minor breaches and accidental disclosures, only 20-30% of respondents were aware of breaches.
- External breaches were only slightly higher than internal breaches, with accidental disclosures at the top of the list. The numbers are so close that they will likely be within the margin of error after I clean them. This is true for major and minor breaches.
- Accidental disclosures were more likely to be reported for regulated data and PII than IP loss.
- 54% of respondents reported they had “About the same” number of breaches year over year, but 14% reported “A few less” and 18% “Many less”! I can’t wait to cross-tabulate that with specific security controls.
Security Control Effectiveness
This is the meat of the survey. We asked about effectiveness for reducing number of breaches, severity of breaches, and costs of compliance.
- The most commonly deployed tools (of the ones we surveyed) are email filtering, access management, network segregation, and server/endpoint hardening.
- Of the data-security-specific technologies, web application firewalls, database activity monitoring, full drive encryption, backup tape encryption, and database encryption are most commonly deployed.
- The most common write-in security control was user awareness.
- The top 5 security controls for reducing the number of data breaches were DLP, Enterprise DRM, email filtering, a content discovery process, and entitlement management. I combined the three DLP options (network, endpoint, and storage) since all made the cut, although storage was at the bottom of the list by a large margin. EDRM rated highly, but was the least used technology.
- For reducing compliance costs, the top 5 rated security controls were Enterprise DRM, DLP, entitlement management, data masking, and a content discovery process.
What’s really interesting is that when we asked people to stack rank their top 3 most effective overall data security controls, the results don’t match our per-control questions. The list then becomes:
- Access Management
- Server/endpoint hardening
- Email filtering
My initial analysis is that in the first questions we focused on a set of data security controls that aren’t necessarily widely used and compared between them. In the top-3 question, participants were allowed to select any control on the list, and the mere act of limiting themselves to the ones they deployed skewed the results. Can’t wait to do the filtering on this one.
We also asked people to rank their single least effective data security control. The top (well, bottom) 3 were:
- Email filtering
- USB/portable media encryption or device control
- Content discovery process
Again, these correlate with what is most commonly being used, so no surprise. That’s why these are preliminary results – there is a lot of filtering/correlation I need to do.
Security Control Deployment
Aside from the most commonly deployed controls we mentioned above, we also asked why people deployed different tools/processes. Answers ranged from compliance, to breach response, to improving security, and reducing costs.
- No control was primarily deployed to reduce costs. The closest was email filtering, at 8.5% of responses.
- The top 5 controls most often reported as being implemented due to a direct compliance requirement were server/endpoint hardening, access management, full drive encryption, network segregation, and backup tape encryption.
- The top 5 controls most often reported as implemented due to an audit deficiency are access management, database activity monitoring, data masking, full drive encryption, and server/endpoint hardening.
- The top 5 controls implemented for cost savings were reported as email filtering, server/endpoint hardening, access management, DLP, and network segregation.
- The top 5 controls implemented primarily to respond to a breach or incident were email filtering, full drive encryption, USB/portable media encryption or device control, endpoint DLP, and server/endpoint hardening.
- The top 5 controls being considered for deployment in the next 12 months are USB/portable media encryption or device control (by a wide margin), DLP, full drive encryption, WAF, and database encryption.
Again, all this is very preliminary, but I think it hints at some very interesting conclusions once I do the full analysis.
Posted at Tuesday 13th July 2010 3:33 am
(7) Comments •