Securosis

Research

Pragmatic Key Management: Understanding Data Encryption Systems

One of the common problems in working with encryption is getting caught up with the intimate details of things like encryption algorithms, key lengths, cipher modes, and other minutiae. Not that these details aren’t important – depending on what you’re doing they might be critical – but in the larger scheme of things these aren’t the aspects most likely to trip up your implementation. Before we get into different key management strategies, let’s take a moment to look at crypto systems at the macro level. We will stick to data encryption for this paper, but these principles apply to other types of cryptosystems as well. Note: For simplicity I will often use “encryption” instead of “cryptographic operation” through this series. If you’re a crypto geek, don’t get too hung up… I know the difference – it’s for readability. The three components of a data encryption system Three major components define the overall structure of an encryption system: The data: The object or objects to encrypt. It might seem silly to break this out, but the security and complexity of the system are influenced by the nature of the payload, as well as by where it is located or collected. The encryption engine: The component that handles the actual encryption (and decryption) operations. The key manager: The component that handles key and passes them to the encryption engine. In a basic encryption system all three components are likely located on the same system. Take personal full disk encryption (the default you might use on your home Windows PC or Mac) – the encryption key, data, and engine are all kept and run on the same hardware. Lose that hardware and you lose the key and data – and the engine, but that isn’t normally relevant. But once we get into SMB and the enterprise we tend to split out the components for security, management, reliability, and compliance. Building a data encryption system Where you place these components define the structure, security, and manageability of your encryption system: Full Disk Encryption Our full disk encryption example above isn’t the sort of approach you would want to take for an organization of any size greater than 1. All major FDE systems do a good job of protecting the key if the device is lost, so we aren’t worried about security too much from that perspective, but managing the key on the local system means the system is much less manageable and reliable than if all the FDE keys are stored together. Enterprise-class FDE manages the keys centrally – even if they are also stored locally – to enable a host of more advanced functions; including better recovery options, audit and compliance, and the ability to manage hundreds of thousands of systems. Database encryption Let’s consider another example: database encryption. By default, all database management systems (DBMS) that support encryption do so with the data, the key, and the encryption engine all within the DBMS. But you can mix and match those components to satisfy different requirements. The most common alternative is to pull the key out of the DBMS and store it in an external key manager. This can protect the key from compromise of the DBMS itself, and increases separation of duties and security. It also reduces the likelihood of lost keys and enables extensive management capabilities – including easier key rotation, expiration, and auditing. But the key could be exposed to someone on the DBMS host itself because it must be stored in memory at before it can be used to encrypt or decrypt. One way to protect against this is to pull both the encryption engine and key out of the DBMS. This could be handled through an external proxy, but more often custom code is developed to send the data to an external encryption server or appliance. Of course this adds complexity and latency… Cloud encryption Cloud computing has given rise to a couple additional scenarios. To protect an Infrastructure as a Service (IaaS) storage volume running at an external cloud provider you can place the encryption engine in a running instance, store the data in a separate volume, and use an external key manager which could be a hardware appliance connected through VPN and managed in your own data center. To protect enterprise files in an object storage service like Amazon S3 or RackSpace Cloud Files, you can encrypt them on a local system before storing them in the cloud – managing keys either on the local system or with a centralized key manager. While some of these services support built-in encryption, they typically store and manage the key themselves, which means the provider has the (hopefully purely theoretical) ability to access your data. But if you control the key and the encryption engine the provider cannot read your files. Backup and storage encryption Many backup systems today include some sort of an encryption option, but the implementations typically offer only the most basic key management. Backup up in one location and restoring in another may be a difficult prospect if the key is stored only in the backup system. Additionally, backup and storage systems themselves might place the encryption engine in any of a wide variety of locations – from individual disk and tape drives, to backup controllers, to server software, to inline proxies. Some systems store the key with the data – sometimes in special hardware added to the tape or drive – while others place it with the engine, and still others keep it in an external key management server. Between all this complexity and poor vendor implementations, I tend to see external key management used for backup and storage more than for just about any other data encryption usage. Application encryption Our last example is application encryption. One of the more secure ways to encrypt application data is to collect it in the application, send it to an encryption server or appliance, and then store the encrypted data in a separate database. The keys

Share:
Read Post

Pragmatic Key Management: Introduction

Few terms strike as much dread in the hearts of security professionals as key management. Those two simple words evoke painful memories of massive PKI failures, with millions spent to send encrypted email to the person in the adjacent cube. Or perhaps it recalls the head-splitting migraine you got when assigned to reconcile incompatible proprietary implementations of a single encryption standard. Or memories of half-baked product implementations that worked fine on in isolation on a single system, but were effectively impossible to manage at scale. And by scale, I mean “more than one”. Over the years key management has mostly been a difficult and complex process. This has been aggravated by the recent resurgence in encryption – driven by regulatory compliance, cloud computing, mobility, and fundamental security needs. Fortunately, encryption today is not the encryption of yesteryear. New techniques and tools remove much of the historical pain of key management – while also supporting new and innovative uses. We also see a change in how organizations approach key management – a move toward practical and lightweight solutions. In this series we will explore the latest approaches for pragmatic key management. We will start with the fundamentals of crypto systems rather than encryption algorithms, what they mean for enterprise deployment, and how to select a strategy that suits your particular project requirements. The historic pain of key management Technically there is no reason key management needs to be as hard as it has been. A key is little more than a blob of text to store and exchange as needed. The problem is that everyone implements their own methods of storing, using, and exchanging keys. No two systems worked exactly alike, and many encryption implementations and products didn’t include the features needed to use encryption in the real world – and still don’t. Many products with encryption features supported only their own proprietary key management – which often failed to meet enterprise requirements in areas such as rotation, backup, separation of duties, and reporting. Encryption is featured in many different types of products but developers who plug an encryption library into an existing tool have (historically) rarely had enough experience in key management to produce refined, easy to use, and effective systems. On the other hand, some security professionals remember early failed PKI deployments that costs millions and provided little value. This was at the opposite end of the spectrum – key management deployed for its own sake, without thought given to how the keys and certificates would be used. Why key management isn’t as hard as you think it is As with most technologies, key management has advanced significantly since those days. Current tools and strategies offer a spectrum of possibilities, all far better standardized and with much more robust management capabilities. We no longer have to deploy key management with an all-or-nothing approach, either relying completely on local management or on an enterprise-wide deployment. Increased standardization (powered in large part by KMIP, the Key Management Interoperability Protocol) and improved, enterprise-class key management tools make it much easier to fit deployments to requirements. Products that implement encryption now tend to include better management features, with increased support for external key management systems when those features are insufficient. We now have smoother migration paths which support a much broader range of scenarios. I am not saying life is now perfect. There are plenty of products that still rely on poorly implemented key management and don’t support KMIP or other ways of integrating with external key managers, but fortunately they are slowly dying off or being fixed due to constant customer pressure. Additionally, dedicated key managers often support a range of non-standards-based integration options for those laggards. It isn’t always great, but it is much easier to mange keys now than even a few years ago. The new business drivers for encryption and key management These advances are driven by increasing customer use of, and demand for, encryption. We can trace this back to 3 primary drivers: Expanding and sustained regulatory demand for encryption. Encryption has always been hinted at by a variety of regulations, but it is now mandated in industry compliance standards (most notably the Payment Card Industry Data Security Standard – PCI-DSS) and certain government regulations. Even when it isn’t mandated, most breach disclosure laws reduce or eliminate the need to publicly report loss of client information if the lost data was encrypted. Increasing use of cloud computing and external service providers. Customers of cloud and other hosting providers want to protect their data when they give up physical control of it. While the provider often has better security than the customer, this doesn’t reduce our visceral response to someone else handling our sensitive information. The increase in public data exposures. While we can’t precisely quantify the growth of actual data loss, it is certainly far more public than it has ever been before. Executives who previously ignored data security concerns are now asking security managers how to stay out of the headlines. More enforcement of more regulations, increasing use of outsiders to manage our data, and increasing awareness of data loss problems, are all combining to produce the greatest growth the encryption market has seen in a long time. Key management isn’t just about encryption (but that is our focus today) Before we delve into how to manage keys, it is important to remember that cryptographic keys are used for more than just encryption, and that there are many different kinds of encryption. Our focus in this series is on data encryption – not digital signing, authentication, identity verification, or other crypto operations. We will not spend much time on digital certificates, certificate authorities, or other signature-based operations. Instead we will focus on data encryption, which is only one area of cryptography. Much of what we see is as much a philosophical change as improvement in particular tools or techniques. I have long been bothered people’s tendency to either indulge in encryption idealism at one end, and or dive

Share:
Read Post

White Paper: Understanding and Selecting a Database Security Platform

We are pleased to announce the availability of a new research paper, Understanding and Selecting Database Security Platforms. And this paper covers most of the facets for database security today. We started to refresh our original Database Activity Monitoring paper in October 2011, but stopped short when our research showed that platform evolution has stopped converging – and has instead diverged again to embrace independent visions of database security, and splintering customer requirements. We decided our original DAM research was becoming obsolete. Use cases have evolved and vendors have added dozens of new capabilities – they have covered the majority of database security requirements, and expanded out into other areas. These changes are so significant that we needed to seriously revisit our use cases and market drivers, and delve into the different ways preventative and detective data security technologies have been bundled with DAM to create far more comprehensive solutions. We have worked hard to fairly represent the different visions of how database security fits within enterprise IT, and to show the different value propositions offered by these variations. These fundamental changes have altered the technical makeup of products so much that we needed new vocabulary to describe these products. The new paper is called “Understanding and Selecting Database Security Platforms” (DSP) to reflect these major product and market changes. We want to thank our sponsors for the Database Security Platform paper: Application Security Inc, GreenSQL, Imperva, and McAfee. Without sponsors we would not be able to provide our research for free, so we appreciate deeply that several vendors chose to participate in this effort and endorse our research positions. You can download the DSP paper. Share:

Share:
Read Post

Incite 5/30/2012: Low Hanging Fruit

As you might have noticed, there was no Incite last week. Turns out the Boss and I were in Barcelona to celebrate 15 years of wedded bliss. We usually run about 6 months late on everything, so the timing was perfect. We had 3 days to ourselves and then two other couples from ATL joined us for the rest of the week. We got to indulge our appreciation for art – hitting the Dali, Miro, and Picasso museums. We also saw some Gaudi structures that are just mind-boggling. Then we joked about how Americans are not patient enough to ever build anything like the Sagrada Familia. Even though we were halfway around the world, we weren’t disconnected. Unless we wanted to be. I rented a MiFi, so when we checked in (mostly with the kids) we just fired up the MiFi, and Skype or FaceTime back home. Not cheap, but cheaper than paying for expensive WiFi and cellular roaming. And it was exceedingly cool to be walking around the Passion Facade of the Sagrada Familia, showing the kids the sculptures via FaceTime, connected via a MiFi on a broadband cellular network in a different country. We took it slow and enjoyed exploring the city, tooling around the markets, and feasting on natural Catalan cooking – not the mixture of additives, preservatives, and otherwise engineered nutrition we call food in the US. And we did more walking in a day than we normally do in a week. We also relaxed. It’s been a pretty intense year so far, and this was our first opportunity to take a breath and enjoy the progress we have made. But real life has a way of intruding on even the most idyllic situations. As we were enjoying a late lunch at a cafe off Las Robles, our friends mentioned how it’s been a little while since they were online. We had already had the discussion about weak passwords on their webmail accounts as we enjoyed cervezas Park Gueell the day before. Their name and a single digit number may be easy to remember, but it’s not really a good password. When my friend then told me how he checked email from a public computer in London, I braced for what I knew was likely to come next. So I started interrogating him as to what he uses that email address for. Bank accounts? Brokerage sites? Utilities? Airlines? Commerce sites? No, no, and no. OK, I can breathe now. Then I proceeded to talk about how losing control of your email can result in a bad day. I thought we were in the clear. Then my buddy’s wife piped in, “Well, I checked my bank account from that computer also, what that bad?” Ugh. Well, yes, that was bad. Quite bad indeed. Then I walked them through how a public computer usually has some kind of key logger and accessing a sensitive account from that device isn’t something you want to do. Ever. She turned ashen and started to panic. To avoid borking the rest of my holiday, I had her log into her account via the bank’s iOS app and scrutinize the transactions. Nothing out of the ordinary, so we all breathed a sigh of relief. She couldn’t reset the password from that app and none of us had a laptop with us. But she promised to change the password immediately when she got back to the US. It was a great reminder of the low-hanging fruit out there for attackers. It’s probably not you, but it’s likely to be plenty of folks you know. Which means things aren’t going to get better anytime soon, though you already knew that. –Mike Photo credits: “Low-hanging fruit explained” originally uploaded by Adam Fagen Heavy Research We’re back at work on a variety of blog series, so here is a list of the research currently underway. Remember you can get our Heavy Feed via RSS, with all our content in its unabridged glory. And you can get all our research papers too. Understanding and Selecting Data Masking How It Works Defining Data Masking Introduction Evolving Endpoint Malware Detection Control Lost Incite 4 U Bear hunting for security professionals: Fascinating post by Chris Nickerson about Running from your Information Security Program. How else could you integrate bear hunting in Russia (yes, real bears), running, and security? He talks about how these Russian dudes take down bears with nothing more than a stick and a knife. Probably not how you’d plan to do it, right? Chris’ points are well taken, especially challenging the adage about not needing to be totally secure – just more secure than the other guys. That’s what I love about pen testers – they question everything, challenge assumptions, and spend a great deal of their lives proving those assumptions wrong. The answer? Plan for the inevitable attacks and make sure you can respond. Yes, it’s something lots of folks (including us) have been talking about for a long time. Though I do enjoy highlighting new and interesting ways to tell important stories. – MR Job security: Say you’re the CISO of a retail chain. Do you think you’d be fired if 10% of your transactions were hacked and resulted in fraud? Maybe you should consider working for the IRS, because apparently gigantic fraud rates not only don’t get you fired there – you get sympathetic press. I bet the guys at Global Payments and Heartland are jealous! And someone at the IRS actually thought that anonymous Internet tax filings, with subsequent anonymous distribution of refunds, was a great idea. I’m willing to bet that not only is whoever created the program is still working at the IRS (where else?), but they will keep the program as is. There are occasions where it’s better to ditch fundamentally flawed processes – and losing millions, if not hundreds of millions, of dollars is a good indicator that your process still has a few glitches – and start over. Most

Share:
Read Post

Understanding and Selecting Data Masking: How It Works

In this post I want to show how masking works, focusing on how masking platforms move and manipulate data. I originally intended to start with architectures and mechanics of masking systems; but it should be more helpful to start by describing the different masking models, how data flows through different systems, and the advantages and disadvantages of each. I will comment on common data sources and destinations, and the issues to consider when considering masking technology. There are many different types of data repositories and services which can be masked, so I will go into detail on these choices. For now we will stick to relational databases, to keep things simple. Let’s jump right in and discuss how the technology works. ETL When most people think about masking, they think about ETL. ‘ETL’ is short for Extraction-Transformation-Load – a concise description of the classic (and still most common) masking process. Sometimes referred to as ‘static’ masking, ETL works against a fixed export from the source repository. Each phase of ETL is typically performed on separate servers: A source data repository, a masking server that orchestrates the transformation, and a destination database. The masking server connects to the source, retrieves a copy of the data, applies the mask to specified columns of data, and then loads the result onto the target server. This process may be partially manual, fully driven by an administrator, or fully automated. Let’s examine the steps in greater detail: Extract: The first step is to ‘extract’ the data from some storage repository – most often a relational database. The extracted data is often formatted to make it easier for the mask to be applied. For example, extraction can performed with a simple SELECT query issued against a database, filtering out unwanted rows and formatting columns in the query. Results may be streamed directly to the masking application for processing or dumped into a file – such as a comma-separated .csv or tab-separated .tsv file. The extracted data is then securely transferred, as an encrypted file or over an encrypted SSL connection, to the masking platform. Transform: The second step is to apply the data mask, transforming sensitive production data into a safe approximation of the original content. See Defining Masking for available transformations. Masks are almost always applied to what database geeks call “columnar data” – which simply means data of the same type is grouped together. For example, a database may contain a ‘customer’ table, where each customer entry includes a social security (SSN). These values are grouped together into a single column, in files and databases alike, making it easier for the masking application to identify which data to mask. The masking application parses through the data, and for each column of data to be masked, it replaces each entry in the column with a masked value. Load: In the last step masked data is loaded into a destination database. The masked data is copied to one or more destination databases, where it is loaded back into tables. The destination database does not contain sensitive data, so it is not subject to the same security and audit requirements as the original database with the unmasked data. ETL is the most generic and most flexible of masking approaches. The logical ETL process flow implemented in dedicated masking platforms, data management tools with integrated masking and encryption libraries, embedded database tools – all the way down to home-grown scripts. I see all these used in production environments, with the level of skill and labor required increasing as you progress down the chain. While many masking platforms replicate the full process – performing extraction, masking, and loading on separate systems – that is not always the case. Here are some alternative masking models and processes. In-place Masking In some cases you need to create a masked copy within the source database – perhaps before moving it to another less sensitive database. In other cases the production data is moved unchanged (securely!) into another system, and then masked at the destination. When production data is discovered on a test system, the data may be masked without being moved at all. All these variations are called “in-place masking” because they skip both movement steps. The masks are applied as before, but inside the database – which raises its own security and performance considerations. There are very good reasons to mask in place. The first is to take advantage of databases’ facility with management and and manipulation of data. They are incredibly adept at data transformation, and offer very high masking performance. Leveraging built-in functions and stored procedures can speed up the masking process because the database has already parsed the data. Masking data in place – replacing data rather than creating a new copy – protects database archives and data files from snooping, should someone access backup tapes or raw disk files. If the security of data after it leaves the production database is your principal concern, then ETL and in-place masking prior to moving data to another location should satisfy security and audit requirements. Many test environments have poor security, which may require masking prior to export or use of a secure ETL exchange, to ensure sensitive data is never exposed on the network or in destination data repository. That said, among enterprise customers we have interviewed, masking data at the source (in the production database) is not a popular option. The additional computational overhead of the masking operation, in addition to the overhead required to read and write the data being transformed, may have an unacceptable impact on database performance. In many organization legacy databases struggle to keep up with day-to-day operation, and cannot absorb the additional load. Masking in the target database (after the data has been moved) is not very popular either – masking solutions are generally purchased to avoid putting sensitive data on insecure test systems, and such customers prefer to avoid loading data into untrusted test systems prior to masking. In-place masking is typically

Share:
Read Post

Security, Metrics, Martial Arts, and Triathlon: a Meandering Friday Summary

Rich here. One of the more fascinating – and unexpected – aspects of migrating from martial arts to triathlon as my primary sport has been importance role of metrics, and how they have changed my views on security. Both sports are pretty darn geeky. On the martial arts side we have intense history, technique, and strategy. Positional errors of a fraction of an inch can mean the difference between success, failure, and injury. But overall there is less emphasis on hard metrics. We use them for conditioning but lack much of the instrumentation needed to collect the kinds of metrics that can make the difference between victory and defeat in competition. For example, very few martial artists could gather hard statistics on how an opponent reacts under specific circumstances, never mind translating that to a specific strategy. Nor do we measure things like speed and power in specific physical configurations. Some martial artists track some fraction of this at a macro level, but generally not with statistical depth. I remember that when training for nationals I knew I would be up against one particular opponent and I studied his strengths, weaknesses, and reactions in certain situations, but I certainly didn’t calculate anything. Besides, some 16 year old kid kicked my ass in the first round and I never went up against the person I planned for (major nutritional failure on my part). Oops. A lot of strategy. Sometimes metrics, but not often and not solid. And a lot of reliance on instinct and core training. Sounds a lot like security. Triathlon is on the opposite end of the spectrum – as are most endurance sports. There is definitely strategy, but even that is defined mostly by raw numbers. I have been tracking my athletic performance metrics fairly intensely since I moved mostly to endurance sports (due to the kids). This started around 10 years ago, although only over the last 3 years have I really focused on it. Additionally, since getting sick last summer I have also started tracking all sorts of other metrics – mostly my daily movements (Jawbone Up, which isn’t available right now), and sleep (Zeo). For the past year I have kept most of this in TrainingPeaks. I’m learning more about myself than I thought possible. I know what paces I can sustain, and what distances, to within a handful of seconds. I know how those are affected by different weather conditions. I know exactly how what I eat affects how I perform different kinds of workouts. I know how food, exercise, and alcohol affect my sleep. I have learned things like how to dial in my diet (no carb no good, but mostly natural with a small amount of processed carbs hits the sweet spot). I know how many days I can go on reduced sleep before I am more likely to get sick. I even figured out just about exactly what will cause one of the stomach incidents that freaked me out so badly last year. I pretty much track myself 24/7. The Jawbone counts how much I move during the day. The Zeo how well I sleep. My Garmin 910XT how well I swim, bike, and run. A Withings scale for weight and body fat. And TrainingPeaks for mood, illness, injury, training stress (mathematically calculated from my workouts), and whatever else I want to put in there. (I have toyed with diet, but don’t really track calories yet). I measure, track over time, and then correlate to make training and lifestyle decisions. These are not theoretical – I use those metrics to change how I live, and then I track my outcomes. I know, for example, that I can optimize my training in the amount of time I have for triathlon, but my single sport performance drops to predictable degrees. All this for someone back-of-pack and over 40. The pros? The levels to which they can tune their lives and training are insane. And it all directly affects performance and their ability to win. But, as with everything, the numbers don’t tell the full story. They can’t precisely predict who will win on race day. Maybe the leader will get caught behind a crash. Maybe they’ll miss just enough sleep, or hit a crosswind at the wrong time, or just have an off day. Maybe someone else will dig deep and blow past everything the numbers predict. But without those numbers, tracked and acted on, for years on end, no pro would ever have a chance of being in the race. Security today is a lot more like martial arts than triathlon, but I’m starting to think the ratio is skewed in the wrong direction. We can track a lot more than we do, and base far more decisions on data than on instinct. Yes, we are battling an opponent, but our race lasts years – not three five-minute rounds. And unlike professional martial artists, we don’t even know our ideal fighting weight, never mind our conditioning level. Believe it or not, I wasn’t always a metrics wonk. I used to think skill and instinct mattered more than anything else. The older I get, the more I realize how very wrong that is. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Mike’s monthly Dark Reading blog: Time to deploy the FUD weapon? Rich quoted by the Macalope: The Macalope Daily: Protesting too much (Subscription required). Favorite Securosis Posts Adrian Lane: Evolving Endpoint Malware Detection: Control Lost. New threats and redefining what ‘endpoint’ actually means are a couple good reasons to follow this series. Mike Rothman: Understanding and Selecting Data Masking: Introduction. Masking is a truly under-appreciated function. Until your production data shows up in an Internet-accessible cloud instance, that is. Adrian’s series should shed some light on this topic. Rich: Continuous Learning. I’m not sure my quote fit here, but I’m sure a fan of people diversifying their knowledge. Other Securosis Posts Our posting volume is down a bit due to

Share:
Read Post

Understanding and Selecting Data Masking: Defining Data Masking

Before I start today’s post, thank you for all the letters saying that people are looking forward to this series. We have put a lot of work into this research to ensure we capture the state of currently available technology, and we are eager to address this under-served market. As always, we encourage blog comments because they help readers understand other viewpoints that we may not reflect in the posts proper. And for the record, I’m not knocking Twitter debates – they are useful as well, but they’re more ephemeral and less accessible to folks outside the Twitter cliques – not everybody wants to follow security geeks like me. And I also apologize for our slow start since initial launch – between meeting with vendors, some medical issues, and client off-site meetings, I’m a bit behind. But I have collected all the data I think is needed to do justice to this subject, so let’s get rolling! In today’s post I will define masking and show the basics of how it works. First a couple basic terms with their traditional definitions: Mask: Similar to the traditional definition, of a facade or a method of concealment, a data mask is a function that transforms data into something similar but new. It may or may not be reversible. Obfuscation: Hiding the original value of data. Data Masking Definition Data masking platforms at minimum replace sensitive data elements in a data repository with similar values, and optionally move masked data to another location. Masking effectively creates proxy data which retains part of the value of the original. The point is to provide data that looks and acts like the original data, but which lacks sensitivity and doesn’t pose a risk of exposure, enabling use of reduced security controls for masked data repositories. This in turn reduces the scope and complexity of IT security efforts. The mask should make it impossible or impractical to reverse engineer masked values back to the original data without special additional information. We will cover additional deployment models and options later in this series, but the following graphic provides an overview: Keep in mind that ‘masking’ is a generic term, and it encompasses several possible data masking processes. In a broader sense data masking – or just ‘masking’ for the remainder of this series – encompasses collection of data, obfuscation of data, storage of data, and possibly movement of the masked information. But ‘mask’ is also used in reference to the masking operation itself – how we change the original data into something else. There are many different ways to obfuscate data depending on the type of data being stored, each embodied by a different function, and each meeting suitable for different security and data use cases. It might be helpful to think of masking in terms of Halloween masks: the level of complexity and degree of concealment both vary, depending upon the effect desired by the wearer. The following is a list of common data masks used to obfuscate data, and how their functionalities differ: Substitution: Substitution is simply replacing one value with another. For example, the mask might substitute a person’s first and last names with names from a some random phone book entry. The resulting data still constitutes a name, but has no logical relationship with the original real name unless you have access to the original substitution table. Redaction/Nulling: This is a form of substitution where we simply replace sensitive data with a generic value, such as ‘X’. For example, we could replace a phone number with “(XXX)XXX-XXXX”, or a Social Security Number (SSN) with XXX-XX-XXXX. This is the simplest and fastest form of masking, but provides very little (arguably no information) from the original. Shuffling: Shuffling is a method of randomizing existing values vertically across a data set. For example, shuffling individual values in a salary column from a table of employee data would make the table useless for learning what any particular each employee earns. But it would not change aggregate or average values for the table. Shuffling is a common randomization technique for disassociating sensitive data relationships (e.g., Bob makes $X per year) while retaining aggregate values. Transposition: This means to swap one value with another, or a portion of one string with another. Transposition can be as complex as an encryption function (see below) or a simple as swapping swapping the first four digits of a credit card number with the last four. There are many variations, but transposition usually refers to a mathematical function which moves existing data around in a consistent pattern. Averaging: Averaging is an obfuscation technique where individual numeric values are replaced by a value derived by averaging some portion of the individual number values. In our salary example above, we could substitute individual salaries with the average across a group or corporate division to hide individual salary values while retaining an aggregate relationship to the real data. De-identification: A generic term that applies to any process that strips identifying information, such as who produced the data set, or personal identities within the data set. De-identification is an important topic when dealing with complex, multi-column data sets that provide ample means for someone to reverse engineer masked data back into individual identities. Tokenization: Tokenization is substitution of data elements with random placeholder values, although vendors overuse the term ‘tokenization’ for a variety of other techniques. Tokens are non-reversible because the token bears no logical relationship with the original value. Format Preserving Encryption: Encryption is the process of transforming data into an unreadable state. For any given value the process consistently produces the same result, and it can only be reversed with special knowledge (the key). While most encryption algorithms produce strings of arbitrary length, format preserving encryption transforms the data into an unreadable state while retaining the format (overall appearance) of the original values. Each of these mask types excels in some use cases, and also of course incurs a certain amount of overhead due to its

Share:
Read Post

Evolving Endpoint Malware Detection: Control Lost

Today we start our latest blog series, which we are calling Evolving Endpoint Malware Detection: Dealing with Advanced and Targeted Attacks – a logical next step from much of the research we have already done around the evolution of malware and emerging controls to deal with it. We started a few years back by documenting Endpoint Security Fundamentals, and more recently looked at network-based approaches to detect malware at the perimeter. Finally we undertook the Herculean task of decomposing the processes involved in confirming an infection, analyzing the malware, and tracking its proliferation with our Malware Analysis Quant research. Since you were a wee lad in the security field, the importance of layered defense has been drummed into your head. No one control is sufficient. In fact, no set of controls are sufficient to stop the kinds of attacks we see every day. But by stacking as many complimentary controls as you can (without totally screwing up the user experience), you can make it hard enough for the attackers that they go elsewhere, looking for lower hanging fruit. Regardless of how good defense in depth sounds, the reality is that with the advent of increased mobility we need to continue protecting the endpoint, as we generally can’t control the location or network being used. Obviously no one would say our current endpoint protection approaches work particularly well, so it’s time to critically evaluate how to do it better. But that’s jumping ahead a bit. First let’s look at the changing requirements before we vilify existing endpoint security controls. Control Lost Sensitive corporate data has never been more accessible. Between PCs and smartphones and cloud-based services (Salesforce.com, Jive, Dropbox, etc.) designed to facilitate collaboration, you cannot assume any device – even those you own and control – isn’t accessing critical information. Just think about how your personal work environment has changed over the past couple years. You store data somewhere in the cloud. You access corporate data on all sorts of devices. You connect through a variety of networks, some ‘borrowed’ from friends or local coffee shops. We once had control of our computing environments, but that’s no longer the case. You can’t assume anything nowadays. The device could be owned by the employee and/or your CFO’s kid could surf anywhere on a corporate laptop. Folks connect through hotel networks and any other public avenues. Obviously this doesn’t mean you should (or can) just give up and stop worrying about controlling your internal networks. But you cannot assume your perimeter defenses, with their fancy egress filtering and content analysis, are in play. An just in case the lack of control over the infrastructure isn’t unsettling enough, you still need to consider the user factor. You know, the unfortunate tendency of employees to click pretty much anything that looks interesting. Potentially contracting all sorts of bad stuff, bringing it back into your corporate environment, and putting data at risk. Again, we have to fortify the endpoint to the greatest degree possible. Advancing Adversaries The attackers aren’t making things any easier. Today’s professional malware writers have gotten ahead of these trends by using advanced malware (remote access trojans [RATs] and other commercial malware techniques) to defeat traditional endpoint defenses. It is well established that traditional file-matching approaches (on both endpoints and mail & web gateways) no longer effectively detect these attacks – due to techniques such as polymorphism, malware droppers, and code obfuscation. Even better, you cannot expect to see an attack before it hits you. Whether it’s a rapidly morphing malware attack or a targeted attempt, yesterday’s generic sample gathering processes (honeynets, WildList, etc.) don’t help, because these malware files are unique and customized to the target. Vendors use the generic term “zero day” for malware you haven’t seen, but the sad reality is you haven’t seen anything important that’s being launched at you. It’s all new to you. When we said professional malware writers, we weren’t kidding. The bad guys now take an agile software approach to building their attacks. They have tools to develop and test the effectiveness of their malware, and are even able to determine whether existing malware protection tools will detect their attacks. Even coordinated with reputation systems and other mechanisms for detecting zero-day attacks, today’s solutions are just not effective enough. All this means security practitioners need new tactics for detecting and blocking malware which targets their users. Evolving Endpoint Malware Detection The good news is that endpoint security vendors realized their traditional approaches were about as viable as dodo birds a few years back. They have been developing their approaches – the resulting products have reduced footprints, require far less computing resources, and are generally decent at detecting simple attacks. But as we have described, simple attacks aren’t the ones to worry about. So in this series we will investigate how endpoint protection will evolve to better detect and hopefully block the current wave of attacks. We will start the next post by identifying the behavioral indicators of a malware attack. Like any poker player, every attack includes its own ‘tells’ that enable you to recognize bad stuff happening. Then we will describe and evaluate a number of different techniques to identify these ‘tells’ at different points along the attack chain. Finally we will wrap up with a candid discussion of the trade-offs involved in dealing with this advanced malware. You can stop these attacks, but the cure may be worse than the disease. So we will offer suggestions for how to find that equilibrium point between detection, response, and user impact. We would like to thank the folks at Trusteer for sponsoring this blog series. As we have mentioned before, you get to enjoy our work for a pretty good price because forward-thinking companies believe in educating the industry in a vendor-neutral and objective fashion. Share:

Share:
Read Post

Continuous Learning

I referred back to the Pragmatic CSO tips when I started the Vulnerability Management Evolution series (the paper hit yesterday, by the way) and there was some good stuff in there, so let me once again dust off those old concepts and highlight another one. This one dealt with the reality that you are a business person, not a security person. When I first meet a CSO, one of the first things I ask is whether they consider themselves a “security professional” or a “finance/health care/whatever other vertical professional.” 8 out of 10 times they respond “security professional” without even thinking. I will say that it’s closer to 10 out of 10 with folks that work in larger enterprises. These folks are so specialized they figure a firewall is a firewall is a firewall and they could do it for any company. They are wrong. One of the things preached in the Pragmatic CSO is that security is not about firewalls or any technology for that matter. It’s about protecting the systems (and therefore the information assets) of the business and you can bet there is a difference between how you protect corporate assets in finance and consumer products. In fact there are lots of differences between doing security in most major industries. There are different businesses, they have different problems, they tolerate different levels of pain, and they require different funding models. To put it another way, a health care CSO said it best to me. When I asked him the question, his response was “I’m a health care IT professional that happens to do security.” That was exactly right. He spent years understanding the nuances of protecting private information and how HIPAA applies to what he does. He understood how the claims information between providers and payees is sent electronically. He got the BUSINESS and then was able to build a security strategy to protect the systems that are important to the business. So let’s say you actually buy into this line of thinking. You spend a bunch of time learning about banking, since you work for a bank. Or manufacturing since your employer makes widgets. It’s all good, right? Well, not so much. What happens when your business changes? Maybe not fundamentally, but partially? You have to change with it. Let me give you an example that’s pretty close to home. My Dad’s wife is a candy importer. She sources product from a variety of places and sells via her own brand in the US, or using the manufacturer’s brand when that makes sense. We were talking recently and she said they had a good year in 2011. I figured that was the insatiable demand for sweets driving the business (fat Americans pay her bills), but in fact it was a couple savvy currency hedges that drove the additional profits. That’s right, the candy importer is actually a currency trader. Obviously that means she has to deal with all sorts of other data types that don’t pertain to distributing candy, and that data needs to be protected differently. That example pretty simple, but what if you thought you were in the transportation business, and then your employer decided to buy a refinery? Yes, Delta is now in the refining business. So their security team, who knows all about protecting credit cards and ensuring commerce engines (web site and reservation systems) don’t fall over under attack, now gets to learn all about the attack surface of critical infrastructure. Obviously huge conglomerates in unrelated businesses roamed the earth back in the 80s, fueled by Milken-generated junk bonds and hostile takeovers. Then the barbarians at the gates were slain, and the pendulum swung back to focus and scale for the past couple decades. It should be no surprise when we inevitably swing back the other way – as we always do. It’s a good thing that security folks are naturally curious. As Rich posted in our internal chat room yesterday: I can’t remember a time in my life when I didn’t poke and prod. You can’t be good at security if you think any other way. – Rich Mogull If you aren’t comfortable with the realization that no matter how much you know, you don’t know jack, you won’t last very long in the security business. Or any business, for that matter. Photo credit: “Learning by Doing” originally uploaded by BrianCSmith Share:

Share:
Read Post

Friday Summary: May 18, 2012

A friend told me this week they were on Pinterest. I responded, “I’m sorry! How long does your employer allow you to take off?” I was seriously thinking this was something like paternity leave or one of those approved medical absence programs. I really wondered when he got sick, and what his prognosis was. He told me, “No, I’m on Pinterest to market my new idea.” WTF? Turns out it’s not a medical sabbatical, but another social media ‘tool’ for sharing photos and stuff. When I Googled Pinterest to find out what the heck it actually was, I found a long blog about the merits of using Pinterest for Engagement Marketing, which happened to be at the blog of an old friend’s company. Soon thereafter I fired up Skype and was chatting with him, finding out what he’d been up to, how the kids were, and what mutual friends he had seen. That led to a LinkedIn search to find those friends mentioned, and while looking I spotted a couple other people I had lost track of. Within minutes I’d emailed one and found the other on Twitter. My friend on Twitter told me to check her blog on marketing over social media, which referenced another mutual friend. I emailed him, and when I hit ‘send’, I received a LinkedIn update with a list of several friends who recently changed jobs. I messaged one and texted the other to congratulate them. The next thing I knew I was chatting on FaceTime with one of these friends, in a pub in London celebrating his new position. We talked for a while, and then he said he ran into a fraternity brother and texted me his email. I emailed the fraternity brother, who sent back a LinkedIn invite telling me he’d Skype me later in the day, and included a funny YouTube video of Darth Vader riding a unicycle while playing bagpipes. As I watched the bagpiping maniac a Skype message popped up from another friend telling me she’s changed jobs (and have you noticed all of the people in tech changing jobs recently?). She invited me to speak at an event for her new company, listed on Meetup. I declined, sending her the Gotomeeting link to a conflicting event, but told her I’ll be in town later in the week and sent her a calendar invite for lunch. She sent back a list of Yelp recommendations for where to go. All in about an hour one morning. For an asocial person, this whole social media thing seems to have permeated my life. It’s freakin’ everywhere. In case you hadn’t heard, Facebook’s making an Initial Public Offering right about now. But love them or hate them, each social media site seems to do one thing really well! LinkedIn is a really great way to keep in touch with people. No more shoebox full of business cards for me! And it’s totally blending work and home, and combining groups of friends from different periods of my life into one ever-present pool. Twitter is an awesome way casually chat in real time with a group of friends while getting work done. BeeJive lets me chat on my mobile phone with the guys at Securosis. Skype offers cheap calls of reasonable quality to anyone. Some companies actually do follow Twitter with live human beings and respond to customer complaints, which is great. And Facebook offers a great way to infect your browser with malware! That said, every social media site still sucks hard. I’m not talking about users making asses of themselves, but instead about how every site tries too hard to be more than a one-trick pony, offering stuff you don’t want. I guess they are trying to increase shareholder value or some such nonsense rather than serve their audience. Skype was trying to branch out with their ‘mood’ feature – who thought that crap was a good idea? And now Pinterest is copying that same bad idea? Facebook Social Cam? Or LinkedIn communities, which seem to be a cesspool of bad information and people “positioning themselves” for employment. Corporate Twitter spambots are bad but they’re not the worst – not by a long shot. It’s the garbage from the social media companies who feel they must inform me that my “contacts are not very active”, or remind me that I have not responded to so-and-so’s request, or promote some new ‘feature’ they have just created which will likely interfere with what they actually do well. Who decided that social media must have nagware built in? And in spite of all the horrific missteps social media makes trying to be more than they are, these sites are great because they provide value. And most of them provide the core product – the one that’s really useful – free! Much as I hate to admit it, social media has become as important as my phone, and I use it every day. Oh, before I forget: If you have emailed us and we have failed to respond in the last couple weeks, please resend your email. We’ve got a triple spam filter going, and every once in a while the service changes its rule enforcement and suddenly (silently) blocks a bunch of legit email. Sorry for the inconvenience. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Mike on the “Renaissance Information Security Professional”. Rich quoted on Adobe’s fixes on c|net. Mike’s Dark Reading post: Time To Deploy The FUD Weapon? Favorite Securosis Posts Mike Rothman: Understanding and Selecting Data Masking: Introduction. Masking is a truly under-appreciated function. Until your production data shows up in an Internet-accessible cloud instance, that is. Hopefully Adrian’s series sheds some light on the topic. Adrian Lane: Write Third. Rich nails it – the rush to be first kills journalism/integrity/fact checking/perspective/etc. Most ‘writers’ become automated garbage relays, often with humorous results, such as one of my all time favorite Securosis posts. Other Securosis Posts [New White Paper] Vulnerability Management Evolution.

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.