Securosis

Research

Understanding and Selecting Data Masking: How It Works

In this post I want to show how masking works, focusing on how masking platforms move and manipulate data. I originally intended to start with architectures and mechanics of masking systems; but it should be more helpful to start by describing the different masking models, how data flows through different systems, and the advantages and disadvantages of each. I will comment on common data sources and destinations, and the issues to consider when considering masking technology. There are many different types of data repositories and services which can be masked, so I will go into detail on these choices. For now we will stick to relational databases, to keep things simple. Let’s jump right in and discuss how the technology works. ETL When most people think about masking, they think about ETL. ‘ETL’ is short for Extraction-Transformation-Load – a concise description of the classic (and still most common) masking process. Sometimes referred to as ‘static’ masking, ETL works against a fixed export from the source repository. Each phase of ETL is typically performed on separate servers: A source data repository, a masking server that orchestrates the transformation, and a destination database. The masking server connects to the source, retrieves a copy of the data, applies the mask to specified columns of data, and then loads the result onto the target server. This process may be partially manual, fully driven by an administrator, or fully automated. Let’s examine the steps in greater detail: Extract: The first step is to ‘extract’ the data from some storage repository – most often a relational database. The extracted data is often formatted to make it easier for the mask to be applied. For example, extraction can performed with a simple SELECT query issued against a database, filtering out unwanted rows and formatting columns in the query. Results may be streamed directly to the masking application for processing or dumped into a file – such as a comma-separated .csv or tab-separated .tsv file. The extracted data is then securely transferred, as an encrypted file or over an encrypted SSL connection, to the masking platform. Transform: The second step is to apply the data mask, transforming sensitive production data into a safe approximation of the original content. See Defining Masking for available transformations. Masks are almost always applied to what database geeks call “columnar data” – which simply means data of the same type is grouped together. For example, a database may contain a ‘customer’ table, where each customer entry includes a social security (SSN). These values are grouped together into a single column, in files and databases alike, making it easier for the masking application to identify which data to mask. The masking application parses through the data, and for each column of data to be masked, it replaces each entry in the column with a masked value. Load: In the last step masked data is loaded into a destination database. The masked data is copied to one or more destination databases, where it is loaded back into tables. The destination database does not contain sensitive data, so it is not subject to the same security and audit requirements as the original database with the unmasked data. ETL is the most generic and most flexible of masking approaches. The logical ETL process flow implemented in dedicated masking platforms, data management tools with integrated masking and encryption libraries, embedded database tools – all the way down to home-grown scripts. I see all these used in production environments, with the level of skill and labor required increasing as you progress down the chain. While many masking platforms replicate the full process – performing extraction, masking, and loading on separate systems – that is not always the case. Here are some alternative masking models and processes. In-place Masking In some cases you need to create a masked copy within the source database – perhaps before moving it to another less sensitive database. In other cases the production data is moved unchanged (securely!) into another system, and then masked at the destination. When production data is discovered on a test system, the data may be masked without being moved at all. All these variations are called “in-place masking” because they skip both movement steps. The masks are applied as before, but inside the database – which raises its own security and performance considerations. There are very good reasons to mask in place. The first is to take advantage of databases’ facility with management and and manipulation of data. They are incredibly adept at data transformation, and offer very high masking performance. Leveraging built-in functions and stored procedures can speed up the masking process because the database has already parsed the data. Masking data in place – replacing data rather than creating a new copy – protects database archives and data files from snooping, should someone access backup tapes or raw disk files. If the security of data after it leaves the production database is your principal concern, then ETL and in-place masking prior to moving data to another location should satisfy security and audit requirements. Many test environments have poor security, which may require masking prior to export or use of a secure ETL exchange, to ensure sensitive data is never exposed on the network or in destination data repository. That said, among enterprise customers we have interviewed, masking data at the source (in the production database) is not a popular option. The additional computational overhead of the masking operation, in addition to the overhead required to read and write the data being transformed, may have an unacceptable impact on database performance. In many organization legacy databases struggle to keep up with day-to-day operation, and cannot absorb the additional load. Masking in the target database (after the data has been moved) is not very popular either – masking solutions are generally purchased to avoid putting sensitive data on insecure test systems, and such customers prefer to avoid loading data into untrusted test systems prior to masking. In-place masking is typically

Share:
Read Post

Understanding and Selecting Data Masking: Defining Data Masking

Before I start today’s post, thank you for all the letters saying that people are looking forward to this series. We have put a lot of work into this research to ensure we capture the state of currently available technology, and we are eager to address this under-served market. As always, we encourage blog comments because they help readers understand other viewpoints that we may not reflect in the posts proper. And for the record, I’m not knocking Twitter debates – they are useful as well, but they’re more ephemeral and less accessible to folks outside the Twitter cliques – not everybody wants to follow security geeks like me. And I also apologize for our slow start since initial launch – between meeting with vendors, some medical issues, and client off-site meetings, I’m a bit behind. But I have collected all the data I think is needed to do justice to this subject, so let’s get rolling! In today’s post I will define masking and show the basics of how it works. First a couple basic terms with their traditional definitions: Mask: Similar to the traditional definition, of a facade or a method of concealment, a data mask is a function that transforms data into something similar but new. It may or may not be reversible. Obfuscation: Hiding the original value of data. Data Masking Definition Data masking platforms at minimum replace sensitive data elements in a data repository with similar values, and optionally move masked data to another location. Masking effectively creates proxy data which retains part of the value of the original. The point is to provide data that looks and acts like the original data, but which lacks sensitivity and doesn’t pose a risk of exposure, enabling use of reduced security controls for masked data repositories. This in turn reduces the scope and complexity of IT security efforts. The mask should make it impossible or impractical to reverse engineer masked values back to the original data without special additional information. We will cover additional deployment models and options later in this series, but the following graphic provides an overview: Keep in mind that ‘masking’ is a generic term, and it encompasses several possible data masking processes. In a broader sense data masking – or just ‘masking’ for the remainder of this series – encompasses collection of data, obfuscation of data, storage of data, and possibly movement of the masked information. But ‘mask’ is also used in reference to the masking operation itself – how we change the original data into something else. There are many different ways to obfuscate data depending on the type of data being stored, each embodied by a different function, and each meeting suitable for different security and data use cases. It might be helpful to think of masking in terms of Halloween masks: the level of complexity and degree of concealment both vary, depending upon the effect desired by the wearer. The following is a list of common data masks used to obfuscate data, and how their functionalities differ: Substitution: Substitution is simply replacing one value with another. For example, the mask might substitute a person’s first and last names with names from a some random phone book entry. The resulting data still constitutes a name, but has no logical relationship with the original real name unless you have access to the original substitution table. Redaction/Nulling: This is a form of substitution where we simply replace sensitive data with a generic value, such as ‘X’. For example, we could replace a phone number with “(XXX)XXX-XXXX”, or a Social Security Number (SSN) with XXX-XX-XXXX. This is the simplest and fastest form of masking, but provides very little (arguably no information) from the original. Shuffling: Shuffling is a method of randomizing existing values vertically across a data set. For example, shuffling individual values in a salary column from a table of employee data would make the table useless for learning what any particular each employee earns. But it would not change aggregate or average values for the table. Shuffling is a common randomization technique for disassociating sensitive data relationships (e.g., Bob makes $X per year) while retaining aggregate values. Transposition: This means to swap one value with another, or a portion of one string with another. Transposition can be as complex as an encryption function (see below) or a simple as swapping swapping the first four digits of a credit card number with the last four. There are many variations, but transposition usually refers to a mathematical function which moves existing data around in a consistent pattern. Averaging: Averaging is an obfuscation technique where individual numeric values are replaced by a value derived by averaging some portion of the individual number values. In our salary example above, we could substitute individual salaries with the average across a group or corporate division to hide individual salary values while retaining an aggregate relationship to the real data. De-identification: A generic term that applies to any process that strips identifying information, such as who produced the data set, or personal identities within the data set. De-identification is an important topic when dealing with complex, multi-column data sets that provide ample means for someone to reverse engineer masked data back into individual identities. Tokenization: Tokenization is substitution of data elements with random placeholder values, although vendors overuse the term ‘tokenization’ for a variety of other techniques. Tokens are non-reversible because the token bears no logical relationship with the original value. Format Preserving Encryption: Encryption is the process of transforming data into an unreadable state. For any given value the process consistently produces the same result, and it can only be reversed with special knowledge (the key). While most encryption algorithms produce strings of arbitrary length, format preserving encryption transforms the data into an unreadable state while retaining the format (overall appearance) of the original values. Each of these mask types excels in some use cases, and also of course incurs a certain amount of overhead due to its

Share:
Read Post

Friday Summary: May 18, 2012

A friend told me this week they were on Pinterest. I responded, “I’m sorry! How long does your employer allow you to take off?” I was seriously thinking this was something like paternity leave or one of those approved medical absence programs. I really wondered when he got sick, and what his prognosis was. He told me, “No, I’m on Pinterest to market my new idea.” WTF? Turns out it’s not a medical sabbatical, but another social media ‘tool’ for sharing photos and stuff. When I Googled Pinterest to find out what the heck it actually was, I found a long blog about the merits of using Pinterest for Engagement Marketing, which happened to be at the blog of an old friend’s company. Soon thereafter I fired up Skype and was chatting with him, finding out what he’d been up to, how the kids were, and what mutual friends he had seen. That led to a LinkedIn search to find those friends mentioned, and while looking I spotted a couple other people I had lost track of. Within minutes I’d emailed one and found the other on Twitter. My friend on Twitter told me to check her blog on marketing over social media, which referenced another mutual friend. I emailed him, and when I hit ‘send’, I received a LinkedIn update with a list of several friends who recently changed jobs. I messaged one and texted the other to congratulate them. The next thing I knew I was chatting on FaceTime with one of these friends, in a pub in London celebrating his new position. We talked for a while, and then he said he ran into a fraternity brother and texted me his email. I emailed the fraternity brother, who sent back a LinkedIn invite telling me he’d Skype me later in the day, and included a funny YouTube video of Darth Vader riding a unicycle while playing bagpipes. As I watched the bagpiping maniac a Skype message popped up from another friend telling me she’s changed jobs (and have you noticed all of the people in tech changing jobs recently?). She invited me to speak at an event for her new company, listed on Meetup. I declined, sending her the Gotomeeting link to a conflicting event, but told her I’ll be in town later in the week and sent her a calendar invite for lunch. She sent back a list of Yelp recommendations for where to go. All in about an hour one morning. For an asocial person, this whole social media thing seems to have permeated my life. It’s freakin’ everywhere. In case you hadn’t heard, Facebook’s making an Initial Public Offering right about now. But love them or hate them, each social media site seems to do one thing really well! LinkedIn is a really great way to keep in touch with people. No more shoebox full of business cards for me! And it’s totally blending work and home, and combining groups of friends from different periods of my life into one ever-present pool. Twitter is an awesome way casually chat in real time with a group of friends while getting work done. BeeJive lets me chat on my mobile phone with the guys at Securosis. Skype offers cheap calls of reasonable quality to anyone. Some companies actually do follow Twitter with live human beings and respond to customer complaints, which is great. And Facebook offers a great way to infect your browser with malware! That said, every social media site still sucks hard. I’m not talking about users making asses of themselves, but instead about how every site tries too hard to be more than a one-trick pony, offering stuff you don’t want. I guess they are trying to increase shareholder value or some such nonsense rather than serve their audience. Skype was trying to branch out with their ‘mood’ feature – who thought that crap was a good idea? And now Pinterest is copying that same bad idea? Facebook Social Cam? Or LinkedIn communities, which seem to be a cesspool of bad information and people “positioning themselves” for employment. Corporate Twitter spambots are bad but they’re not the worst – not by a long shot. It’s the garbage from the social media companies who feel they must inform me that my “contacts are not very active”, or remind me that I have not responded to so-and-so’s request, or promote some new ‘feature’ they have just created which will likely interfere with what they actually do well. Who decided that social media must have nagware built in? And in spite of all the horrific missteps social media makes trying to be more than they are, these sites are great because they provide value. And most of them provide the core product – the one that’s really useful – free! Much as I hate to admit it, social media has become as important as my phone, and I use it every day. Oh, before I forget: If you have emailed us and we have failed to respond in the last couple weeks, please resend your email. We’ve got a triple spam filter going, and every once in a while the service changes its rule enforcement and suddenly (silently) blocks a bunch of legit email. Sorry for the inconvenience. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Mike on the “Renaissance Information Security Professional”. Rich quoted on Adobe’s fixes on c|net. Mike’s Dark Reading post: Time To Deploy The FUD Weapon? Favorite Securosis Posts Mike Rothman: Understanding and Selecting Data Masking: Introduction. Masking is a truly under-appreciated function. Until your production data shows up in an Internet-accessible cloud instance, that is. Hopefully Adrian’s series sheds some light on the topic. Adrian Lane: Write Third. Rich nails it – the rush to be first kills journalism/integrity/fact checking/perspective/etc. Most ‘writers’ become automated garbage relays, often with humorous results, such as one of my all time favorite Securosis posts. Other Securosis Posts [New White Paper] Vulnerability Management Evolution.

Share:
Read Post

Understanding and Selecting Data Masking: Series Introduction

Data masking has been around a long time. I have been masking since the early ’90s to create test data from production copies of customer insurance records, as well as to alter database columns before sending database exports out for “data cleansing”. At the time masking was little more than UNIX shell scripts or home grown Perl scripts to alter particular columns in .csv files. A few years later I was giddy with excitement to have my first masking ‘program’, running on a paleolithic version of Windows, which actually had a ‘wizard’ for walking through the process. No, it did not help with extraction of information from a database, but it identified the columns to be altered, provided a list of masks to apply, and dumped an error file when it ran into trouble. That saved a lot of tweaking scripts and manually reviewing dump files. And all this was several years before I heard anyone mention ‘ETL’ (Extract, Transform, Load) because ODBC and JDBC drivers to connect to databases were just arriving on the scene, and nobody had automated bulk loads back into another database. That was still science fiction. Masking products don’t look like that any longer – now they are full-blown data security and management platforms. It feels a bit nostalgic to review data masking technologies, and somewhat surprising to find how far they have evolved into full production-quality enterprise platforms. I have been following data masking for almost two decades, and seen more evolution in the last couple years than over the first dozen. These advancements have come in two forms. First, evolution of the technology in recent years, building the capability to handle just about any type of database or data source, full automation, workflow integration, and a dozen or so data obfuscation techniques. Second, in response to substantial market demand from IT security and compliance departments, the way these tools are used has changed. Increased demands from new buying centers have forced changes in workflow, user interface, and how core capabilities are packaged. It only took a couple public breaches, where production data was easily exfiltrated from unsecured test databases, to drive masking into companies’ production data flows. Compliance requirements such as PCI-DSS cemented the need and are now a principal driver for adoption. The upshot is that most of these tools have seen significant advancement, and now include multiple robust user interfaces to support both technical and non-technical users, as well as pre-packaged solutions for different compliance mandates. Somewhere along the way, masking grew up! I started following this vertical again because we received a number of customer questions, specifically around compliance. We have been seeing steady growth in adoption of masking over the last four years – perhaps 20% YoY – as more customers use masking to reduce information risk. In some ways it’s a more elegant solution than encryption; and for several deployment models masking is cheaper and easier than surrounding sensitive data with layers of security controls such as user rights management, encryption, database security, and various firewall technologies. When you think about securing Big Data, data analytics systems, HIPPA compliance, and using public cloud computing resources, there is plenty of reason to believe masking’s rapid adoption will continue. I have written a lot about masking on the blog, but never a focused research paper; it seems to be time for a thorough explanation of what masking does and how it helps security. So I am excited to launch a new series: Understanding and Selecting Data Masking Solutions. I have designed this series to help would-be buyers understand what to look for in a product, and show existing customers how to leverage their investments to solve emerging problems. I’ll delve into the technology, deployment models, data flow, and management capabilities. I will discuss the four principal use cases and how the technology solves certain compliance and security issues, and close out with a brief buyers’ guide on what features to look for based upon your criteria. The outline follows: Core Features: We’ll define masking, introduce the basic technology, and discuss how it’s applied to data. We will also define the major masking options (shuffling, averaging, substitution, field nulling/redaction, and mathematical transposition) and de-identification methods. And we’ll explain the need for data type & format preservation, uniqueness, and semantic & referential integrity. How It Works: We will examine how masking works, focusing on how data flows through it and how information is secured. We’ll describe different options for sources, destinations, extraction methods, loading options, and where & how masking is performed. We will contrast masking against encryption and tokenization to frame advantages of particular techniques for specific use cases later. Technical Architecture: Deployment models (ETL, in-place, and the various options for dynamic masking), issues, and concerns with each. We will discuss support for files and databases, and how masking integrates with these platforms. We’ll include diagrams to compare and contrast the models. Advanced Features: We’ll cover current trends in data discovery, risk & criticality assessment, and mask validation. We will talk about centralized policy management, data set management, and secure data transfer. We’ll discuss integration with other systems such as trouble ticketing, encryption, tokenization, and DLP for automated workflow. Use Cases: We will outline both traditional and new use cases, bringing together the evolving requirements with ongoing changes to masking technologies, along with how these use cases prompt new deployment models. This section will focus on specific customers requirements that have come up in our research; we’ll also evaluate specific masking alternatives to meet security and compliance mandates. We will cover automated workflows and scripting, as well as use of pre-defined templates for defining masks. We’ll discuss compliance masks and pre-built regulatory options, as well as control reporting. Evaluate Your Needs: We’ll wrap up by mapping out evaluation criteria and a process to guide a customer buying decisions. We will distinguish between “must-have” and “nice-to-have” requirements, compliance, integration, setup, and management. As with all Securosis research projects, we are focused on

Share:
Read Post

Understanding and Selecting a Database Security Platform: Comments and Series Index

Rich and I – with help from Chris Pepper – compiled the Understanding and Selecting a Database Security Platform series into a research paper, and provided it to a number of people for initial review. We got a lot of valuable feedback and observations back. Commenters felt several topics were under-served, they believe others were over-emphasized, and more we failed to mention. We’re not too proud to admit when we’re wrong, or when we failed to capture the essence of customer buying decisions, so we are happy to revisit these topics. We believe their feedback improves the paper quite a bit. In keeping with our Totally Transparent Research process we want all discussions that affect the paper out in the open, so we are posting those comments here for review. If you have additional comments, or responses to anything here, we encourage you to chime in. This series took longer to produce than most of our other research papers, and some readers had trouble following along from beginning to end. For the sake of continuity I have listed all the blog posts: Understanding and Selecting a Database Security Platform: Introduction Defining DSP Core components and the evolution of DAM to DSP Event collection Technical architecture Core features Extended features Administration and management Use cases And for reference, the original Understanding and Selecting a Database Activity Monitoring Solution research paper and the first DAM 2.0 posts offer additional insight. Once we have discussed all the comments and pulled all relevant feedback into the paper, we will release the final version. Share:

Share:
Read Post

FireStarter: Policy Wonks and Pests

I’ve spent more hours than I can count studying compliance and governance. Reading and re-reading PCI requirements, Sarbanes-Oxley law, theory, and applied theory. Spent mind-numbing hours combing through BASEL and BASEL II docs. I’ve spent many long weeks with external auditors, internal auditors, assessors, risk management personnel, corporate governance officers, and government officials – trying to understand their jobs, their roles, and how the world functions from their perspectives. I’ve spent months mapping those ideas and processes into policy implementations, process modifications, and the rules that actually enforce policies. I’ve written audit reports for these various compliance and policy management frameworks to demonstrate policy compliance and efficacy. When you sell security and risk management software these efforts are necessary, because compliance drives your company’s revenue. So I feel I understand policy and compliance pretty darn well, but I am bothered by the trend toward policy being the focus – at the expense of the task it was originally designed to govern. I got started on this thread during a review of an instructional “how-to” on the secure-software development lifecycle (SDLC). The more I read of this SDLC description, the more I realized that it was not SDLC at all. It was a risk and management process to gauge the effectiveness of the SDLC program. It contained next to nothing on SDLC itself! There were very few instructions on tools, processes, or things you need to know to actually develop under an SDLC – just management and policy oversight. Don’t get me wrong – risk management and development management policies are very important for SDLC. When we track and monitor we get a better idea of whether what we are doing is having a positive effect, weigh the relative merits of different types of security efforts, and over time learn whether we are getting better. But policy and management are not for the sake of policy and management – they only exist to ensure the core effort (in this case SDLC) is actually working. I find that a lot of this stems from people developing policy when they have never done whatever the policies are meant to govern. And sometimes that’s okay. It’s not a requirement that you have developed code, managed teams of developers, or been responsible for process development to comment on SDLC and SDLC governance. But without that experience in whatever practice you are trying to manage, efforts to improve it rarely work out well – the policy mindset does not mesh well with the development mindset. Agile programming even has a name for these people: chickens! From the parable of Chickens and Pigs, the Chickens have lots of input but are not part of the actual process. And developers make this distinction because chickens can be detrimental to the process of developing software. This particular brand of chicken I usually call “policy wonks”, and I am convinced they do at least as much harm as good. I’m pretty pragmatic. I prefer easy over hard, and when it comes down to it I just want to get my work done and move on. In fact all of us at Securosis are this way – Mike so much that he authored the Pragmatic CSO guide that remains in use and gets downloaded pretty much every week. Developers, if I can be so bold as to generalize on the culture as a whole, are usually anti-bureaucracy and anti-policy. It’s whatever works quickly and effectively. And I have this trait in a big way. But after years spent with policy development and compliance, gathering metrics and measuring outcomes, I know they actually are critical. But I keep running into people who only do policy, who only give us the (to steal a phrase from David Mortman) Utopian Policy Ideal, without any consideration whatsoever for actually getting $#)^! done! Policy is to help us avoid repeating mistakes and guide us on how to get work done the way we want to get it done. But it’s not all about policy. Policy is not the work to get done. Are policy and governance important? Hell, yeah! But if we keep spending 50% of our time on this 5% of the picture, we will suck at the other 95% of the stuff that needs to happen in order to get things done. You know – real work. Note from Rich: Adrian asked me to review this before posting so I thought I’d insert a line. This is my single biggest pet peeve in security today. Especially in cloud. Far too many people seem to want to be policy wonks and focus on GRC to the exclusion of actual security. Share:

Share:
Read Post

Friday Summary: May 4, 2012

My conversation started like this: “Do you know where the recorder is?” she asked. “The what?” I replied. “The tape recorder we bought you!” After a long pause, I replied: “You mean the Panasonic cassette tape recorder you bought me in 1974?” “Yes, that one! I want to record myself playing the piano.” My brain froze momentarily, as I processed the many implications of this statement. After another long pause I asked: “Mom, did you really call me up to ask me about a cassette recorder? From the 70’s? And for the record, no, I’ve not seen it in – uh – three decades. I think we threw it out when the batteries corroded the insides. That would have been in the early 80’s.” “Oh, darn!” “If you don’t mind my asking, why not use the computer? Or one of those dictaphones you’ve got scattered around the house. Or your phone should – wait, don’t you have a smartphone?” “No, your father and I do not have cell phones.” This conversation occurred last month. I literally put down the phone after that comment to think about what that meant. They didn’t go all Amish on me, did they? I consider myself a ‘late’ adopter because my first phone that was more than a basic phone was the iPhone 4. I still use email. I have really just started to appreciate Twitter, placed my entire music library on a computer, and started streaming television over WiFi. But I have owned cell phones for 15 years or so. This is a whole different universe of thought and perception. Other than their DVD player and the ‘recent’ upgrade to Windows XP, it seems my parents stopped advancing with technology a long time ago. My wife has a theory that you can tell someone’s ‘heyday’ when you walk into their home, by looking at the period decor. I have got lots of friends who are 10, 15, even 25 years older than me; and it seems to hold true. For my parents it’s velour, brass, and mauve – you do the math. Some people continue to modernize but most just stop at some point. I think that there is an economic component to the lack of change – it’s expensive to just replace things for the sake of modernization. But this is different. An old couch is a long way from not having a cellphone. I grew up hearing about the generation gap, and I mostly ignored the discussion about the digital divide as – in Berkeley at least – it came across as some socialist rant against what was perceived as a technological caste system. But I am starting to see the point, not in the “technological literacy” sense, but more about humans’ willingness to adapt or sample new things, or just try something different. But damn, this is still shocking. And I’m their offspring – could this happen to me too? Is it because you own a device that already does something similar, so you figure “Why buy a new one?” Do you need a robot vacuum cleaner when the Hoover upright still functions? Do you need voicemail when your answering machine still works? If the Mr. Coffee still cranks out brown water, why invest in a single-cup espresso maker with those fancy foil packs? Why replace the refrigerator that’s been working great for 30 years? If IE6 still browses the Internet, why change? Do you need LED lights when you have an incandescent desk lamp? Mom was more comfortable with a cassette tape recorder than any other recording device invented in the last 40 years. She was headed to the store to see if she could find a new one. I told her that her best bet was [snark]Office Max[/snark]. The good news is that I have figured out the perfect Christmas gift – I’ll send them the Patrick Nagel prints I have stored in the garage. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Rich with Nir Zuk on Coming to Grips with Consumerization. Adrian at Dark Reading: Security Bugs And Proofs Of Concept. Written before the TNS poisoning disclosure. Rich mentioned in Entrepreneur.com. Adrian’s paper on User Activity Monitoring. Mike’s PCI: Dead Man(date) Walking? at Dark Reading. Favorite Securosis Posts Mike Rothman: Friday Summary: TSA Edition. Rich nails the issue with airport security in his intro to last week’s Summary. He’s right – more security theater will be coming to an airport near you. Adrian Lane: Stupid Human Tricks: Security Job Interviews. The LiquidMatrix guys are like family, so this is my favorite ‘inside’ post of the week. Guaranteed to make the most cynical security people laugh out loud! Rich: FireStarter: Policy Wonks and Pests. Have I mentioned how little respect I have for people who want to govern things they don’t understand? Other Securosis Posts Incite 5/2/2012: Refi Madness. Vulnerability Management Evolution: Evolution or Revolution? Favorite Outside Posts Mike Rothman: 8 Things To Expect Shopping At Microsoft’s Non-Apple Apple Store. Imitation is the sincerest form of flattery. And then there is copying. If you’ve never been to a Microsoft store, Conan has it nailed. Especially the Zune meet-ups. Should provide your LOL of the day. Adrian Lane: TNS Poison – straight from the researcher. Fascinating tale of FAIL. Rich: Don’t be an evangelist. Okay, I get mentioned in this one, but there really isn’t any place for religion in tech. You need to be able to adapt to the times. Project Quant Posts Malware Analysis Quant: Index of Posts. Malware Analysis Quant: Metrics–Monitor for Reinfection. Malware Analysis Quant: Metrics–Remediate. Malware Analysis Quant: Metrics–Find Infected Devices. Malware Analysis Quant: Metrics–Define Rules and Search Queries. Malware Analysis Quant: Metrics–The Malware Profile. Malware Analysis Quant: Metrics–Dynamic Analysis. Research Reports and Presentations Watching the Watchers: Guarding the Keys to the Kingdom. Network-Based Malware Detection: Filling the Gaps of AV. Tokenization Guidance Analysis: Jan 2012. Applied Network Security Analysis: Moving from Data to Information. Tokenization Guidance. Security Management 2.0: Time to

Share:
Read Post

Friday Summary: April 13th, 2012

Happy Friday the 13th! I was thinking about superstition and science today, so I was particularly amused to notice that it’s Friday the 13th. Rich and I are both scientists of sorts; we both eschew superstition, but we occasionally argue about science. What’s real and what’s not. What’s science, what’s pseudoscience, and what’s just plain myth. It’s interesting to discuss root causes and what forces actually alter our surroundings. Do we have enough data to make an assertion about something, or is it just a statistical anomaly? I’m far more likely to jump to conclusions about stuff based on personal experience, and he’s more rigorous with the scientific method. And that’s true for work as well as life in general. For example he still shuns my use of Vitamin C, while I’m convinced it has a positive effect. And Rich chides as I make statements about things I don’t understand, or assertions that are completely ‘pseudoscience’ in his book. I’ll make an off-handed observation and he’ll respond with “Myth Busters proved that’s wrong in last week’s show”. And he’s usually right. We still have a fundamental disagreement about the probability of self-atomizing concrete, a story I’d rather not go into – but regardless, we are both serious tech geeks and proponents of science. I regularly run across stuff that surprises me and challenges my fundamental perception of what’s possible. And I am fascinated by those things and the explanations ‘experts’ come up with for them – usually from people with a financial incentive. Hawking anything from food to electronic devices by claiming benefits we cannot measure, or for which we don’t have science which could prove or disprove their clams. To keep things from getting all political or religious, I restrict my examples to my favorite hobby: HiFi. I offer power cords as an example. I’ve switched most of the power cords to my television, iMac, and stereo to versions that run $100 to $300. Sounds deranged, I know, to spend that much on a piece of wire. But you know what? The colors on the television are deeper, more saturated, and far less visually ‘noisy’. Same for the iMac. And I’m not the only one who has witnessed this. It’s not subtle, and it’s completely repeatable. But I am at a loss to understand how the last three feet of copper between the wall socket and the computer can dramatically improve the quality of the display. Or the sound from my stereo. I can see it, and I can hear it, but I know of no test to measure it and I just don’t find the explanations of “electron alignment” plausible. Sometimes it’s simply that nobody thought to measure stuff they should have because theoretically it shouldn’t matter. In college I thought most music sounded terrible and figured I had simply outgrown the music of my childhood. Turns out that in the 80s, when CDs were born, CD players introduced several new forms of distortion, and some of them were unmeasurable. Listener fatigue became common, many people getting headaches as a result of these poorly created devices. Things like jitter, power supply noise, noise created by different types of silicon gates and capacitors, all producing sonic signatures audible to the human ear. Lots of this couldn’t be effectively measured but will send you running from the room. Fortunately over the last 12 years or so audio designers have become aware of these new forms of distortion, and they now have devices that can measure them to one degree or another. I can even hear significant differences with various analog valves (i.e. ‘tubes’) where I cannot measure electrical differences. Another oddity I have found is with vibration control devices. I went to a friend’s house and found his amplifiers and DVD players suspended high in the air on top of maple butcher blocks, which sat on top of what looked like a pair of hockey pucks separated by a ball bearing. The maple blocks are supposed to both absorb vibration and avoid electromagnetic interference between components. And we did several A/B comparisons with and without each, but it was the little bearings that made a clear and noticeable difference in sound quality. The theory is that high frequency vibrations, which shake the electronic circuits of the amps and CD players, decrease resolution and introduce some form of distortion. Is that true? I have no clue. Do they work? Hell yes they do! I know that my mountain bike’s frame was designed to alter the tube circumference and wall thicknesses as a method of dampening vibrations, and there is an improvement over previous generations of bike frames, albeit a subtle one. The reduction in vibrations on the bike can easily be measured, as can the vibrations and electromagnetic interference between A/V equipment. But the vibrational energy is so vanishingly small that it should never make a difference to audio quality. Then there are the environmental factors that alter the user’s perception of events. Yeah, drugs and alcohol would be an example, but sticking to my HiFi theme: a creme that makes your iPod sound better. Works by creating a positive impression with the user. Which again borders on the absurd. An unknown phenomena, or snake oil? Sometimes it’s tough to tell superstition from science. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Adrian’s Dark Reading paper on User Activity Monitoring. Rich’s excellent Macworld article on the Flashback malware. Adrian’s Dark Reading post on reverse database proxies. Favorite Securosis Posts Adrian Lane: The Myth of the Security-Smug Mac User. We get so many ‘news’ items, like how Android will capture the tablet market in 2015, or how Apple’s market share of smartphones is dwindling, or how smug Apple users will get their ‘comeuppance’ for rejecting AV solutions, that you wonder who’s coming up with this crap. Mac users may not have faith in AV to keep them secure, but they know eventually Macs will be targeted just as Windows has been. And I’m fairly certain most hackers run on

Share:
Read Post

Pain Comes Instantly—Fixes Come Later

Mary Ann Davidson’s recent post Pain Comes Instantly has been generating a lot of press. It’s being miscast by some of the media outlets as trashing PCI Data Security Standard, but it’s really about the rules for vendors who want to certify commercial payment software and related products. The debate is worth considering, so I recommend giving it a read. It’s a long post, but I encourage you to read it all the way through before forming opinions, as she makes many arguments and provides some allegories along the way. In essence she challenges the PCI Council on a particular requirement in the Payment Application Vendor Release Agreement (VRA), part of each vendor’s contractual agreement with the PCI Council to get their applications certified as PCI compliant. The issue is over software vulnerability disclosure. Paraphrasing the issue at hand, let’s say Oracle becomes aware of a security bug. Under the terms of the agreement, Oracle must disseminate the information to the Council as part of the required information disclosure process. Her complaint is that the PCI Council insists on its right to leak (‘share’) this information even when Oracle has not yet provided a fix. Mary Ann argues that in this case the PCI Council is harming Oracle’s customers (who are also PCI Council customers) by making the vulnerability public. Hackers will of course exploit the vulnerability and try to breach the payment systems. The real point of contention is that the PCI Council may decide to share this information with QSAs, partners, and other organizations, so those security experts can better protect themselves and PCI customers based upon this information. Oracle’s position is that these QSAs and others who may receive information from the Council are not qualified to make use the information. And second, the more people know about the vulnerability, the more it likely it is to leak. I don’t have a problem with those points. I totally agree that if you tell thousands of people about the vulnerability, it’s as good as public knowledge. And it’s probably safe to wager that only a small percentage of Oracle customers have the initiative or knowledge to take vulnerability information and craft it into effective protection. Even if a customer has Oracle’s database firewall, they won’t be able to create a rule to protect the database from this vulnerability information. So from that perspective, I agree. But it’s a limited perspective. Just because few Oracle customers can generate a fix or a workaround doesn’t mean that a fix won’t or can’t be made available. Oracle customers have contributed workarounds in the past. Even if an individual customer can’t help themselves, others can – and have. But here’s my real problem with that post: I am having trouble finding a substantial difference between her argument and the whole responsible disclosure debate. What’s the real difference from a security researcher finding an Oracle vulnerability? The information is outside Oracle’s control in both cases, and there is a likelihood of public disclosure. It’s something a determined hacker may discover, or have already discovered. It’s in Oracle’s best interest to fix the problem fast before the rest of the world finds out. Historically the problem is that vendors, unless they have been publicly shamed into action, don’t react quickly to security issues. Oracle, among other vendors, has often been accused of siting on vulnerabilities for months – even years – before addressing them. Security researchers for years told basically the same story about Oracle flaws they found, which goes something like this: We have discovered a security flaw in Oracle. We told Oracle about it, and gave them details on how to reproduce it and some suggestions for how to fix it. Oracle a) never fixed it, b) produced a half-assed fix that causes other issues, or c) waited 9, 12, or 18 months before patching the issue – and that was only after I announced the bug to the world at the RSA/DefCon/Black Hat/OWASP conference. I gave Oracle information that anyone could discover, and did not ask for any compensation, and Oracle tried to sue me when I disclosed the vulnerability after 12 months. I’m not Oracle bashing here – it’s an industry-wide issue – but my point that with disclosure, timing matters… a lot. Since the Payment Application Vendor Release Agreement simply states you will ‘promptly’ inform the PCI Council of vulnerabilities, Oracle has a bit of leeway. Maybe ‘prompt’ means 30 days. Heck, maybe 60. That should be enough time to get a patch to those customers using certified payment products – or whatever term the PCI council uses for vetted but not guaranteed software. If a vendor is a bit tardy with getting detailed information to the PCI Council while they code and test a fix, I don’t think the PCI council will complain too much, so long as they are protected from liability. But make no mistake – timing is a critical part of this whole issue. Timing – particularly the lack of ‘prompt’ responses from Oracle – is why the security research community remains pissed-off and critical to this day. Share:

Share:
Read Post

Understanding and Selecting DSP: Administration

Today’s post focuses on the administering Database Security Platforms. Conceptually DSP is pretty simple: collect data from databases, analyze it according to established rules, and react when a rule has been violated. The administrative component of every DSP platform follows these three basic tasks: data management, policy management, and workflow management. In addition to these three basic functions, we also need to administer the platform itself, as we do with any other application platform. As we described in our earlier post on DSP technical architecture, DSP sends all collected data to a central server. The DAM precursors evolved from single servers, to two-tiered architectures, and finally into a hierarchal model, in order to scale up to enterprise environments. The good news is that system maintenance, data storage, and policy management are all available from a single console. While administration is now usually through a browser, the web application server that performs the work is built into the central management server. Unlike some other security products, not much glue code or browser tricks is required to stitch things together. System Management User Management: With access to many different databases, most filtering and reporting on sensitive data, user management is critical for security. Establishing who can make changes to policies, read collected data, or administer the platform are all specialized tasks, and these groups of users are typically kept separate. All DSP solutions offer different methods for segregating users into different groups, each with differing granularity. Most of the platforms offer integration with directory services to aid in user provisioning and assignment of roles. Collector/Sensor/Target Database Management: Agents and data collectors are managed from the central server. While data and policies are stored centrally, the collectors – which often enforce policy on the remote database – must periodically synch with the central server to update rules and settings. Some systems require the administrator to ‘push’ rules out to agents or remote servers, while others synch automatically. Systems Management: DSP is, in and of itself, and application platform. It has web interfaces, automated services, and databases like most enterprise applications. As such it requires some tweaking, patching, and configuration to perform its best. For example, the supporting database may need pruning to clear out older data, vendor assessment rules require updates, and the system may need additional resources for data storage and reports. The system management interface is provided via a web browser, but only available to authorized administrators. Data Aggregation & Correlation The one characteristic Database Activity Monitoring solutions share with log management, and even Security Information and Event Management, tools is their ability to collect disparate activity logs from a variety of database management systems. They tend to exceed the capabilities of related technologies in their ability to go “up the stack” in order to gather deeper database activity application layer data, and in their ability to correlate information. Like SIEM, DSP aggregates, normalizes, and correlates events across many heterogenous sources. Some platforms even provide an optional ‘enrichment’ capability by linking audit, identity and assessment data to event records. For example, providing both ‘before’ and ‘after’ data values for a suspect query. Despite central management and correlation features, the similarities with SIEM end there. By understanding the Structured Query Language (SQL) of each database platform, these platforms can interpret queries and understand their meaning. While a simple SELECT statement might mean the same thing across different database platforms, each database management system (DBMS) is full of its own particular syntax. DSP understands the SQL for each platform is able to normalize events so the user doesn’t need to know the ins and outs of each DBMS. For example, if you want to review all privilege escalations on all covered systems, a DAM solution will recognize those events, regardless of platform and present a complete report, without you having to understand the SQL. A more advanced feature is to then correlate activity across different transactions and platforms, rather than looking only at single events. For example, some platforms recognize a higher than normal transaction volume by a particular user, or (as we’ll consider in policies) can link a privilege escalation event with a large SELECT query on sensitive data, which could indicate an attack. All activity is also centrally collected in a secure repository to prevent tampering or a breach of the repository itself. Since they collect massive amounts of data, DSPs must support automatic archiving. Archiving should support separate backups of system activity, configuration, policies, alerts, and case management; and encrypt under separate keys to support separation of duties. Policy Management All platforms come with sets of pre-packaged policies for security and compliance. For example, every product contains hundreds, if not thousands, of assessment policies that identify vulnerabilities. Most platforms come with pre-defined policies for monitoring standard deployments of databases behind major applications such as Oracle Financials and SAP. Built-in policies for PCI, SOX, and other generic compliance requirements are also available to help you jump-start the process and save many hours of policy building. Every single policy has the built-in capability of generating an alert if the rule is violated – usually through email, instant message or some other messaging capability. Note that every user needs to tune or customize a subset of pre-existing policies to match their environment, and create others to address specific risks to their data. They are still far better than starting from scratch. Activity monitoring policies include user/group, time of day, source/destination, and other important contextual options. And these policies should offer different analysis techniques based on attributes, heuristics, context, and content analysis. They should also support advanced definitions, such as complex multi-level nesting and combinations. If a policy violation occurs you can specify any number of alerting, event handling and reactive actions. Ideally, the platform will include policy creation tools that limit the need to write everything out in SQL or some other definition language; it’s much better if your compliance team does not need to learn SQL programming to create policies. You can’t avoid having to do some things

Share:
Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.