Securosis

Research

Friday Summary: July 23, 2010

A couple weeks ago I was sitting on the edge of the hotel bed in Boulder, Colorado, watching the immaculate television. A US-made 30” CRT television in “standard definition”. That’s cathode ray tube for those who don’t remember, and ‘standard’ is the marketing term for ‘low’. This thing was freaking horrible, yet it was perfect. The color was correct. And while the contrast ratio was not great, it was not terrible either. Then it dawned on me that the problem was not the picture, as this is the quality we used to get from televisions. Viewing an old set, operating exactly the same way they always did, I knew the problem was me. High def has so much more information, but the experience of watching the game is the same now as it was then. It hit me just how much our brains were filling in missing information, and we did not mind this sort of performance 10 years ago because it was the best available. We did not really see the names on the backs of football jerseys during those Sunday games, we just thought we did. Heck, we probably did not often make out the numbers either, but somehow we knew who was who. We knew where our favorite players on the field were, and the red streak on the bottom of the screen pounding a blue colored blob must be number 42. Our brain filled in and sharpened the picture for us. Rich and I had been discussing experience bias, recency bias, and cognitive dissonance during out trip to Denver. We were talking about our recent survey and how to interpret the numbers without falling into bias traps. It was an interesting discussion of how people detect patterns, but like many of our conversations devolved into how political and religious convictions can cloud judgement. But not until I was sitting there, watching television in the hotel; did I realize how much our prior experiences and knowledge shape perception, derived value, and interpreted results. Mostly for the good, but unquestionably some bad. Rich also sent me a link to a Michael Shermer video just after that, in which Shermer discusses patterns and self deception. You can watch the video and say “sure, I see patterns, and sometimes what I see is not there”, but I don’t think videos like this demonstrate how pervasive this built in feature is, and how it applies to every situation we find ourself in. The television example of this phenomena was more shocking than some others that have popped into my head since. I have been investing in and listening to high-end audio products such as headphones for years. But I never think about the illusion of a ‘soundstage’ right in front of me, I just think of it as being there. I know the guitar player is on the right edge of the stage, and the drummer is in the back, slightly to the left. I can clearly hear the singer when she turns her head to look at fellow band members during the song. None of that is really in front of me, but there is something in the bits of the digital facsimile on my hard drive that lets my brain recognize all these things, placing the scene right there in front of me. I guess the hard part is recognizing when and how it alters our perception. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Rich quoted in “Apple in a bind over its DNS patch”. Adrian’s Dark Reading post on SIEM ain’t DAM. Rich and Martin on Network Security Podcast #206. Favorite Securosis Posts Rich: Pricing Cyber-Policies. As we used to say at Gartner, all a ‘cybersecurity’ policy buys you is a seat at the arbitration table. Mike Rothman: The Cancer within Evidence Based Research Methodologies. We all need to share data more frequently and effectively. This is why. Adrian Lane: FireStarter: an Encrypted Value Is Not a Token!. Bummer. Other Securosis Posts Tokenization: Token Servers. Incite 7/20/2010: Visiting Day. Tokenization: The Tokens. Comments on Visa’s Tokenization Best Practices. Friday Summary: July 15, 2010. Favorite Outside Posts Rich: Successful Evidence-Based Risk Management: The Value of a Great CSIRT. I realize I did an entire blog post based on this, but it really is a must read by Alex Hutton. We’re basically a bunch of blind mice building 2-lego high walls until we start gathering, and sharing, information on which of our security initiatives really work and when. Mike Rothman: Understanding the advanced persistent threat. Bejtlich’s piece on APT in SearchSecurity is a good overview of the term, and how it’s gotten fsked by security marketing. Adrian Lane: Security rule No. 1: Assume you’re hacked. Project Quant Posts NSO Quant: Monitor Process Revisited. NSO Quant: Monitoring Health Maintenance Subprocesses. NSO Quant: Validate and Escalate Subprocesses. NSO Quant: Analyze Subprocess. NSO Quant: Collect and Store Subprocesses. NSO Quant: Define Policies Subprocess. NSO Quant: Enumerate and Scope Subprocesses. Research Reports and Presentations White Paper: Endpoint Security Fundamentals. Understanding and Selecting a Database Encryption or Tokenization Solution. Low Hanging Fruit: Quick Wins with Data Loss Prevention. Report: Database Assessment. Database Audit Events. XML Security Overview Presentation. Project Quant Survey Results and Analysis. Project Quant Metrics Model Report. Top News and Posts Researchers: Authentication crack could affect millions. SCADA System’s Hard-Coded Password Circulated Online for Years. Microsoft Launches ‘Coordinated’ Vulnerability Disclosure Program. GSM Cracking Software Released. How Mass SQL Injection Attacks Became an Epidemic. Harsh Words for Professional Infosec Certification. Google re-ups the disclosure debate. A new policy – 60 days to fix critical bugs or they disclose. I wonder if anyone asked the end users what they want? Adobe reader enabling protected mode. This is a very major development… if it works. Also curious to see what they do for Macs. Oracle to release 59 critical patches in security update. Is it just me, or do they have more security patches than bug fixes nowdays? Connecticut AG reaches agreement with

Share:
Read Post

Death, Irrelevance, and a Pig Roast

There is nothing like a good old-fashioned mud-slinging battle. As long as you aren’t the one covered in mud, that is. I read about the Death of Snort and started laughing. The first thing they teach you in marketing school is when no one knows who you are, go up to the biggest guy in the room and kick them in the nuts. You’ll get your ass kicked, but at least everyone will know who you are. That’s exactly what the folks at OISF (who drive the Suricata project) did, and they got Ellen Messmer of NetworkWorld to bite on it. Then she got Marty Roesch to fuel the fire and the end result is much more airtime than Suricata deserves. Not that it isn’t interesting technology, but to say it’s going to displace Snort any time soon is crap. To go out with a story about Snort being dead is disingenuous. But given the need to drive page views, the folks at NWW were more than willing to provide airtime. Suricata uses Snort signatures (for the most part) to drive its rule base. They’d better hope it’s not dead. But it brings up a larger issue of when a technology really is dead. In reality, there are few examples of products really dying. If you ended up with some ConSentry gear, then you know the pain of product death. But most products are around around ad infinitum, even if they aren’t evolved. So those products aren’t really dead, they just become irrelevant. Take Cisco MARS as an example. Cisco isn’t killing it, it’s just not being used as a multi-purpose SIEM, which is how it was positioned for years. Irrelevant in the SIEM discussion, yes. Dead, no. Ultimately, competition is good. Suricata will likely push the Snort team to advance their technology faster than in the absence of an alternative. But it’s a bit early to load Snort onto the barbie – even if it is the other white meat. Yet, it usually gets back to the reality that you can’t believe everything you read. Actually you probably shouldn’t believe much that you read. Except our stuff, of course. Photo credit: “Roasted pig (large)” originally uploaded by vnoel Share:

Share:
Read Post

Tokenization: Token Servers, Part 2 (Architecture, Integration, and Management)

Our last post covered the core functions of the tokenization server. Today we’ll finish our discussion of token servers by covering the externals: the primary architectural models, how other applications communicate with the server(s), and supporting systems management functions. Architecture There are three basic ways to build a token server: Stand-alone token server with a supporting back-end database. Embedded/integrated within another software application. Fully implemented within a database. Most of the commercial tokenization solutions are stand-alone software applications that connect to a dedicated database for storage, with at least one vendor bundling their offering into an appliance. All the cryptographic processes are handled within the application (outside the database), and the database provides storage and supporting security functions. Token servers use standard Database Management Systems, such as Oracle and SQL Server, but locked down very tightly for security. These may be on the same physical (or virtual) system, on separate systems, or integrated into a load-balanced cluster. In this model (stand-alone server with DB back-end) the token server manages all the database tasks and communications with outside applications. Direct connections to the underlying database are restricted, and cryptographic operations occur within the tokenization server rather than the database. In an embedded configuration the tokenization software is embedded into the application and supporting database. Rather than introducing a token proxy into the workflow of credit card processing, existing application functions are modified to implement tokens. To users of the system there is very little difference in behavior between embedded token services and a stand-alone token server, but on the back end there are two significant differences. First, this deployment model usually involves some code changes to the host application to support storage and use of the tokens. Second, each token is only useful for one instance of the application. Token server code, key management, and storage of the sensitive data and tokens all occur within the application. The tightly coupled nature of this model makes it very efficient for small organizations, but does not support sharing tokens across multiple systems, and large distributed organizations may find performance inadequate. Finally, it’s technically possible to manage tokenization completely within the database without the need for external software. This option relies on stored procedures, native encryption, and carefully designed database security and access controls. Used this way, tokenization is very similar to most data masking technologies. The database automatically parses incoming queries to identify and encrypt sensitive data. The stored procedure creates a random token – usually from a sequence generator within the database – and returns the token as the result of the user query. Finally all the data is stored in a database row. Separate stored procedures are used to access encrypted data. This model was common before the advent of commercial third party tokenization tools, but has fallen into disuse due to its lack for advanced security features and failure to leverage external cryptographic libraries & key management services. There are a few more architectural considerations: External key management and cryptographic operations are typically an option with any of these architectural models. This allows you to use more-secure hardware security modules if desired. Large deployments may require synchronization of multiple token servers in different, physically dispersed data centers. This support must be a feature of the token server, and is not available in all products. We will discuss this more when we get to usage and deployment models. Even when using a stand-alone token server, you may also deploy software plug-ins to integrate and manage additional databases that connect to the token server. This doesn’t convert the database into a token server, as we described in our second option above, but supports communications for distributed systems that need access to either the token or the protected data. Integration Since tokenization must be integrated with a variety of databases and applications, there are three ways to communicate with the token server: Application API calls: Applications make direct calls to the tokenization server procedural interface. While at least one tokenization server requires applications to explicitly access the tokenization functions, this is now a rarity. Because of the complexity of the cryptographic processes and the need for precise use of the tokenization server; vendors now supply software agents, modules, or libraries to support the integration of token services. These reside on the same platform as the calling application. Rather than recoding applications to use the API directly, these supporting modules accept existing communication methods and data formats. This reduces code changes to existing applications, and provides better security – especially for application developers who are not security experts. These modules then format the data for the tokenization API calls and establish secure communications with the tokenization server. This is generally the most secure option, as the code includes any required local cryptographic functions – such as encrypting a new piece of data with the token server’s public key. Proxy Agents: Software agents that intercept database calls (for example, by replacing an ODBC or JDBC component). In this model the process or application that sends sensitive information may be entirely unaware of the token process. It sends data as it normally does, and the proxy agent intercepts the request. The agent replaces sensitive data with a token and then forwards the altered data stream. These reside on the token server or its supporting application server. This model minimizes application changes, as you only need to replace the application/database connection and the new software automatically manages tokenization. But it does create potential bottlenecks and failover issues, as it runs in-line with existing transaction processing systems. Standard database queries: The tokenization server intercepts and interprets the requests. This is potentially the least secure option, especially for ingesting content to be tokenized. While it sounds complex, there are really only two functions to implement: Send new data to be tokenized and retrieve the token. When authorized, exchange the token for the protected data. The server itself should handle pretty much everything else. Systems Management Finally, as with any

Share:
Read Post

Tokenization: Token Servers

In our previous post we covered token creation, a core feature of token servers. Now we’ll discuss the remaining behind-the-scenes features of token servers: securing data, validating users, and returning original content when necessary. Many of these services are completely invisible to end users of token systems, and for day to day use you don’t need to worry about the details. But how the token server works internally has significant effects on performance, scalability, and security. You need to assess these functions during selection to ensure you don’t run into problems down the road. For simplicity we will use credit card numbers as our primary example in this post, but any type of data can be tokenized. To better understand the functions performed by the token server, let’s recap the two basic service requests. The token server accepts sensitive data (e.g., credit card numbers) from authenticated applications and users, responds by returning a new or existing token, and stores the encrypted value when creating new tokens. This comprises 99% of all token server requests. The token server also returns decrypted information to approved applications when presented a token with acceptable authorization credentials. Authentication Authentication is core to the security of token servers, which need to authenticate connected applications as well as specific users. To rebuff potential attacks, token servers perform bidirectional authentication of all applications prior to servicing requests. The first step in this process is to set up a mutually authenticated SSL/TLS session, and validate that the connection is started with a trusted certificate from an approved application. Any strong authentication should be sufficient, and some implementations may layer additional requirements on top. The second phase of authentication is to validate the user who issues a request. In some cases this may be a system/application administrator using specific administrative privileges, or it may be one of many service accounts assigned privileges to request tokens or to request a given unencrypted credit card number. The token server provides separation of duties through these user roles – serving requests only from only approved users, through allowed applications, from authorized locations. The token server may further restrict transactions – perhaps only allowing a limited subset of database queries. Data Encryption Although technically the sensitive data doesn’t might not be encrypted by the token server in the token database, in practice every implementation we are aware of encrypts the content. That means that prior to being written to disk and stored in the database, the data must be encrypted with an industry-accepted ‘strong’ encryption cipher. After the token is generated, the token server encrypts the credit card with a specific encryption key used only by that server. The data is then stored in the database, and thus written to disk along with the token, for safekeeping. Every current tokenization server is built on a relational database. These servers logically group tokens, credit cards, and related information in a database row – storing these related items together. At this point, one of two encryption options is applied: either field level or transparent data encryption. In field level encryption just the row (or specific fields within it) are encrypted. This allows a token server to store data from different applications (e.g., credit cards from a specific merchant) in the same database, using different encryption keys. Some token systems leverage transparent database encryption (TDE), which encrypts the entire database under a single key. In these cases the database performs the encryption on all data prior to being written to disk. Both forms of encryption protect data from indirect exposure such as someone examining disks or backup media, but field level encryption enables greater granularity, with a potential performance cost. The token server will have bundles the encryption, hashing, and random number generation features – both to create tokens and to encrypt network sessions and stored data. Finally, some implementations use asymmetric encryption to protect the data as it is collected within the application (or on a point of sale device) and sent to the server. The data is encrypted with the server’s public key. The connection session will still typically be encrypted with SSL/TLS as well, but to support authentication rather than for any claimed security increase from double encryption. The token server becomes the back end point of decryption, using the private key to regenerate the plaintext prior to generating the proxy token. Key Management Any time you have encryption, you need key management. Key services may be provided directly from the vendor of the token services in a separate application, or by hardware security modules (HSM), if supported. Either way, keys are kept separate from the encrypted data and algorithms, providing security in case the token server is compromised, as well as helping enforce separation of duties between system administrators. Each token server will have one or more unique keys – not shared by other token servers – to encrypt credit card numbers and other sensitive data. Symmetric keys are used, meaning the same key is used for both encryption and decryption. Communication between the token and key servers is mutually authenticated and encrypted. Tokenization systems also need to manage any asymmetric keys for connected applications and devices. As with any encryption, the key management server/device/functions must support secure key storage, rotation, and backup/restore. Token Storage Token storage is one of the more complicated aspects of token servers. How tokens are used to reference sensitive data or previous transactions is a major performance concern. Some applications require additional security precautions around the generation and storage of tokens, so tokens are not stored in a directly reference-able format. Use cases such as financial transactions with either single-use or multi-use tokens can require convoluted storage strategies to balance security of the data against referential performance. Let’s dig into some of these issues: Multi-token environments: Some systems provide a single token to reference every instance of a particular piece of sensitive data. So a credit card used at a specific merchant site will be represented by a

Share:
Read Post

Incite 7/20/2010: Visiting Day

Back when I went to sleepaway camp as a kid I always looked forward to Visiting Day. Mostly for the food, because after a couple weeks of camp food anything my folks brought up was a big improvement. But I admit it was great to see the same families year after year (especially the family that brought enough KFC to feed the entire camp) and to enjoy a day of R&R with your own family before getting back to the serious business of camping. So I was really excited this past weekend when the shoe was on the other foot, and I got to be the parent visiting XX1 at her camp. First off I hadn’t seen the camp, so I had no context when I saw pictures of her doing this or that. But most of all, we were looking forward to seeing our oldest girl. She’s been gone 3 weeks now, and the Boss and I really missed her. I have to say I was very impressed with the camp. There were a ton of activities for pretty much everyone. Back in my day, we’d entertain ourselves with a ketchup cap playing a game called Skully. Now these kids have go-karts, an adventure course, a zipline (from a terrifying looking 50 foot perch), ATVs and dirt bikes, waterskiing, and a bunch of other stuff. In the arts center they had an iMac-based video production and editing rig (yes, XX1 starred in a short video with her group), ceramics (including their own wheels and kiln), digital photography, and tons of other stuff. For boys there was rocketry and woodworking (including tabletop lathes and jigsaws). Made me want to go back to camp. Don’t tell Rich and Adrian if I drop offline for couple weeks, okay? Everything was pretty clean and her bunk was well organized, as you can see from the picture. Just like her room at home…not! Obviously the counselors help out and make sure everything is tidy, but with the daily inspections and work wheel (to assign chores every day), she’s got to do her part of keeping things clean and orderly. Maybe we’ll even be able to keep that momentum when she returns home. Most of all, it was great to see our young girl maturing in front of our eyes. After only 3 weeks away, she is far more confident and sure of herself. It was great to see. Her counselors are from New Zealand and Mexico, so she’s gotten a view of other parts of the world and learned about other cultures, and is now excited to explore what the world has to offer. It’s been a transformative experience for her, and we couldn’t be happier. I really pushed to send her to camp as early as possible because I firmly believe kids have to learn to fend for themselves in the world without the ever-present influence of their folks. The only way to do that is away from home. Camp provides a safe environment for kids to figure out how to get along (in close quarters) with other kids, and to do activities they can’t at home. That was based on my experience, and I’m glad to see it’s happening for my daughter as well. In fact, XX2 will go next year (2 years younger than XX1 is now) and she couldn’t be more excited after visiting. But there’s more! An unforeseen benefit of camp accrues to us. Not just having one less kid to deal with over the summer – which definitely helps. But sending the kids to camp each summer will force us (well, really the Boss) to let go and get comfortable with the reality that at some point our kids will grow, leave the nest, and fly on their own. Many families don’t deal with this transition until college and it’s very disruptive and painful. In another 9 years we’ll be ready, because we are letting our kids fly every summer. And from where I sit, that’s a great thing. – Mike Photo credits: “XX1 bunk” originally uploaded by Mike Rothman Recent Securosis Posts Wow. Busy week on the blog. Nice. Pricing Cyber-Policies FireStarter: An Encrypted Value is Not a Token! Tokenization: The Tokens Comments on Visa’s Tokenization Best Practices Friday Summary: July 15, 2010 Tokenization Architecture – The Basics Color-blind Swans and Incident Response Home Business Payment Security Simple Ideas to Start Improving the Economics of Cybersecurity Various NSO Quant Posts on the Monitor Subprocesses: Define Policies Collect and Store Analyze Validate and Escalate Incite 4 U We have a failure to communicate! – Chris makes a great point on the How is that Assurance Evidence? blog about the biggest problem we security folks face on a daily basis. It ain’t mis-configured devices or other typical user stupidity. It’s our fundamental inability to communicate. He’s exactly right, and it manifests in the lack of having any funds in the credibility bank, obviously impacting our ability to drive our security agendas. Holding a senior level security job is no longer about the technology. Not by a long shot. It’s about evangelizing the security program and persuading colleagues to think security first and to do the right thing. Bravo, Chris. Always good to get a reminder that all the security kung-fu in the world doesn’t mean crap unless the business thinks it’s important to protect the data. – MR Cyber RF – I was reading Steven Bellovin’s post on Cyberwar, and the only thing that came to mind was Sun Tsu’s quote, “Victorious warriors win first and then go to war, while defeated warriors go to war first and then seek to win.” Don’t think I am one of those guys behind the ‘Cyberwar’ bandwagon, or who likes using war metaphors for football – this subject makes me want to gag. Like most posts on this subject, there is an interesting mixture of stuff I agree with, and an equal blend of stuff I totally disagree with. But the reason

Share:
Read Post

The Cancer within Evidence Based Research Methodologies

Alex Hutton has a wonderful must-read post on the Verizon security blog on Evidence Based Risk Management. Alex and I (along with others including Andrew Jaquith at Forrester, as well as Adam Shostack and Jeff Jones at Microsoft) are major proponents of improving security research and metrics to better inform the decisions we make on a day to day basis. Not just generic background data, but the kinds of numbers that can help answer questions like “Which security controls are most effective under XYZ circumstances?” You might think we already have a lot of that information, but once you dig in the scarcity of good data is shocking. For example we have theoretical models on password cracking – but absolutely no validated real-world data on how password lengths, strengths, and forced rotation correlate with the success of actual attacks. There’s a ton of anecdotal information and reports of password cracking times – especially within the penetration testing community – but I have yet to see a single large data set correlating password practices against actual exploits. I call this concept outcomes based security, which I now realize is just one aspect/subset of what Alex defines as Evidence Based Risk Management. We often compare the practice of security with the practice of medicine. Practitioners of both fields attempt to limit negative outcomes within complex systems where external agents are effectively impossible to completely control or predict. When you get down to it, doctors are biological risk managers. Both fields are also challenged by having to make critical decisions with often incomplete information. Finally, while science is technically the basis of both fields, the pace and scope of scientific information is often insufficient to completely inform decisions. My career in medicine started in 1990 when I first became certified as an EMT, and continued as I moved on to working as a full time paramedic. Because of this background, some of my early IT jobs also involved work in the medical field (including one involving Alex’s boss about 10 years ago). Early on I was introduced to the concepts of Evidence Based Medicine that Alex details in his post. The basic concept is that we should collect vast amounts of data on patients, treatments, and outcomes – and use that to feed large epidemiological studies to better inform physicians. We could, for example, see under which circumstances medication X resulted in outcome Y on a wide enough scale to account for variables such as patient age, gender, medical history, other illnesses, other medications, etc. You would probably be shocked at how little the practice of medicine is informed by hard data. For example if you ever meet a doctor who promotes holistic medicine, acupuncture, or chiropractic, they are making decisions based on anecdotes rather than scientific evidence – all those treatments have been discredited, with some minor exceptions for limited application of chiropractic… probably not what you used it for. Alex proposes an evidence-based approach – similar to the one medicine is in the midst of slowly adopting – for security. Thanks to the Verizon Data Breach Investigations Report, Trustwave’s data breach report, and little pockets of other similar information, we are slowly gaining more fundamental data to inform our security decisions. But EBRM faces the same near-crippling challenge as Evidence Based Medicine. In health care the biggest obstacle to EBM is the physicians themselves. Many rebel against the use of the electronic medical records systems needed to collect the data – sometimes for legitimate reasons like crappy software, and at other times due to a simple desire to retain direct control over information. The reason we have HIPAA isn’t to protect your health care data from a breach, but because the government had to step in and legislate that doctors must release and share your healthcare information – which they often considered their own intellectual property. Not only do many physicians oppose sharing information – at least using the required tools – but they oppose any restrictions on their personal practice of medicine. Some of this is a legitimate concern – such as insurance companies restricting treatments to save money – but in other cases they just don’t want anyone telling them what to do – even optional guidance. Medical professionals are just as subject to cognitive bias as the rest of us, and as a low-level medical provider myself I know that algorithms and checklists alone are never sufficient in managing patients – a lot of judgment is involved. But it is extremely difficult to balance personal experience and practices with evidence, especially when said evidence seems counterintuitive or conflicts with existing beliefs. We face these exact same challenges in security: Organizations and individual practitioners often oppose the collection and dissemination of the raw data (even anonymized) needed to learn from experience and advance based practices. Individual practitioners, regulatory and standards bodies, and business constituents need to be willing to adjust or override their personal beliefs in the face of hard evidence, and support evolution in security practices based on hard evidence rather than personal experience. Right now I consider the lack of data our biggest challenge, which is why we try to participate as much as possible in metrics projects, including our own. It’s also why I have an extremely strong bias towards outcome-based metrics rather than general risk/threat metrics. I’m much more interested in which controls work best under which circumstances, and how to make the implementation of said controls as effective and efficient as possible. We are at the very beginning of EBRM. Despite all our research on security tools, technologies, vulnerabilities, exploits, and processes, the practice of security cannot progress beyond the equivalent of witch doctors until we collectively unite behind information collection, sharing, and analysis as the primary sources informing our security decisions. Seriously, wouldn’t you really like to know when 90-day password rotation actually reduces risk vs. merely annoying users and wasting time? Share:

Share:
Read Post

FireStarter: an Encrypted Value Is *Not* a Token!

We’ve been writing a lot on tokenization as we build the content for our next white paper, and in Adrian’s response to the PCI Council’s guidance on tokenization. I want to address something that’s really been ticking me off… In our latest post in the series we described the details of token generation. One of the options, which we had to include since it’s built into many of the products, is encryption of the original value – then using the encrypted value as the token. Here’s the thing: If you encrypt the value, it’s encryption, not tokenization! Encryption obfuscates, but a token removes, the original data. Conceptually the major advantages of tokenization are: The token cannot be reversed back to the original value. The token maintains the same structure and data type as the original value. While format preserving encryption can retain the structure and data type, it’s still reversible back to the original if you have the key and algorithm. Yes, you can add per-organization salt, but this is still encryption. I can see some cases where using a hash might make sense, but only if it’s a format preserving hash. I worry that marketing is deliberately muddling the terms. Opinions? Otherwise, I declare here and now that if you are using an encrypted value and calling it a ‘token’, that is not tokenization. Share:

Share:
Read Post

Pricing Cyber-Policies

Every time I think I’m making progress on controlling my cynical gene, I see something that sets me back almost to square one. I’ve been in this game for a long time, and although I think subconsciously I know some things are going on, it’s still a bit shocking to see them in print. What set me off this time is Richard Bejtlich’s brief thoughts on the WEIS 2010 (Workshop on the Economics of Information Security) conference. His first thoughts are around a presentation on cyber insurance. The presenter admitted that the industry has no expected loss data and no financial impact data. Really? They actually admitted that. But it gets better. Your next question must be, “So how do they price the policies?” It certainly was mine. Yes! They have an answer for that: Price the policies high and see what happens. WHAT? Does Dr. Evil head their policy pricing committee? I can’t say I’m a big fan of insurance companies, and this is the reason why. They are basically making it up. Pulling the premiums out of their butts. Literally. And they would never err favor of folks buying the policies, so you see high prices. Clearly this is a chicken & egg situation. They don’t have data because no one shares it. So they write some policies to start collecting data, but they price the policies probably too high for most companies to actually buy. So they still have no data. And those looking for insurance don’t really have any options. I guess I need to ask why folks are looking for cyber-insurance anyway? I can see the idea of trying to get someone else to pay for disclosure – those are hard costs. Maybe you can throw clean-up into that, but how could you determine what is clean-up required from a specific attack, and what is just crappy security already in place? It’s not like you are insuring Sam Bradford’s shoulder here, so you aren’t going to get a policy to reimburse for brand damage. Back when I worked for TruSecure, the company had an “insurance” policy guaranteeing something in the event of a breach on a client certified using the company’s Risk Management Methodology. At some point the policy expired, and when trying to renew it, we ran across the same crap. We didn’t know how to model loss data – there was none because the process was perfect. LOL! And they didn’t either. So the quote came back off the charts. Then we had to discontinue the program because we couldn’t underwrite the risk. Seems almost 7 years later, we’re still in the same place. Actually we’re in a worse place because the folks writing these policies are now aggressively working the system to prevent payouts (see Colorado Casualty/University of Utah) when a breach occurs. I guess from my perspective cyber-insurance is a waste of time. But I could be missing something, so I’ll open it up to you folks – you’re collectively a lot smarter than me. Do you buy cyber-insurance? For what? Have you been able to collect on any claims? Is the policy just to make your board happy? To cover your ass and shuffle blame to the insurance company? Do tell. Please! Photo credit: “Dr Evil 700 Billion” originally uploaded by Radio_jct Share:

Share:
Read Post

Tokenization: The Tokens

In this post we’ll dig into the technical details of tokens. What they are and how they are created; as well as some of the options for security, formatting, and performance. For those of you who read our stuff and tend to skim the more technical posts, I recommend you stop and pay a bit more attention to this one. Token generation and structure affect the security of the data, the ability to use the tokens as surrogates in other applications, and the overall performance of the system. In order to differentiate the various solutions, it’s important to understand the basics of token creation. Let’s recap the process quickly. Each time sensitive data is sent to the token server three basic steps are performed. First, a token is created. Second, the token and the original data are stored together in the token database. Third, the token is returned to the calling application. The goal is not just to protect sensitive data without losing functionality within applications, so we cannot simply create any random blob of data. The format of the token needs to match the format of the original data, so it can be used exactly as if it were the original (sensitive) data. For example, a Social Security token needs to have at least the same size (if not data type) as a social security number. Supporting applications and databases can accept the substituted value as long as it matches the constraints of the original value. Let’s take a closer look at each of the steps. Token Creation There are three common methods for creating tokens: Random Number Generation: This method substitutes data with a random number or alphanumeric value, and is our recommended method. Completely random tokens offers the greatest security, as the content cannot be reverse engineered. Some vendors use sequence generators to create tokens, grabbing the next value in the series to use for the token – this is not nearly as secure as a fully randomized number, but is very fast and secure enough for most (non-PCI) use cases. A major benefit of random numbers is that they are easy to adapt to any format constraints (discussed in greater detail below), and the random numbers can be generated in advance to improve performance. Encryption: This method generates a ‘token’ by encrypting the data. Sensitive information is padded with a random salt to prevent reverse engineering, and then encrypted with the token server’s private key. The advantage is that the ‘token’ is reasonably secure from reverse engineering, but the original value can be retrieved as needed. The downsides, however are significant – performance is very poor, Format Preserving Encryption algorithms are required, and data can be exposed when keys are compromised or guessed. Further, the PCI Council has not officially accepted format preserving cryptographic algorithms, and is awaiting NIST certification. Regardless, many large and geographically disperse organizations that require access to original data favor the utility of encrypted ‘tokens’, even though this isn’t really tokenization. One-way Hash Function: Hashing functions create tokens by running the original value through a non-reversible mathematical operation. This offers reasonable performance, and tokens can be formatted to match any data type. Like encryption, hashes must be created with a cryptographic salt (some random bits of data) to thwart dictionary attacks. Unlike encryption, tokens created through hashing are not reversible. Security is not as strong as fully random tokens but security, performance, and formatting flexibility are all improved over encryption. Beware that some open source and commercial token servers use poor token generation methods of dubious value. Some use reversible masking, and others use unsalted encryption algorithms, and can thus be easily compromised and defeated. Token Properties We mentioned the importance of token formats earlier, and token solutions need to be flexible enough to handle multiple formats for the sensitive data they accept – such as personally identifiable information, Social Security numbers, and credit card numbers. In some cases, additional format constraints must be honored. As an example, a token representing a Social Security Number in a customer service application may need to retain the real last digits. This enables customer service representatives to verify user identities, without access to the rest of the SSN. When tokenizing credit cards, tokens are the same size as the original credit card number – most implementations even ensure that tokens pass the LUHN check. As the token still resembles a card number, systems that use the card numbers need not to be altered to accommodate tokens. But unlike the real credit card or Social Security numbers, tokens cannot be used as financial instruments, and have no value other than as a reference to the original transaction or real account. The relationship between a token and a card number is unique for any given payment system, so even if someone compromises the entire token database sufficiently that they can commit transactions in that system (a rare but real possibility), the numbers are worthless outside the single environment they were created for. And most important, real tokens cannot be decrypted or otherwise restored back into the original credit card number. Each data type has different use cases, and tokenization vendors offer various options to accomodate them. Token Datastore Tokens, along with the data they represent, are stored within heavily secured database with extremely limited access. The data is typically encrypted (per PCI recommendations), ensuring sensitive data is not lost in the event of a database compromises or stolen media. The token (database) server is the only point of contact with any transaction system, payment system, or collection point to reduce risk and compliance scope. Access to the database is highly restricted, with administrative personnel denied read access to the data, and even authorized access the original data limited to carefully controlled circumstances. As tokens are used to represent the same data for multiple events, possibly across multiple systems, most can issue different tokens for the same user data. A credit card number, for example, may get a

Share:
Read Post

Comments on Visa’s Tokenization Best Practices

If you are interested in tokenization, check out Visa’s Tokenization Best Practices guide, released this week. The document is a very short four pages. It highlights the basics and is helpful in understanding minimum standards for deployment. That said, I think some simple changes would make the recommendations much better and deployments more secure. From a security standpoint my issues are twofold: I think they fell far short with their recommendations on token generation, and that salting should be implemented differently than they suggest. I also believe that, given how prescriptive the advice is in several sections, Visa should clarify what they mean by encrypting the “Card Data Vault”, but that’s a subject for another day. First things first: let’s dig into the token generation issues. The principle behind tokenization is to substitute a token for a real (sensitive) value, so you cannot reverse engineer the token into PAN data. But when choosing a token creation strategy, you must decide whether you want to be able to retrieve the value or not. If you will want to convert the token back to the original value, use encryption. But if you don’t need to this, there are better ways to secure PAN data than encryption or hashing! My problem with the Visa recommendations is their first suggestion should have been simply to use a random number. If the output is not generated by a mathematical function applied to the input, it cannot be reversed to regenerate the original PAN data. The only way to discover PAN data from a real token is a (reverse) lookup in the token server database. Random tokens are simple to generate, and the size & data type constraints are trivial. This should be the default, as most firms should neither need or want PAN data retrievable from the token. As for encryption, rather than suggest a “strong encryption cipher”, why not take this a step further and recommend a one time pad? This is a perfect application for that kind of substitution cipher. And one time pads are as secure a method as anything else. I’m guessing Visa did not suggest this because a handful of very large payment processors, with distributed operations, actually want to retrieve the PAN data in multiple locations. That means they need encryption, and they need to distribute the keys. As for hashing, I think the method they prescribe is wrong. Remember that a hash is deterministic. You put in A, the hash digests the PAN data, and it produces B. Every time. Without fail. In order to avoid dictionary attacks you salt the input with a number. But the recommendations are ” … hashing of the cardholder data using a fixed but unique salt value per merchant”! If you use a static merchant ID as the salt, you are really not adding much in the way of computational complexity (or trying very hard to stop attacks). Odds are the value will be guessed or gathered at some point, as will the hashing algorithm – which subjects you to precomputed attacks against all the tokens. It seems to me that for PAN data, you can pick any salt you want, so why not make it different for each and every token? The token server can store the random salt with the token, and attacks become much tougher. Finally, Visa did not even discuss format preservation. I am unaware of any tokenization deployment that does not retain the format of the original credit card number/PAN. In many cases they preserve data types as well. Punting on this subject is not really appropriate, as format preservation is what allows token systems to slide into existing operations without entirely reworking the applications and databases. Visa should have stepped up to the plate with format preserving encryption and fully endorsed format-preserving strong cryptography. This was absent fromnot addressed in the Field Level Encryption Best Practices in 2009, and remains conspicuous by its absence. The odds are that if you are saddled with PCI-DSS responsibilities, you will not write your own ‘home-grown’ token servers. So keep in mind that these recommendations are open enough that vendors can easily provide botched implementations and still meet Visa’s guidelines. If you are only interested in getting systems out of scope, then any of these solutions is fine because QSAs will accept them as meeting the guidelines. But if you are going to the trouble of implementing a token server, it’s no more work to select one that offers strong security. Share:

Share:
Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.