Securosis

Research

Tokenization: Use Cases, Part 3

Not every use case for tokenization involves PCI-DSS. There are equally compelling implementation options, several for personally identifiable information, that illustrate different ways to deploy token services. Here we will describe how tokens are used to replace Social Security numbbers in human resources applications. These services must protect the SSN during normal use by employees and third party service providers, while still offering authorized access for Human Resources personnel, as well as payroll and benefits services. In our example an employee uses an HR application to review benefits information and make adjustments to their own account. Employees using the system for the first time will establish system credentials and enter their personal information, potentially including Social Security number. To understand how tokens work in this scenario, let’s map out the process: The employee account creation process is started by entering the user’s credentials, and then adding personal information including the Social Security number. This is typically performed by HR staff, with review by the employee in question. Over a secure connection, the presentation server passes employee data to the HR application. The HR application server examines the request, finds the Social Security number is presnt, and forwards the SSN to the tokenization server. The tokenization server validates the HR application connection and request. It creates the token, storing the token/Social Security number pair in the token database. Then it returns the new token to the HR application server. The HR application server stores the employee data along with the token, and returns the token to the presentation server. The temporary copy of the original SSN is overwritten so it does not persist in memory. The presentation server displays the successful account creation page, including the tokenized value, back to the user. The original SSN is overwritten so it does not persist in token server memory. The token is used for all other internal applications that may have previously relied on real SSNs. Occasionally HR employees need to look up an employee by SSN, or access the SSN itself (typically for payroll and benefits). These personnel are authorized to see the real SSN within the application, under the right context (this needs to be coded into the application using the tokenization server’s API). Although the SSN shows up in their application screens when needed, it isn’t stored on the application or presentation server. Typically it isn’t difficult to keep the sensitive data out of logs, although it’s possible SSNs will be cached in memory. Sure, that’s a risk, but it’s a far smaller risk than before. The real SSN is used, as needed, for connections to payroll and benefits services/systems. Ideally you want to minimize usage, but realistically many (most?) major software tools and services still require the SSN – especially for payroll and taxes.Applications that already contain Social Security numbers undergo a similar automated transformation process to replace the SSN with a token, and this occurs without user interaction. Many older applications used SSN as the primary key to reference employee records, so referential key dependencies make replacement more difficult and may involve downtime and structural changes.Note than as surrogates for SSNs, tokens can be formatted to preserve the last 4 digits. Display of the original trailing four digits allows HR and customer service representatives to identify the employee, while preserving privacy by masking the first 5 digits. There is never any reason to show an employee their own SSN – they should already know it – and non-HR personnel should never see SSNs either. The HR application server and presentation layers will only display the tokenized values to the internal web applications for general employee use, never the original data.But what’s really different about this use case is that HR applications need regular access to the original social security number. Unlike a PCI tokenization deployment – where requests for original PAN data are somewhat rare – accounting, benefits, and other HR services regularly require the original non-token data. Within our process, authorized HR personnel can use the same HR application server, through a HR specific presentation layer, and access the original Social Security number. This is performed automatically by the HR application on behalf of validated and authorized HR staff, and limited to specific HR interfaces. After the HR application server has queried the employee information from the database, the application instructs the token server to get the Social Security number, and then sends it back to the presentation server.Similarly, automated batch jobs such as payroll deposits and 401k contributions are performed by HR applications, which in turn instruct the token server to send the SSN to the appropriate payroll/benefits subsystem. Social Security numbers are accessed by the token server, and then passed to the supporting application over a secured and authenticated connection. In this case, the token appears seen at the presentation layer, while third party providers receive the SSN via proxy on the back end. Share:

Share:
Read Post

Tokenization Topic Roundup

Tokenization has been one of our more interesting research projects. Rich and I thoroughly understood tokenization server functions and requirements when we began this project, but we have been surprised by the depth of complexity underlying the different implementations. The variety of variations and different issues that reside ‘under the covers’ really makes each vendor unique. The more we dig, the more interesting tidbits we find. Every time we talk to a vendor we learn something new, and we are reminded how each development team must make design tradeoffs to get their products to market. It’s not that the products are flawed – more that we can see ripples from each vendor’s biggest customers in their choices, and this effect is amplified by how new the tokenization market still is. We have left most of these subtle details out of this series, as they do not help make buying decisions and/or are minutiae specific to PCI. But in a few cases – especially some of Visa’s recommendations, and omissions in the PCI guidelines, these details have generated a considerable amount of correspondence. I wanted to raise some of these discussions here to see if they are interesting and helpful, and whether they warrant inclusion in the white paper. We are an open research company, so I am going to ‘out’ the more interesting and relevant email. Single Use vs. Multi-Use Tokens I think Rich brought this up first, but a dozen others have emailed to ask for more about single use vs. multi-use tokens. A single use token (terrible name, by the way) is created to represent not only a specific sensitive item – a credit card number – but is unique to a single transaction at a specific merchant. Such a token might represent your July 4th purchase of gasoline at Shell. A multi-use token, in contrast, would be used for all your credit card purchases at Shell – or in some models your credit card at every merchant serviced by that payment processor. We have heard varied concerns over this, but several have labeled multi-use tokens “an accident waiting to happen.” Some respondents feel that if the token becomes generic for a merchant-customer relationship, it takes on the value of the credit card – not at the point of sale, but for use in back-office fraud. I suggest that this issue also exists for medical information, and that there will be sufficient data points for accessing or interacting with multi-use tokens to guess the sensitive value it represents. A couple other emails complained that inattention to detail in the token generation process make attacks realistic, and multi-use tokens are a very attractive target. Exploitable weaknesses might include lack of salting, using a known merchant ID as the salt, and poor or missing of initialization vectors (IVs) for encryption-based tokens. As with the rest of security, a good tool can’t compensate for a fundamentally flawed implementation. I am curious what you all think about this. Token Distinguishability In the Visa Best Practices guide for tokenization, they recommend making it possible to distinguish between a token and clear text PAN data. I recognize that during the process of migrating from storing credit card numbers to replacement with tokens, it might be difficult to tell the difference through manual review. But I have trouble finding a compelling customer reason for this recommendation. Ulf Mattsson of Protegrity emailed me a couple times on this topic and said: This requirement is quite logical. Real problems could arise if it were not possible to distinguish between real card data and tokens representing card data. It does however complicate systems that process card data. All systems would need to be modified to correctly identify real data and tokenised data. These systems might also need to properly take different actions depending on whether they are working with real or token data. So, although a logical requirement, also one that could cause real bother if real and token data were routinely mixed in day to day transactions. I would hope that systems would either be built for real data, or token data, and not be required to process both types of data concurrently. If built for real data, the system should flag token data as erroneous; if built for token data, the system should flag real data as erroneous. Regardless, after the original PAN data has been replaced with tokens, is there really a need to distinguish a token from a real number? Is this a pure PCI issue, or will other applications of this technology require similar differentiation? Is the only reason this problem exists because people aren’t properly separating functions that require the token vs. the value? Exhausting the Token Space If a token format is designed to preserve the last four real digits of a credit card number, that only leaves 11-12 digits to differentiate one from another. If the token must also pass a LUHN check – as some customers require – only a relatively small set of numbers (which are not real credit card numbers) remain available – especially if you need a unique token for each transaction. I think Martin McKey or someone from RSA brought up the subject of exhausting the token space, at the RSA conference. This is obviously more of an issue for payment processors than in-house token servers, but there are only so many numbers to go around, and at some point you will run out. Can you age and obsolete tokens? What’s the lifetime of a token? Can the token server reclaim and re-use them? How and when do you return the token to the pool of tokens available for (re-)use? Another related issue is token retention guidelines for merchants. A single use token should be discarded after some particular time, but this has implications on the rest of the token system, and adds an important differentiation from real credit card numbers with (presumably) longer lifetimes. Will merchants be able to disassociate the token used for

Share:
Read Post

Friday Summary: August 6th, 2010

I started running when I was 10. I started because my mom was talking a college PE class, so I used to tag along and no one seemed to care. We ran laps three nights a week. I loved doing it and by twelve I was lapping the field in the 20 minutes allotted. I lived 6 miles from my junior high and high school so I used to run home. I could have walked, ridden a bike, or taken rides from friends who offered, but I chose to run. I was on the track team and I ran cross country – the latter had us running 10 miles a day before I ran home. And until I discovered weight lifting, and added some 45 lbs of upper body weight, I was pretty fast. I used to run 6 days week, every week. Run one evening, next day mid-afternoon, then morning; and repeat the cycle, taking the 7th day off. That way I ran with less than 24 hours rest four days days, but it still felt like I got two days off. And I would play all sorts of mental games with myself to keep getting better, and to keep it interesting. Coming off a hill I would see how long I could hold the faster speed on the flat. Running uphill backwards. Going two miles doing that cross-over side step they teach you in martial arts. When I hit a plateau I would take a day and run wind sprints up the steepest local hill I could find. The sandy one. As fast as I could run up, then trot back down, repeating until my legs were too rubbery to feel. Or maybe run speed intervals, trying to get myself in and out of oxygen deprivation several times during the workout. If I was really dragging I would allow myself to go slower, but run with very heavy ‘cross-training’ shoes. That was the worst. I have no idea why, I just wanted to run, and I wanted to push myself. I used to train with guys who were way faster that me, which was another great way to motivate. We would put obscene amounts of weight on the leg press machine and see how many reps we could do, knee cartilage be damned, to get stronger. We used to jump picnic tables, lengthwise, just to gain explosion. One friend like to heckle campus security and mall cops just to get them to chase us because it was fun, but also because being pursued by a guy with a club is highly motivating. But I must admit I did it mainly because there are few things quite as funny as the “oomph-ugghh” sound rent-a-guards make when they hit the fence you just casually hopped over. For many years after college, while I never really trained to run races or compete at any level, I continued to push myself as much as I could. I liked the way I felt after a run, and I liked the fact that I can eat whatever I want … as long as I get a good run in. Over the last couple years, due to a combination of age and the freakish Arizona summers, all that stopped. Now the battle is just getting out of the house: I play mental games just to get myself out the door to run in 112 degrees. I have one speed, which I affectionately call “granny gear”. I call it that because I go exactly the same speed up hill as I do on the flat: slow. Guys rolling baby strollers pass me. And in some form of karmic revenge I can just picture myself as the mall cop, getting toasted and slamming into chain link fence because I lack the explosion and leg strength to hop much more than the curb. But I still love it as it clears my head and I still feel great afterwards … gasping for air and blotchy red skin notwithstanding. Or at least that is what I am telling myself as I am lacing up my shoes, drinking a whole bunch of water, and looking at the thermometer that reads 112. Sigh Time to go … On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Adrian’s Dark Reading post on What You Should Know About Tokenization. Rich’s The Five Things You Need to Know About Social Networking Security, on the Websense blog. Chris’s Beware Bluetooth Keyboards with iOS Devices, starring Mike – belated, as we forgot to include it last time. Favorite Securosis Posts Rich: NSO Quant: Firewall Management Process Map (UPDATED). Mike Rothman: What Do We Learn at Black Hat/DefCon? Adrian Lane: Incite 8/4/2010: Letters for Everyone. Other Securosis Posts Tokenization: Use Cases, Part 1. GSM Cell Phones to Be Intercepted in Defcon Demonstration. Tokenization: Series Index. Tokenization: Token Servers, Part 3, Deployment Models. Tokenization: Token Servers, Part 2 (Architecture, Integration, and Management). Death, Irrelevance, and a Pig Roast. Favorite Outside Posts Mike Rothman: Website Vulnerability Assessments: Good, Fast or Cheap – Pick Two. Great post from Jeremiah on the reality of trade-offs. Adrian Lane: How Microsoft’s Team Approach Improves Security. What is it they say about two drunks holding each other up? David Mortman: Taking Back the DNS. Vixie & ISC plan to build reputation APIs directly into BIND. Rich Mogull: 2010 Data Breach Investigations Report Released. VZ Business continues to raise the bar for data and breach analysis. 2010 version adds data from the US Secret Service. Cool stuff. Chris Pepper: DefCon Ninja Badges Let Hackers Do Battle. I hope Rich is having fun at DefCon – this sounds pretty good, at least. Project Quant Posts NSO Quant: Manage Firewall Policy Review Sub-Processes. NSO Quant: Firewall Management Process Map (UPDATED). NSO Quant: Monitor Process Revisited. NSO Quant: Monitoring Health Maintenance Subprocesses. NSO Quant: Validate and Escalate Sub-Processes. NSO Quant: Analyze Sub-Process. NSO Quant: Collect and Store SubProcesses. Research Reports and Presentations White Paper: Endpoint Security Fundamentals.

Share:
Read Post

Tokenization: Use Cases, Part 1

We have now discussed most of the relevant bits of technology for token server construction and deployment. Armed with that knowledge we can tackle the most important part of the tokenization discussion: use cases. Which model is right for your particular environment? What factors should be considered in the decision? The following three or four uses cases cover most of the customer situations we get calls asking for advice on. As PCI compliance is the overwhelming driver for tokenization at this time, our first two use cases will focus on different options for PCI-driven deployments. Mid-sized Retail Merchant Our first use case profiles a mid-sized retailer that needs to address PCI compliance requirements. The firm accepts credit cards but sells exclusively on the web, so they do not have to support point of sale terminals. Their focus is meeting PCI compliance requirements, but how best to achieve the goal at reasonable cost is the question. As in many cases, most of the back office systems were designed before credit card storage was regulated, and use the CC# as part of the customer and order identification process. That means that order entry, billing, accounts receivable, customer care, and BI systems all store this number, in addition to web site credit authorization and payment settlement systems. Credit card information is scattered across many systems, so access control and tight authentication are not enough to address the problem. There are simply too many access points to restrict with any certainty of success, and there are far too many ways for attackers to compromise one or more systems. Further, some back office systems are accessible by partners for sales promotions and order fulfillment. The security efforts will need to embrace almost every back office system, and affect almost every employee. Most of the back office transaction systems have no particular need for credit card numbers – they were simply designed to store and pass the number as a reference value. The handful of systems that employ encryption are transparent, meaning they automatically return decrypted information, and only protect data when stored on disk or tape. Access controls and media encryption are not sufficient controls to protect the data or meet PCI compliance in this scenario. While the principal project goal is PCI compliance; as with any business strong secondary goals of minimizing total costs, integration challenges, and day to day management requirements. Because the obligation is to protect card holder data and limit the availability of credit cards in clear text, the merchant does have a couple choices: encryption and tokenization. They could implement encryption in each of the application platforms, or they could use a central token server to substitute tokens for PAN data at the time of purchase. Our recommendation for our theoretical merchant is in-house tokenization. An in-house token server will work with existing applications and provide tokens in lieu of credit card numbers. This will remove PAN data from the servers entirely with minimal changes to those few platforms that actually use credit cards: accepting them from customers, authorizing charges, clearing, and settlement – everything else will be fine with a non-sensitive token that matches the format of a real credit card number. We recommend a standalone server over one embedded within the applications, as the merchant will need to share tokens across multiple applications. This makes it easier to segment users and services authorized to generate tokens from those that can actually need real unencrypted credit card numbers. Diagram 1 lays out the architecture. Here’s the structure: A customer makes a purchase request. If this is a new customer, they send their credit card information over an SSL connection (which should go without saying). For future purchases, only the transaction request need be submitted. The application server processes the request. If the credit card is new, it uses the tokenization server’s API to send the value and request a new token. The tokenization server creates the token and stores it with the encrypted credit card number. The tokenization server returns the token, which is stored in the application database with the rest of the customer information. The token is then used throughout the merchant’s environment, instead of the real credit card number. To complete a payment transaction, the application server sends a request to the transaction server. The transaction server sends the token to the tokenization server, which returns the credit card number. The transaction information – including the real credit card number – is sent to the payment processor to complete the transaction. While encryption could protect credit card data without tokenization, and be implemented in such a way as to minimize changes to UI and database storage to supporting applications, it would require modification of every system that handles credit cards. And a pure encryption solution would require support of key management services to protect encryption keys. The deciding factor against encryption here is the cost of retrofitting system with application layer encryption – especially because several rely on third-party code. The required application changes, changes to operations management and disaster recovery, and broader key management services required would be far more costly and time-consuming. Recoding applications would become the single largest expenditure, outweighing the investment in encryption or token services. Sure, the goal is compliance and data security, but ultimately any merchant’s buying decision is heavily affected by cost: for acquisition, maintenance, and management. And for any merchant handling credit cards, as the business grows so does the cost of compliance. Likely the ‘best’ choice will be the one that costs the least money, today and in the long term. In terms of relative security, encryption and tokenization are roughly equivalent. There is no significant cost difference between the two, either for acquisition or operation. But there is a significant difference in the costs of implementation and auditing for compliance. Next up we’ll look at another customer profile for PCI. Share:

Share:
Read Post

Tokenization: Token Servers, Part 3, Deployment Models

We have covered the internals of token servers and talked about architecture and integration of token services. Now we need to look at some of the different deployment models and how they match up to different types of businesses. Protecting medical records in multi-company environments is a very different challenge than processing credit cards for thousands of merchants. Central Token Server The most common deployment model we see today is a single token server that sits between application servers and the back end transaction servers. The token server issues one or more tokens for each instance of sensitive information that it recieves. For most applications it becomes a reference library, storing sensitive information within a repository and providing an index back to the real data as needed. The token service is placed in line with existing transaction systems, adding a new substitution step between business applications and back-end data processing. As mentioned in previous posts, this model is excellent for security as it consolidates all the credit card data into a single highly secure server; additionally, it is very simple to deploy as all services reside in a single location. And limiting the number of locations where sensitive data is stored and accessed both improves security and reduces auditing, as there are fewer systems to review. A central token server works well for small businesses with consolidated operations, but does not scale well for larger distributed organizations. Nor does it provide the reliability and uptime demanded by always-on Internet businesses. For example: Latency: The creation of a new token, lookup of existing customers, and data integrity checks are computationally complex. Most vendors have worked hard to alleviate this problem, but some still have latency issues that make them inappropriate for financial/point of sale usage. Failover: If the central token server breaks down, or is unavailable because of a network outage, all processing of sensitive data (such as orders) stops. Back-end processes that require tokens halt. Geography: Remote offices, especially those in remote geographic locations, suffer from network latency, routing issues, and Internet outages. Remote token lookups are slow, and both business applications and back-end processes suffer disproportionately in the event of disaster or prolonged network outages. To overcome issues in performance, failover, and network communications, several other deployment variations are available from tokenization vendors. Distributed Token Servers With distributed token servers, the token databases are copies and shared among multiple sites. Each has a copy of the tokens and encrypted data. In this model, each site is a peer of the others, with full functionality. This model solves some of the performance issues with network latency for token lookup, as well as failover concerns. Since each token server is a mirror, if any single token server goes down, the others can share its load. Token generation overhead is mitigated, as multiple servers assist in token generation and distribution of requests balances the load. Distributed servers are costly but appropriate for financial transaction processing. While this model offers the best option for uptime and performance, synchronization between servers requires careful consideration. Multiple copies means synchronization issues, and carefully timed updates of data between locations, along with key management so encrypted credit card numbers can be accessed. Finally, with multiple databases all serving tokens, you increase the number of repositories that must be secured, maintained, and audited increases substantially. Partitioned Token Servers In a partitioned deployment, a single token server is designated as ‘active’, and one or more additional token servers are ‘passive’ backups. In this model if the active server crashes or is unavailable a passive server becomes active until the primary connection can be re-established. The partitioned model improves on the central model by replicating the (single, primary) server configuration. These replicas are normally at the same location as the primary, but they may also be distributed to other locations. This differs from the distributed model in that only one server is active at a time, and they are not all peers of one another. Conceptually partitioned servers support a hybrid model where each server is active and used by a particular subset of endpoints and transaction servers, as well as as a backup for other token servers. In this case each token server is assigned a primary responsibility, but can take on secondary roles if another token server goes down. While the option exists, we are unaware of any customers using it today. The partitioned model solves failover issues: if a token server fails, the passive server takes over. Synchronization is easier with this model as the passive server need only mirror the active server, and bi-directional synchronization is not required. Token servers leverage the mirroring capabilities built into the relational database engines, as part of their back ends, to provide this capability. Next we will move on to use cases. Share:

Share:
Read Post

Tokenization: Series Index

Understanding and Selecting a Tokenization Solution: Introduction Business Justification Token System Basics The Tokens Token Servers, Part 1, Internal Functions Token Servers, Part 2, Architecture and Integration Token Servers, Part 3, Deployment Models Tokenization: Use Cases, Part 1 Tokenization: Use Cases, Part 2 Tokenization: Use Cases, Part 3 Tokenization Topic Roundup Tokenization: Selection Process Share:

Share:
Read Post

Friday Summary: July 23, 2010

A couple weeks ago I was sitting on the edge of the hotel bed in Boulder, Colorado, watching the immaculate television. A US-made 30” CRT television in “standard definition”. That’s cathode ray tube for those who don’t remember, and ‘standard’ is the marketing term for ‘low’. This thing was freaking horrible, yet it was perfect. The color was correct. And while the contrast ratio was not great, it was not terrible either. Then it dawned on me that the problem was not the picture, as this is the quality we used to get from televisions. Viewing an old set, operating exactly the same way they always did, I knew the problem was me. High def has so much more information, but the experience of watching the game is the same now as it was then. It hit me just how much our brains were filling in missing information, and we did not mind this sort of performance 10 years ago because it was the best available. We did not really see the names on the backs of football jerseys during those Sunday games, we just thought we did. Heck, we probably did not often make out the numbers either, but somehow we knew who was who. We knew where our favorite players on the field were, and the red streak on the bottom of the screen pounding a blue colored blob must be number 42. Our brain filled in and sharpened the picture for us. Rich and I had been discussing experience bias, recency bias, and cognitive dissonance during out trip to Denver. We were talking about our recent survey and how to interpret the numbers without falling into bias traps. It was an interesting discussion of how people detect patterns, but like many of our conversations devolved into how political and religious convictions can cloud judgement. But not until I was sitting there, watching television in the hotel; did I realize how much our prior experiences and knowledge shape perception, derived value, and interpreted results. Mostly for the good, but unquestionably some bad. Rich also sent me a link to a Michael Shermer video just after that, in which Shermer discusses patterns and self deception. You can watch the video and say “sure, I see patterns, and sometimes what I see is not there”, but I don’t think videos like this demonstrate how pervasive this built in feature is, and how it applies to every situation we find ourself in. The television example of this phenomena was more shocking than some others that have popped into my head since. I have been investing in and listening to high-end audio products such as headphones for years. But I never think about the illusion of a ‘soundstage’ right in front of me, I just think of it as being there. I know the guitar player is on the right edge of the stage, and the drummer is in the back, slightly to the left. I can clearly hear the singer when she turns her head to look at fellow band members during the song. None of that is really in front of me, but there is something in the bits of the digital facsimile on my hard drive that lets my brain recognize all these things, placing the scene right there in front of me. I guess the hard part is recognizing when and how it alters our perception. On to the Summary: Webcasts, Podcasts, Outside Writing, and Conferences Rich quoted in “Apple in a bind over its DNS patch”. Adrian’s Dark Reading post on SIEM ain’t DAM. Rich and Martin on Network Security Podcast #206. Favorite Securosis Posts Rich: Pricing Cyber-Policies. As we used to say at Gartner, all a ‘cybersecurity’ policy buys you is a seat at the arbitration table. Mike Rothman: The Cancer within Evidence Based Research Methodologies. We all need to share data more frequently and effectively. This is why. Adrian Lane: FireStarter: an Encrypted Value Is Not a Token!. Bummer. Other Securosis Posts Tokenization: Token Servers. Incite 7/20/2010: Visiting Day. Tokenization: The Tokens. Comments on Visa’s Tokenization Best Practices. Friday Summary: July 15, 2010. Favorite Outside Posts Rich: Successful Evidence-Based Risk Management: The Value of a Great CSIRT. I realize I did an entire blog post based on this, but it really is a must read by Alex Hutton. We’re basically a bunch of blind mice building 2-lego high walls until we start gathering, and sharing, information on which of our security initiatives really work and when. Mike Rothman: Understanding the advanced persistent threat. Bejtlich’s piece on APT in SearchSecurity is a good overview of the term, and how it’s gotten fsked by security marketing. Adrian Lane: Security rule No. 1: Assume you’re hacked. Project Quant Posts NSO Quant: Monitor Process Revisited. NSO Quant: Monitoring Health Maintenance Subprocesses. NSO Quant: Validate and Escalate Subprocesses. NSO Quant: Analyze Subprocess. NSO Quant: Collect and Store Subprocesses. NSO Quant: Define Policies Subprocess. NSO Quant: Enumerate and Scope Subprocesses. Research Reports and Presentations White Paper: Endpoint Security Fundamentals. Understanding and Selecting a Database Encryption or Tokenization Solution. Low Hanging Fruit: Quick Wins with Data Loss Prevention. Report: Database Assessment. Database Audit Events. XML Security Overview Presentation. Project Quant Survey Results and Analysis. Project Quant Metrics Model Report. Top News and Posts Researchers: Authentication crack could affect millions. SCADA System’s Hard-Coded Password Circulated Online for Years. Microsoft Launches ‘Coordinated’ Vulnerability Disclosure Program. GSM Cracking Software Released. How Mass SQL Injection Attacks Became an Epidemic. Harsh Words for Professional Infosec Certification. Google re-ups the disclosure debate. A new policy – 60 days to fix critical bugs or they disclose. I wonder if anyone asked the end users what they want? Adobe reader enabling protected mode. This is a very major development… if it works. Also curious to see what they do for Macs. Oracle to release 59 critical patches in security update. Is it just me, or do they have more security patches than bug fixes nowdays? Connecticut AG reaches agreement with

Share:
Read Post

Tokenization: Token Servers

In our previous post we covered token creation, a core feature of token servers. Now we’ll discuss the remaining behind-the-scenes features of token servers: securing data, validating users, and returning original content when necessary. Many of these services are completely invisible to end users of token systems, and for day to day use you don’t need to worry about the details. But how the token server works internally has significant effects on performance, scalability, and security. You need to assess these functions during selection to ensure you don’t run into problems down the road. For simplicity we will use credit card numbers as our primary example in this post, but any type of data can be tokenized. To better understand the functions performed by the token server, let’s recap the two basic service requests. The token server accepts sensitive data (e.g., credit card numbers) from authenticated applications and users, responds by returning a new or existing token, and stores the encrypted value when creating new tokens. This comprises 99% of all token server requests. The token server also returns decrypted information to approved applications when presented a token with acceptable authorization credentials. Authentication Authentication is core to the security of token servers, which need to authenticate connected applications as well as specific users. To rebuff potential attacks, token servers perform bidirectional authentication of all applications prior to servicing requests. The first step in this process is to set up a mutually authenticated SSL/TLS session, and validate that the connection is started with a trusted certificate from an approved application. Any strong authentication should be sufficient, and some implementations may layer additional requirements on top. The second phase of authentication is to validate the user who issues a request. In some cases this may be a system/application administrator using specific administrative privileges, or it may be one of many service accounts assigned privileges to request tokens or to request a given unencrypted credit card number. The token server provides separation of duties through these user roles – serving requests only from only approved users, through allowed applications, from authorized locations. The token server may further restrict transactions – perhaps only allowing a limited subset of database queries. Data Encryption Although technically the sensitive data doesn’t might not be encrypted by the token server in the token database, in practice every implementation we are aware of encrypts the content. That means that prior to being written to disk and stored in the database, the data must be encrypted with an industry-accepted ‘strong’ encryption cipher. After the token is generated, the token server encrypts the credit card with a specific encryption key used only by that server. The data is then stored in the database, and thus written to disk along with the token, for safekeeping. Every current tokenization server is built on a relational database. These servers logically group tokens, credit cards, and related information in a database row – storing these related items together. At this point, one of two encryption options is applied: either field level or transparent data encryption. In field level encryption just the row (or specific fields within it) are encrypted. This allows a token server to store data from different applications (e.g., credit cards from a specific merchant) in the same database, using different encryption keys. Some token systems leverage transparent database encryption (TDE), which encrypts the entire database under a single key. In these cases the database performs the encryption on all data prior to being written to disk. Both forms of encryption protect data from indirect exposure such as someone examining disks or backup media, but field level encryption enables greater granularity, with a potential performance cost. The token server will have bundles the encryption, hashing, and random number generation features – both to create tokens and to encrypt network sessions and stored data. Finally, some implementations use asymmetric encryption to protect the data as it is collected within the application (or on a point of sale device) and sent to the server. The data is encrypted with the server’s public key. The connection session will still typically be encrypted with SSL/TLS as well, but to support authentication rather than for any claimed security increase from double encryption. The token server becomes the back end point of decryption, using the private key to regenerate the plaintext prior to generating the proxy token. Key Management Any time you have encryption, you need key management. Key services may be provided directly from the vendor of the token services in a separate application, or by hardware security modules (HSM), if supported. Either way, keys are kept separate from the encrypted data and algorithms, providing security in case the token server is compromised, as well as helping enforce separation of duties between system administrators. Each token server will have one or more unique keys – not shared by other token servers – to encrypt credit card numbers and other sensitive data. Symmetric keys are used, meaning the same key is used for both encryption and decryption. Communication between the token and key servers is mutually authenticated and encrypted. Tokenization systems also need to manage any asymmetric keys for connected applications and devices. As with any encryption, the key management server/device/functions must support secure key storage, rotation, and backup/restore. Token Storage Token storage is one of the more complicated aspects of token servers. How tokens are used to reference sensitive data or previous transactions is a major performance concern. Some applications require additional security precautions around the generation and storage of tokens, so tokens are not stored in a directly reference-able format. Use cases such as financial transactions with either single-use or multi-use tokens can require convoluted storage strategies to balance security of the data against referential performance. Let’s dig into some of these issues: Multi-token environments: Some systems provide a single token to reference every instance of a particular piece of sensitive data. So a credit card used at a specific merchant site will be represented by a

Share:
Read Post

Tokenization: The Tokens

In this post we’ll dig into the technical details of tokens. What they are and how they are created; as well as some of the options for security, formatting, and performance. For those of you who read our stuff and tend to skim the more technical posts, I recommend you stop and pay a bit more attention to this one. Token generation and structure affect the security of the data, the ability to use the tokens as surrogates in other applications, and the overall performance of the system. In order to differentiate the various solutions, it’s important to understand the basics of token creation. Let’s recap the process quickly. Each time sensitive data is sent to the token server three basic steps are performed. First, a token is created. Second, the token and the original data are stored together in the token database. Third, the token is returned to the calling application. The goal is not just to protect sensitive data without losing functionality within applications, so we cannot simply create any random blob of data. The format of the token needs to match the format of the original data, so it can be used exactly as if it were the original (sensitive) data. For example, a Social Security token needs to have at least the same size (if not data type) as a social security number. Supporting applications and databases can accept the substituted value as long as it matches the constraints of the original value. Let’s take a closer look at each of the steps. Token Creation There are three common methods for creating tokens: Random Number Generation: This method substitutes data with a random number or alphanumeric value, and is our recommended method. Completely random tokens offers the greatest security, as the content cannot be reverse engineered. Some vendors use sequence generators to create tokens, grabbing the next value in the series to use for the token – this is not nearly as secure as a fully randomized number, but is very fast and secure enough for most (non-PCI) use cases. A major benefit of random numbers is that they are easy to adapt to any format constraints (discussed in greater detail below), and the random numbers can be generated in advance to improve performance. Encryption: This method generates a ‘token’ by encrypting the data. Sensitive information is padded with a random salt to prevent reverse engineering, and then encrypted with the token server’s private key. The advantage is that the ‘token’ is reasonably secure from reverse engineering, but the original value can be retrieved as needed. The downsides, however are significant – performance is very poor, Format Preserving Encryption algorithms are required, and data can be exposed when keys are compromised or guessed. Further, the PCI Council has not officially accepted format preserving cryptographic algorithms, and is awaiting NIST certification. Regardless, many large and geographically disperse organizations that require access to original data favor the utility of encrypted ‘tokens’, even though this isn’t really tokenization. One-way Hash Function: Hashing functions create tokens by running the original value through a non-reversible mathematical operation. This offers reasonable performance, and tokens can be formatted to match any data type. Like encryption, hashes must be created with a cryptographic salt (some random bits of data) to thwart dictionary attacks. Unlike encryption, tokens created through hashing are not reversible. Security is not as strong as fully random tokens but security, performance, and formatting flexibility are all improved over encryption. Beware that some open source and commercial token servers use poor token generation methods of dubious value. Some use reversible masking, and others use unsalted encryption algorithms, and can thus be easily compromised and defeated. Token Properties We mentioned the importance of token formats earlier, and token solutions need to be flexible enough to handle multiple formats for the sensitive data they accept – such as personally identifiable information, Social Security numbers, and credit card numbers. In some cases, additional format constraints must be honored. As an example, a token representing a Social Security Number in a customer service application may need to retain the real last digits. This enables customer service representatives to verify user identities, without access to the rest of the SSN. When tokenizing credit cards, tokens are the same size as the original credit card number – most implementations even ensure that tokens pass the LUHN check. As the token still resembles a card number, systems that use the card numbers need not to be altered to accommodate tokens. But unlike the real credit card or Social Security numbers, tokens cannot be used as financial instruments, and have no value other than as a reference to the original transaction or real account. The relationship between a token and a card number is unique for any given payment system, so even if someone compromises the entire token database sufficiently that they can commit transactions in that system (a rare but real possibility), the numbers are worthless outside the single environment they were created for. And most important, real tokens cannot be decrypted or otherwise restored back into the original credit card number. Each data type has different use cases, and tokenization vendors offer various options to accomodate them. Token Datastore Tokens, along with the data they represent, are stored within heavily secured database with extremely limited access. The data is typically encrypted (per PCI recommendations), ensuring sensitive data is not lost in the event of a database compromises or stolen media. The token (database) server is the only point of contact with any transaction system, payment system, or collection point to reduce risk and compliance scope. Access to the database is highly restricted, with administrative personnel denied read access to the data, and even authorized access the original data limited to carefully controlled circumstances. As tokens are used to represent the same data for multiple events, possibly across multiple systems, most can issue different tokens for the same user data. A credit card number, for example, may get a

Share:
Read Post

Comments on Visa’s Tokenization Best Practices

If you are interested in tokenization, check out Visa’s Tokenization Best Practices guide, released this week. The document is a very short four pages. It highlights the basics and is helpful in understanding minimum standards for deployment. That said, I think some simple changes would make the recommendations much better and deployments more secure. From a security standpoint my issues are twofold: I think they fell far short with their recommendations on token generation, and that salting should be implemented differently than they suggest. I also believe that, given how prescriptive the advice is in several sections, Visa should clarify what they mean by encrypting the “Card Data Vault”, but that’s a subject for another day. First things first: let’s dig into the token generation issues. The principle behind tokenization is to substitute a token for a real (sensitive) value, so you cannot reverse engineer the token into PAN data. But when choosing a token creation strategy, you must decide whether you want to be able to retrieve the value or not. If you will want to convert the token back to the original value, use encryption. But if you don’t need to this, there are better ways to secure PAN data than encryption or hashing! My problem with the Visa recommendations is their first suggestion should have been simply to use a random number. If the output is not generated by a mathematical function applied to the input, it cannot be reversed to regenerate the original PAN data. The only way to discover PAN data from a real token is a (reverse) lookup in the token server database. Random tokens are simple to generate, and the size & data type constraints are trivial. This should be the default, as most firms should neither need or want PAN data retrievable from the token. As for encryption, rather than suggest a “strong encryption cipher”, why not take this a step further and recommend a one time pad? This is a perfect application for that kind of substitution cipher. And one time pads are as secure a method as anything else. I’m guessing Visa did not suggest this because a handful of very large payment processors, with distributed operations, actually want to retrieve the PAN data in multiple locations. That means they need encryption, and they need to distribute the keys. As for hashing, I think the method they prescribe is wrong. Remember that a hash is deterministic. You put in A, the hash digests the PAN data, and it produces B. Every time. Without fail. In order to avoid dictionary attacks you salt the input with a number. But the recommendations are ” … hashing of the cardholder data using a fixed but unique salt value per merchant”! If you use a static merchant ID as the salt, you are really not adding much in the way of computational complexity (or trying very hard to stop attacks). Odds are the value will be guessed or gathered at some point, as will the hashing algorithm – which subjects you to precomputed attacks against all the tokens. It seems to me that for PAN data, you can pick any salt you want, so why not make it different for each and every token? The token server can store the random salt with the token, and attacks become much tougher. Finally, Visa did not even discuss format preservation. I am unaware of any tokenization deployment that does not retain the format of the original credit card number/PAN. In many cases they preserve data types as well. Punting on this subject is not really appropriate, as format preservation is what allows token systems to slide into existing operations without entirely reworking the applications and databases. Visa should have stepped up to the plate with format preserving encryption and fully endorsed format-preserving strong cryptography. This was absent fromnot addressed in the Field Level Encryption Best Practices in 2009, and remains conspicuous by its absence. The odds are that if you are saddled with PCI-DSS responsibilities, you will not write your own ‘home-grown’ token servers. So keep in mind that these recommendations are open enough that vendors can easily provide botched implementations and still meet Visa’s guidelines. If you are only interested in getting systems out of scope, then any of these solutions is fine because QSAs will accept them as meeting the guidelines. But if you are going to the trouble of implementing a token server, it’s no more work to select one that offers strong security. Share:

Share:
Read Post
dinosaur-sidebar

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.