The goal of tokenization is to reduce the scope of PCI database security assessment. This means a reduction in the time, cost, and complexity of compliance auditing. We want to remove the need to inspect every system for security settings, encryption deployments, network security, and application security, as much as possible. For smaller merchants tokenization can make self-assessment much more manageable. For large merchants paying 3rd-party auditors to verify compliance, the cost savings is huge.
PCI DSS still applies to every system in the logical and physical network associated with the payment transaction systems, de-tokenization, and systems that store credit cards – what the payment industry calls “primary account number”, or PAN. For many merchants this includes a major portion – if not an outright majority – of information systems under management. The PCI documentation refers to these systems as the “Cardholder Data Environment”, or CDE. Part of the goal is to shrink the number of systems encompassed by the CDE. The other goal is to reduce the number of relevant checks which must be made. Systems that store tokenized data, even if not fully isolated logically and/or physically from the token payment gateway to servers, need fewer checks to ensure compliance with PCI DSS.
The ground rules
So how do we know when a server is in scope? Let’s lay out the ground rules, first for systems that always require a full security analysis:
- Token server: The token server is always in scope if it resides on premise. If the token server is hosted by a third party, the calling systems and the API are subject to inspection.
- Credit card/PAN data storage: Anywhere PAN data is stored, encrypted or not, is in scope.
- Tokenization applications: Any application platform that requests tokenized values, in exchange for the credit card number, is in scope.
- De-tokenization applications: Any application platform that can make de-tokenization requests is in scope.
In a nutshell, anything that touches credit cards or can request de-tokenized values is in scope. It is assumed that administration of the token server is limited to a single physical location, and not available through remote network services. Also note that PAN data storage is commonly part of the basic token server functionality, but they are separated in some cases. If PAN data storage and token generation server/services are separate but in-house (i.e., not provided as a service) then both are in scope. Always.
Determining system scope
For the remaining systems, how can you tell if tokenization will reduce scope, and by how much? For each of your remaining systems, here is how to tell:
The first check to make for any system is for the capability to make requests to the token server. The focus is on de-tokenization, because it is assumed that every other system that has access to the token server or its server API, is passing credit card numbers and fully in scope. If this capability exists – through user interface, programmatic interface, or any other means, then PAN is accessible and the system is in scope. It is critical to minimize the number of people and programs that can access the token server or service, both for security and to redue scope.
The second decision concerns use of random tokens. Suitable token generation methods include random number generators, sequence generators, one-time pads, and unique code books. Any of these methods can create tokens that cannot be reversed back to credit cards without access to the token server. I am leaving hashed-based tokens off this list because they are relatively insecure (reversible), because providers routinely fail to salt their tokens, or salt with ridiculously guessable values (i.e., the merchant ID).
Vendors and payment security stakeholders are busy debating encrypted card data versus tokenization, so it’s worth comparing them again. Format Preserving Encryption (FPE) was designed to secure payment data without breaking applications and databases. Application platforms were programmed to accept credit card numbers, not huge binary strings, so FPE was adopted to improve security with minimum disruption. FPE is entrenched at many large merchants, who don’t want the additional expense of moving to tokenization, and so are pushing for acceptance of FPE as a form of tokenization. The supporting encryption and key management systems are accessible – meaning PAN data is available to authorized users, so FPE cannot remove systems from the audit scope. Proponents of FPE claim they can segregate the encryption engine and key management, so therefore it’s just as secure as random numbers. Only the premise is a fallacy. FPE advocates like to talk about logical separation between sensitive encryption/decryption systems and other systems which only process FPE-encoded data, but this is not sufficient. The PCI Council’s guidance does not exempt systems which contain PAN (even encrypted using FPE) from audit scope, and it is too easy for an attacker or employee to cross that logical separation – especially in virtual environments. This makes FPE riskier than tokenization.
Finally, strive to place systems containing tokenized data outside the “Cardholder Data Environment” using network segmentation. If they are in the CDE, they need to be in scope for PCI DSS – if for no other reason than because they provide an attacker point for access to other card storage, transaction processing, and token servers. Configure firewalls, network configuration, and routing, to separate CDE systems from non-CDE systems which don’t directly communicate with them. Systems that are physically and logically isolated from the CDE, provided they meet the ground rules and use random tokens, are completely removed from audit scope.
Under these conditions tokenization is a big win, but there are additional advantages…
Determining control scope
As above, a fully isolated system with random tokens means you can remove the system from scope. Consider the platforms which have historically stored credit card data but do not need it: customer service databases, shipping & receiving, order entry, etc. This is where you can take advantage of tokenization. For all systems which can be removed from audit scope, you can save effort on: firewall configuration, identity management (and review), encryption, key management, patching, configuration assessment, etc. Security services such as anti-virus, monitoring, and auditing are no longer mandatory for PCI compliance. This saves time, reduces security licensing costs, and cuts audit expenses. You still need basic platform security to ensure tokens are not swapped, replaced, or altered – so monitoring, hash verification, and/or audit trails are recommended. But in this model the token is purely a reference to a transaction rather than PAN, so it’s much less sensitive and the danger of token theft is reduced or removed entirely.
This raises the question: can we further reduce scope on systems that use tokenization but are not isolated from credit card processing? The short answer is ‘yes’. Systems that are not fully isolated but only use random number tokens don’t require full checks on encryption or key management (you did choose real tokens, right?). Further, you don’t need to monitor access to tokens, and enforce separation of duties as rigorously. Yes, the PCI guidance on tokenization states that the full scope of PCI DSS applies, but these controls become ridiculous when cardholder data is no longer present – as is the case with random number tokens, but not the case with FPE. There you have it: the basic reason the PCI Council waffled on their tokenization scoping guidance – to avoid political infighting and limit liability. But this additional scope reduction make a huge difference to merchants for cost reduction – and tokenization is for merchants, not the PCI Council. Merchants, as a rule, don’t really care about other organizations’ risk – only about maximizing their own savings.
Remember that other controls remain in place. You still need to ensure that access control, AV, firewalls, and the like remain on your checklist. You will still need to verify network segmentation and that de-tokenization interfaces are not accessible except under tightly controlled circumstances. And you still need to monitor system and network usage for anomalous activity. Everything else remains as it was – especially if it falls under the ground rules mentioned above. To be perfectly clear, this means systems that perform de-tokenization requests (whether they store PAN or not) must completely satisfy the PCI DSS standard. When reading the PCI’s tokenization supplement, keep in mind that tokenization does not modify the standard – instead it has the potential to nullift some required checks and/or removes systems from scope (audit).
One final comment: Onsite file and database storage facilities provide a half dozen ways to access raw credit card data – each of which requires separate investigation – making the audit much more complex. De-tokenization requests through a single API call to a single service location both make it easier to secure endpoints making the de-tokenization requests, and help validate the security of the (single) de-tokenization protocol. Notice that this decision tree does not include in-house tokenization platforms versus 3rd-party tokenization services. If you can use a 3rd-party tokenization service the token server is automatically off-premise, further reducing audit complexity. This is the best way to ensure as much as possible is out of scope, and while the PCI Council claims you are ultimately responsible, you can still place the burden of proof on the service provider to pass their PCI audits.
Between scope and control reduction, a 50% reduction in compliance costs can really happen.
Next up in this series: guidance for auditors.
Reader interactions
6 Replies to “Tokenization Guidance: Merchant Advice”
Adrian,
Thanks for your detailed response. Just to clarify on the risk question with regards to FPE. I absolutely agree as there are several “FPE”s on the market but few with the formal validation that is necessary for uses such as in PCI demands. The FPE I refer to – AES FFX mode – has proofs – formal, validated and independently verified proofs carrying into the NIST AES modes standard track. This is not a supposition of proof, but an axiomatic proof of the underlying structure – interested readers can look at Rogaway’s publications (UC Davis) on the topic – he’s a world renowned expert and well published and cited. We also recommend interested parties looking at FFX FPE also contact NIST directly now given the advanced state of standards development within AES modes.
I agree key management is king – but the same management applies to a PRNG in terms of integrity and assurance that the system sending the token back hasn’t been modified (configuration or design flaw) to return tokens that can leak original data, assurance the integrity of the mapping tables or other mapping method is intact on a transaction by transaction basis and so on – operational and implementation integrity challenges in their own right and ultimately responsible for security and risk as a whole. Another way to look at this is to establish the number of security functions in in the system, sum the risk within each function, and accumulate net risk over all component as a combination – quantitative risk analysis (Soo Hoo et al). The more components in the chain which introduce risk, the higher like risk of single event having an amplified effect. Tokenization is typically a multi component environment.
Keep in mind too – if tokens are truly random, then a PAN should tokenize to different random values each time or two different PANs may tokenize to the same random value if the PRNG is truly random – the probability of the same value appearing again in a sequence is the same as each should be independent events – but that’s not true in practice in tokenization – typically a PAN will tokenize to the same value for a given merchant for referential integrity reasons. So, this is in fact a “shuffle” of the data space, not a random token each time. So, this means that there is a diminishing space of values to select the next “random” value from. With code book approaches, the “shuffle” – next value – is predefined based on the input – which iself is not actually random. So, not in fact truly random process overall since forwards and backwards predictability comes from the codebook – which must then be protected just like a key. Given there’s no standard for codebooks, and even less so for a table/PRNG based approach, how do you really know the risk is higher or lower than e.g. a proven method? That metric then comes down to the assessor and in some cases guesswork if the methods are not published, proven and scrutinized openly. Its not quite as simple as claiming its random and therefore we’re in good shape. Keep in mind, a strong cryptographic algorithm is by design absolutely indistinguishable from a random sequence: that is a requirement and a fundamental property – so in fact converges to the “ideal random” method itself.
Portability and lock in is not really an issue as described. There’s no reason the token:PAN pairs can’t be moved into a database after being cryptographically generated if such a migration was desirable.
Scale however is something that is a challenge if using a database driven approach – especially in the service provider space. An acquirer with 1 million discrete merchants will be needing to manage 1 million database tables or instances for separation of tokens:PAN pairs across merchants. This will be necessary for the portability requirement, and to avoid cross merchant token driven fraud – the “high value” token scenario outlined in the PCI SSC’s document.
Cryptographic methods & combinations with random tokens – codebook or algorithmic methods reduce this problem to managing a either large codebook or a small amount of cryptographic material vs a large table or sets of databases. It’s certainly simpler to manage small highly sensitive data set than larger multidimensional database or databases with a lot of real-time state. The CIA’s Ira Hunt often mentions “in terms of security, humans are great at managing a small number of things well, and usually manage a lot of things less so”. Certainly databases are mature – no doubts there whatsoever, but costs of operation map directly to scale and complexity – and continuous backup may be a requirement in many cases – another cost and impact to operation and performance. In contrast, FFX based FPE, being a mode of AES, can operate in native AES hardware very efficiently today, and with raw AES these days operating at gigabit per second speeds and higher, I would argue that such cryptographic methods will always be an order of magnitude faster or more than looking up records in a database/cache (e.g. A Hash computation, search, lookup, a decrypt vs a decrypt), and can scale by CPU vs storage – especially if the operation for the transform from PAN to Token is atomic and within an environment that is highly secure- on a secure CPU or inside a HSM – devices designed with the goal of providing the minimum surface for data exposure possible. Its hard to make a database run in a HSM – keys yes, but not the whole operation for end to end integrity assurance.
In the end, each method has its advantages and disadvantages – and use cases which is why we offer both as I mentioned – and the disassociation of the token from the live data presents another useful security method that is perfectly valid – and a great tool for the merchant. However, I think some of the assumptions outlined above don’t play out in practice – especially the concern over FFX mode AES FPE strength given its proofs.
Best Regards,
Mark
@Brian – sorry for the long post, but this is an interesting topic, deserves scrutiny and attention and I thank Adrian for keeping the fire alive, the insight and the opportunity to respond 🙂
@Mark –
Thanks for the thoughtful reply.
As you noticed, my comments focus on storage of tokens (or PAN), and purposely under-playing the security of moving PAN data from point to point during the initial authorization phase. That’s intentional as tokenization – for the merchant – is largely a back-end substitution. And the vast majority of merchants opt for off-site 3rd party solutions. The deployment models will change in the coming years given some of the mobile payment and tokenization strategies that I am hearing about, but for now I am really covering tokenization of PAN data to solve storage security issues.
That said I should reply to your P2PE comments as they are relevant, and some of the earlier posts could be construed as covering POS. I agree P2PE – where encryption happens right at the POS and the customer never actually sees the PAN – should be out of scope. The merchant network carries an encrypted stream without access to the data – and there are vendors who sell systems that will do just that. But be honest – we know that’s not how it’s always done; it’s common for the POS to pass PAN to a Windows based register that in turn sends it to a back office server that encrypts it prior to being sent to the payment processor. Once again, the concept of FPE & P2PE implementations are great; in practice we need to verify the implementation meets the ideal. With canned hardware solutions it’s easy; for some legacy systems where FPE’s glued in, auditing is complex. My statements for the authorization phase should be modified to clarify this point, so I’ll modify the research to reflect your comment. Which also reminds me that I should make a comment on edge-tokenization like Akamai/Cybersource offer as it delivers on what your are pointing out.
On the subject of accepting FPE tokens, given the token server is not on premise, and given there is _no_ direct access to key management – only API layer de-tokenization requests – I agree that parallels the P2PE guidance. But I would _never_ advise a merchant to accept FPE in this case. Why? Risk. PCI council stated the merchant was fully responsible for risks associated with tokenization, and should subsequent flaws in the encryption or key management scheme be discovered, you’re hosed. If the merchant has random numbers they don’t have the risk. And here’s a subtle one that few think about: Vendor lock-in. With random numbers I can transition to a new system by changing the API calls! With FPE I am tied to the _system_; it’s a full migration including re-issue & replace tokens, along with the API call switch-over. Similarly, when advising payment processors, I recommend random number solutions. With random numbers a) computational overhead is lower, b) format and data type preservation is easier, c) geographic distribution and scalability is at least as good, and has the potential to be vastly superior.
@Brian –
I totally agree – if I am looking at this from the PCI council’s standpoint. Merchants want to get compliant as quickly and inexpensively as possible. I’ve yet to talk to a merchant who said security was their primary goal. Risk reduction – which is a little different way to look at compliance and security – is usually in the top three motivating factors, followed by reduced time, reduced management costs, reduced audit costs and reduced impact on scalability. I’m not advocating less security, nor am I endorsing the order of priorities, but I am trying to put tokenization into the merchants perspective.
The overall purpose of tokenization is not scope reduction. That’s a by-product. The purpose is to remove sensitive data, which is the #1 method for protecting cardholder data. Aside from that, this is an excellent article. @MarkBower, can you limit your blog responses to the key points and, consider write your own blog if you’re going to write this much?
Edit: last para:
…most larger merchants already know this and do care – for their customers sake and brand reputation – and are taking steps which go beyond where PCI leaves off to protect data, and to ensure their systems are scuppered by hackers walking I the front door of a de-scoped environment. De-scoped does not mean secure.
should read:
…most larger merchants already know this and do care – for their customers sake and brand reputation – and are taking steps which go beyond where PCI leaves off to protect data, and to ensure their systems are not scuppered by hackers walking in the front door of a de-scoped environment. De-scoped does not mean secure.
This topic came up at the PCI EMEA meeting during the discussion session on P2PE and Tokenization – it was openly discussed and the answers pointed directly back to good key management practices as segregation, the use of HSM’s and so forth. It’s not always vendors talking about this as you mention. Indeed, FPE came up too – brought up by the PCI SSC panel consisting of the Advisory/Management Board – clearly a topic of interest given its wide adoption already – and the panel made it clear that it was acceptable if there was the right level of cryptographic backing – proofs, standards tracks etc.
So, to clarify – encrypted data can be taken out of scope of PCI. In the use case of P2PE (Point to Point Encryption), it’s absolutely clear that encrypted data can be out of scope when the implementation adheres to the P2PE requirements or a QSA evaluates the environment as such – this is to get the data from all those endpoints – secure devices capturing card/chip data and encrypting there and then for transmission up to the acquirer. There is very specific guidance on this in the current P2PE document of what can be de-scoped. Strong cryptography in the absence of keys is irreversible in any practical way. That is a fact and clearly stated.
Of course, P2PE should not be confused with Tokenization – though it often is – they are entirely complimentary. P2PE is about sending pre authorization cardholder data or live payment data to an acquirer. Tokenization is useful after authorization – batch files, transaction logs for settlement and for post sales processes like chargeback. Keep in mind with an on premise tokenization solution, there will still be a need to detokenize prior to any PAN driven process – and it’s absolutely critical – for the merchants sake to get their money from transactions they’ve now tokenized – that the token processes have the highest integrity and reliability. If those tokens go missing, are corrupted, or purged accidentally, it might have a serious financial impact if the batch settlement files can’t execute.
FPE really comes into its own for P2PE in the merchants case – when the path from the point of capture to the destination is complicated by legacy systems expecting a PAN, SAD or Track data and where change system impact needs to be minimal to none. Then, once the transaction data has been securely transferred to the acquirer without the merchant’s ability to decrypt in any shape or form and a token returned in response – you have the ideal scope reduction scenario – the merchant’s not managing a token database or vault/code book, and responsibility for key management rests with the acquirer, and the exposure of PAN data is minimized.
With regards to your decision tree on scope for the tokenization use case – I think you’ve left out an important use case: when the token – or transaction ID’s for that matter used as the surrogate to the PAN for post authorization uses – is cryptographically generated by a third party where the keys are managed entirely independently – e.g. in HSM at an acquirer – then this use cases falls into the PCI FAQ on encrypted data and scope – FAQ 10359 which parallels the P2PE guidance.
In respect to the statement “The supporting encryption and key management systems are accessible – meaning PAN data is available to authorized users, so FPE cannot remove systems from the audit scope” It’s not quite clear what you mean. Contrast this statement with a very common scenario comparable to an acquirer issuing tokens after a P2PE transaction – PIN debit transactions and transaction response codes. Here we have encrypted data going up, derived data coming back. Do merchants have access to PIN decryption keys for instance because they accept PIN debit at the POS? No. Do merchants have direct access to the systems which create the derived transaction response codes post authorization for reversals from the acquirer? No. Perhaps a badly implemented on premise solution that’s not even close to PCI compliant could create an exposure like this, but a Tokenization approach of the same quality would be equally at risk of “accessibility”.
Of course, scope reduction is a major driver for PCI technology investment – but eliminating it is not always possible. This is one of the reasons we provide a choice – randomly generated tokens disassociated from the PAN in every way, or Format Preserving Encryption – from one platform. The decision on approach can be suited to risk, cost, scope, implementation – on premise, acquirer etc and so on.
I do agree with Stephens post – taking an “out of scope, out of mind” posture is financially attractive, but there’s more data to a merchants IT systems than just cardholder data – and most larger merchants already know this and do care – for their customers sake and brand reputation – and are taking steps which go beyond where PCI leaves off to protect data, and to ensure their systems are scuppered by hackers walking I the front door of a de-scoped environment. De-scoped does not mean secure.
Regards,
Mark Bower
Disclaimer: I work for a firm that provides FPE, P2PE and Random token technology.
While the cost savings are inarguable it worries me, as a customer, to hear that businesses can now get rid of “firewall configuration, identity management …, patching, configuration … anti-virus, monitoring, and auditing” which should be due diligence for any business. Even if one does not have to pay QSA rates to have them review it any business should be doing that sort of thing as part of a good risk management program, compliance with various other regulations such as state PII laws, and to avoid creating externalities which can come back to haunt you later through legal action. And ya know ethics and morals and stuff like that.