Token Vaults and Token Storage Tradeoffs
Use of tokenization continues to expand as customers look to simplify PCI-DSS compliance. With this increased adoption comes a lot of vendor positioning and puffery, as they attempt to differentiate their products in an increasingly competitive market. Unfortunately this competitive positioning often causes confusion among buyers, which is why I have spent the last couple mornings answering questions on FPE vs. Tokenization, and the difference between a token vault and a database. Lately most questions center on differentiating tokenization data vaults, with the expected confusion caused by vendor hyperbole. In this post I will define a token vault and shed some light on their pros and cons. My goal is to help you determine as a consumer whether vaults are something to consider when selecting a tokenization solution. A token vault is where you store issued tokens and the credit card numbers they represent. The storage location is called a “token vault”. The vault typically contains other information, but for this discussion just think of the token vault as a long list of CC#/token pairs. A new type of solution called ‘stateless’ or ‘vault-less’ tokenization is now available. These systems use derived tokens, which can be recalculated from some secret value, and those do not need to be stored in a database. Recent press hype claims that token vaults are bad and you should stay away from them. The primary argument is “you don’t want a relational database as a token vault” – or more specifically, “an Oracle database makes a slow and expensive token vault, and customers don’t want that”. Not so fast! The issue is not clear-cut. It’s not that token vaults are good or bad, but of course there are tradeoffs. Token vaults are fine for many types of customers, but not suitable for others. There are three issues at the heart of this debate: cost, scale, and performance. Let’s take a closer look at each of them. Cost: If you are going to use an Oracle, IBM DB2, or Microsoft SQL Server database for your token vault, you will need a license for the database. And token vaults must be redundant so you will need at least a couple licenses. If you want to ensure that your tokenization system can handle large bursts of transactions – such as holiday shopping periods – you will need hefty servers. Databases are priced based on server capacity, so these licenses can get very expensive. That said, many customers running in-house tokenization systems already have database site licenses, so for many customers this is not an issue. Scale: If you have data processing sites where token servers are dispersed across remote data centers that cannot guarantee highly reliable communications, synchronization of token vaults is a serious issue. You need to ensure that credit cards are not misused, that you have transactional consistency across all locations, and that no token is issued twice. With ‘vault-less’ tokenization synchronization is a non-issue. If consistency across a scaled tokenization deployment is critical derived tokens are incredibly attractive. But some non-derived token systems with token vaults get around this issue by pre-allocating token sequences; this ensures tokens are unique, and synchronization latency is not a concern. This is a critical advantage for very large credit card processors and merchants but not a universal requirement. Performance: Some token server designs require a check inside the token vault prior to completing every transaction, in order to ensure to avoid duplicate credit cards or tokens. This is especially true when a single token is used to represent multiple transactions or merchants (multi-use tokens). Unfortunately early tokenization solutions generally had poor database architectures. They did not provide efficient mechanisms of indexing token/CC# pairs for quick lookup. This is not a flaw in the databases themselves – it was a mistake made token vault designers as they laid out their data! As the number of tokens climbs into the tens or hundreds of millions, lookup operations can become unacceptably slow. Many customers have poor impressions of token vaults because their early implementations got this wrong. So very wrong. Today lookup speed is often not a problem – even for large databases – but customers need to verify that any given solutions meets their requirements during peak loads. For some customers a ‘vault-less’ tokenization solution is superior across all three axes. For other customers, with deep understanding of relational databases… security, performance, and scalability are just part of daily operations management. No vendor can credibly claim that databases or token vaults are universally the wrong choice, just like that nobody can claim that any non-relational solution is always the right choice. The decision comes down to the customer’s environment and IT operations. I am willing to bet that the vendors of these solutions will have some additional comments, so as always the comments section is open to anyone who wants to contribute. Share: