This is the fifth post in a new series. If you want to track it through the entire editing process, you can follow along and contribute on GitHub. You can read the first post and find the other posts under “related posts” in full article view.
Additional Platform Features and Options
The encryption engine and the key store are the major functional pieces in any encryption platform, but there are supporting systems with any data center encryption solution that are important for both overall management, as well as tailoring the solution to fit within your application infrastructure. We frequently see the following major features and options to help support customer needs:
For enterprise-class data center encryption you need a central location to define both what data to secure and key management policies. So management tools provide a window onto what data is encrypted and a place to set usage policies for cryptographic keys. You can think of this as governance of the entire crypto ecosystem – including key rotation policies, integration with identity management, and IT administrator authorization. Some products even provide the ability to manage remote cryptographic engines and automatically apply encryption as data is discovered. Management interfaces have evolved to enable both security and IT management to set policy without needing cryptographic expertise. The larger and more complex your environment, the more critical central management becomes, to control your environment without making it a full-time job.
Format Preserving Encryption
Encryption protects data by scrambling it into an unreadable state. Format Preserving Encryption (FPE) also scrambles data into an unreadable state, but retains the format of the original data. For example if you use FPE to encrypt a 9-digit Social Security Number, the encrypted result would be 9 digits as well. All commercially available FPE tools use variations of AES encryption, which remains nearly impossible to break, so the original data cannot be recovered without the key. The principal reason to use FPE is to avoid re-coding applications and re-structuring databases to accommodate encrypted (binary) data. Both tokenization and FPE offer this advantage. But encryption obfuscates sensitive information, while tokenization removes it entirely to another location. Should you need to propagate copies of sensitive data while still controlling occasional access, FPE is a good option. Keep in mind that FPE is still encryption, so sensitive data is still present.
Tokenization is a method of replacing sensitive data with non-sensitive placeholders: tokens. Tokens are created to look exactly like the values they replace, retaining both format and data type. Tokens are typically ‘random’ values that look like the original data but lack intrinsic value. For example, a token that looks like a credit card number cannot be used as a credit card to submit financial transactions. Its only value is as a reference to the original value stored in the token server that created and issued the token. Tokens are usually swapped in for sensitive data stored in relational databases and files, allowing applications to continue to function without changes, while removing the risk of a data breach. Tokens may even include elements of the original value to facilitate processing. Tokens may be created from ‘codebooks’ or one time pads; these tokens are still random but retain a mathematical relationship to the original, blurring the line between random numbers and FPE. Tokenization has become a very popular, and effective, means of reducing the exposure of sensitive data.
Like tokenization, masking replaces sensitive data with similar non-sensitive values. And like tokenization masking produces data that looks and acts like the original data, but which doesn’t pose a risk of exposure. But masking solutions go one step further, protecting sensitive data elements while maintaining the value of the aggregate data set. For example we might replace real user names in a file with names randomly selected from a phone directory, skew a person’s date of birth by some number of days, or randomly shuffle employee salaries between employees in a database column. This means reports and analytics can continue to run and produce meaningful results, while the database as a whole is protected. Masking platforms commonly take a copy of production data, mask it, and then move the copy to another server. This is called static masking or “Extract, Transform, Load” (ETL for short).
A recent variation is called “dynamic masking”: masks are applied in real time, as data is read from a database or file. With dynamic masking the original files and databases remain untouched; only delivered results are changed, on-the-fly. For example, depending on the requestor’s credentials, a request might return the original (real, sensitive) data, or a masked copy. In the latter case data is dynamically replaced with a non-sensitive surrogate. Most dynamic masking platforms function as a ‘proxy’ something like firewall, using redaction to quickly return information without exposing sensitive data to unauthorized requesters. Select systems offer more intelligent randomization, tokenization, or even FPE.
Again, the lines between FPE, tokenization, and masking are blurring as new variants emerge. But tokenization and masking variants offer superior value when you don’t want sensitive data exposed but cannot risk application changes.