Our last post covered the core functions of the tokenization server. Today we’ll finish our discussion of token servers by covering the externals: the primary architectural models, how other applications communicate with the server(s), and supporting systems management functions.
There are three basic ways to build a token server:
- Stand-alone token server with a supporting back-end database.
- Embedded/integrated within another software application.
- Fully implemented within a database.
Most of the commercial tokenization solutions are stand-alone software applications that connect to a dedicated database for storage, with at least one vendor bundling their offering into an appliance. All the cryptographic processes are handled within the application (outside the database), and the database provides storage and supporting security functions. Token servers use standard Database Management Systems, such as Oracle and SQL Server, but locked down very tightly for security. These may be on the same physical (or virtual) system, on separate systems, or integrated into a load-balanced cluster. In this model (stand-alone server with DB back-end) the token server manages all the database tasks and communications with outside applications. Direct connections to the underlying database are restricted, and cryptographic operations occur within the tokenization server rather than the database.
In an embedded configuration the tokenization software is embedded into the application and supporting database. Rather than introducing a token proxy into the workflow of credit card processing, existing application functions are modified to implement tokens. To users of the system there is very little difference in behavior between embedded token services and a stand-alone token server, but on the back end there are two significant differences. First, this deployment model usually involves some code changes to the host application to support storage and use of the tokens. Second, each token is only useful for one instance of the application. Token server code, key management, and storage of the sensitive data and tokens all occur within the application. The tightly coupled nature of this model makes it very efficient for small organizations, but does not support sharing tokens across multiple systems, and large distributed organizations may find performance inadequate.
Finally, it’s technically possible to manage tokenization completely within the database without the need for external software. This option relies on stored procedures, native encryption, and carefully designed database security and access controls. Used this way, tokenization is very similar to most data masking technologies. The database automatically parses incoming queries to identify and encrypt sensitive data. The stored procedure creates a random token – usually from a sequence generator within the database – and returns the token as the result of the user query. Finally all the data is stored in a database row. Separate stored procedures are used to access encrypted data. This model was common before the advent of commercial third party tokenization tools, but has fallen into disuse due to its lack for advanced security features and failure to leverage external cryptographic libraries & key management services.
There are a few more architectural considerations:
- External key management and cryptographic operations are typically an option with any of these architectural models. This allows you to use more-secure hardware security modules if desired.
- Large deployments may require synchronization of multiple token servers in different, physically dispersed data centers. This support must be a feature of the token server, and is not available in all products. We will discuss this more when we get to usage and deployment models.
- Even when using a stand-alone token server, you may also deploy software plug-ins to integrate and manage additional databases that connect to the token server. This doesn’t convert the database into a token server, as we described in our second option above, but supports communications for distributed systems that need access to either the token or the protected data.
Since tokenization must be integrated with a variety of databases and applications, there are three ways to communicate with the token server:
- Application API calls: Applications make direct calls to the tokenization server procedural interface. While at least one tokenization server requires applications to explicitly access the tokenization functions, this is now a rarity. Because of the complexity of the cryptographic processes and the need for precise use of the tokenization server; vendors now supply software agents, modules, or libraries to support the integration of token services. These reside on the same platform as the calling application. Rather than recoding applications to use the API directly, these supporting modules accept existing communication methods and data formats. This reduces code changes to existing applications, and provides better security – especially for application developers who are not security experts. These modules then format the data for the tokenization API calls and establish secure communications with the tokenization server. This is generally the most secure option, as the code includes any required local cryptographic functions – such as encrypting a new piece of data with the token server’s public key.
- Proxy Agents: Software agents that intercept database calls (for example, by replacing an ODBC or JDBC component). In this model the process or application that sends sensitive information may be entirely unaware of the token process. It sends data as it normally does, and the proxy agent intercepts the request. The agent replaces sensitive data with a token and then forwards the altered data stream. These reside on the token server or its supporting application server. This model minimizes application changes, as you only need to replace the application/database connection and the new software automatically manages tokenization. But it does create potential bottlenecks and failover issues, as it runs in-line with existing transaction processing systems.
- Standard database queries: The tokenization server intercepts and interprets the requests. This is potentially the least secure option, especially for ingesting content to be tokenized.
While it sounds complex, there are really only two functions to implement:
- Send new data to be tokenized and retrieve the token.
- When authorized, exchange the token for the protected data.
The server itself should handle pretty much everything else.
Finally, as with any major application, the token server includes various management functions. But due to security needs, these tend to have additional requirements:
- User management, including authentication, access, and authorization – for user, application, and database connections. Additionally, most tokenization solutions include extensive separation of duties controls to limit administrative access to the protected data.
- Backup and recovery for the stored data, system configuration and, if encryption is managed on the token server, encryption keys. The protected data is always kept encrypted for backup operations.
- Logging and reporting – especially logging of system changes, administrative access, and encryption key operations (such as key rotation). These reports are often required to meet compliance needs, especially for PCI.
In our next post we’ll go into more detail on token server deployment models, which will provide more context for all of this.