Login  |  Register  |  Contact

Column Encryption

Thursday, June 18, 2009

Database Encryption, Part 4: Credentialed User Protection

By Adrian Lane, Rich

In this post we will detail the other half of the decision tree for selecting a database encryption strategy: securing data from credentialed database users. Specifically, we are concerned with preventing misuse of data through individual or group accounts that provide access to data either directly or through another application. For the purpose of this discussion, we will be most interested in differentiating between accounts assigned users who use the data stored within the database, from accounts assigned to users who administer the database system itself. These are the two primary types of credentialed database users, and each needs to be treated differently because their access to database functions is radically different. As administrative accounts have far more capabilities and tools at their disposal, those threats are more varied and complex, making it much more difficult to insulate sensitive data. Also keep in mind that a 'user' in context of database accounts may be a single person, or it may be a group account associated with a number of users, or it may be an account utilized by a service or program.

With User Encryption, we assign access rights to the data we want secured on a user by user basis, and provide decryption keys only to the specified users who own that information, typically through a secondary authentication and authorization process. We call this User Encryption because we are both protecting sensitive data associated with each user account, and also responding to threats by type of user. This differs from Transparent Encryption in two important ways. First, we are now protecting data accessed through the normal database communication protocols as opposed to methods that bypass the database engine. Second, we are no longer encrypting everything in the database; rather it's quite the opposite -- we want to encrypt as little as possible so unsensitive information remains available to the rest of the database community. Conceptually this is very similar to the functionality provided by database groups, roles, and user authorization features. In practice it provides an additional layer of security and authentication where, in the event of a mistake or account compromise, exposed data remains encrypted and unreadable. As you can probably tell, since most regular users can be restricted using access controls, encryption at this level is mostly used to restrict administrative users.

They say if all you have is a hammer, everything begins to look like a nail. That statement is relevant to this discussion of database encryption because the database vendors begin the conversation with their capabilities for column, table, row, and even cell level encryption. But these are simply tools. In fact, for what we want to accomplish, they may be the wrong tools. We need to fully understand the threat first, in this case credentialed users, and build our tool set and deployment model based upon that and not the other way around. We will discuss these encryption options in our next post on Implementation and Deployment, but need to fully understand the threat to be mitigated before selecting a technology. Interestingly enough, in the case of credentialed user threat analysis, we are proceeding from the assumption that something will go wrong, and someone will attempt to leverage credentials in such a way that they gain access to sensitive information within the database. In Part 2 of this series, we posed the questions "What do you want to protect?" and "What threat do you want to protect the data from?" Here, we add one more question: "Who do you want to protect the data from?" General users of the data or administrators of the system? Let's look at these two user groups in detail:

  • Users: This is the general class of users who call upon the database to store, retrieve, report, and analyze data. They may do this directly through queries, but far more likely they connect to the database through another application. There are several common threats companies look to address for this class of user: providing protection against inadvertent disclosure from sloppy privilege management, inherited trust relationships, meeting a basic compliance requirement for encrypting sensitive data, or even providing finer-grained access control than is otherwise available through the application or database engine. Applications commonly use service accounts to connect to the database; those accounts are shared by multiple users, so the permissions may not be sufficiently granular to protect sensitive data. Users do not have the same privileges and access to the underlying infrastructure that administrators do, so the threat is exploitation of laxity in access controls. If protecting against this is our goal, we need to identify the sensitive information, determine who may use it, and encrypt it so that only the appropriate users have access. In these cases deployment options are flexible, as you can choose key management that is internal or external to the database, leverage the internal database encryption engine, and gain some latitude as to how much of the encryption and authentication is performed outside the database. Keep in mind that access controls are highly effective with much less performance impact, and they should be your first choice. Only encrypt when encryption really buys you additional security.
  • Administrators: The most common concern we hear companies discuss is their desire to mitigate damage in the event that a database administrator (DBA) account is compromised or misused by an employee. This is the single most difficult database security challenge to solve. The DBA role has rights to perform just about every function in the database, but no legitimate need to examine or use most of the data stored there. For example, the DBA has no need to examine Social Security Numbers, credit card data, or any customer data to maintain the database itself. This threat model dictates many of the deployment options. When the requirement is to protect the data from highly privileged administrators, enforcing separation of duties and providing a last line of defense for breached DBA accounts, then at the very least external key management is required. By encrypting and removing key management functions from DBA control, that sensitive information can be kept secure. External Key Management provides separation of duties between the management of the database and use of the data therein. The orchestration of the encryption/decryption functions are typically performed at the application and not inside the database, requiring modification to the application code. Use of the database engine's built-in encryption capabilities may be possible, depending upon the vendor implementation, and most vendors provide API calls for third party encryption support while maintaining a single database 'conversation'. This design addresses user level threats as well, and should be considered a superset.

Once we've decided which of these threats to address; we select tools, technologies, and deployment options that support our goals. We have established that at the very least, the two drivers will be distinguished by calling for internal vs. external key management. Depending upon your answers to the question "What data do you want to protect?", we can now decide on what level we need to encrypt a (table, column, row, or cell), the type of key management we need, whether we will leverage the internal database encryption engine or use external services, and what changes need to be made to the business processing logic. We will cover these topics in our next post, and will follow that up with several common customer use cases.

One parting point we want to make with user encryption strategies, as it is a question that comes up over and over: "How is this different than access controls?" It's all in how you use it. Encryption's key value is in providing a level of granularity beyond what's possible with access controls. This almost always translates to restricting administrators, since access controls are very effective for all other kinds of users. Another, more complex, option with encryption is to tie it to digital certificates outside the database, adding (essentially) another authentication factor. This increases security, because simply compromising a username and password isn't sufficient to read the data, and so is particularly useful for protecting data utilized by service accounts.

For the most part, as you'll see in our use cases, we only recommend user level database encryption under extremely limited circumstances. It's a complex topic, and we haven't even dug into the technology yet, but please don't assume because we are spending so much time on it that it's your best option. Just because it's complicated and takes a long time to describe, doesn't mean that's what you should look at first.

–Adrian Lane, Rich

Thursday, May 14, 2009

Database Encryption: Option 2, Enforcing Separation of Duties

By Adrian Lane

This is the next installment in what is now officially the longest running blog series in Securosis history: Database Encryption. In case you have forgotten, Rich provided the Introduction and the first section on Media Protection, and I covered the threat analysis portion to help you determine which threats to consider when developing a database encryption strategy. You may want to peek back at those posts as a refresher if this is a subject that interests you, as we like to use our own terminology. It's for clarity, not because we're arrogant. Really!

For what we are calling "database media protection" as described in Part 1, we covered the automatic encryption of the data files or database objects through native encryption built into the database engine. Most of the major relational database platforms provide this option, which can be "seamlessly" deployed without modification to applications and infrastructure that use the database. This is a very effective way to prevent recovery of data stored on lost or stolen media. And it is handy when you have renegade IT personnel who hate managing separate encryption solutions. Simple. Effective. Invisible. And only a moderate performance penalty. What more could you want?

If you have to meet compliance requirements, probably a lot more. You need to secure credit card data within the database to comply with the PCI Data Security Standard. You are unable to catalog all of the applications that use sensitive data stored in your database, so you want to stop data leakage at the source. Your DBAs want to be 'helpful', but their ad-hoc adjustments break the accounting system. Your quality assurance team exports production data into unsecured test systems. Medical records need to be kept private. While database media protection is effective in addressing problems with data at rest, it does not help enforce proper data usage. Requirements to prevent misuse by credentialed users or compromised user accounts, or enforce separation of duties, are outside the scope of basic database encryption. For these reasons and many others, you decide that you need to protect the data within the database through more granular forms of database encryption; table, column, or row level security. This is where the fun starts! Encrypting for separation of duties is far more complex than encrypting for media protection; it involves protecting data from legitimate database users, requiring more changes to the database itself. It's still native database encryption, but this simple conceptual change creates exceptional implementation issues. It will be harder to configure, your performance will suffer, and you will break your applications along the way. Following our earlier analogy, this is where we transition from hanging picture hooks to a full home remodeling project. In this section we will examine how to employ granular encryption to support separation of duties within the database itself, and the problems this addresses. Then we will delve into the problems you will to run into and what you need to consider before taking the plunge.

Before we jump in, note that each of these options are commonly referred to as a 'Level' of encryption; this does not mean they offer more or less security, but rather identifies where encryption is applied within the database storage hierarchy (element, row, column, table, tablespace, database, etc). There are three major encryption options that support separation of duties within the database. Not every database vendor supports all of these options, but generally at least two of the three, and that is enough to accomplish the goals above. The common options are:

  1. Column Level Encryption: As the name suggests, column level encryption applies to all data in a single, specific column in a table. This column is encrypted using a single key that supports one or more database users. Subsequent queries to examine or modify encrypted columns must possess the correct database privileges, but additionally must provide credentials to access the encryption/decryption key. This could be as simple as passing a different user ID and password to the key manager, or as sophisticated as a full cryptographic certificate exchange, depending upon the implementation. By instructing the database to encrypt all data stored in a column, you focus on specific data that needs to be protected. Column level encryption is the popular choice for compliance with PCI-DSS by restricting access to a very small group. The downside is that the column is encrypted as a whole, so every select requires the entire column to be deencrypted, and every modification requires the entire column to be re-encrypted and certified. This is the most commonly available option in relational database platforms, but has the poorest performance.
  2. Table / Tablespace Encryption: Table level encryption is where the entire contents of a table or group of tables are encrypted as one element. Much like full database encryption, this method protects all the data within the table, and is a good option when all more than one column in the table contains sensitive information. While it does not offer fine-grained access control to specific data elements, it more efficient option than column encryption when multiple columns contain sensitive data, and requires fewer application and query modification. Examples of when to use this technique include personally identifiable information grouped together -- like medical records or financial transactions -- and this is an appropriate approach for HIPAA compliance. Performance is manageable, and is best when the sensitive tables can be fully segregated into their own tablespace or database.
  3. Field/Cell/Row Level Encryption, Label Security: Row level encryption is where a single row in a table is encrypted, and field or cell level encryption is where individual data elements within a database table are encrypted. They offer very fined control over data access, but can be a management and performance nightmare. Depending upon the implementation, there might be one key used for all elements or a key for each row. The performance penalty is a sharp limitation, especially when selecting or modifying multiple rows. More commonly, separation of duties is supported by label security. This strategy involves structural modifications to the database to support "labeling" each row with attributes corresponding to access rights or user groups. Additionally, each user is assigned access rights that map to one or more of these labels. When a user makes a request, they are only allowed to retrieve/view a subset of the rows with matching label attributes. The query is only applied to a subset of database independent of any action on the user's part or query modifications. This offers much higher performance and works well with large databases. It can be used in conjunction with field/cell level encryption to provide high security, but as this is often sufficient to address separation of duties, it is used in conjunction with transparent forms of database encryption.

These advantages come at a cost, and one of these costs is the re-engineering effort required for the applications that rely upon the data that has been encrypted. Most database queries rely on functions to format the results or derive information, and fail when referencing encrypted data. For example, grouping functions like 'summation' or 'average', and more advanced comparisons such as 'like' and 'range' not longer work. Indices on encrypted columns fail as they are not trying to arrange randomized data. Foreign key relationships and compound keys cause errors and unintended side effects with both application and database functions. Reporting applications and batch jobs run under generic accounts lack permissions to perform their intended functions. The full effect of retrofitting tables and queries designed under a different set of assumptions cannot be adequately estimated, and requires complete regression testing and data verification.

The single biggest complaint we hear from companies when implementing granular encryption regards the performance impact. Depending upon the specific vendor implementation, column level encryption may require anything from several blocks to the entire column of data being decrypted before the query results can be returned. In cases where there are millions of rows scattered across millions of data blocks, the processing overhead is staggering. Encryption also precludes use of several standard performance optimizations, further reducing performance and throughput. For example, establishing a database connection is a time consuming effort for the database, often far exceeding the time needed to execute the user's query. "Connection Pooling" is a common database feature where connections are pre-established under a generic application user account and remain idle until a user makes a request. But when access to encrypted data requires a complete user ID and credentials, generic service accounts cannot access the encrypted data. Each request needs to be established with a credentialed user account, or the connection modified such that the credentials are passed and authenticated. Another example is data caching, where the database fetches commonly accessed information and stores it in memory. With encryption and label security, what each user sees may be different, and caching is less effective.

Many of these issues can be mitigated or completely addressed, but only when designing encryption into the application and database structures from scratch. If you are moving forward with an encryption project, it is far better to implement these changes into new tables and functions rather than attempt to retrofit new functions into tables and applications designed under a different set of assumptions.

In our next post we will take a closer look at key management options. There are several variants available to support encryption functions, performance, and even separation of duties.

–Adrian Lane