Database benchmarking is hard to do. Any of you who followed the performance testing wars of the early 90’s, or the adoption and abuse of TPC-C and TPC-D, know that the accuracy of database performance testing is a long-standing sore point. With database encryption the same questions of how to measure performance rear their heads. But in this case there are no standards. That’s not to say the issue is not important to customers – it is. You tell a customer encryption will reduce throughput by 10% or more, and your meeting is over. End of discussion. Just the fear of potential performance issues has hindered the adoption of database encryption.
This is why it is incredibly important for all vendors who offer encryption for databases (OS/file system encryption vendors, drive manufacturers, database vendors, etc.) to be able to say their impact is below 10%. Or as far below that number as they can. I am not sure where it came from, but it’s a mythical number in database circles. Throughout my career I have been doing database performance testing of one variety or another. Some times I have come up with a set of tests that I thought exercised the full extent of the system. Other times I created test cases to demonstrate high and low water marks of system performance. Sometimes I captured network traffic at a customer site as a guide, so I could rerun those sessions against the database to get a better understanding of real world performance. But it is never a general use case – it’s only an approximation for that customer, approximating their applications.
When testing database encryption performance several variables come into play:
Do users issue 50 select statements for every insert? Are the queries a balance of select, inserts, and deletes? Do I run the queries back to back as fast as I can, like a batch job, or do I introduce latency between requests to simulate end users? Do I select queries that only run against encrypted data, or all data? Generally when testing database performance, with or without encryption, we select a couple different query profiles to see what types of queries cause problems. I know from experience I can create a test case that will drop throughput by 1%, and another that will drop it by 40% (I actually had a junior programmer unknowingly design and deploy a query with a cartesian that crashed our live server instantly, but that’s another story).
What type of encryption?
Full disk? Tablespace? Column Level? Table level? Media? Each variant has advantages and disadvantages. Even if they are all using AES-256, there will be differences in performance. Some have hardware acceleration; some limit the total amount of encrypted data by partitioning data sets; others must contend for thread, memory, and processor resources. Column level encryption encrypts less data than full disk encryption, so this should be an apples to apples comparison. But if the encrypted column is used as an index, it can have a significant impact on query performance, mitigating the advantage.
What percentage of queries are against encrypted data?
Many customers have the ability to partition sensitive data to a database or tablespace that is much smaller than the overall data set. This means that many queries do not require encryption and decryption, and the percentage impact of encryption on these systems is generally quite good. It is not unreasonable to see that encryption only impacts the entire database by 1% over the entire database in a 24 hour period, while reduces throughput against the encrypted tablespace by 25%.
What is the load on the system?
Encryption algorithms are CPU intensive. They take some memory as well, but it’s CPU you need to pay attention to. With two identical databases, you can get different performance results if the load on one system is high. Sounds obvious, I know, but this is not quite as apparent as you might think. For example, I have seen several cases where the impact of encryption on transactional throughput was a mere 5%, but the CPU load grew by 40%. If the CPUs on the system do not have sufficient ‘headroom’, you will slow down encryption, overall read/write requests (both encrypted and unencrypted), and the system’s transactional throughput over time. Encryption’s direct slowdown of encrypted traffic is quite different than its impact on overall system load and general responsiveness.
How many cryptographic operations can be accomplished during a day is irrelevant. How many cryptographic operations can be accomplished under peak usage during the day is a more important indicator. Resource consumption is not linear, and as you get closer to the limits of the platform, performance and throughput degrade and a greater than linear rate.
How do you know what impact database encryption will have on your database? You don’t. Not until you test. There is simply no way for any vendor of a product to provide a universal test case. You need to test with cases that represent your environment. You will see numbers published, and they may be accurate, but they seldom reflect your environment and are so are generally useless.