Researchers at Microsoft are presenting a prototype of encrypted data which can be used without decrypting. Called homomorphic encryption, the idea is to keep data in a protected state (encrypted) yet still useful. It may sound like Star Trek technobabble, but this is a real working prototype. The set of operations you can perform on encrypted data is limited to a few things like addition and multiplication, but most analytics systems are limited as well. If this works, it would offer a new way to approach data security for publicly available systems.
The research team is looking for a way to reduce encryption operations, as they are computationally expensive – their encryption and decryption demand a lot of processing cycles. Performing calculations and updates on large data sets becomes very expensive, as you must decrypt the data set, find the data you are interested in, make your changes, and then re-encrypt altered items. The ultimate performance impact varies with the storage system and method of encryption, but overhead and latency might typically range from 2x-10x compared to unencrypted operations. It would be a major advancement if they could dispense away with the encryption and decryption operations, while still enabling reporting on secured data sets.
The promise of homomorphic encryption is predictable alteration without decryption. The possibility of being able to modify data without sacrificing security is compelling. Running basic operations on encrypted data might remove the threat of exposing data in the event of a system breach or user carelessness. And given that every company even thinking about cloud adoption is looking at data encryption and key management deployment options, there is plenty of interest in this type of encryption.
But like a lot of theoretical lab work, practicality has an ugly way of pouring water on our security dreams. There are three very real problems for homomorphic encryption and computation systems:
- Data integrity: Homomorphic encryption does not protect data from alteration. If I can add, multiply, or change a data entry without access to the owner’s key: that becomes an avenue for an attacker to corrupt the database. Alteration of pricing tables, user attributes, stock prices, or other information stored in a database is just as damaging as leaking information. An attacker might not know what the original data values were, but that’s not enough to provide security.
- Data confidentiality: Homomorphic encryption can leak information. If I can add two values together and come up with a consistent value, it’s possible to reverse engineer the values. The beauty of encryption is that when you make a very minor change to the ciphertext – the data you are encrypting – you get radically different output. With CBC variants of encryption, the same plaintext has different encrypted values. The question with homomorphic encryption is whether it can be used while still maintaining confidentiality – it might well leak data to determined attackers.
- Performance: Performance is poor and will likely remain no better than classical encryption. As homomorphic performance improves, so do more common forms of encryption. This is important when considering the cloud as a motivator for this technology, as acknowledged by the researchers. Many firms are looking to “The Cloud” not just for elastic pay-as-you-go services, but also as a cost-effective tool for handling very large databases. As databases grow, the performance impact grows in a super-linear way – layering on a security tool with poor performance is a non-starter.
Not to be a total buzzkill, but I wanted to point out that there are practical alternatives that work today. For example, data masking obfuscates data but allows computational analytics. Masking can be done in such a way as to retain aggregate values while masking individual data elements. Masking – like encryption – can be poorly implemented, enabling the original data to be reverse engineered. But good masking implementations keep data secure, perform well, and facilitate reporting and analytics. Also consider the value of private clouds on public infrastructure. In one of the many possible deployment models, data is locked into a cloud as a black box, and only approved programatic elements ever touch the data – not users. You import data and run reports, but do not allow direct access the data. As long as you protect the management and programmatic interfaces, the data remains secure.
There is no reason to look for isolinear plasma converters or quantum flux capacitors when when a hammer and some duct tape will do.