Thanks to some dude who looks like a James Bond villain and rents rack space in a nuclear bomb resistant underground cavern, combined with a foreign nation running the equivalent of a Hoover mated with a Xerox over the entire country, “data leaks” are back in the headlines.

While most of us intuitively understand that preventing leaks completely is impossible, you wouldn’t know it from listening to various politicians/executives/pundits. We tend to intuitively understand the impossibility, but we don’t often dig why – especially when it comes to technology.

Lately I’ve been playing with aspects of quantum mechanics as metaphors for information-centric (data) security. When we start looking at problems like protecting data in the highly distributed and abstracted environments enabled by virtualization, decentralization, and cloud computing, they are eerily reminiscent of the transition from the standard physics models (which date back to Newton) to the quantum world that came with the atomic age.

My favorite new way to explain the impossibility of preventing data leaks is quantum tunneling.

Quantum tunneling is one of those insane aspects of quantum computing that defies our normal way of thinking about things. Essentially it tells us that elementary particles (like electrons) have a chance of moving across any physical barrier, regardless of size. Even if the barrier clearly requires more energy to pass than the particle possesses. This isn’t just a theory – it’s essential to the functioning of real-world devices like scanning-tunneling microscopes, and explains radioactive particle decay.

Quantum tunneling is due to the wave-particle duality of these elementary particles. Without going too deeply into it, these particles express aspects of both particles and waves. One aspect is that we can’t ever really put our finger on both the absolute position and momentum of the particle; this means they live in a world defined by probabilities. Although the probability of a particle passing the barrier is low, it’s within the realm of the possible, and thus with enough particles and time it’s inevitable that some of them will cross the barrier.

Data loss is very similar conceptually. In our case we don’t have particles, we have datum (for our purposes, the smallest unit of data with value). Instead of physical barriers we have security controls. For datum our probabilities are location and momentum (movement), and for security controls we have effectiveness.

Combine this together and we learn that for any datum, there is a probability of it escaping any security control. The total function is all the values of that datum (the data), and the combined effectiveness of all the security controls for various exit vectors. This is a simplification of the larger model, but I’ll save that for a future geekout (yes, I even made up some equations).

Since no set of security controls is ever 100% effective for all vectors, it’s impossible to prevent data leaks. Datum tunneling.

But this same metaphor also provides some answers. First of all, the fewer copies of the datum (the less data) and the fewer the vectors, the lower the probability of tunneling. The larger the data set (a collection of different datums), the less probability of tunneling if you use the right control set. In other words, it’s a lot easier to get a single credit card number out the door despite DLP, but DLP can be very effective against larger data sets, if it’s well positioned to block the right vectors. We’re basically increasing the ‘mass’ of what we’re trying to protect. In a different case, such as a movie file, the individual datum has more ‘mass’ and thus is easier to protect.

Distill this down and we get back to standard security principles: How much are we trying to protect? How accessible is it? What are the ways to access and distribute/exfiltrate it. I like thinking in terms of these probabilities to remind us that perfect protection is an impossibility, while still highlighting where to focus efforts in order to reduce overall risk.