Login  |  Register  |  Contact

Virtual Private Storage

Tuesday, September 01, 2009

Musings on Data Security in the Cloud

By Rich

So I've written about data security, and I've written about cloud security, thus it's probably about time I wrote something about data security in the cloud.

To get started, I'm going to skip over defining the cloud. I recommend you take a look at the work of the Cloud Security Alliance, or skip on over to Hoff's cloud architecture post, which was the foundation of the architectural section of the CSA work. Today's post is going to be a bit scattershot, as I throw out some of the ideas rolling around my head from I thinking about building a data security cycle/framework for the cloud.

We've previously published two different data/information-centric security cycles. The first, the Data Security Lifecycle (second on the Research Library page) is designed to be a comprehensive forward-looking model. The second, The Pragmatic Data Security Cycle, is designed to be more useful in limited-scope data security projects. Together they are designed to give you the big picture, as well as a pragmatic approach for securing data in today's resource-constrained environments. These are different than your typical Information Lifecycle Management cycles to reflect the different needs of the security audience.

When evaluating data security in the context of the cloud, the issues aren't that we've suddenly blasted these cycles into oblivion, but that when and where you can implement controls is shifted, sometimes dramatically. Keep in mind that moving to the cloud is every bit as much an opportunity as a risk. I'm serious -- when's the last time you had the chance to completely re-architect your data security from the ground up?

For example, one of the most common risks cited when considering cloud deployment is lack of control over your data; any remote admin can potentially see all your sensitive secrets. Then again, so can any local admin (with access to the system). What's the difference? In one case you have an employment agreement and their name, in the other you have a Service Level Agreement and contracts... which should include a way to get the admin's name.

The problems are far more similar than they are different. I'm not one of those people saying the cloud isn't anything new -- it is, and some of these subtle differences can have a big impact -- but we can definitely scope and manage the data security issues. And when we can't achieve our desired level of security... well, that's time to figure out what our risk tolerance is.

Let's take two specific examples:

Protecting Data on Amazon S3 -- Amazon S3 is one of the leading IaaS services for stored data, but it includes only minimal security controls compared to an internal storage repository. Access controls (which may not integrate with your internal access controls) and transit encryption (SSL) are available, but data is not encrypted in storage and may be accessible to Amazon staff or anyone who compromises your Amazon credentials. One option, which we've talked about here before, is Virtual Private Storage. You encrypt your data before sending it off to Amazon S3, giving you absolute control over keys and ACLs. You maintain complete control while still retaining the benefits of cloud-based storage. Many cloud backup solutions use this method.

Protecting Data at a SaaS Provider -- I'd be more specific and list a SaaS provider, but I can't remember which ones follow this architecture. With SaaS we have less control and are basically limited to the security controls built into the SaaS offering. That isn't necessarily bad -- the SaaS provider might be far more secure than you are -- but not all SaaS offerings are created equal. To secure SaaS data you need to rely more on your contracts and an understanding of how your provider manages your data.

One architectural option for your SaaS provider is to protect your data with individual client keys managed outside the application (this is actually a useful internal data security architectural choice). It's application-level encryption with external key management. All sensitive client data is encrypted in the SaaS provider's database. Keys are managed in a dedicated appliance/service, and provided temporally to the application based on user credentials. Ideally the SaaS prover's admins are properly segregated -- where no single admin has database, key management, and application credentials. Since this potentially complicates support, it might be restricted to only the most sensitive data. (All your information might still be encrypted, but for support purposes could be accessible to the approved administrators/support staff). The SaaS provider then also logs all access by internal and external users.

This is only one option, but your SaaS provider should be able to document their internal data security, and even provide you with external audit reports.

As you can see, just because you are in the cloud doesn't mean you completely give up any chance of data security. It's all about understanding security boundaries, control options, technology, and process controls.

In future posts we'll start walking through the Data Security Lifecycle and matching specific issues and control options in each phase against the SPI (SaaS, PaaS, IaaS) cloud models.

–Rich

Monday, May 18, 2009

Securing Cloud Data with Virtual Private Storage

By Rich

For a couple of weeks I've had a tickler on my to do list to write up the concept of virtual private storage, since everyone seems all fascinated with virtualization and clouds these days. Luck for me, Hoff unintentionally gave me a kick in the ass with his post today on EMC's ATMOS. Not that he mentioned me personally, but I've had "baby brain" for a couple of months now and sometimes need a little external motivation to write something up. (I've learned that "baby brain" isn't some sort of lovely obsession with your child, but a deep seated combination of sleep deprivation and continuous distraction).

Virtual Private Storage is a term/concept I started using about six years ago to describe the application of encryption to protect private data in shared storage. It's a really friggin' simple concept many of you either already know, or will instantly understand. I didn't invent the architecture or application, but, as foolish analysts are prone to, coined the term to help describe how it worked. (Not that since then I've seen the term used in other contexts, so I'll be specific in my meaning).

Since then, shared storage is now called "the cloud", and internal shared storage an "internal private cloud", while outsourced storage is some variant of "external cloud", which may be public or private. See how much simpler things get over time?

The concept of Virtual Private Storage is pretty simple, and I like the name since it ties in well with Virtual Private Networks, which are well understood and part of our common lexicon. With a VPN we secure private communications over a public network by encrypting and encapsulating packets. The keys aren't ever stored in the packets, but on the end nodes.

With Virtual Private Storage we follow the same concept, but with stored data. We encrypt the data before it's placed into the shared repository, and only those who are authorized for access have the keys. The original idea was that if you had a shared SAN, you could buy a SAN encryption appliance and install it on your side of the connection, protecting all your data before it hits storage. You manage the keys and access, and not even the SAN administrator can peek inside your files. In some cases you can set it up so remote admins can still see and interact with the files, but not see the content (encrypt the file contents, but not the metadata).

A SaaS provider that assigns you an encryption key for your data, then manages that key, is not providing Virtual Private Storage. In VPS, only the external end-nodes which access the data hold the keys. To be more specific, as with a VPN, it's only private if only you hold your own keys. It isn't something that's applicable in all cloud manifestations, but conceptually works well for shared storage (including cloud applications where you've separated the data storage from the application layer).

In terms of implementation there are a number of options, depending on exactly what you're storing. We've seen practical examples at the block level (e.g., a bunch of online backup solutions), inline appliances (a weak market now, but they do work well), software (file/folder), and application level.

Again, this is a pretty obvious application, but I like the term because it gets us thinking about properly encrypting our data in shared environments, and ties well with another core technology we all use and love.

And since it's Monday and I can't help myself, here's the obligatory double-entendre analogy. If you decide to... "share your keys" at some sort of... "key party", with a... "partner", the... "sanctity" of your relationship can't be guaranteed and your data is "open".

–Rich