Data Security
|
Sign Up!
|
|
|
|
|
Project Quant
|
|
The patch management metrics project.
|
|
|
Tag Cloud
|
|
|
 |
|
Entries Calendar
|
| S |
M |
T |
W |
T |
F |
S |
| 28 | 1 |
2 |
3 |
4 |
5 |
6 |
| 7 |
8 |
9 |
10 |
11 |
12 |
13 |
| 14 |
15 |
16 |
17 |
18 |
19 |
20 |
| 21 |
22 |
23 |
24 |
25 |
26 |
27 |
| 28 |
29 |
30 |
31 |
1 |
2 |
3 |
|
|
By Rich
Over the next 3 days, we'll be posting the content from the Securosis Guide to the RSA Conference 2010. We broke the market into 8 different topics: Network Security, Data Security, Application Security, Endpoint Security, Content (Web & Email) Security, Cloud and Virtualization Security, Security Management, and Compliance. For each section, we provide a little history and what we expect to see at the show. Next up is Data Security.
Data Security
Although technically nearly all of Information Security is directed at protecting corporate data and content, in practice our industry has historically focused on network and endpoint security. At Securosis we divide up the data security world into two major domains based on how users access data -- the data center and the desktop. This reflects how data is managed far more practically than "structured" and "unstructured". The data center includes access through enterprise applications, databases, and document management systems. The desktop includes productivity applications (the Office suite), email, and other desktop applications and communications.
What We Expect to See
There are four areas of interest at the show relative to data security:
- Content Analysis: This is the ability of security tools to dig inside files and packets to understand the content inside, not just the headers or other metadata. The most basic versions are generally derived from pattern matching (regular expressions), while advanced options include partial document matching and database fingerprinting. Content analysis techniques were pioneered by Data Loss Prevention (DLP) tools; and are starting to pop up in everything from firewalls, to portable device control agents, to SIEM systems.
The most important questions to ask identify the kind of content analysis being performed. Regular expressions alone can work, but result in more false positives and negatives than other options. Also find out if the feature can peer inside different file types, or only analyze plain text. Depending on your requirements, you may not need advanced techniques, but you do need to understand exactly what you're getting and determine if it will really help you protect your data, or just generate thousands of alerts every time someone buys a collectable shot glass from Amazon.
- DLP Everywhere: Here at Securosis we use a narrow definition for DLP that includes solutions designed to protect data with advanced content analysis capabilities and dedicated workflow, but not every vendor marketing department agrees with our approach. Given the customer interest around DLP, we expect you'll see a wide variety of security tools with DLP or "data protection" features, most of which are either basic content analysis or some form of context-based file or access blocking. These DLP features can be useful, especially in smaller organizations and those with only limited data protection needs, but they are a pale substitute if you need a dedicated data protection solution.
When talking with these vendors, start by digging into their content analysis capabilities and how they really work from a technical standpoint. If you get a technobabble response, just move on. Also ask to see a demo of the management interface -- if you expect a lot of data-related violations, you will likely need a dedicated workflow to manage incidents, so user experience is key. Finally, ask them about directory integration -- when it comes to data security, different rules apply to different users and groups.
- Encryption and Tokenization: Thanks to a combination of PCI requirements and recent data breaches, we are seeing a ton of interest in application and database encryption and tokenization. Tokenization replaces credit card numbers or other sensitive strings with random token values (which may match the credit card format) matched to real numbers only in a central highly secure database. Format Preserving Encryption encrypts the numbers so you can recover them in place, but the encrypted values share the credit card number format. Finally, newer application and database encryption options focus on improved ease of use and deployment compared to their predecessors.
You don't really need to worry about encryption algorithms, but it's important to understand platform support, management user experience (play around with the user interface), and deployment requirements. No matter what anyone tells you, there are always requirements for application and database changes, but some of these approaches can minimize the pain. Ask how long an average deployment takes for an organization of your size, and make sure they can provide real examples or references in your business, since data security is very industry specific.
- Database Security: Due partially to acquisitions and partially to customer demand, we are seeing a variety of tools add features to tie into database security. Latest in the hit parade are SIEM tools capable of monitoring database transactions and vulnerability assessment tools with database support. These parallel the dedicated Database Activity Monitoring and Database Assessment markets. As with any area of overlap and consolidation, you'll need to figure out if you need a dedicated tool, or if features in another type of product are good enough. We also expect to see a lot more talk about data masking, which is the conversion of production data into a pseudo-random but still usable format for development.
–Rich
Posted at Tuesday 23rd February 2010 3:16 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
When Mike was reviewing the latest Pragmatic Data Security post he nailed me on being too apologetic for telling people they need to spend money on data-security specific tools. (The line isn't in the published post).
Just so you don't think Mike treats me any nicer in private than he does in public, here's what he said:
Don't apologize for the fact that data discovery needs tools. It is what it is. They can be like almost everyone else and do nothing, or they can get some tools to do the job. Now helping to determine which tools they need (which you do later in the post) is a good thing. I just don't like the apologetic tone.
As someone who is often a proponent for tools that aren't in the typical security arsenal, I've found myself apologizing for telling people to spend money. Partially, it's because it isn't my money... and I think analysts all too often forget that real people have budget constraints. Partially it's because certain users complain or look at me like I'm an idiot for recommending something like DLP.
I have a new answer next time someone asks me if there's a free tool to replace whatever data security tool I recommend:
Did you build your own Linux box running ipfw to protect your network, or did you buy a firewall?
The important part is that I only recommend these purchases when they will provide you with clear value in terms of improving your security over alternatives. Yep, this is going to stay a tough sell until some regulation or PCI-like standard requires them.
Thus I'm saying, here and now, that if you need to protect data you likely need DLP (the real thing, not merely a feature of some other product) and Database Activity Monitoring. I haven't found any reasonable alternatives that provide the same value.
There. I said it. No more apologies -- if you have the need, spend the money. Just make sure you really have the need, and the tool you are looking at really delivers the value, since not all solutions are created equal.
–Rich
Posted at Monday 1st February 2010 10:54 am
Filed under:
(3) Comments •
(0) Trackbacks •
Permalink
By Rich
Now that we've described the Pragmatic Data Security Cycle, it's time to dig into the phases. As we roll through each of these I'm going to break it into three parts: the process, the technologies, and a case study. For the case study we're going to follow a fictional organization through the entire process. Instead of showing you every single data protection option at each phase, we'll focus on a narrow project that better represents what you will likely experience.
Define: The Process
From a process standpoint, this is both the easiest and hardest of the phases. Easy, since there's only one thing you need to do and it isn't very technical or complex, hard since it may involve coordination across multiple business units and the quest for executive sponsorship.
- Identify an executive sponsor to support your efforts. Without management support, the rest of the process will be extremely difficult.
- Identify the one piece of information/content/data you want to protect. The definition shouldn't be too broad. For example, "engineering plans" is too broad, but "engineering plans for project X" is acceptable. Using "PCI/NPI/HIPAA" is acceptable, assuming you narrow it down in the next step.
- Define and model the information you defined in the step above. For totally unstructured content like engineering plans, identify a repository to use for your definition, or any watermarking/labels you are certain will be available to identify and protect the information. For PCI/NPI/HIPAA determine the exact fields/pieces of data to protect. For PCI it might be only the credit card number, for NPI it might be names and addresses, and for HIPAA it might be ICD9 billing codes. If you are protecting data from a database, also identify the source repository.
- Identify key business units with a stake in the information, and contact them to verify the priority, structure, and repositories for this information. It's no fun if you think you're going to protect a database of customer data, only to find out halfway through that it's not really the important one from a business perspective.
That's it: find a sponsor, identify the category, identify the data/repository, and confirm with the business folks.
Define: Technologies
None. This is a manual business process and the only technology you need is something to take notes with... or maybe email to communicate.
Define: Case Study
Billy Bob's Bait Shop and Sushi Outlet is a mid-sized, multi-site retail organization that specializes in "The freshest seafood, for your family or aquatic friends". Billy Bob's consists of a corporate headquarters and a few dozen retail outlets in three states. There are about 1,000 employees, and a growing web business due to their capability to ship fresh bait or sushi to any location in the US overnight.
Billy Bob's is struggling with PCI compliance and wants to avoid a major security breach after seeing the damage caused to their major competitor during a breach (John Boy's Worms and Grub).
They do not have a dedicated security team, but their CIO designated one of their top network administrators (the former firewall manager) to head up security operations. Frank has a solid history as a network administrator and is familiar with security (including some SANS training and a CISSP class). Due to problems with their first PCI assessment, Frank has the backing of the CIO.
The category of data is PCI. After some research, Frank decides to go with a multilevel definition -- at the top is credit card numbers. Since they are (supposedly) not storing them in a database they could feed to any data protection tools, Frank is starting with a regular expression to identify credit card numbers, and then plans on refining it using customer names (which are stored in the database). He is hoping that whatever tools he picks can use a generic credit card number definition for low-priority alerts, and a credit card (generic) tied with a customer name to trigger higher priority alerts. Frank also plans on using violation counts to help find real problems areas.
Frank now has a generic category (PCI), a specific definition (generic regex and customer name from a database) and the repository location (the customer database itself). From the heads of the customer relations and billing, he learned that there are really two databases he needs to worry about: the main transaction processing/records system for the web outlet, and the point of sale transaction processing system for the retail outlets. The web outlet does not store unencrypted credit card numbers, but the retail outlets currently do, and they are working with the transaction processor to fix that. Thus he is adding credit card numbers from the retail database to his list of data sources. Fortunately, they are only stored in the central processing database, and not at the individual retail outlets.
That's the setup -- in our next post we will cover the Discovery process to figure out where the heck all that data is.
–Rich
Posted at Wednesday 27th January 2010 4:05 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
Back in Part 1 of our series on Pragmatic Data Security, we covered some guiding concepts. Before we actually dig in, there's some more groundwork we need to cover. There are two important fundamentals that provide context for the rest of the process.
The Data Breach Triangle
In May of 2009 I published a piece on the Data Breach Triangle, which is based on the fire triangle every Boy Scout and firefighter is intimately familiar with. For a fire to burn you need fuel, oxygen, and heat -- take any single element away and there's no combustion. Extending that idea: to experience a data breach you need an exploit, data, and an egress route. If you block the attacker from getting in, don't leave them data to steal, or block the stolen data's outbound path, you can't have a successful breach.

To date, the vast majority of information security spending is directed purely at preventing exploits -- including everything from vulnerability management, to firewalls, to antivirus. But when it comes to data security, in many cases it's far cheaper and easier to block the outbound path, or make the data harder to access in the first place. That's why, as we detail the process, you'll notice we spend a lot of time finding and removing data from where it shouldn't be, and locking down outbound egress channels.
The Two Domains of Data Security
We're going to be talking about a lot of technologies through this series. Data security is a pretty big area, and takes the right collection of tools to accomplish. Think about network security -- we use everything from firewalls, to IDS/IPS, to vulnerability assessment and monitoring tools. Data security is no different, but I like to divide both the technologies and the processes into two major buckets, based on how we access and use the information:
- The Data Center and Enterprise Applications -- When a user access content through an enterprise application (client/server or web), often backed by a database.
- Productivity Tools -- When a user works with information with their desktop tools, as opposed to connecting to something in the data center. This bucket also includes our communications applications. If you are creating or accessing the content in Microsoft Office, or exchanging it over email/IM, it's in this category.
To provide a little more context, our web application and database security tools fall into the first domain, while DLP and rights management generally fall into the second.
Now I bet some of you thought I was going to talk about structured and unstructured data, but I think that distinction isn't nearly as applicable as the data center vs. productivity applications. Not all structured data is in a database, and not all unstructured data is on a workstation or file server. Practically speaking, we need to focus on the business workflow of how users work with data, not where the data might have come from. You can have structured data in anything from a database to a spreadsheet or a PDF file, or unstructured data stored in a database, so that's no longer an effective division when it comes to the design and implementation of appropriate security controls.
The distinction is important since we need to take slightly different approaches based on how a user works with the information, taking into account its transitions between the two domains. We have a different set of potential controls when a user comes through a controlled application, vs. when a user is creating or manipulating content on their desktop and exchanging it through email.
As we introduce and explore the Pragmatic Data Security process, you'll see that we rely heavily on the concepts of the Data Breach Triangle and these two domains of data security to focus our efforts and design the right business processes and control schemes without introducing unneeded complexity.
–Rich
Posted at Wednesday 20th January 2010 12:09 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
Over the past 7 years or so I've talked with thousands of IT professionals working on various types of data security projects. If I were forced to pull out one single thread from all those discussions it would have to be the sheer intimidating potential of many of these projects. While there are plenty of self-constrained projects, in many cases the security folks are tasked with implementing technologies or changes that involve monitoring or managing on a pretty broad scale. That's just the nature of data security -- unless the information you're trying to protect is already in isolated use, you have to cast a pretty wide net.
But a parallel thread in these conversations is how successful and impactful well-defined data security projects can be. And usually these are the projects that start small, and grow over time.
Way back when I started the blog (long before Securosis was a company) I did a series on the Information-Centric Security Cycle (linked from the Research Library). It was my first attempt to pull the different threads of data security together into a comprehensive picture, and I think it still stands up pretty well.
But as great as my inspired work of data-security genius is (*snicker*), it's not overly useful when you have to actually go out and protect, you know, stuff. It shows the potential options for protecting data, but doesn't provide any guidance on how to pull it off.
Since I hate when analysts provide lofty frameworks that don't help you get your job done, it's time to get a little more pragmatic and provide specific guidance on implementing data security. This Pragmatic Data Security series will walk through a structured and realistic process for protecting your information, based on hundreds of conversations with security professionals working on data security projects.
Before starting, there's a bit of good news and bad news:
- Good news: there are a lot of things you can do without spending much money.
- Bad news: to do this well, you're going to have to buy the right tools. We buy firewalls because our routers aren't firewalls, and while there are a few free options, there's no free lunch.
I wish I could tell you none of this will cost anything and it won't impose any additional effort on your already strained resources, but that isn't the way the world works.
The concept of Pragmatic Data Security is that we start securing a single, well-defined data type, within a constrained scope. We then grow the scope until we reach our coverage objectives, before moving on to additional data types. Trying to protect, or even find, all of your sensitive information at once is just as unrealistic as thinking you can secure even one type of data everywhere it might be in your organization.
As with any pragmatic approach, we follow some simple principles:
- Keep it simple. Stick to the basics.
- Keep it practical. Don't try to start processes and programs that are unrealistic due to resources, scope, or political considerations.
- Go for the quick wins. Some techniques aren't perfect or ideal, but wipe out a huge chunk of the problem.
- Start small.
- Grow iteratively. Once something works, expand it in a controlled manner.
- Document everything. Makes life easier come audit time.
I don't mean to over-simplify the problem. There's a lot we need to put in place to protect our information, and many of you are starting from scratch with limited resources. But over the rest of this series we'll show you the process, and highlight the most effective techniques we've seen.
Tomorrow we'll start with the Pragmatic Data Security Cycle, which forms the basis of our process.
–Rich
Posted at Wednesday 13th January 2010 2:15 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
In our last post in this series, we covered the cloud implications of the Share phase of Data Security Cycle. In this post we will move on to the Archive and Destroy phases.
Archive
Definition
Archiving is the process of transferring data from active use into long-term storage. This can include archived storage at your cloud provider, or migration back to internal archives.
From a security perspective we are concerned with two controls: encrypting the data, and tracking the assets when data moves to removable storage (tapes, or external drives for shipping transfers). Since many cloud providers are constantly backing up data, archiving often occurs outside customer control, and it's important to understand your provider's policies and procedures.
Steps and Controls
| Control | Structured/Application | Unstructured |
| Encryption | Database Encryption | Tape Encryption Storage Encryption |
| Asset Management | Asset Management |
Encryption
In the Store phase we covered a variety of encryption options, and if content is kept encrypted as it moves into archived storage, no additional steps are needed. Make sure your archiving system takes the encryption keys into account, since restored data is useless if the corresponding decryption keys are unavailable. In cloud environments data is often kept live due to the elasticity of cloud storage, and might just be marked with some sort of archive tag or metadata.
- Database Encryption: We reviewed the major database encryption options in the Store phase. The only archive-specific issue is ensuring the database replication/archiving method supports maintenance of the existing encryption. Another option is to use file encryption to secure the database archives. For larger databases, tape or storage encryption is often used.
- Tape Encryption: Encryption of the backup tapes using either hardware or software. There are a number of tools on the market and this is a common practice. Hardware provides the best performance, and inline appliances can work with most existing tape systems, but we are increasingly seeing encryption integrated into backup software and even tape drives. If your cloud provider manages tape backups (which many do), it's important to understand how those tapes are protected -- is any existing encryption maintained, and if not, how are the tapes encrypted and keys managed?
- Storage Encryption: Encryption of data archived to disk, using a variety of techniques. Although some hardware tools such as inline appliances and encrypted drivesxist, this is most commonly performed in software. We are using Storage Encryption as a generic term to cover any file or media encryption for data moved to long-term disk storage.
Asset Management
One common problem in both traditional and cloud environments is the difficulty of tracking the storage media containing archived data. Merely losing the location of unencrypted media may require a breach disclosure, even if the tape or drive is likely still located in a secure area -- if you can't prove it's there, it is effectively lost. From a security perspective, we aren't as concerned with asset management for encrypted content -- it's more of an issue for unencrypted sensitive data. Check with your cloud provider to understand their asset tracking for media, or implement an asset management system and procedures if you manage your own archives of cloud data.
Cloud SPI Tier Implications
Software as a Service (SaaS)
Archive security options in a SaaS deployment are completely dependent on your provider. Determine their backup procedures (especially backup rotations), any encryption, and asset management (especially for unencrypted data). Also determine if there are any differences between backups of live data and any long-term archiving for data moved off primary systems.
Platform as a Service (PaaS)
Archive security in PaaS deployments is similar to SaaS when you transition data to, or manage data with, the PaaS provider. You will need to understand the provider's archive mechanisms and security controls. If the data resides in your systems, archive security is no different than managing secure archives for your traditional data stores.
Infrastructure as a Service (IaaS)
For completely private cloud deployments, IaaS Archive security is no different than managing traditional archived storage. You'll use some form of media encryption and asset management for sensitive data. For cloud storage and databases, as with PaaS and SaaS you need to understand the archival controls used by your provider, although any data encrypted before moving to the cloud is clearly still secure.
Destroy
Definition
Destroy is the permanent destruction of data that's no longer needed, and the use of content discovery to validate that it is not lingering in active storage or archives.
Organizations commonly destroy unneeded data, especially sensitive data that may be under regulatory compliance requirements. The cloud may complicate this if your provider's data management infrastructure isn't compatible with your destruction requirements (e.g., the provider is unable to delete data from archived storage). Crypto-shredding may be the best option for many cloud deployments, since it relies less on complete access to all physical media, which may be difficult or impossible even in completely private/internal cloud deployments.
Steps and Controls
| Control | Structured/Application | Unstructured |
| Crypto-Shredding | Enterprise Key Management |
| Secure Deletion | Disk/Free Space Wiping |
| Physical Destruction | Physical Destruction |
| Content Discovery | Database Discovery | DLP/CMP Discovery Storage/Data Classification Tools Electronic Discovery |
Crypto-Shredding
Crypto-shredding is the deliberate destruction of all encryption keys for the data; effectively destroying the data until the encryption protocol used is (theoretically, some day) broken or capable of being brute-forced. This is sufficient for nearly every use case in a private enterprise, but shouldn't be considered acceptable for highly sensitive government data. Encryption tools must have this as a specific feature to absolutely ensure that the keys are unrecoverable. Crypto-shredding is an effective technique for the cloud since it ensures that any data in archival storage that's outside your physical control is also destroyed once you make the keys unavailable. If all data is encrypted with a single key, to crypto-shred you'll need to rotate the key for active storage, then shred the "old" key, which will render archived data inaccessible.
We don't mean to oversimplify this option -- if your cloud provider can't rotate your keys or ensure key deletion, crypto-shredding isn't realistic. If you manage your own keys, it should be an important part of your strategy.
Disk/Free Space Wiping and Physical Destruction
These options is only available when you have low-level administrative access to the physical storage. It includes software or hardware designed to destroy data on hard drives and other media, or physical destruction of the drives. At a minimum the tool should overwrite all writable space on the media 1-3 times, and 7 times is recommended for sensitive data. Merely formatting over data is not sufficient. Secure wiping is highly recommended for any systems with sensitive data that are sold or reused, especially laptops and desktops. File-level secure deletion tools exist for when it's necessary to destroy just a portion of data in active storage, but are not as reliable as a full media wipe.
For physical destruction (again, assuming you have access to the drives), there are two options:
- Degaussing: Use of strong magnets to scramble magnetic media like hard drives and backup tapes. Dedicated solutions should be used to ensure data is unrecoverable, and it's highly recommended you confirm the efficiency of a degaussing tool by randomly performing forensic analysis on wiped media.
- Physical Destruction: Complete physical destruction of storage devices, focusing on shredding the actual magnetic media (platters or tape).
Due to the abstraction involved in cloud computing, these will often not be available, although your provider may include them as part of their procedures for management of their drives. When managing a private/internal cloud, you can include physical media wiping or destruction as part of your procedures for managing drives removed from active service. In IaaS deployments, you may retain the low level access to overwrite data in individual virtual machines or storage.
Content Discovery
When truly sensitive data reaches end-of-life, you need to make sure that the destroyed data is really destroyed. Use of content discovery tools helps ensure that no copies or versions of the data remain accessible in the enterprise. Considering how complex our storage, archive, and backup strategies in the cloud are today, it is impossible to absolutely guarantee the data is unrecoverable, but content discovery does reduce the risk of retrieval.
As with content discovery in the Store phase, these tools are only effective if they have access to the storage infrastructure; they cannot work through an application interface unless they are built into the application.
For details on Database Discovery and DLP/CMP please see the Store phase. There are two additional technology categories we also see used for this purpose:
- Storage/Data Classification and Search: These are tools typically used and managed by enterprise storage teams. Their content analysis is generally less detailed than DLP/CMP tools, but can be helpful for broad searches for stored data. Storage/Data classification tools are third-party tools which crawl a storage environment and use rule sets (usually keywords and regular expressions) to apply metadata tags to files. If your cloud storage offers standard file access, they may be helpful. Search is either built into the application or a third-party tool that indexes stored data. While these are not ideal tools for content discovery to ensure data destruction, search may be your only option in some SaaS deployments.
- Electronic Discovery: Tools dedicated to the electronic discovery of data for legal proceedings. Likely the same tools that will be used to search for destroyed data if there's ever reason to attempt recovery in the future. As with most of the tools in this section, they are not cloud specific and may not be an option.
Cloud SPI Tier Implications
Software as a Service (SaaS)
As with Archive, your data destruction options are completely dependent on your provider. Typically you will be limited to some level of deletion, although in some applications crypto-shredding may be an option. What's most important is to understand how your provider handles data destruction, and to obtain any documentation and service level agreements that are available. Search will usually be your best content discovery option.
Platform as a Service (PaaS)
For data stored with your PaaS provider, unless you have file system access of some sort you will face the same limitations as with SaaS providers. If you encrypt data on your side before sending it to the platform, crypto-shredding is a good option. Any data stored in your environment is obviously easier to destroy, since you have greater control of the infrastructure and physical media. Content discovery may be an option, but this depends completely on how your PaaS-based application is designed.
Infrastructure as a Service (IaaS)
For cloud data storage (database and file based), crypto-shredding is likely your best option. For other infrastructure deployments, particularly those with virtual machines and disks, you may be able to overwrite stored data. Content discovery using DLP/CMP will probably work, again depending on the details of your deployment.
–Rich
Posted at Tuesday 22nd September 2009 11:58 am
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
In our last post in this series, we covered the cloud implications of the Use phase of our Data Security Cycle. In this post we will move on to the Share phase. Please remember that we are only covering technologies at a high level in this series on the cycle; we will run a second series on detailed technical implementations of data security in the cloud a little later.
Definition
Share includes controls we use when exchanging data between users, customers, and partners. Where Use focuses on controls when a user interacts with the data as an individual, Share includes the controls once they start to exchange that data (or back-end data exchange). In cloud computing we see a major emphasis on application and logical controls, with encryption for secure data exchange, DLP/CMP to monitor communications and block policy violations, and activity monitoring to track back-end data exchanges.
Cloud computing introduces two new complexities in managing data sharing:
- Many data exchanges occur within the cloud, and are invisible to external security controls. Traditional network and endpoint monitoring probably won't be effective. For example, when you share a Google Docs document to another user, the only local interactions are through a secure browser connection. Email filtering, a traditional way of tracking electronic document exchanges, won't really help.
- For leading edge enterprises that build dynamic data security policies using tools like DLP/CMP, those tools may not work off a cloud-based data store. If you are building a filtering policy that matches account numbers from a customer database, and that database is hosted in the cloud as an application or platform, you may need to perform some kind of mass data extract and conversion to feed the data security tool.
Although the cloud adds some complexity, it can also improve data sharing security in a well-designed deployment. Especially in SaaS deployments, we gain new opportunities to employ logical controls that are often difficult or impossible to manage in our current environments.
Although our focus is on cloud-specific tools and technologies, we still review some of the major user-side options that should be part of any data security strategy.
Steps and Controls
| Control | Structured/Application | Unstructured |
| Activity Monitoring and Enforcement | Database Activity Monitoring Cloud Activity Monitoring/Logs Application Activity Monitoring | Network DLP/CMP Endpoint DLP/CMP |
| Encryption | Network/Transport Encryption Application-Level Encryption | Email Encryption File Encryption/EDRM Network/Transport Encryption |
| Logical Controls | Application Logic Row Level Security
| None |
| Application Security | see Application Security Domain section |
Activity Monitoring and Enforcement
We initially covered Activity Monitoring and Enforcement in the Use phase, and many of these controls are also used in the Share phase. Our focus now switches from watching how users interact with the data, to when and where they exchange it with others. We include technologies that track data exchanges at four levels:
- Individual users exchanging data with other internal users within the cloud or a managed environment.
- Individual users exchanging data with outside users, either via connections made from the cloud directly, or data transferred locally and then sent out.
- Back-end systems exchanging data to/from the cloud, or within multiple cloud-based systems.
- Back-end systems exchanging data to external systems/servers; for example, a cloud-based employee human resources system that exchanges healthcare insurance data with a third-party provider.
- Database Activity Monitoring (DAM): We initially covered DAM in the Use phase. In the Share phase we use DAM to track data exchanges to other back-end systems within or outside the cloud. Rather than focusing on tracking all activity in the database, the tool is tuned to focus on these exchanges and generate alerts on policy violations (such as a new query being run outside of expected behavior), or track the activity for auditing and forensics purposes. The challenge is to deploy a DAM tool in a cloud environment, but an advantage is greater visibility into data leaving the DBMS than might otherwise be possible.
- Application Activity Monitoring: Similar to DAM, we initially covered this in the Use phase. We again focus our efforts on tracking data sharing, both by users and back-end systems. While it's tougher to monitor individual pieces of data, it's not difficult to build in auditing and alerting for larger data exchanges, such as outputting from a cloud-based database to a spreadsheet.
- Cloud Activity Monitoring and Logs: Depending on your cloud service, you may have access to some level of activity monitoring and logging in the control plane (as opposed to building it into your specific application). To be considered a Share control, this monitoring needs to specify both the user/system involved and the data being exchanged.
- Network Data Loss Prevention/Content Monitoring and Protection: DLP/CMP uses advanced content analysis and deep packet inspection to monitor network communications traffic, alerting on (and sometimes enforcing) policy violations. DLP/CMP can play multiple roles in protecting cloud-based data. In managed environments, network DLP/CMP policies can track (and block) sensitive data exchanges to untrusted clouds. For example, policies might prevent users from attaching files with credit card numbers to a cloud email message, or block publishing of sensitive engineering plans to a cloud-based word processor. DLP can also work in the other direction: monitoring data pulled from a cloud deployment to the desktop or other non-cloud infrastructure. DLP/CMP tools aren't limited to user activities, and can monitor, alert, and enforce policies on other types of TCP data exchange, such as FTP, which might be used to transfer data from the traditional infrastructure to the cloud. DLP/CMP also has the potential to be deployed within the cloud itself, but this is only possible in a subset of IaaS deployments, considering the deployment models of current tools. (Note that some email SaaS providers may also offer DLP/CMP as a service).
- Endpoint DLP/CMP: We initially covered Endpoint DLP/CMP in the Use phase, where we discussed monitoring and blocking local activity. Many endpoint DLP/CMP tools also track network activity -- this is useful as a supplement when the endpoint is outside the corporate network's DLP/CMP coverage.
Encryption
In the Store phase we covered encryption for protecting data at rest. Here we expand to cover data in motion. Keep in mind that additional encryption is only needed if the data would otherwise be exchanged as plain text -- there's no reason or need to redundantly re-encrypt already encrypted network traffic.
- Network/Transport Encryption: As data moves between applications, databases, the cloud, and other locations, the network connections should be encrypted using a standard network-layer protocol. For larger systems where this could affect performance, hardware acceleration is recommended. Virtual Private Networks are useful for encrypting data moving in and out of clouds in certain deployment models.
- Application Level Encryption: As we discussed in the Store phase, data encrypted by an application on collection is ideally protected as it moves throughout the rest of the application stack. Don't forget that at some point the data is probably decrypted to be used, so it's important to map the data flow and determine potential weak points.
- Email Encryption: Email encryption isn't cloud-specific, but since email is one of the most common ways of exchanging data, including reports and data dumps from cloud services, encryption is often relevant for cloud deployments -- especially when built into the cloud application/service.
- File Encryption and Enterprise Digital Rights Management: These technologies were discussed in detail in the Store phase. They also apply in the Share phase since encrypted files or DRM protected documents are still protected as they are moved, not just in storage. For cloud security purposes, encryption or EDRM may be built into various data exchange mechanisms -- with EDRM for user files, and encryption as a more general option.
Logical Controls
We discussed Logical Controls in the Use phase, and they can also be used to manage data exchange, not just transaction activity.
Application Security
As with logical controls, we discussed Application Security in the Use phase. Again, a full discussion of cloud application security issues is beyond the scope of this post, and we recommend you read the Cloud Security Alliance Guidance for more details.
Cloud SPI Tier Implications
Software as a Service (SaaS)
Data sharing in SaaS deployments is encapsulated within the application, is connected to back-end external applications, or involves generating data dumps to transfer the content to a local system. Application and logical controls are your best defense, combined with encryption to cover any data transfers. Once data leaves the SaaS application, DLP/CMP may be useful for tracking the content, or to protect it from leaving your managed environment. DLP/CMP is also useful to determine if the data should go to the cloud at all, and ensure that any data is transferred conforms to policy requirements. Since most SaaS solutions rely principally on HTTP for communications/access, most off-the-shelf DLP tools will work.
Platform as a Service (PaaS)
Depending on your PaaS deployment, it's again likely that application logic will be your best security option, followed by proper use of encryption to secure communications. You may also be able to deploy monitoring in your application that connects to the PaaS provider if they don't offer a desired level of monitoring/logging, but that will only track connections from your managed environment (someone trying to compromise the PaaS directly, without going through your application, won't appear in your application logs).
Infrastructure as a Service (IaaS)
VPNs are commonly used to protect communications to IaaS infrastructure, both internal and external. When VPNs aren't an option, such as with many types of cloud-based storage, SSL/TLS network encryption is usually available. Any additional Share controls rely completely on what you can deploy in the infrastructure. Any monitoring/auditing such as DLP require some sort of network traffic to analyze, or an alternative hook, such as a local agent.
–Rich
Posted at Monday 21st September 2009 1:55 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
In our last post in this series, we covered the cloud implications of the Store phase of Data Security Cycle (our first post was on the Create phase). In this post we'll move on to the Use phase. Please remember we are only covering technologies at a high level in this series -- we will run a second series on detailed technical implementations of data security in the cloud a little later.
Definition
Use includes the controls that apply when the user is interacting with the data -- either via a cloud-based application, or the endpoint accessing the cloud service (e.g., a client/cloud application, direct storage interaction, and so on). Although we primarily focus on cloud-specific controls, we also cover local data security controls that protect cloud data once it moves back into the enterprise. These are controls for the point of use -- we will cover additional network based controls in the next phase.
Users interact with cloud data in three ways:
- Web-based applications, such as most SaaS applications.
- Client applications, such as local backup tools that store data in the cloud.
- Direct/abstracted access, such as a local folder synchronized with cloud storage (e.g., Dropbox), or VPN access to a cloud-based server.
Cloud data may also be accessed by other back-end servers and applications, but the usage model is essentially the same (web, dedicated application, direct access, or an abstracted service).
Steps and Controls
| Control | Structured/Application | Unstructured |
| Activity Monitoring and Enforcement | Database Activity Monitoring Application Activity Monitoring | Endpoint Activity Monitoring File Activity Monitoring Portable Device Control Endpoint DLP/CMP Cloud-Client Logs |
| Rights Management | Label Security
| Enterprise DRM |
| Logical Controls | Application Logic Row Level Security
| None |
| Application Security | see Application Security Domain section |
Activity Monitoring and Enforcement
Activity Monitoring and Enforcement includes advanced techniques for capturing all data access and usage activity in real or near-real time, often with preventative capabilities to stop policy violations. Although activity monitoring controls may use log files, they typically include their own collection methods or agents for deeper activity details and more rapid monitoring. Activity monitoring tools also include policy-based alerting and blocking/enforcement that log management tools lack.
None of the controls in this category are cloud specific, but we have attempted to show how they can be adapted to the cloud. These first controls integrate directly with the cloud infrastructure:
- Database Activity Monitoring (DAM): Monitoring all database activity, including all SQL activity. Can be performed through network sniffing of database traffic, agents installed on the server, or external monitoring, typically of transaction logs. Many tools combine monitoring techniques, and network-only monitoring is generally not recommended. DAM tools are managed externally to the database to provide separation of duties from database administrators (DBAs). All DBA activity can be monitored without interfering with their ability to perform job functions. Tools can alert on policy violations, and some tools can block certain activity. Current DAM tools are not cloud specific, and thus are only compatible with environments where the tool can either sniff all network database access (possible in some IaaS deployments, or if provided by the cloud service), or where a compatible monitoring agent can be installed in the database instance.
- Application Activity Monitoring: Similar to Database Activity Monitoring, but at the application level. As with DAM, tools can use network monitoring or local agents, and can alert and sometimes block on policy violations. Web Application Firewalls are commonly used for monitoring web application activity, but cloud deployment options are limited. Some SaaS or PaaS providers may offer real time activity monitoring, but log files or dashboards are more common. If you have direct access to your cloud-based logs, you can use a near real-time log analysis tool and build your own alerting policies.
- File Activity Monitoring: Monitoring access and use of files in enterprise storage. Although there are no cloud specific tools available, these tools may be deployable for cloud storage that uses (or presents an abstracted version of) standard file access protocols. Gives an enterprise the ability to audit all file access and generate reports (which may sometimes aid compliance reporting). Capable of independently monitoring even administrator access and can alert on policy violations.
The next three tools are endpoint data security tools that are not cloud specific, but may still be useful in organizations that manage endpoints:
- Endpoint Activity Monitoring: Primarily a traditional data security tool, although it can be used to track user interactions with cloud services. Watching all user activity on a workstation or server. Includes monitoring of application activity; network activity; storage/file system activity; and system interactions such as cut and paste, mouse clicks, application launches, etc. Provides deeper monitoring than endpoint DLP/CMF tools that focus only on content that matches policies. Capable of blocking activities such as pasting content from a cloud storage repository into an instant message. Extremely useful for auditing administrator activity on servers, assuming you can install the agent. An example of cloud usage would be deploying activity monitoring agents on all endpoints in a customer call center that accesses a SaaS for user support.
- Portable Device Control: Another traditional data security tool with limited cloud applicability, used to restrict access of, or file transfers to, portable storage such as USB drives and DVD burners. For cloud security purposes, we only include tools that either track and enforce policies based on data originating from a cloud application or storage, or are capable of enforcing policies based on data labels provided by that cloud storage or application. Portable device control is also capable of allowing access but auditing file transfers and sending that information to a central management server. Some tools integrate with encryption to provide dynamic encryption of content passed to portable storage. Will eventually be integrated into endpoint DLP/CMF tools that can make more granular decisions based on the content, rather than blanket policies that apply to all data. Some DLP/CMF tools already include this capability.
- Endpoint DLP: Endpoint Data Loss Prevention/Content Monitoring and Filtering tools that monitor and restrict usage of data through content analysis and centrally administered policies. While current capabilities vary highly among products, tools should be able to monitor what content is being accessed by an endpoint, any file storage or network transmission of that content, and any transfer of that content between applications (cut/paste). For performance reasons endpoint DLP is currently limited to a subset of enforcement policies (compared to gateway products) and endpoint-only products should be used in conjunction with network protection in most cases (which we will discuss in the next phase of the lifecycle).
At this time, most activity monitoring and enforcement needs to be built into the cloud infrastructure to provide value. We often see some degree of application activity monitoring built into SaaS offerings, with some logging available for cloud databases and file storage. The exception is IaaS, where you may have full control to deploy any security tool you like, but will need to account for the additional complexities of deploying in virtual environments which impact the ability to route and monitor network traffic.
Rights Management
We covered the rights management options in the Create and Store sections. They are also a factor in the this phase (Use), since this is another point where they can be actively enforced during user interaction
In the Store phase rights are applied as data enters storage, and access limitations are enforced. In the Use phase, additional rights are controlled, such as data modification, export, or more-complex usage patterns (like printing or copying).
Logical Controls
Logical controls expand the brute-force restrictions of access controls or EDRM that are based completely on who you are and what you are accessing. Logical controls are implemented in applications and databases and add business logic and context to data usage and protection. Most data-security logic controls for cloud deployments are implemented in application logic (there are plenty of other logical controls available for other aspects of cloud computing, but we are focusing on data security).
- Application Logic: Enforcing security logic in the application through design, programming, or external enforcement. Logical controls are one of the best options for protecting data in any kind of cloud-based application.
- Object (Row) Level Security: Creating a ruleset restricting use of a database object based on multiple criteria. For example, limiting a sales executive to only updating account information for accounts assigned to his territory. Essentially, these are logical controls implemented at the database layer, as opposed to the application layer. Object level security is a feature of the Database Management System and may or may not be available in cloud deployments (it's available in some standard DBMSs, but is not currently a feature of any cloud-specific database system).
- Structural Controls: Using database design features to enforce security. For example, using the database schema to limit integrity attacks or restricting connection pooling to improve auditability. You can implement some level of structural controls in any database with a management system, but more advanced structural options may only be available in robust relational databases. Tools like SimpleDB are quite limited compared to a full hosted DBMS. Structural controls are more widely available than object level security, and since they don't rely on IP addresses or external monitoring they are a good option for most cloud deployments. They are particularly effective when designed in conjunction with application logic controls.
Application Security
Aside from raw storage or plain hosted database access, most cloud deployments involve enterprise applications. Effective application security is thus absolutely critical to protect data, and often far more important than any access controls or other protections. A full discussion of cloud application security issues is beyond the scope of this post, and we recommend you read the Cloud Security Alliance Guidance for more details.
Cloud SPI Tier Implications
Software as a Service (SaaS)
Most usage controls in SaaS deployments are enforced in the application layer, and depend on what's available from your cloud provider. The provider may also enforce additional usage controls on their internal users, and we recommend you ask for documentation if it's available. In particular, determine what kinds of activity monitoring they perform for internal users vs. cloud-based users, and if those logs are ever available (such as during the investigation of security incidents). We also often see label security in SaaS deployments.
Platform as a Service (PaaS)
Depending on your PaaS deployment, it's likely that application logic will be your best security option, followed by activity monitoring. If your PaaS provider doesn't provide the level of auditing you would like, you may be able to capture activity within your application before it makes a call to the platform, although this won't capture any potential direct calls to the PaaS that are outside your application.
Infrastructure as a Service (IaaS)
Although IaaS technically offers the most flexibility for deploying your own security controls, the design of the IaaS may inhibit deployment of many security controls. For example, monitoring tools that rely on network access or sniffing may not be deployable. On the other hand, your IaaS provider may include security controls as part of the service, especially some degree of logging and/or monitoring.
Database control availability will depend more on the nature of the infrastructure -- as we've mentioned, full hosted databases in the cloud can enforce many, if not all, of the traditional database security controls.
Endpoint-based usage controls are enforceable in managed environments, but are only useful in private cloud deployments where access to the cloud can be restricted to only managed endpoints.
–Rich
Posted at Friday 18th September 2009 2:50 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
In our last post in this series, we covered the cloud implications of the Create phase of the Data Security Cycle. In this post we're going to move on to the Store phase. Please remember that we are only covering technologies at a high level in this series on the cycle; we will run a second series on detailed technical implementations of data security in the cloud a little later.
Definition
Store is defined as the act of committing digital data to structured or unstructured storage (database vs. files). Here we map the classification and rights to security controls, including access controls, encryption and rights management. I include certain database and application controls, such as labeling, in rights management -- not just DRM. Controls at this stage also apply to managing content in storage repositories (cloud or traditional), such as using content discovery to ensure that data is in approved/appropriate repositories.
Steps and Controls
| Control | Structured/Application | Unstructured |
| Access Controls | DBMS Access Controls
Administrator Separation of Duties | File System Access Controls
Application/Document Management System Access Controls |
| Encryption | Field Level Encryption
Application Level Encryption
Transparent Database Encryption | Media Encryption
File/Folder Encryption
Virtual Private Storage
Distributed Encryption |
| Rights Management | Application Logic
Tagging/Labeling | Tagging/Labeling
Enterprise DRM |
| Content Discovery | Cloud-Provided Database Discovery Tool
Database Discovery/DAM
DLP/CMP Discovery | Cloud-Provided Content Discovery
DLP/CMP Content Discovery |
Access Controls
One of the most fundamental data security technologies, built into every file and management system, and one of the most poorly used. In cloud computing environments there are two layers of access controls to manage -- those presented by the cloud service, and the underlying access controls used by the cloud provider for their infrastructure. It's important to understand the relationship between the two when evaluating overall security -- in some cases the underlying infrastructure may be more secure (no direct back-end access) whereas in others the controls may be weaker (a database with multiple-tenant connection pooling).
- DBMS Access Controls: Access controls within a database management system (cloud or traditional), including proper use of views vs. direct table access. Use of these controls is often complicated by connection pooling, which tends to anonymize the user between the application and the database. A database/DBMS hosted in the cloud will likely use the normal access controls of the DBMS (e.g., hosted Oracle or MySQL). A cloud-based database such as Amazon's SimpleDB or Google's BigTable comes with its own access controls. Depending on your security requirements, it may be important to understand how the cloud-based DB stores information, so you can evaluate potential back-end security issues.
- Administrator Separation of Duties: Newer technologies implemented in databases to limit database administrator access. On Oracle this is called Database Vault, and on IBM DB2 I believe you use the Security Administrator role and Label Based Access Controls. When evaluating the security of a cloud offering, understand the capabilities to limit both front and back-end administrator access. Many cloud services support various administrator roles for clients, allowing you to define various administrative roles for your own staff. Some providers also implement technology controls to restrict their own back-end administrators, such as isolating their database access. You should ask your cloud provider for documentation on what controls they place on their own administrators (and super-admins), and what data they can potentially access.
- File System Access Controls: Normal file access controls, applied at the file or repository level. Again, it's important to understand the differences between the file access controls presented to you by the cloud service, vs. their access control implementation on the back end. There is an incredible variety of options across cloud providers, even within a single SPI tier -- many of them completely proprietary to a specific provider. For the purposes of this model, we only include access controls for cloud based file storage (IaaS), and the back-end access controls used by the cloud provider. Due to the increased abstraction, everything else falls into the Application and Document Management System category.
- Application and Document Management System Access Controls: This category includes any access control restrictions implemented above the file or DBMS storage layers. In non-cloud environments this includes access controls in tools like SharePoint or Documentum. In the cloud, this category includes any content restrictions managed through the cloud application or service abstracted from the back-end content storage. These are the access controls for any services that allow you to manage files, documents, and other 'unstructured' content. The back-end storage can consist of anything from a relational database to flat files to traditional storage, and should be evaluated separately.
When designing or evaluating access controls you are concerned first with what's available to you to control your own user/staff access, and then with the back end to understand who at your cloud provider can see what information. Don't assume that the back end is necessarily less secure -- some providers use techniques like bit splitting (combined with encryption) to ensure no single administrator can see your content at the file level, with strong separation of duties to protect data at the application layer.
Encryption
The most overhyped technology for protecting data, but still one of the most important. Encryption is far from a panacea for all your cloud data security issues, but when used properly and in combination with other controls, it provides effective security. In cloud implementations, encryption may help compensate for issues related to multi-tenancy, public clouds, and remote/external hosting.
- Application-Level Encryption: Collected data is encrypted by the application, before being sent into a database or file system for storage. For cloud-based applications (e.g., public or private SaaS) this is usually the recommended option because it protects the data from the user all the way down to storage. For added security, the encryption functions and keys can be separated from the application itself, which also limits the access of application administrators to sensitive data.
- Field-Level Encryption: The database management system encrypts fields within a database, normally at the column level. In cloud implementations you will generally want to encrypt data at the application layer, rather than within the database itself, due to the complexity.
- Transparent Encryption: Encryption of the database structures, files, or the media where the database is stored. For database structures this is managed by the DBMS, while for files it can be the DBMS or third-party file encryption. Media encryption is managed at the storage layer; never by the DBMS. Transparent encryption protects the database data from unauthorized direct access, but does not provide any internal security. For example, you can encrypt a remotely hosted database to prevent local administrators from accessing it, but it doesn't protect data from authorized database users.
- Media Encryption: Encryption of the physical storage media, such as hard drives or backup tapes. In a cloud environment, encryption of a complete virtual machine on IaaS could be considered media encryption. Media encryption is designed primarily to protect data in the event of physical loss/theft, such as a drive being removed from a SAN. It is often of limited usefulness in cloud deployments, although may be used by hosting providers on the back end in case of physical loss of media.
- File/Folder Encryption: Traditional encryption of specific files and folders in storage by the host platform.
- Virtual Private Storage: Encryption of files/folders in a shared storage environment, where the encryption/decryption is managed and performed outside the storage environment. This separates the keys and encryption from the storage platform itself, and allows them to be managed locally even when the storage is remote. Virtual Private Storage is an effective technique to protect remote data when you don't have complete control of the storage environment. Data is encrypted locally before being sent to the shared storage repository, providing complete control of user access and key management. You can read more about Virtual Private Storage in our post.
- Distributed Encryption: With distributed encryption we use a central key management solution, but distribute the encryption engines to any end-nodes that require access to the data. It is typically used for unstructured (file/folder) content. When a node needs access to an encrypted file it requests a key from the central server, which provides it if the access is authorized. Keys are usually user or group based, not specific to individual files. Distributed encryption helps with the main problem of file/folder encryption, which is ensuring that everyone who needs it gets access to the keys. Rather than trying to synchronize keys continually in the background, they are provide at need.
Rights Management
The actual enforcement of rights assigned during the Create phase.
For descriptions of the technologies, please see the post on the Create phase. In future posts we will discuss cloud implementations of each of these technologies in greater detail.
Content Discovery
Content Discovery is the process of using content or context-based tools to find sensitive data in content repositories. Content aware tools use advanced content analysis techniques, such as pattern matching, database fingerprinting, and partial document matching to identify sensitive data inside files and databases. Contextual tools rely more on location or specific metadata, such as tags, and are thus better suited to rigid environments with higher assurance that content is labeled appropriately.
Discovery allows you to scan storage repositories and identify the location of sensitive data based on central policies. It's extremely useful for ensuring that sensitive content is only located where the desired security controls are in place. Discovery is also very useful for supporting compliance initiatives, such as PCI, which restrict the usage and handling of specific types of data.
- Cloud-Provided Database Discovery Tool: Your cloud service provides features to locate sensitive data within your cloud database, such as locating credit card numbers. This is specific to the cloud provider, and we have no examples of current offerings.
- Database Discovery/DAM: Tools to crawl through database fields looking for data that matches content analysis policies. We most often see this as a feature of a Database Activity Monitoring (DAM) product. These tools are not cloud specific, and depending on your cloud deployment may not be deployable. IaaS environments running standard DBMS platforms (e.g., Oracle or MS SQL Server) may be supported, but we are unaware of any cloud-specific offerings at this time.
- Data Loss Prevention (DLP)/Content Monitoring and Protection (CMP) Database Discovery: Some DLP/CMP tools support content discovery within databases; either directly or through analysis of a replicated database or flat file dump. With full access to a database, such as through an ODBC connection, they can perform ongoing scanning for sensitive information.
- Cloud-Provided Content Discovery: A cloud-based feature to perform content discovery on files stored with the cloud provider.
- DLP/CMP Content Discovery: All DLP/CMP tools with content discovery features can scan accessible file shares, even if they are hosted remotely. This is effective for cloud implementations where the tool has access to stored files using common file sharing protocols, such as CIFS and WebDAV.
Cloud SPI Tier Implications
Software as a Service (SaaS)
As with most security aspects of SaaS, the security controls available depend completely on what's provided by your cloud service. Front-end access controls are common among SaaS offerings, and many allow you to define your own groups and roles. These may not map to back-end storage, especially for services that allow you to upload files, so you should ask your SaaS provider how they manage access controls for their internal users.
Many SaaS offerings state they encrypt your data, but it's important to understand just where and how it's encrypted. For some services, it's little more than basic file/folder or media encryption of their hosting platforms, with no restrictions on internal access. In other cases, data is encrypted using a unique key for every customer, which is managed externally to the application using a dedicated encryption/key management system. This segregates data between co-tenants on the service, and is also useful to restrict back-end administrative access. Application-level encryption is most common in SaaS offerings, and many provide some level of storage encryption on the back end.
Most rights management in SaaS uses some form of labeling or tagging, since we are generally dealing with applications, rather than raw data. This is the same reason we don't tend to see content discovery for SaaS offerings.
Platform as a Service (PaaS)
Implementation in a PaaS environment depends completely on the available APIs and development environment.
When designing your PaaS-based application, determine what access controls are available and how they map to the provider's storage infrastructure. In some cases application-level encryption will be an option, but make sure you understand the key management and where the data is encrypted. In some cases, you may be able to encrypt data on your side before sending it off to the cloud (for example, encrypting data within your application before making a call to store it in the PaaS).
As with SaaS, rights management and content discovery tend to be somewhat restricted in PaaS, unless the provider offers those features as part of the service.
Infrastructure as a Service (IaaS)
Your top priority for managing access controls in IaaS environments is to understand the mappings between the access controls you manage, and those enforced in the back-end infrastructure. For example, if you deploy a virtual machine into a public cloud, how are the access controls managed both for those accessing the machine from the Internet, and for the administrators that maintain the infrastructure? If another customer in the cloud is compromised, what prevents them from escalating privileges and accessing your content?
Virtual Private Storage is an excellent option to protect data that's remotely hosted, even in a multi-tenant environment. It requires a bit more management effort, but the end result is often more secure than traditional in-house storage.
Content discovery is possible in IaaS deployments where common network file access protocols/methods are available, and may be useful for preventing unapproved use of sensitive data (especially due to inadvertent disclosure in public clouds).
–Rich
Posted at Thursday 17th September 2009 12:59 pm
Filed under:
(0) Comments •
(0) Trackbacks •
Permalink
By Rich
Last week I started talking about data security in the cloud, and I referred back to our Data Security Lifecycle from back in 2007. Over the next couple of weeks I'm going to walk through the cycle and adapt the controls for cloud computing. After that, I will dig in deep on implementation options for each of the potential controls. I'm hoping this will give you a combination of practical advice you can implement today, along with a taste of potential options that may develop down the road.
We do face a bit of the chicken and egg problem with this series, since some of the technical details of controls implementation won't make sense without the cycle, but the cycle won't make sense without the details of the controls. I decided to start with the cycle, and will pepper in specific examples where I can to help it make sense. Hopefully it will all come together at the end.
In this post we're going to cover the Create phase:
Definition
Create is defined as generation of new digital content, either structured or unstructured, or significant modification of existing content. In this phase we classify the information and determine appropriate rights. This phase consists of two steps -- Classify and Assign Rights.
Steps and Controls
| Control | Structured/Application | Unstructured |
| Classify | Application Logic Tag/Labeling | Tag/Labeling |
| Assign Rights | Label Security | Enterprise DRM |
Classify
Classification at the time of creation is currently either a manual process (most unstructured data), or handled through application logic. Although the potential exists for automated tools to assist with classification, most cloud and non-cloud environments today classify manually for unstructured or directly-entered database data, while application data is automatically classified by business logic. Bear in mind that these are controls applied at the time of creation; additional controls such as access control and encryption are managed in the Store phase. There are two potential controls:
- Application Logic: Data is classified based on business logic in the application. For example, credit card numbers are classified as such based on on field definitions and program logic. Generally this logic is based on where data is entered, or via automated analysis (keyword or content analysis)
- Tagging/Labeling: The user manually applies tags or labels at the time of creation e.g., manually tagging via drop-down lists or open fields, manual keyword entry, suggestion-assisted tagging, and so on.
Assign Rights
This is the process of converting the classification into rights applied to the data. Not all data necessarily has rights applied, in which cases security is provided through additional controls during later phases of the cycle. (Technically rights are always applied, but in many cases they are so broad as to be effectively non-existent). These are rights that follow the data, as opposed to access controls or encryption which, although they protect the data, are decoupled from its creation. There are two potential technical controls here:
- Label Security: A feature of some database management systems and applications that adds a label to a data element, such as a database row, column, or table, or file metadata, classifying the content in that object. The DBMS or application can then implement access and logical controls based on the data label. Labels may be applied at the application layer, but only count as assigning rights if they also follow the data into storage.
- Enterprise Digital Rights Management (EDRM): Content is encrypted, and access and use rights are controlled by metadata embedded with the content. The EDRM market has been somewhat self-limiting due to the complexity of enterprise integration and assigning and managing rights.
Cloud SPI Tier Implications
Software as a Service (SaaS)
Classification and rights assignment are completely controlled by the application logic implemented by your SaaS provider. Typically we see Application Logic, since that's a fundamental feature of any application -- SaaS or otherwise. When evaluating your SaaS provider you should ask how they classify sensitive information and then later apply security controls, or if all data is lumped together into a single monolithic database (or flat files) without additional labels or security controls to prevent leakage to administrators, attackers, or other SaaS customers.
In some cases, various labeling technologies may be available. You will, again, need to work with your potential SaaS provider to determine if these labels are used only for searching/sorting data, or if they also assist in the application of security controls.
Platform as a Service (PaaS)
Implementation in a PaaS environment depends completely on the available APIs and development environment. As with internal applications, you will maintain responsibility for how classification and rights assignment are managed.
When designing your PaaS-based application, identify potential labeling/classification APIs you can integrate into program logic. You will need to work with your PaaS provider to understand how they can implement security controls at both the application and storage layers -- for example, it's important to know if and how data is labeled in storage, and if this can be used to restrict access or usage (business logic).
Infrastructure as a Service (IaaS)
Classification and rights assignments depend completely on what is available from your IaaS provider. Here are some specific examples:
- Cloud-based database: Work with your provider to determine if data labels are available, and with what granularity. If they aren't provided, you can still implement them as a manual addition (e.g., a row field or segregated tables), but understand that the DBMS will not be enforcing the rights automatically, and you will need to program management into your application.
- Cloud-based storage: Determine what metadata is available. Many cloud storage providers don't modify files, so anything you define in an internal storage environment should work in the cloud. The limitation is that the cloud provider won't be able to tie access or other security controls to the label, which is sometimes an option with document management systems. Enterprise DRM, for example, should work fine with any cloud storage provider.
This should give you a good idea of how to manage classification and rights assignment in various cloud environments. One exciting aspect is that use of tags, including automatically generated tags, is a common concept in the Web 2.0 world, and we can potentially tie this into our security controls. Users are better "trained" to tag content during creation with web-based applications (e.g., photo sharing sites & blogs), and we can take advantage of these habits to improve security.
–Rich
Posted at Tuesday 8th September 2009 10:19 am
Filed under:
(5) Comments •
(0) Trackbacks •
Permalink
By Rich
So I've written about data security, and I've written about cloud security, thus it's probably about time I wrote something about data security in the cloud.
To get started, I'm going to skip over defining the cloud. I recommend you take a look at the work of the Cloud Security Alliance, or skip on over to Hoff's cloud architecture post, which was the foundation of the architectural section of the CSA work. Today's post is going to be a bit scattershot, as I throw out some of the ideas rolling around my head from I thinking about building a data security cycle/framework for the cloud.
We've previously published two different data/information-centric security cycles. The first, the Data Security Lifecycle (second on the Research Library page) is designed to be a comprehensive forward-looking model. The second, The Pragmatic Data Security Cycle, is designed to be more useful in limited-scope data security projects. Together they are designed to give you the big picture, as well as a pragmatic approach for securing data in today's resource-constrained environments. These are different than your typical Information Lifecycle Management cycles to reflect the different needs of the security audience.

When evaluating data security in the context of the cloud, the issues aren't that we've suddenly blasted these cycles into oblivion, but that when and where you can implement controls is shifted, sometimes dramatically. Keep in mind that moving to the cloud is every bit as much an opportunity as a risk. I'm serious -- when's the last time you had the chance to completely re-architect your data security from the ground up?
For example, one of the most common risks cited when considering cloud deployment is lack of control over your data; any remote admin can potentially see all your sensitive secrets. Then again, so can any local admin (with access to the system). What's the difference? In one case you have an employment agreement and their name, in the other you have a Service Level Agreement and contracts... which should include a way to get the admin's name.
The problems are far more similar than they are different. I'm not one of those people saying the cloud isn't anything new -- it is, and some of these subtle differences can have a big impact -- but we can definitely scope and manage the data security issues. And when we can't achieve our desired level of security... well, that's time to figure out what our risk tolerance is.
Let's take two specific examples:
Protecting Data on Amazon S3 -- Amazon S3 is one of the leading IaaS services for stored data, but it includes only minimal security controls compared to an internal storage repository. Access controls (which may not integrate with your internal access controls) and transit encryption (SSL) are available, but data is not encrypted in storage and may be accessible to Amazon staff or anyone who compromises your Amazon credentials. One option, which we've talked about here before, is Virtual Private Storage. You encrypt your data before sending it off to Amazon S3, giving you absolute control over keys and ACLs. You maintain complete control while still retaining the benefits of cloud-based storage. Many cloud backup solutions use this method.
Protecting Data at a SaaS Provider -- I'd be more specific and list a SaaS provider, but I can't remember which ones follow this architecture. With SaaS we have less control and are basically limited to the security controls built into the SaaS offering. That isn't necessarily bad -- the SaaS provider might be far more secure than you are -- but not all SaaS offerings are created equal. To secure SaaS data you need to rely more on your contracts and an understanding of how your provider manages your data.
One architectural option for your SaaS provider is to protect your data with individual client keys managed outside the application (this is actually a useful internal data security architectural choice). It's application-level encryption with external key management. All sensitive client data is encrypted in the SaaS provider's database. Keys are managed in a dedicated appliance/service, and provided temporally to the application based on user credentials. Ideally the SaaS prover's admins are properly segregated -- where no single admin has database, key management, and application credentials. Since this potentially complicates support, it might be restricted to only the most sensitive data. (All your information might still be encrypted, but for support purposes could be accessible to the approved administrators/support staff). The SaaS provider then also logs all access by internal and external users.
This is only one option, but your SaaS provider should be able to document their internal data security, and even provide you with external audit reports.
As you can see, just because you are in the cloud doesn't mean you completely give up any chance of data security. It's all about understanding security boundaries, control options, technology, and process controls.
In future posts we'll start walking through the Data Security Lifecycle and matching specific issues and control options in each phase against the SPI (SaaS, PaaS, IaaS) cloud models.
–Rich
Posted at Tuesday 1st September 2009 3:19 pm
Filed under:
(2) Comments •
(0) Trackbacks •
Permalink
By Adrian Lane
Updated June 4th to reflect terminology change.
This is the Re-Introduction to our Database Encryption series. Why are we re-introducing this series? I'm glad you asked. The more we worked on the separation of duties and key management sections, the more dissatisfied we became. Rich and I got some really good feedback from vendors and end users, and we felt we were missing the mark with this series. And not just because the stuff I drafted when I was sick completely lacked clarity of thought, but there are three specific reasons we were unhappy. The advice we were giving was not particularly pragmatic, the terminology we thought worked didn't, and we were doing a poor job of aligning end-user goals with available options. So yeah, this is an apology to our audience as the series was not up to our expectations and we failed to achieve some of our own Totally Transparent Research concepts. But we're 'fessing up to the problem and starting from scratch.
So we want to fix these things in two ways. First we want to change some of the terminology we have been using to describe database encryption. Using 'media encryption' and 'separation of duties' is confusing the issues, and we want to differentiate between the threat we are trying to protect against vs. what is being encrypted. And as we are talking to IT, developers, DBAs, and other audiences, we wanted to reduce confusion as much as possible. Second, we will create a simple guide for people to select a database encryption strategy that addresses their goals. Basically we are going to outline a decision tree of user requirements and map those to the available database encryption choices. Rich and I think that will aid end users to both clarify their goals and determine the correct implementation strategy.
In our original introduction we provided a clear idea of where we wanted to go with this series, but we did adopt our own terminology in order to better encapsulate the database encryption options vendors provide. We chose "Encryption for Separation of Duties" and "Encryption for Media Protection". This is a bit of an oversimplification, and mapped to the threat rather than to the feature. Plus, if you asked your RDBMS vendor for 'media encryption', they would not know what they heck you were talking about. We are going to change the terminology back to the following:
Database Transparent/External Encryption: Encryption of the entire database. This is provided by native encryption functions within the database. The goal is to prevent exposure of information due to loss of the physical media. This can also be done through drive or OS/file system encryption, although they lack some of the protections of native database encryption. The encryption is invisible to the application and does not require alterations to the code or schema.
Data User Encryption: Encrypting specific columns, tables, or even data elements in the database. The classic example is credit card numbers. The goal is to provide protection against inadvertent disclosure, or to enforce separation of duties. How this is accomplished will depend upon how key management is utilized and (internal/external) encryption services, and will affect the way the application uses the database, but provides more granular access control.
While we're confident we've described the two options accurately, we're not convinced the specific terms "database encryption" and "data encryption" are necessarily the best, so please suggest any better options.
Blanket encryption of all database content for media protection is much easier than encrypting specific columns & tables for separation of duties, but it doesn't offer the same security benefits. Knowing which to choose will depend upon three things:
- What do you want to protect?
- What do you want to protect it from?
- What application changes and management tasks will you tolerate?
Thus, the first thing we need to decide when looking at database encryption is what are we trying to protect and why. If we're just going after the 'PCI checkbox' or are worried about losing data from swapping out hard drives, someone stealing the files off the server, or misplacing backup tapes, then database encryption (for media protection) is our answer. If the goal is to protect data in the event of compromised accounts, rogue DBAs, or inadvertent disclosure; then things get a lot more complicated. We will go into the details of 'why' and 'how' in a future post, as well as the issues of application alterations, after we have introduced the decision tree overview. If you have any comments, good, bad, or indifferent, please share. As always, we want the discussion to be as open as possible.
–Adrian Lane
Posted at Thursday 4th June 2009 1:06 am
Filed under:
(9) Comments •
(0) Trackbacks •
Permalink
By Rich
One of the more difficult aspects of the analyst gig is sorting through all the information you get, and isolating out any inherent biases. The kinds of inquiries we get from clients can all too easily skew our perceptions of the industry, since people tend to come to us for specific reasons, and those reasons don't necessarily represent the mean of the industry. Aside from all the vendor updates (and customer references), our end user conversations usually involve helping someone with a specific problem -- ranging from vendor selection, to basic technology education, to strategy development/problem solving. People call us when they need help, not when things are running well, so it's all too easy to assume a particular technology is being used more widely than it really is, or a problem is bigger or smaller than it really is, because everyone calling us is asking about it. Countering this takes a lot of outreach to find out what people are really doing even when they aren't calling us.
Over the past few weeks I've had a series of opportunities to work with end users outside the context of normal inbound inquiries, and it's been fairly enlightening. These included direct client calls, executive roundtables such as one I participated in recently with IANS (with a mix from Fortune 50 to mid-size enterprises), and some outreach on our part. They reinforced some of what we've been thinking, while breaking other assumptions. I thought it would be good to compile these together into a "state of the industry" summary. Since I spend most of my time focused on web application and data security, I'll only cover those areas:

When it comes to web application and data security, if there isn't a compliance requirement, there isn't budget -- Nearly all of the security professionals we've spoken with recognize the importance of web application and data security, but they consistently tell us that unless there is a compliance requirement it's very difficult for them to get budget. That's not to say it's impossible, but non-compliance projects (however important) are way down the priority list in most organizations. In a room of a dozen high-level security managers of (mostly) large enterprises, they all reinforced that compliance drove nearly all of their new projects, and there was little support for non-compliance-related web application or data security initiatives. I doubt this surprises any of you.
"Compliance" may mean more than compliance -- Activities that are positioned as helping with compliance, even if they aren't a direct requirement, are more likely to gain funding. This is especially true for projects that could reduce compliance costs. They will have a longer approval cycle, often 9 months or so, compared to the 3-6 months for directly-required compliance activities. Initiatives directly tied to limiting potential data breach notifications are the most cited driver. Two technology examples are full disk encryption and portable device control.
PCI is the single biggest compliance driver for web application and data security -- I may not be thrilled with PCI, but it's driving more web application and data security improvements than anything else.
The term Data Loss Prevention has lost meaning -- I discussed this in a post last week. Even those who have gone through a DLP tool selection process often use the term to encompass more than the narrow definition we prefer.
It's easier to get resources to do some things manually than to buy a tool -- Although tools would be much more efficient and effective for some projects, in terms of costs and results, manual projects using existing resources are easier to get approval for. As one manager put it, "I already have the bodies, and I won't get any more money for new tools." The most common example cited was content discovery (we'll talk more about this a few points down).
Most people use DLP for network (primarily email) monitoring, not content discovery or endpoint protection -- Even though we tend to think discovery offers equal or greater value, most organizations with DLP use it for network monitoring.
Interest in content discovery, especially DLP-based, is high, but resources are hard to get for discovery projects -- Most security managers I talk with are very interested in content discovery, but they are less educated on the options and don't have the resources. They tell me that finding the data is the easy part -- getting resources to do anything about it is the limiting factor.
The Web Application Firewall (WAF) market and Security Source Code Tools markets are nearly equal in size, with more clients on WAFs, and more money spent on source code tools per client -- While it's hard to fully quantify, we think the source code tools cost more per implementation, but WAFs are in slightly wider use.
WAFs are a quicker hit for PCI compliance -- Most organizations deploying WAFs do so for PCI compliance, and they're seen as a quicker fix than secure source code projects.
Most WAF deployments are out of band, and false positives are a major problem for default deployments -- Customers are installing WAFs for compliance, but are generally unable to deploy them inline (initially) due to the tuning requirements.
Full drive encryption is mature, and well deployed in the early mainstream -- Full drive encryption, while not perfect, is deployable in even large enterprises. It's now considered a level-setting best practice in financial services, and usage is growing in healthcare and insurance. Other asset recovery options, such as remote data destruction and phone home applications, are now seen as little more than snake oil. As one CISO told us, "I don't care about the laptop, we just encrypt it and don't worry about it when it goes missing".
File and folder encryption is not in wide use -- Very few organizations are performing any wide scale file/folder encryption, outside of some targeted encryption of PII for compliance requirements.
Database encryption is hard, and not widely used -- Most organizations are dissatisfied with database encryption options, and do not deploy it widely. Within a large organization there is likely some DB encryption, with preference given to file/folder/media protection over column level encryption, but most organizations prefer to avoid it. Performance and key management are cited as the primary obstacles, even when using native tools. Current versions of database encryption (primarily native encryption) do perform better than older versions, but key management is still unsatisfactory. Large encryption projects, when initiated, take an average of 12-18 months.
Large enterprises prefer application-level encryption of credit card numbers, and tokenization -- When it comes to credit card numbers, security managers prefer to encrypt it at the application level, or consolidate numbers into a central source, using representative "tokens" throughout the rest of the application stack. These projects take a minimum of 12-18 months, similar to database encryption projects (the two are often tied together, with encryption used in the source database).
Email encryption and DRM tend to be workgroup-specific deployments -- Email encryption and DRM use is scattered throughout the industry, but is still generally limited to workgroup-level projects due to the complexity of management, or lack of demand/compliance from users.
Database Activity Monitoring usage continues to grow slowly, mostly for compliance, but not quickly enough to save lagging vendors -- Many DAM deployments are still tied to SOX auditing, and it's not as widely used for other data security initiatives. Performance is reasonable when you can use endpoint agents, which some DBAs still resist. Network monitoring is not seen as effective, but may still be used when local monitoring isn't an option. Network requirements, depending on the tool, may also inhibit deployments.
My main takeaway is that security managers know what they need to do to protect information assets, but they lack the time, resources, and management support for many initiatives. There is also broad dissatisfaction with security tools and vendors in general, in large part due to poor expectation setting during the sales process, and deliberately confusing marketing. It's not that the tools don't work, but that they're never quite as easy as promised.
It's an interesting dilemma, since there is clear and broad recognition that data security (and by extension, web application security) is likely our most pressing overall issue in terms of security, but due to a variety of factors (many of which we covered in our Business Justification for Data Security paper), the resources just aren't there to really tackle it head-on.
–Rich
Posted at Monday 1st June 2009 8:18 am
Filed under:
(12) Comments •
(0) Trackbacks •
Permalink
By Rich
Way back when I started Securosis, I came up with something called the Data Security Lifecycle, which I later renamed the Information-Centric Security Cycle. While I think it does a good job of capturing all the components of data security, it's also somewhat dense. That lifecycle was designed to be a comprehensive outline of protective controls and information management, but I've since realized that if you have a specific data security problem, it isn't the best place to start.
In a couple weeks I'll be speaking at the TechTarget Financial Information Security Decisions conference in New York, where I'm presenting Pragmatic Data Security. By "pragmatic" I mean something you can implement as soon as you get home. Where the lifecycle answers the question, "How can I secure all my data throughout its entire lifecycle?" pragmatic data security answers, "How can I protect this specific data at this point in time, in my existing environment?"
It starts with a slimmed down cycle:

- Define what information you want to protect (specifically, not general data classification)
- Discover where it's located (various tools/techniques, preferably automated, like DLP, rather than manual)
- Secure the data where it's stored, and/or eliminate data where it shouldn't be (access controls, encryption)
- Monitor data usage (various tools, including DLP, DAM, logs, SIEM)
- Protect the data from exfiltration (DLP, USB control, email security, web gateways, etc.)
For example, if you want to protect credit card numbers you'd define them in step 1, use DLP content discovery in step 2 to locate where they are stored, remove it or lock the repositories down in step 3, use DAM and DLP to monitor where they're going in step 4, and use blocking technologies to keep them from leaving the organization in step 5.
All too often I'm seeing people get totally wrapped up in complex "boil the ocean" projects that never go anywhere, vs. defining and solving a specific problem. You don't need to start your entire data security program with some massive data classification program. Pick one defined type of data/information, and just go protect it. Find it, lock it down, watch how it's being used, and stop it from going where you don't want.
Yeah, parts are hard, but hard != impossible. If you keep your focus, any hard problem is just a series of smaller, defined steps.
–Rich
Posted at Thursday 21st May 2009 4:54 pm
Filed under:
(2) Comments •
(0) Trackbacks •
Permalink
By Rich
Although security is my chosen profession, I've been working in and around the healthcare industry for literally my entire life. My mother was (is) a nurse and I grew up in and around hospitals. I later became an EMT, then paramedic, and still work in emergency services on the side. Heck, even my wife works in a hospital, and one of my first security gigs was analyzing a medical benefits system, while another was as a contract CTO for an early stage startup in electronic medical records/transcription.

The value of moving to consistent electronic medical records is nearly incalculable. You would probably be shocked if you saw how we perform medical studies and analyze real-world medical treatments and outcomes. It's so bass-ackwards, considering all the tech tools available today, that the only excuse is insanity or hubris. I mean there are approved drugs used in Advanced Cardiac Life Support where the medical benefits aren't even close to proven. Sometimes it's almost as much guesswork as trying to come up with a security ROI. There's literally a category of drugs that's pretty much, "well, as long as they are really dead this probably won't hurt, but it probably won't help either".
With good electronic medical records, accessible on a national scale, we'll gain an incredible ability to analyze symptoms, illnesses, treatments, and outcomes on a massive scale. It's called evidence-based medicine, and despite what a certain political party is claiming, it has nothing to do with the government telling doctors what to do. Unless said doctors are idiots who prefer not to make decisions based on science, not that your doctor would ever do that.
The problem is while most of us personally don't have any interest in the x-rays of whatever object happened to embed itself in your posterior when you slipped and fell on it in the bathroom, odds are someone wouldn't mind uploading it... somewhere. Never mind insurance companies, potential employers, or that hot chick in the bar you've convinced those are just "love bumps", and you were born with them.
Securing electronic medical records is a nasty problem for a few reasons:
- They need to be accessible by any authorized medical provider in a clinical setting... quickly and easily. Even when you aren't able to manually authorize that particular provider (like me when I roll up in an ambulance).
- To be useful on a personal level, they need to be complete, portable, and standardized.
- To be useful on a national level, they need to be complete, standardized, and accessible, yet anonymized.
While delving into specific technologies is beyond the scope of this post, there are specific security requirements we need to include in records systems to protect patient privacy, while enabling all the advantages of moving off paper. Keep in mind these recommendations are specific to electronic medical records systems (EMR) (also called CPR for Computerized Patient Records) -- not every piece of IT that touches a record, but doesn't have access to the main patient record.
- Secure Authentication: You might call this one a no-brainer, but despite HIPAA we still see rampant reuse of credentials, and weak credentials, in many different medical settings. This is often for legitimate reasons, since many EMR systems are programmed like crap and are hard to use in clinical settings. That said, we have options that work, and any time a patient record is viewed (as opposed to adding info like test results or images) we need stronger authentication tied to a specific, vetted individual.
- Secure Storage: We're tired of losing healthcare records on lost hard drives or via hacking compromises of the server. Make it stop. Please. (Read all our other data security posts for some ideas).
- Robust Logging and Activity Monitoring: When records are accessed, a full record of who did what, and when, needs to be recorded. Some systems on the market do this, but not all of them. Also, these monitoring controls are easily bypassed by direct database access, which is rampant in the healthcare industry. These guys run massive amounts of shitty applications and rely heavily on vendor support, with big contracts and direct database access. That might be okay for certain systems, but not for the EMR.
- Anomaly Detection: Unusual records access shouldn't just be recorded, but must generate a security alert (which is generally a manual review process today). An example alert might be when someone in radiology views a record, but no radiological order was recorded, or that individual wasn't assigned to the case.
- Secure Exchange: I doubt our records will reside on a magical RFID implanted in our chests (since arms are easy to lose, in my experience) so we always have them with us. They will reside in a series of systems, which hopefully don't involve Google. Our healthcare providers will exchange this information, and it's possible no complete master record will exist unless some additional service is set up. That's okay, since we'll have collections of fairly complete records, with the closest thing to a master record likely (and somewhat unfortunately) managed by our insurance company. While we have some consistent formats for exchanging this data (HL7), there isn't any secure exchange mechanism. We'll need some form of encryption/DRM... preferably a national/industry standard.
- De-Identification: Once we go to collect national records (or use the data for other kinds of evidence-based studies) it needs to be de-identified. This isn't just masking a name and SSN, since other information could easily enable inference attacks. But at a certain point, we may de-identify data so much that it blocks inference attacks, but ruins the value of the data. It's a tough balance, which may result in tiers of data, depending on the situation.
In terms of direct advice to those of you in healthcare, when evaluating an EMR system I recommend you focus on evaluating the authentication, secure storage, logging/monitoring, and anomaly detection/alerting first. Secure exchange and de-identification come into play when you start looking at sharing information.
–Rich
Posted at Tuesday 19th May 2009 6:00 pm
Filed under:
(4) Comments •
(0) Trackbacks •
Permalink
Page 1 of 3 pages 1 2 3 >