Login  |  Register  |  Contact

Data Security

Friday, September 18, 2009

Cloud Data Security: Use (Rough Cut)

By Rich

In our last post in this series, we covered the cloud implications of the Store phase of Data Security Cycle (our first post was on the Create phase). In this post we’ll move on to the Use phase. Please remember we are only covering technologies at a high level in this series – we will run a second series on detailed technical implementations of data security in the cloud a little later.


Use includes the controls that apply when the user is interacting with the data – either via a cloud-based application, or the endpoint accessing the cloud service (e.g., a client/cloud application, direct storage interaction, and so on). Although we primarily focus on cloud-specific controls, we also cover local data security controls that protect cloud data once it moves back into the enterprise. These are controls for the point of use – we will cover additional network based controls in the next phase.

Users interact with cloud data in three ways:

  • Web-based applications, such as most SaaS applications.
  • Client applications, such as local backup tools that store data in the cloud.
  • Direct/abstracted access, such as a local folder synchronized with cloud storage (e.g., Dropbox), or VPN access to a cloud-based server.

Cloud data may also be accessed by other back-end servers and applications, but the usage model is essentially the same (web, dedicated application, direct access, or an abstracted service).

Steps and Controls

Activity Monitoring and EnforcementDatabase Activity Monitoring
Application Activity Monitoring
Endpoint Activity Monitoring
File Activity Monitoring
Portable Device Control
Endpoint DLP/CMP
Cloud-Client Logs
Rights ManagementLabel Security
Enterprise DRM
Logical ControlsApplication Logic
Row Level Security
Application Securitysee Application Security Domain section

Activity Monitoring and Enforcement

Activity Monitoring and Enforcement includes advanced techniques for capturing all data access and usage activity in real or near-real time, often with preventative capabilities to stop policy violations. Although activity monitoring controls may use log files, they typically include their own collection methods or agents for deeper activity details and more rapid monitoring. Activity monitoring tools also include policy-based alerting and blocking/enforcement that log management tools lack.

None of the controls in this category are cloud specific, but we have attempted to show how they can be adapted to the cloud. These first controls integrate directly with the cloud infrastructure:

  1. Database Activity Monitoring (DAM): Monitoring all database activity, including all SQL activity. Can be performed through network sniffing of database traffic, agents installed on the server, or external monitoring, typically of transaction logs. Many tools combine monitoring techniques, and network-only monitoring is generally not recommended. DAM tools are managed externally to the database to provide separation of duties from database administrators (DBAs). All DBA activity can be monitored without interfering with their ability to perform job functions. Tools can alert on policy violations, and some tools can block certain activity. Current DAM tools are not cloud specific, and thus are only compatible with environments where the tool can either sniff all network database access (possible in some IaaS deployments, or if provided by the cloud service), or where a compatible monitoring agent can be installed in the database instance.
  2. Application Activity Monitoring: Similar to Database Activity Monitoring, but at the application level. As with DAM, tools can use network monitoring or local agents, and can alert and sometimes block on policy violations. Web Application Firewalls are commonly used for monitoring web application activity, but cloud deployment options are limited. Some SaaS or PaaS providers may offer real time activity monitoring, but log files or dashboards are more common. If you have direct access to your cloud-based logs, you can use a near real-time log analysis tool and build your own alerting policies.
  3. File Activity Monitoring: Monitoring access and use of files in enterprise storage. Although there are no cloud specific tools available, these tools may be deployable for cloud storage that uses (or presents an abstracted version of) standard file access protocols. Gives an enterprise the ability to audit all file access and generate reports (which may sometimes aid compliance reporting). Capable of independently monitoring even administrator access and can alert on policy violations.

The next three tools are endpoint data security tools that are not cloud specific, but may still be useful in organizations that manage endpoints:

  1. Endpoint Activity Monitoring: Primarily a traditional data security tool, although it can be used to track user interactions with cloud services. Watching all user activity on a workstation or server. Includes monitoring of application activity; network activity; storage/file system activity; and system interactions such as cut and paste, mouse clicks, application launches, etc. Provides deeper monitoring than endpoint DLP/CMF tools that focus only on content that matches policies. Capable of blocking activities such as pasting content from a cloud storage repository into an instant message. Extremely useful for auditing administrator activity on servers, assuming you can install the agent. An example of cloud usage would be deploying activity monitoring agents on all endpoints in a customer call center that accesses a SaaS for user support.
  2. Portable Device Control: Another traditional data security tool with limited cloud applicability, used to restrict access of, or file transfers to, portable storage such as USB drives and DVD burners. For cloud security purposes, we only include tools that either track and enforce policies based on data originating from a cloud application or storage, or are capable of enforcing policies based on data labels provided by that cloud storage or application. Portable device control is also capable of allowing access but auditing file transfers and sending that information to a central management server. Some tools integrate with encryption to provide dynamic encryption of content passed to portable storage. Will eventually be integrated into endpoint DLP/CMF tools that can make more granular decisions based on the content, rather than blanket policies that apply to all data. Some DLP/CMF tools already include this capability.
  3. Endpoint DLP: Endpoint Data Loss Prevention/Content Monitoring and Filtering tools that monitor and restrict usage of data through content analysis and centrally administered policies. While current capabilities vary highly among products, tools should be able to monitor what content is being accessed by an endpoint, any file storage or network transmission of that content, and any transfer of that content between applications (cut/paste). For performance reasons endpoint DLP is currently limited to a subset of enforcement policies (compared to gateway products) and endpoint-only products should be used in conjunction with network protection in most cases (which we will discuss in the next phase of the lifecycle).

At this time, most activity monitoring and enforcement needs to be built into the cloud infrastructure to provide value. We often see some degree of application activity monitoring built into SaaS offerings, with some logging available for cloud databases and file storage. The exception is IaaS, where you may have full control to deploy any security tool you like, but will need to account for the additional complexities of deploying in virtual environments which impact the ability to route and monitor network traffic.

Rights Management

We covered the rights management options in the Create and Store sections. They are also a factor in the this phase (Use), since this is another point where they can be actively enforced during user interaction

In the Store phase rights are applied as data enters storage, and access limitations are enforced. In the Use phase, additional rights are controlled, such as data modification, export, or more-complex usage patterns (like printing or copying).

Logical Controls

Logical controls expand the brute-force restrictions of access controls or EDRM that are based completely on who you are and what you are accessing. Logical controls are implemented in applications and databases and add business logic and context to data usage and protection. Most data-security logic controls for cloud deployments are implemented in application logic (there are plenty of other logical controls available for other aspects of cloud computing, but we are focusing on data security).

  1. Application Logic: Enforcing security logic in the application through design, programming, or external enforcement. Logical controls are one of the best options for protecting data in any kind of cloud-based application.
  2. Object (Row) Level Security: Creating a ruleset restricting use of a database object based on multiple criteria. For example, limiting a sales executive to only updating account information for accounts assigned to his territory. Essentially, these are logical controls implemented at the database layer, as opposed to the application layer. Object level security is a feature of the Database Management System and may or may not be available in cloud deployments (it’s available in some standard DBMSs, but is not currently a feature of any cloud-specific database system).
  3. Structural Controls: Using database design features to enforce security. For example, using the database schema to limit integrity attacks or restricting connection pooling to improve auditability. You can implement some level of structural controls in any database with a management system, but more advanced structural options may only be available in robust relational databases. Tools like SimpleDB are quite limited compared to a full hosted DBMS. Structural controls are more widely available than object level security, and since they don’t rely on IP addresses or external monitoring they are a good option for most cloud deployments. They are particularly effective when designed in conjunction with application logic controls.

Application Security

Aside from raw storage or plain hosted database access, most cloud deployments involve enterprise applications. Effective application security is thus absolutely critical to protect data, and often far more important than any access controls or other protections. A full discussion of cloud application security issues is beyond the scope of this post, and we recommend you read the Cloud Security Alliance Guidance for more details.

Cloud SPI Tier Implications

Software as a Service (SaaS)

Most usage controls in SaaS deployments are enforced in the application layer, and depend on what’s available from your cloud provider. The provider may also enforce additional usage controls on their internal users, and we recommend you ask for documentation if it’s available. In particular, determine what kinds of activity monitoring they perform for internal users vs. cloud-based users, and if those logs are ever available (such as during the investigation of security incidents). We also often see label security in SaaS deployments.

Platform as a Service (PaaS)

Depending on your PaaS deployment, it’s likely that application logic will be your best security option, followed by activity monitoring. If your PaaS provider doesn’t provide the level of auditing you would like, you may be able to capture activity within your application before it makes a call to the platform, although this won’t capture any potential direct calls to the PaaS that are outside your application.

Infrastructure as a Service (IaaS)

Although IaaS technically offers the most flexibility for deploying your own security controls, the design of the IaaS may inhibit deployment of many security controls. For example, monitoring tools that rely on network access or sniffing may not be deployable. On the other hand, your IaaS provider may include security controls as part of the service, especially some degree of logging and/or monitoring.

Database control availability will depend more on the nature of the infrastructure – as we’ve mentioned, full hosted databases in the cloud can enforce many, if not all, of the traditional database security controls.

Endpoint-based usage controls are enforceable in managed environments, but are only useful in private cloud deployments where access to the cloud can be restricted to only managed endpoints.


Thursday, September 17, 2009

Cloud Data Security: Store (Rough Cut)

By Rich

In our last post in this series, we covered the cloud implications of the Create phase of the Data Security Cycle. In this post we’re going to move on to the Store phase. Please remember that we are only covering technologies at a high level in this series on the cycle; we will run a second series on detailed technical implementations of data security in the cloud a little later.


Store is defined as the act of committing digital data to structured or unstructured storage (database vs. files). Here we map the classification and rights to security controls, including access controls, encryption and rights management. I include certain database and application controls, such as labeling, in rights management – not just DRM. Controls at this stage also apply to managing content in storage repositories (cloud or traditional), such as using content discovery to ensure that data is in approved/appropriate repositories.

Steps and Controls

Access ControlsDBMS Access Controls
Administrator Separation of Duties
File System Access Controls
Application/Document Management System Access Controls
EncryptionField Level Encryption
Application Level Encryption
Transparent Database Encryption
Media Encryption
File/Folder Encryption
Virtual Private Storage
Distributed Encryption
Rights ManagementApplication Logic
Enterprise DRM
Content DiscoveryCloud-Provided Database Discovery Tool
Database Discovery/DAM
DLP/CMP Discovery
Cloud-Provided Content Discovery DLP/CMP Content Discovery

Access Controls

One of the most fundamental data security technologies, built into every file and management system, and one of the most poorly used. In cloud computing environments there are two layers of access controls to manage – those presented by the cloud service, and the underlying access controls used by the cloud provider for their infrastructure. It’s important to understand the relationship between the two when evaluating overall security – in some cases the underlying infrastructure may be more secure (no direct back-end access) whereas in others the controls may be weaker (a database with multiple-tenant connection pooling).

  1. DBMS Access Controls: Access controls within a database management system (cloud or traditional), including proper use of views vs. direct table access. Use of these controls is often complicated by connection pooling, which tends to anonymize the user between the application and the database. A database/DBMS hosted in the cloud will likely use the normal access controls of the DBMS (e.g., hosted Oracle or MySQL). A cloud-based database such as Amazon’s SimpleDB or Google’s BigTable comes with its own access controls. Depending on your security requirements, it may be important to understand how the cloud-based DB stores information, so you can evaluate potential back-end security issues.
  2. Administrator Separation of Duties: Newer technologies implemented in databases to limit database administrator access. On Oracle this is called Database Vault, and on IBM DB2 I believe you use the Security Administrator role and Label Based Access Controls. When evaluating the security of a cloud offering, understand the capabilities to limit both front and back-end administrator access. Many cloud services support various administrator roles for clients, allowing you to define various administrative roles for your own staff. Some providers also implement technology controls to restrict their own back-end administrators, such as isolating their database access. You should ask your cloud provider for documentation on what controls they place on their own administrators (and super-admins), and what data they can potentially access.
  3. File System Access Controls: Normal file access controls, applied at the file or repository level. Again, it’s important to understand the differences between the file access controls presented to you by the cloud service, vs. their access control implementation on the back end. There is an incredible variety of options across cloud providers, even within a single SPI tier – many of them completely proprietary to a specific provider. For the purposes of this model, we only include access controls for cloud based file storage (IaaS), and the back-end access controls used by the cloud provider. Due to the increased abstraction, everything else falls into the Application and Document Management System category.
  4. Application and Document Management System Access Controls: This category includes any access control restrictions implemented above the file or DBMS storage layers. In non-cloud environments this includes access controls in tools like SharePoint or Documentum. In the cloud, this category includes any content restrictions managed through the cloud application or service abstracted from the back-end content storage. These are the access controls for any services that allow you to manage files, documents, and other ‘unstructured’ content. The back-end storage can consist of anything from a relational database to flat files to traditional storage, and should be evaluated separately.

When designing or evaluating access controls you are concerned first with what’s available to you to control your own user/staff access, and then with the back end to understand who at your cloud provider can see what information. Don’t assume that the back end is necessarily less secure – some providers use techniques like bit splitting (combined with encryption) to ensure no single administrator can see your content at the file level, with strong separation of duties to protect data at the application layer.


The most overhyped technology for protecting data, but still one of the most important. Encryption is far from a panacea for all your cloud data security issues, but when used properly and in combination with other controls, it provides effective security. In cloud implementations, encryption may help compensate for issues related to multi-tenancy, public clouds, and remote/external hosting.

  1. Application-Level Encryption: Collected data is encrypted by the application, before being sent into a database or file system for storage. For cloud-based applications (e.g., public or private SaaS) this is usually the recommended option because it protects the data from the user all the way down to storage. For added security, the encryption functions and keys can be separated from the application itself, which also limits the access of application administrators to sensitive data.
  2. Field-Level Encryption: The database management system encrypts fields within a database, normally at the column level. In cloud implementations you will generally want to encrypt data at the application layer, rather than within the database itself, due to the complexity.
  3. Transparent Encryption: Encryption of the database structures, files, or the media where the database is stored. For database structures this is managed by the DBMS, while for files it can be the DBMS or third-party file encryption. Media encryption is managed at the storage layer; never by the DBMS. Transparent encryption protects the database data from unauthorized direct access, but does not provide any internal security. For example, you can encrypt a remotely hosted database to prevent local administrators from accessing it, but it doesn’t protect data from authorized database users.
  4. Media Encryption: Encryption of the physical storage media, such as hard drives or backup tapes. In a cloud environment, encryption of a complete virtual machine on IaaS could be considered media encryption. Media encryption is designed primarily to protect data in the event of physical loss/theft, such as a drive being removed from a SAN. It is often of limited usefulness in cloud deployments, although may be used by hosting providers on the back end in case of physical loss of media.
  5. File/Folder Encryption: Traditional encryption of specific files and folders in storage by the host platform.
  6. Virtual Private Storage: Encryption of files/folders in a shared storage environment, where the encryption/decryption is managed and performed outside the storage environment. This separates the keys and encryption from the storage platform itself, and allows them to be managed locally even when the storage is remote. Virtual Private Storage is an effective technique to protect remote data when you don’t have complete control of the storage environment. Data is encrypted locally before being sent to the shared storage repository, providing complete control of user access and key management. You can read more about Virtual Private Storage in our post.
  7. Distributed Encryption: With distributed encryption we use a central key management solution, but distribute the encryption engines to any end-nodes that require access to the data. It is typically used for unstructured (file/folder) content. When a node needs access to an encrypted file it requests a key from the central server, which provides it if the access is authorized. Keys are usually user or group based, not specific to individual files. Distributed encryption helps with the main problem of file/folder encryption, which is ensuring that everyone who needs it gets access to the keys. Rather than trying to synchronize keys continually in the background, they are provide at need.

Rights Management

The actual enforcement of rights assigned during the Create phase.

For descriptions of the technologies, please see the post on the Create phase. In future posts we will discuss cloud implementations of each of these technologies in greater detail.

Content Discovery

Content Discovery is the process of using content or context-based tools to find sensitive data in content repositories. Content aware tools use advanced content analysis techniques, such as pattern matching, database fingerprinting, and partial document matching to identify sensitive data inside files and databases. Contextual tools rely more on location or specific metadata, such as tags, and are thus better suited to rigid environments with higher assurance that content is labeled appropriately.

Discovery allows you to scan storage repositories and identify the location of sensitive data based on central policies. It’s extremely useful for ensuring that sensitive content is only located where the desired security controls are in place. Discovery is also very useful for supporting compliance initiatives, such as PCI, which restrict the usage and handling of specific types of data.

  1. Cloud-Provided Database Discovery Tool: Your cloud service provides features to locate sensitive data within your cloud database, such as locating credit card numbers. This is specific to the cloud provider, and we have no examples of current offerings.
  2. Database Discovery/DAM: Tools to crawl through database fields looking for data that matches content analysis policies. We most often see this as a feature of a Database Activity Monitoring (DAM) product. These tools are not cloud specific, and depending on your cloud deployment may not be deployable. IaaS environments running standard DBMS platforms (e.g., Oracle or MS SQL Server) may be supported, but we are unaware of any cloud-specific offerings at this time.
  3. Data Loss Prevention (DLP)/Content Monitoring and Protection (CMP) Database Discovery: Some DLP/CMP tools support content discovery within databases; either directly or through analysis of a replicated database or flat file dump. With full access to a database, such as through an ODBC connection, they can perform ongoing scanning for sensitive information.
  4. Cloud-Provided Content Discovery: A cloud-based feature to perform content discovery on files stored with the cloud provider.
  5. DLP/CMP Content Discovery: All DLP/CMP tools with content discovery features can scan accessible file shares, even if they are hosted remotely. This is effective for cloud implementations where the tool has access to stored files using common file sharing protocols, such as CIFS and WebDAV.

Cloud SPI Tier Implications

Software as a Service (SaaS)

As with most security aspects of SaaS, the security controls available depend completely on what’s provided by your cloud service. Front-end access controls are common among SaaS offerings, and many allow you to define your own groups and roles. These may not map to back-end storage, especially for services that allow you to upload files, so you should ask your SaaS provider how they manage access controls for their internal users.

Many SaaS offerings state they encrypt your data, but it’s important to understand just where and how it’s encrypted. For some services, it’s little more than basic file/folder or media encryption of their hosting platforms, with no restrictions on internal access. In other cases, data is encrypted using a unique key for every customer, which is managed externally to the application using a dedicated encryption/key management system. This segregates data between co-tenants on the service, and is also useful to restrict back-end administrative access. Application-level encryption is most common in SaaS offerings, and many provide some level of storage encryption on the back end.

Most rights management in SaaS uses some form of labeling or tagging, since we are generally dealing with applications, rather than raw data. This is the same reason we don’t tend to see content discovery for SaaS offerings.

Platform as a Service (PaaS)

Implementation in a PaaS environment depends completely on the available APIs and development environment.

When designing your PaaS-based application, determine what access controls are available and how they map to the provider’s storage infrastructure. In some cases application-level encryption will be an option, but make sure you understand the key management and where the data is encrypted. In some cases, you may be able to encrypt data on your side before sending it off to the cloud (for example, encrypting data within your application before making a call to store it in the PaaS).

As with SaaS, rights management and content discovery tend to be somewhat restricted in PaaS, unless the provider offers those features as part of the service.

Infrastructure as a Service (IaaS)

Your top priority for managing access controls in IaaS environments is to understand the mappings between the access controls you manage, and those enforced in the back-end infrastructure. For example, if you deploy a virtual machine into a public cloud, how are the access controls managed both for those accessing the machine from the Internet, and for the administrators that maintain the infrastructure? If another customer in the cloud is compromised, what prevents them from escalating privileges and accessing your content?

Virtual Private Storage is an excellent option to protect data that’s remotely hosted, even in a multi-tenant environment. It requires a bit more management effort, but the end result is often more secure than traditional in-house storage.

Content discovery is possible in IaaS deployments where common network file access protocols/methods are available, and may be useful for preventing unapproved use of sensitive data (especially due to inadvertent disclosure in public clouds).


Tuesday, September 08, 2009

Cloud Data Security Cycle: Create (Rough Cut)

By Rich

Last week I started talking about data security in the cloud, and I referred back to our Data Security Lifecycle from back in 2007. Over the next couple of weeks I’m going to walk through the cycle and adapt the controls for cloud computing. After that, I will dig in deep on implementation options for each of the potential controls. I’m hoping this will give you a combination of practical advice you can implement today, along with a taste of potential options that may develop down the road.

We do face a bit of the chicken and egg problem with this series, since some of the technical details of controls implementation won’t make sense without the cycle, but the cycle won’t make sense without the details of the controls. I decided to start with the cycle, and will pepper in specific examples where I can to help it make sense. Hopefully it will all come together at the end.

In this post we’re going to cover the Create phase:


Create is defined as generation of new digital content, either structured or unstructured, or significant modification of existing content. In this phase we classify the information and determine appropriate rights. This phase consists of two steps – Classify and Assign Rights.

Steps and Controls


div class=”bodyTable”>

ClassifyApplication Logic
Assign RightsLabel SecurityEnterprise DRM


Classification at the time of creation is currently either a manual process (most unstructured data), or handled through application logic. Although the potential exists for automated tools to assist with classification, most cloud and non-cloud environments today classify manually for unstructured or directly-entered database data, while application data is automatically classified by business logic. Bear in mind that these are controls applied at the time of creation; additional controls such as access control and encryption are managed in the Store phase. There are two potential controls:

  1. Application Logic: Data is classified based on business logic in the application. For example, credit card numbers are classified as such based on on field definitions and program logic. Generally this logic is based on where data is entered, or via automated analysis (keyword or content analysis)
  2. Tagging/Labeling: The user manually applies tags or labels at the time of creation e.g., manually tagging via drop-down lists or open fields, manual keyword entry, suggestion-assisted tagging, and so on.

Assign Rights

This is the process of converting the classification into rights applied to the data. Not all data necessarily has rights applied, in which cases security is provided through additional controls during later phases of the cycle. (Technically rights are always applied, but in many cases they are so broad as to be effectively non-existent). These are rights that follow the data, as opposed to access controls or encryption which, although they protect the data, are decoupled from its creation. There are two potential technical controls here:

  1. Label Security: A feature of some database management systems and applications that adds a label to a data element, such as a database row, column, or table, or file metadata, classifying the content in that object. The DBMS or application can then implement access and logical controls based on the data label. Labels may be applied at the application layer, but only count as assigning rights if they also follow the data into storage.
  2. Enterprise Digital Rights Management (EDRM): Content is encrypted, and access and use rights are controlled by metadata embedded with the content. The EDRM market has been somewhat self-limiting due to the complexity of enterprise integration and assigning and managing rights.

Cloud SPI Tier Implications

Software as a Service (SaaS)

Classification and rights assignment are completely controlled by the application logic implemented by your SaaS provider. Typically we see Application Logic, since that’s a fundamental feature of any application – SaaS or otherwise. When evaluating your SaaS provider you should ask how they classify sensitive information and then later apply security controls, or if all data is lumped together into a single monolithic database (or flat files) without additional labels or security controls to prevent leakage to administrators, attackers, or other SaaS customers.

In some cases, various labeling technologies may be available. You will, again, need to work with your potential SaaS provider to determine if these labels are used only for searching/sorting data, or if they also assist in the application of security controls.

Platform as a Service (PaaS)

Implementation in a PaaS environment depends completely on the available APIs and development environment. As with internal applications, you will maintain responsibility for how classification and rights assignment are managed.

When designing your PaaS-based application, identify potential labeling/classification APIs you can integrate into program logic. You will need to work with your PaaS provider to understand how they can implement security controls at both the application and storage layers – for example, it’s important to know if and how data is labeled in storage, and if this can be used to restrict access or usage (business logic).

Infrastructure as a Service (IaaS)

Classification and rights assignments depend completely on what is available from your IaaS provider. Here are some specific examples:

  • Cloud-based database: Work with your provider to determine if data labels are available, and with what granularity. If they aren’t provided, you can still implement them as a manual addition (e.g., a row field or segregated tables), but understand that the DBMS will not be enforcing the rights automatically, and you will need to program management into your application.
  • Cloud-based storage: Determine what metadata is available. Many cloud storage providers don’t modify files, so anything you define in an internal storage environment should work in the cloud. The limitation is that the cloud provider won’t be able to tie access or other security controls to the label, which is sometimes an option with document management systems. Enterprise DRM, for example, should work fine with any cloud storage provider.

This should give you a good idea of how to manage classification and rights assignment in various cloud environments. One exciting aspect is that use of tags, including automatically generated tags, is a common concept in the Web 2.0 world, and we can potentially tie this into our security controls. Users are better “trained” to tag content during creation with web-based applications (e.g., photo sharing sites & blogs), and we can take advantage of these habits to improve security.


Tuesday, September 01, 2009

Musings on Data Security in the Cloud

By Rich

So I’ve written about data security, and I’ve written about cloud security, thus it’s probably about time I wrote something about data security in the cloud.

To get started, I’m going to skip over defining the cloud. I recommend you take a look at the work of the Cloud Security Alliance, or skip on over to Hoff’s cloud architecture post, which was the foundation of the architectural section of the CSA work. Today’s post is going to be a bit scattershot, as I throw out some of the ideas rolling around my head from I thinking about building a data security cycle/framework for the cloud.

We’ve previously published two different data/information-centric security cycles. The first, the Data Security Lifecycle (second on the Research Library page) is designed to be a comprehensive forward-looking model. The second, The Pragmatic Data Security Cycle, is designed to be more useful in limited-scope data security projects. Together they are designed to give you the big picture, as well as a pragmatic approach for securing data in today’s resource-constrained environments. These are different than your typical Information Lifecycle Management cycles to reflect the different needs of the security audience.

When evaluating data security in the context of the cloud, the issues aren’t that we’ve suddenly blasted these cycles into oblivion, but that when and where you can implement controls is shifted, sometimes dramatically. Keep in mind that moving to the cloud is every bit as much an opportunity as a risk. I’m serious – when’s the last time you had the chance to completely re-architect your data security from the ground up?

For example, one of the most common risks cited when considering cloud deployment is lack of control over your data; any remote admin can potentially see all your sensitive secrets. Then again, so can any local admin (with access to the system). What’s the difference? In one case you have an employment agreement and their name, in the other you have a Service Level Agreement and contracts… which should include a way to get the admin’s name.

The problems are far more similar than they are different. I’m not one of those people saying the cloud isn’t anything new – it is, and some of these subtle differences can have a big impact – but we can definitely scope and manage the data security issues. And when we can’t achieve our desired level of security… well, that’s time to figure out what our risk tolerance is.

Let’s take two specific examples:

Protecting Data on Amazon S3 – Amazon S3 is one of the leading IaaS services for stored data, but it includes only minimal security controls compared to an internal storage repository. Access controls (which may not integrate with your internal access controls) and transit encryption (SSL) are available, but data is not encrypted in storage and may be accessible to Amazon staff or anyone who compromises your Amazon credentials. One option, which we’ve talked about here before, is Virtual Private Storage. You encrypt your data before sending it off to Amazon S3, giving you absolute control over keys and ACLs. You maintain complete control while still retaining the benefits of cloud-based storage. Many cloud backup solutions use this method.

Protecting Data at a SaaS Provider – I’d be more specific and list a SaaS provider, but I can’t remember which ones follow this architecture. With SaaS we have less control and are basically limited to the security controls built into the SaaS offering. That isn’t necessarily bad – the SaaS provider might be far more secure than you are – but not all SaaS offerings are created equal. To secure SaaS data you need to rely more on your contracts and an understanding of how your provider manages your data.

One architectural option for your SaaS provider is to protect your data with individual client keys managed outside the application (this is actually a useful internal data security architectural choice). It’s application-level encryption with external key management. All sensitive client data is encrypted in the SaaS provider’s database. Keys are managed in a dedicated appliance/service, and provided temporally to the application based on user credentials. Ideally the SaaS prover’s admins are properly segregated – where no single admin has database, key management, and application credentials. Since this potentially complicates support, it might be restricted to only the most sensitive data. (All your information might still be encrypted, but for support purposes could be accessible to the approved administrators/support staff). The SaaS provider then also logs all access by internal and external users.

This is only one option, but your SaaS provider should be able to document their internal data security, and even provide you with external audit reports.

As you can see, just because you are in the cloud doesn’t mean you completely give up any chance of data security. It’s all about understanding security boundaries, control options, technology, and process controls.

In future posts we’ll start walking through the Data Security Lifecycle and matching specific issues and control options in each phase against the SPI (SaaS, PaaS, IaaS) cloud models.


Thursday, June 04, 2009

Introduction To Database Encryption - The Reboot!

By Adrian Lane

Updated June 4th to reflect terminology change.

This is the Re-Introduction to our Database Encryption series. Why are we re-introducing this series? I’m glad you asked. The more we worked on the separation of duties and key management sections, the more dissatisfied we became. Rich and I got some really good feedback from vendors and end users, and we felt we were missing the mark with this series. And not just because the stuff I drafted when I was sick completely lacked clarity of thought, but there are three specific reasons we were unhappy. The advice we were giving was not particularly pragmatic, the terminology we thought worked didn’t, and we were doing a poor job of aligning end-user goals with available options. So yeah, this is an apology to our audience as the series was not up to our expectations and we failed to achieve some of our own Totally Transparent Research concepts. But we’re ‘fessing up to the problem and starting from scratch.

So we want to fix these things in two ways. First we want to change some of the terminology we have been using to describe database encryption. Using ‘media encryption’ and ‘separation of duties’ is confusing the issues, and we want to differentiate between the threat we are trying to protect against vs. what is being encrypted. And as we are talking to IT, developers, DBAs, and other audiences, we wanted to reduce confusion as much as possible. Second, we will create a simple guide for people to select a database encryption strategy that addresses their goals. Basically we are going to outline a decision tree of user requirements and map those to the available database encryption choices. Rich and I think that will aid end users to both clarify their goals and determine the correct implementation strategy.

In our original introduction we provided a clear idea of where we wanted to go with this series, but we did adopt our own terminology in order to better encapsulate the database encryption options vendors provide. We chose “Encryption for Separation of Duties” and “Encryption for Media Protection”. This is a bit of an oversimplification, and mapped to the threat rather than to the feature. Plus, if you asked your RDBMS vendor for ‘media encryption’, they would not know what they heck you were talking about. We are going to change the terminology back to the following:

  1. Database Transparent/External Encryption: Encryption of the entire database. This is provided by native encryption functions within the database. The goal is to prevent exposure of information due to loss of the physical media. This can also be done through drive or OS/file system encryption, although they lack some of the protections of native database encryption. The encryption is invisible to the application and does not require alterations to the code or schema.

  2. Data User Encryption: Encrypting specific columns, tables, or even data elements in the database. The classic example is credit card numbers. The goal is to provide protection against inadvertent disclosure, or to enforce separation of duties. How this is accomplished will depend upon how key management is utilized and (internal/external) encryption services, and will affect the way the application uses the database, but provides more granular access control.

While we’re confident we’ve described the two options accurately, we’re not convinced the specific terms “database encryption” and “data encryption” are necessarily the best, so please suggest any better options.

Blanket encryption of all database content for media protection is much easier than encrypting specific columns & tables for separation of duties, but it doesn’t offer the same security benefits. Knowing which to choose will depend upon three things:

  • What do you want to protect?
  • What do you want to protect it from?
  • What application changes and management tasks will you tolerate?

Thus, the first thing we need to decide when looking at database encryption is what are we trying to protect and why. If we’re just going after the ‘PCI checkbox’ or are worried about losing data from swapping out hard drives, someone stealing the files off the server, or misplacing backup tapes, then database encryption (for media protection) is our answer. If the goal is to protect data in the event of compromised accounts, rogue DBAs, or inadvertent disclosure; then things get a lot more complicated. We will go into the details of ‘why’ and ‘how’ in a future post, as well as the issues of application alterations, after we have introduced the decision tree overview. If you have any comments, good, bad, or indifferent, please share. As always, we want the discussion to be as open as possible.

–Adrian Lane

Monday, June 01, 2009

The State of Web Application and Data Security—Mid 2009

By Rich

One of the more difficult aspects of the analyst gig is sorting through all the information you get, and isolating out any inherent biases. The kinds of inquiries we get from clients can all too easily skew our perceptions of the industry, since people tend to come to us for specific reasons, and those reasons don’t necessarily represent the mean of the industry. Aside from all the vendor updates (and customer references), our end user conversations usually involve helping someone with a specific problem – ranging from vendor selection, to basic technology education, to strategy development/problem solving. People call us when they need help, not when things are running well, so it’s all too easy to assume a particular technology is being used more widely than it really is, or a problem is bigger or smaller than it really is, because everyone calling us is asking about it. Countering this takes a lot of outreach to find out what people are really doing even when they aren’t calling us.

Over the past few weeks I’ve had a series of opportunities to work with end users outside the context of normal inbound inquiries, and it’s been fairly enlightening. These included direct client calls, executive roundtables such as one I participated in recently with IANS (with a mix from Fortune 50 to mid-size enterprises), and some outreach on our part. They reinforced some of what we’ve been thinking, while breaking other assumptions. I thought it would be good to compile these together into a “state of the industry” summary. Since I spend most of my time focused on web application and data security, I’ll only cover those areas:


When it comes to web application and data security, if there isn’t a compliance requirement, there isn’t budget – Nearly all of the security professionals we’ve spoken with recognize the importance of web application and data security, but they consistently tell us that unless there is a compliance requirement it’s very difficult for them to get budget. That’s not to say it’s impossible, but non-compliance projects (however important) are way down the priority list in most organizations. In a room of a dozen high-level security managers of (mostly) large enterprises, they all reinforced that compliance drove nearly all of their new projects, and there was little support for non-compliance-related web application or data security initiatives. I doubt this surprises any of you.

“Compliance” may mean more than compliance – Activities that are positioned as helping with compliance, even if they aren’t a direct requirement, are more likely to gain funding. This is especially true for projects that could reduce compliance costs. They will have a longer approval cycle, often 9 months or so, compared to the 3-6 months for directly-required compliance activities. Initiatives directly tied to limiting potential data breach notifications are the most cited driver. Two technology examples are full disk encryption and portable device control.

PCI is the single biggest compliance driver for web application and data security – I may not be thrilled with PCI, but it’s driving more web application and data security improvements than anything else.

The term Data Loss Prevention has lost meaningI discussed this in a post last week. Even those who have gone through a DLP tool selection process often use the term to encompass more than the narrow definition we prefer.

It’s easier to get resources to do some things manually than to buy a tool – Although tools would be much more efficient and effective for some projects, in terms of costs and results, manual projects using existing resources are easier to get approval for. As one manager put it, “I already have the bodies, and I won’t get any more money for new tools.” The most common example cited was content discovery (we’ll talk more about this a few points down).

Most people use DLP for network (primarily email) monitoring, not content discovery or endpoint protection – Even though we tend to think discovery offers equal or greater value, most organizations with DLP use it for network monitoring.

Interest in content discovery, especially DLP-based, is high, but resources are hard to get for discovery projects – Most security managers I talk with are very interested in content discovery, but they are less educated on the options and don’t have the resources. They tell me that finding the data is the easy part – getting resources to do anything about it is the limiting factor.

The Web Application Firewall (WAF) market and Security Source Code Tools markets are nearly equal in size, with more clients on WAFs, and more money spent on source code tools per client – While it’s hard to fully quantify, we think the source code tools cost more per implementation, but WAFs are in slightly wider use.

WAFs are a quicker hit for PCI compliance – Most organizations deploying WAFs do so for PCI compliance, and they’re seen as a quicker fix than secure source code projects.

Most WAF deployments are out of band, and false positives are a major problem for default deployments – Customers are installing WAFs for compliance, but are generally unable to deploy them inline (initially) due to the tuning requirements.

Full drive encryption is mature, and well deployed in the early mainstream – Full drive encryption, while not perfect, is deployable in even large enterprises. It’s now considered a level-setting best practice in financial services, and usage is growing in healthcare and insurance. Other asset recovery options, such as remote data destruction and phone home applications, are now seen as little more than snake oil. As one CISO told us, “I don’t care about the laptop, we just encrypt it and don’t worry about it when it goes missing”.

File and folder encryption is not in wide use – Very few organizations are performing any wide scale file/folder encryption, outside of some targeted encryption of PII for compliance requirements.

Database encryption is hard, and not widely used – Most organizations are dissatisfied with database encryption options, and do not deploy it widely. Within a large organization there is likely some DB encryption, with preference given to file/folder/media protection over column level encryption, but most organizations prefer to avoid it. Performance and key management are cited as the primary obstacles, even when using native tools. Current versions of database encryption (primarily native encryption) do perform better than older versions, but key management is still unsatisfactory. Large encryption projects, when initiated, take an average of 12-18 months.

Large enterprises prefer application-level encryption of credit card numbers, and tokenization – When it comes to credit card numbers, security managers prefer to encrypt it at the application level, or consolidate numbers into a central source, using representative “tokens” throughout the rest of the application stack. These projects take a minimum of 12-18 months, similar to database encryption projects (the two are often tied together, with encryption used in the source database).

Email encryption and DRM tend to be workgroup-specific deployments – Email encryption and DRM use is scattered throughout the industry, but is still generally limited to workgroup-level projects due to the complexity of management, or lack of demand/compliance from users.

Database Activity Monitoring usage continues to grow slowly, mostly for compliance, but not quickly enough to save lagging vendors – Many DAM deployments are still tied to SOX auditing, and it’s not as widely used for other data security initiatives. Performance is reasonable when you can use endpoint agents, which some DBAs still resist. Network monitoring is not seen as effective, but may still be used when local monitoring isn’t an option. Network requirements, depending on the tool, may also inhibit deployments.

My main takeaway is that security managers know what they need to do to protect information assets, but they lack the time, resources, and management support for many initiatives. There is also broad dissatisfaction with security tools and vendors in general, in large part due to poor expectation setting during the sales process, and deliberately confusing marketing. It’s not that the tools don’t work, but that they’re never quite as easy as promised.

It’s an interesting dilemma, since there is clear and broad recognition that data security (and by extension, web application security) is likely our most pressing overall issue in terms of security, but due to a variety of factors (many of which we covered in our Business Justification for Data Security paper), the resources just aren’t there to really tackle it head-on.


Thursday, May 21, 2009

The Pragmatic Data (Information-Centric) Security Cycle

By Rich

Way back when I started Securosis, I came up with something called the Data Security Lifecycle, which I later renamed the Information-Centric Security Cycle. While I think it does a good job of capturing all the components of data security, it’s also somewhat dense. That lifecycle was designed to be a comprehensive outline of protective controls and information management, but I’ve since realized that if you have a specific data security problem, it isn’t the best place to start.

In a couple weeks I’ll be speaking at the TechTarget Financial Information Security Decisions conference in New York, where I’m presenting Pragmatic Data Security. By “pragmatic” I mean something you can implement as soon as you get home. Where the lifecycle answers the question, “How can I secure all my data throughout its entire lifecycle?” pragmatic data security answers, “How can I protect this specific data at this point in time, in my existing environment?”

It starts with a slimmed down cycle:


  1. Define what information you want to protect (specifically, not general data classification)
  2. Discover where it’s located (various tools/techniques, preferably automated, like DLP, rather than manual)
  3. Secure the data where it’s stored, and/or eliminate data where it shouldn’t be (access controls, encryption)
  4. Monitor data usage (various tools, including DLP, DAM, logs, SIEM)
  5. Protect the data from exfiltration (DLP, USB control, email security, web gateways, etc.)

For example, if you want to protect credit card numbers you’d define them in step 1, use DLP content discovery in step 2 to locate where they are stored, remove it or lock the repositories down in step 3, use DAM and DLP to monitor where they’re going in step 4, and use blocking technologies to keep them from leaving the organization in step 5.

All too often I’m seeing people get totally wrapped up in complex “boil the ocean” projects that never go anywhere, vs. defining and solving a specific problem. You don’t need to start your entire data security program with some massive data classification program. Pick one defined type of data/information, and just go protect it. Find it, lock it down, watch how it’s being used, and stop it from going where you don’t want.

Yeah, parts are hard, but hard != impossible. If you keep your focus, any hard problem is just a series of smaller, defined steps.


Wednesday, May 20, 2009

Security Requirements for Electronic Medical Records

By Rich

Although security is my chosen profession, I’ve been working in and around the healthcare industry for literally my entire life. My mother was (is) a nurse and I grew up in and around hospitals. I later became an EMT, then paramedic, and still work in emergency services on the side. Heck, even my wife works in a hospital, and one of my first security gigs was analyzing a medical benefits system, while another was as a contract CTO for an early stage startup in electronic medical records/transcription.


The value of moving to consistent electronic medical records is nearly incalculable. You would probably be shocked if you saw how we perform medical studies and analyze real-world medical treatments and outcomes. It’s so bass-ackwards, considering all the tech tools available today, that the only excuse is insanity or hubris. I mean there are approved drugs used in Advanced Cardiac Life Support where the medical benefits aren’t even close to proven. Sometimes it’s almost as much guesswork as trying to come up with a security ROI. There’s literally a category of drugs that’s pretty much, “well, as long as they are really dead this probably won’t hurt, but it probably won’t help either”.

With good electronic medical records, accessible on a national scale, we’ll gain an incredible ability to analyze symptoms, illnesses, treatments, and outcomes on a massive scale. It’s called evidence-based medicine, and despite what a certain political party is claiming, it has nothing to do with the government telling doctors what to do. Unless said doctors are idiots who prefer not to make decisions based on science, not that your doctor would ever do that.

The problem is while most of us personally don’t have any interest in the x-rays of whatever object happened to embed itself in your posterior when you slipped and fell on it in the bathroom, odds are someone wouldn’t mind uploading it… somewhere. Never mind insurance companies, potential employers, or that hot chick in the bar you’ve convinced those are just “love bumps”, and you were born with them.

Securing electronic medical records is a nasty problem for a few reasons:

  • They need to be accessible by any authorized medical provider in a clinical setting… quickly and easily. Even when you aren’t able to manually authorize that particular provider (like me when I roll up in an ambulance).
  • To be useful on a personal level, they need to be complete, portable, and standardized.
  • To be useful on a national level, they need to be complete, standardized, and accessible, yet anonymized.

While delving into specific technologies is beyond the scope of this post, there are specific security requirements we need to include in records systems to protect patient privacy, while enabling all the advantages of moving off paper. Keep in mind these recommendations are specific to electronic medical records systems (EMR) (also called CPR for Computerized Patient Records) – not every piece of IT that touches a record, but doesn’t have access to the main patient record.

  1. Secure Authentication: You might call this one a no-brainer, but despite HIPAA we still see rampant reuse of credentials, and weak credentials, in many different medical settings. This is often for legitimate reasons, since many EMR systems are programmed like crap and are hard to use in clinical settings. That said, we have options that work, and any time a patient record is viewed (as opposed to adding info like test results or images) we need stronger authentication tied to a specific, vetted individual.
  2. Secure Storage: We’re tired of losing healthcare records on lost hard drives or via hacking compromises of the server. Make it stop. Please. (Read all our other data security posts for some ideas).
  3. Robust Logging and Activity Monitoring: When records are accessed, a full record of who did what, and when, needs to be recorded. Some systems on the market do this, but not all of them. Also, these monitoring controls are easily bypassed by direct database access, which is rampant in the healthcare industry. These guys run massive amounts of shitty applications and rely heavily on vendor support, with big contracts and direct database access. That might be okay for certain systems, but not for the EMR.
  4. Anomaly Detection: Unusual records access shouldn’t just be recorded, but must generate a security alert (which is generally a manual review process today). An example alert might be when someone in radiology views a record, but no radiological order was recorded, or that individual wasn’t assigned to the case.
  5. Secure Exchange: I doubt our records will reside on a magical RFID implanted in our chests (since arms are easy to lose, in my experience) so we always have them with us. They will reside in a series of systems, which hopefully don’t involve Google. Our healthcare providers will exchange this information, and it’s possible no complete master record will exist unless some additional service is set up. That’s okay, since we’ll have collections of fairly complete records, with the closest thing to a master record likely (and somewhat unfortunately) managed by our insurance company. While we have some consistent formats for exchanging this data (HL7), there isn’t any secure exchange mechanism. We’ll need some form of encryption/DRM… preferably a national/industry standard.
  6. De-Identification: Once we go to collect national records (or use the data for other kinds of evidence-based studies) it needs to be de-identified. This isn’t just masking a name and SSN, since other information could easily enable inference attacks. But at a certain point, we may de-identify data so much that it blocks inference attacks, but ruins the value of the data. It’s a tough balance, which may result in tiers of data, depending on the situation.

In terms of direct advice to those of you in healthcare, when evaluating an EMR system I recommend you focus on evaluating the authentication, secure storage, logging/monitoring, and anomaly detection/alerting first. Secure exchange and de-identification come into play when you start looking at sharing information.


Tuesday, May 12, 2009

The Data Breach Triangle

By Rich

I’d like to say I first became familiar with fire science back when I was in the Boulder County Fire Academy, but it really all started back in the Boy Scouts. One of the first things you learn when you’re tasked with starting, or stopping, fires is something known as the fire triangle. Fire is a pretty fascinating process when you dig into it. It demonstrates many of the characteristics of life (consumption, reproduction, waste production, movement), but is just a nifty chemical reaction that’s all sorts of fun when you’re a kid with white gas and a lighter (sorry Mom). The fire triangle is a simple model used to describe the elements required for fire to exist: heat, fuel, and oxygen. Take away any of the three, and fire can’t exist. (In recent years the triangle was updated to a tetrahedron, but since that would ruin my point, I’m ignoring it). In wildland fires we create backburns to remove fuel, in structure fires we use water to remove heat, and with fuel fires we use chemical agents to remove oxygen.

With all the recent breaches, I came up with the idea of a Data Breach Triangle to help prioritize security controls. The idea is that, just like fire, a breach needs three elements. Remove any of them and the breach is prevented. It consists of:


  • Data: The equivalent of fuel – information to steal or misuse.
  • Exploit: The combination of a vulnerability and/or an exploit path to allow an attacker unapproved access to the data.
  • Egress: A path for the data to leave the organization. It could be digital, such as a network egress, or physical, such as portable storage or a stolen hard drive.

Our security controls should map to the triangle, and technically only one side needs to be broken to prevent a breach. For example, encryption or data masking removes the data (depending a lot on the encryption implementation). Patch management and proactive controls prevent exploits. Egress filtering or portable device control prevents egress. This assumes, of course, that these controls actually work – which we all know isn’t always the case.

When evaluating data security I like to look for the triangle – will the controls in question really prevent the breach? That’s why, for example, I’m a huge fan of DLP content discovery for data cleansing – you get to ignore a whole big chunk of expensive security controls if there’s no data to steal. For high-value networks, egress filtering is a key control if you can’t remove the data or absolutely prevent exploits (exploits being the toughest part of the triangle to manage).

The nice bit is that exploit management is usually our main focus, but breaking the other two sides is often cheaper and easier.


Saturday, March 07, 2009

Director of National Cyber-Security Center Resigns

By Adrian Lane

A couple days ago I posted some thoughts on Data Security and the US Government, how I perceive the role of Cybersecurity, and what I suspected would be a difficult challenge as the Cybersecurity team was set up at cross-purposes with the intelligence community. Today the Wall Street Journal released an article on the resignation of National Cybersecurity Chief Rod Beckstrom. In a case of “even a blind squirrel occasionally finds a nut”, my estimate of internal conflict appears to already be going on. In his resignation letter, Mr. Beckstrom stated that the “NSA currently dominates most national cyber efforts” and “The intelligence culture is very different than a network operations or security culture”. The WSJ focuses on privacy and separation of power issues with additional comments from Mr. Beckstrom: “the threats to our democratic process … if all top level network security and monitoring are handled by any one organization”.

The resignation letter has a different feel and focus, pointing out that there was a general lack of support for the NCSC, and the specific ways Beckstrom feels his organizations was subjugated. If you have interest in this subject, you will want to read his resignation letter, as it contains more information. It also lists a couple methods by which the NSA can subtly (sneakily?) affect the effectiveness of Cybersecurity efforts that I did not mention in my post. Quite frankly I am surprised that the National Cybersecurity Center could somehow manage to only get 5 fully funded days of operation, but if true, this demonstrates the challenges faced by NCSC.

This could get ugly unless both sides understand that each organization can benefit the other, and realize the goals and agendas do not necessarily need to be at the expense of each other. Concessions have to be made, otherwise this is an expensive and ugly turf war and the entire security problem- which is quickly becoming a US government security problem- continues to fester.

–Adrian Lane

Wednesday, February 25, 2009

Is There Any DLP or Data Security On Mac/Linux?

By Rich

Had a very interesting call today with a client in the pharma research space. They would like to protect clinical study data as it moves to researcher’s computers, but are struggling with the best approach. On the call, I quickly realized that DLP, or a content tracking tool like Verdasys (who also does endpoint DLP) would be ideal. The only problem? They need Windows, Mac, and Linux support.200902241153.jpg

I couldn’t remember offhand of any DLP/tracking tool (or even DRM) that will work on all 3 platforms. This is an open call for you vendors to hit me up if you can help.

For you end users, where we ended up was with a few potential approaches:

  1. Switch to a remote virtual/hosted desktop for handling the sensitive data… such as Citrix or VMWare.
  2. Use Database Activity Monitoring to track who pulls the data.
  3. Endpoint encryption to protect the data from loss, but it won’t help when it’s moved to inappropriate locations.
  4. Network DLP to track it in email, but without the endpoint coverage it leaves a really big hole.
  5. Content discovery to keep some minimal tracking where it ends up (for managed systems), but that means opening up SMB/CIFS file sharing on the endpoint for admin access, which is in itself a security risk.
  6. Distributed encryption, which *does* have cross platform support, but still doesn’t stop the researcher from putting the data someplace it shouldn’t be, which is their main concern.

While this is one of those industries (research) with higher Mac/cross platform use than the average business, this is clearly a growing problem thanks to the consumerization of IT.

This situation also highlights how no single-channel solution can really protect data well. It’s the mix of network, endpoint, and discovery that really allows you to reduce risk without killing business process.


Saturday, February 21, 2009

Will This Be The Next PCI Requirement Addition?

By Rich

I’m almost willing to bet money on this one…

Due to the nature of the recent breaches, such as Hannaford, where data was exfiltrated over the network, I highly suspect we will see outbound monitoring and/or filtering in the next revision of the PCI DSS. For more details on what I mean, refer back to this post.

Consider this your first warning.


Thursday, February 12, 2009

Recent Data Breaches- How To Limit Malicious Outbound Connections

By Rich

Word is slowly coming through industry channels that the attackers in the Heartland breach exfiltrated sniffed data via an outbound network connection. While not surprising, I did hear that the connection wasn’t encrypted- the bad guys sent the data out in cleartext (I’ll leave it to the person who passed this on to identify themselves if they want). Rumor from 2 independent sources is the bad guys are an organized group out of St. Petersburg (yes, Russia, as cliche as that is).

This is similar to a whole host of breaches- including (probably) TJX. While I’m not so naive as to think you can stop all malicious outbound connections, I do think there’s a lot we can do to make life harder on the bad guys. Endless Hole, Alaskan Glacier

First, you need to lock down your outbound connections using a combination of current and next-generation firewalls. You should isolate out your transaction network to enforce tighter controls on it than on the rest of your business network. Traditional firewalls can lock down most outbound port/protocols, but struggle with nested/stealth channels or all the stuff shoveled over port 80. Next-gen firewalls and web gateways (I hate the name, but don’t have a better one) like Palo Alto Networks or Mi5 Networks can help. Regular web gateways (Websense and McAfee/Secure Computing) are also good, but vary more on their outbound control capabilities and tend to be more focused on malware prevention (not counting their DLP products, which we’ll talk about in a second).

The web gateway and next gen firewalls will focus on your overall network, while you can lock of the transaction side with tighter traditional firewall rules and segmenting that thing off.

Next, use DLP to sniff for outbound cardholder data. The bad guys don’t seem to be encrypting, and DLP will alert on that in a heartbeat (and maybe block it, depending on the channel). You’ll want to proxy with your web gateway to sniff SSL (and only some web gateways can do this) and set the DLP to alert on unauthorized encryption usage. That might be a real pain in the ass, if you have a lot of unmanaged encryption outside of SSL. Also, to do the outbound SSL proxy you need to roll out a gateway certificate to all your endpoints and suppress browser alerts via group policies.

I also recommend DLP content discovery to reduce where you have unencrypted stored data (yes, you do have it, even if you think you don’t).

As you’ve probably figured out by now, if you are starting from scratch some of this will be very difficult to implement on an existing network, especially one that hasn’t been managed tightly. Thus I suggest you focus on any of your processing/transaction paths and start walling those off first. In the long run, that will reduce both your risks and your compliance and audit costs.


Friday, February 06, 2009

The Business Justification for Data Security- Version 1.0

By Rich

We’ve been teasing you with previews, but rather than handing out more bits and pieces, we are excited to release the complete version of the Business Justification for Data Security.

This is version 1.0 of the report, and we expect it to continue to evolve as we get more public feedback. Based on some of that initial feedback, we’d like to emphasize something before you dig in. Keep in mind that this is a business justification tool, designed to help you align potential data security investments with business needs, and to document the justification to make a case with those holding the purse strings. It’s not meant to be a complete risk assessment model, although it does share many traits with risk management tools.

We’ve also designed this to be both pragmatic and flexible- you shouldn’t need to spend months with consultants to build your business justification. For some projects, you might complete it in an hour. For others, maybe a few days or weeks as you wrangle business unit heads together to force them to help value different types of information.

For those of you that don’t want to read a 38 page paper we’re going to continue to post the guts of the model as blog posts, and we also plan on blogging additional content, such as more examples and use cases.

We’d like to especially thank our exclusive sponsor, McAfee, who also set up a landing page here with some of their own additional whitepapers and content. As usual, we developed the content completely independently, and it’s only thanks to our sponsors that we can release it for free (and still feed our families). This paper is also released in cooperation with the SANS Institute, will be available in the SANS Reading Room, and we will be delivering a SANS webcast on the topic on March 17th.

This was one of our toughest projects, and we’re excited to finally get it out there. Please post your feedback in the comments, and we will be crediting reviewers that advance the model when we release the next version.

And once again, thanks to McAfee, SANS, and (as usual) Chris Pepper, our fearless editor.


Wednesday, January 28, 2009

The Business Justification For Data Security: Data Valuation

By Rich

Man, nothing feels better than finishing off a few major projects. Yesterday we finalized the first draft of the Business Justification paper this series is based on, and I also squeezed out my presentation for IT Security World (in March) where I’m talking about major enterprise software security. Ah, the thrills and spills of SAP R/3 vs. Netweaver security!

In our first post we provided an overview of the model. Today we’re going to dig into the first step- data valuation. For the record, we’re skipping huge chunks of the paper in these posts to focus on the meat of the model- and our invitation for reviewers is still open (official release date should be within 2 weeks).

We know our data has value, but we can”t assign a definitive or fixed monetary value to it. We want to use the value to justify spending on security, but trying to tie it to purely quantitative models for investment justification is impossible. We can use educated guesses but they”re still guesses, and if we pretend they are solid metrics we”re likely to make bad risk decisions. Rather than focusing on difficult (or impossible) to measure quantitative value, let”s start our business justification framework with qualitative assessments. Keep in mind that just because we aren”t quantifying the value of the data doesn’t mean we won”t use other quantifiable metrics later in the model. Just because you cannot completely quantify the value of data, that doesn’t mean you should throw all metrics out the window.

To keep things practical, let”s select a data type and assign an arbitrary value to it. To keep things simple you might use a range of numbers from 1 to 3, or “Low”, “Medium”, and “High” to represent the value of the data. For our system we will use a range of 1-5 to give us more granularity, with 1 being a low value and 5 being a high value.

Another two metrics help account for business context in our valuation: frequency of use and audiences. The more often the data is used, the higher its value (generally). The audience may be a handful of people at the company, or may be partners & customers as well as internal staff. More use by more people often indicates higher value, as well as higher exposure to risk. These factors are important not only for understanding the value of information, but also the threats and risks associated with it – and so our justification for expenditures. These two items will not be used as primary indicators of value, but will modify an “intrinsic” value we will discuss more thoroughly below. As before, we will assign each metric a number from 1 to 5 , and we suggest you at least loosely define the scope of those ranges. Finally, we will examine three audiences that use the data: employees, customers, and partners; and derive a 1-5 score.

The value of some data changes based on time or context, and for those cases we suggest you define and rate it differently for the different contexts. For example, product information before product release is more sensitive than the same information after release.

As an example, consider student records at a university. The value of these records is considered high, and so we would assign a value of five. While the value of this data is considered “High” as it affects students financially, the frequency of use may be moderate because these records are accessed and updated mostly during a predictable window – at the beginning and end of each semester. The number of audiences for this data is two, as the records are used by various university staff (financial services and the registrar”s office), and the student (customer). Our tabular representation looks like this:


p style=”font: 12.0px Helvetica; min-height: 14.0px”>





Student Record




In our next post (later today) we’ll give you more examples of how this works.