Login  |  Register  |  Contact

Data Loss Prevention

Tuesday, June 19, 2012

New Paper: Implementing and Managing a DLP Solution

By Rich

Yes, folks, at long last, here is my follow-up to Understanding and Selecting a DLP Solution.

As you might guess from the title, this one is focused on implementation and management. After you have picked a tool, this will help you get up and running, and then keep it running, with as little overhead as possible.

I would like to thank McAfee for licensing the paper and making it possible for us to give this stuff out for free (and by now we hope you’ve figured out that all the content is developed independently and objectively). McAfee is hosting the paper and you can download it from us:


Monday, March 19, 2012

iOS Data Security: Protecting Data on Unmanaged Devices

By Rich

There are a whole spectrum of options available for securing enterprise data on iOS, depending on how much you want to manage the device and the data. ‘Spectrum’ isn’t quite the right word, though, because these options aren’t on a linear continuum – instead they fall into three major buckets:

  1. Options for unmanaged devices
  2. Options for partially managed devices
  3. Options for fully managed devices

Here’s how we define these categories:

  • Unmanaged devices are fully in the control of the end user. No enterprise polices are enforced, and the user can install anything and otherwise use the device as they please.
  • Partially managed devices use a configuration profile or Exchange ActiveSync policies to manage certain settings, but the user is otherwise still in control of the device. The device is the user’s, but they agreed to some level of corporate management. They can install arbitrary applications and change most settings. Typical policies require them to use a strong passcode and enable remote wipe by the enterprise. They may also need to use an on-demand VPN for at least some network traffic (e.g., to the enterprise mail server and intranet web services), but the user’s other traffic goes unmonitored through whatever network connection they are currently using.
  • Fully managed devices also use a configuration profile, but are effectively enterprise-owned. The enterprise controls what apps can be installed, enforces an always-on VPN that the user can’t disable, and has the ability to monitor and manage all traffic to and from the device.

Some options fall into multiple categories, so we will start with the least protected and work our way up the hierarchy. We will indicate which options carry forward and will work in the higher (tighter) buckets.

Note: This series is focused exclusively on data security. We will not discuss mobile device management in general, or the myriad of other device management options!

With that reminder, let’s start with a brief discussion of your data protection options for the first bucket:

Unmanaged Devices

Unmanaged devices are completely under the user’s control, and the enterprise is unable to enforce any device polices. This means no configuration profiles and no Exchange ActiveSync policies to enforce device settings such as passcode requirements.

User managed security with written policies

Under this model you don’t restrict data or devices in any way, but institute written policies requiring users to protect data on the devices themselves. It isn’t the most secure option, but we are nothing if not comprehensive.

Basic policies should include the following:

  • Require Passcode: After n minutes
  • Simple Passcode: OFF
  • Erase Data: ON

Additionally we highly recommend you enable some form of remote wipe – either the free Find My iPhone, Exchange ActiveSync, or a third-party app.

These settings enable data protection and offer the highest level of device security possible without additional tools, but they aren’t generally sufficient for an enterprise or anything other than the smallest businesses.

We will discuss policies in more detail later, but make sure the user signs a mobile device policy saying they agree to these settings, then help them get the device configured. But, if you are reading this paper, this is not a good option for you.

No access to enterprise data

While it might seem obvious, your first choice is to completely exclude iOS devices. Depending on how your environment is set up, this might actually be difficult. There are a few key areas you need to check, to ensure an iOS device won’t slip through:

  • Email server: if you support IMAP/POP or even Microsoft Exchange mailboxes, if the user knows the right server settings and you haven’t implemented any preventative controls, they will be able to access email from their iPhone or iPad. There are numerous ways to prevent this (too many to cover in this post), but as a rule of thumb if the device can access the server, and you don’t have per-device restrictions, there is usually nothing to prevent them from getting email on the iDevice.
  • File servers: like email servers, if you allow the device to connect to the corporate network and have open file shares, the user can access the content. There are plenty of file access clients in the App Store capable of accessing most server types. If you rely on username and password protection (as opposed to network credentials) then the user can fetch content to their device.
  • Remote access: iOS includes decent support for a variety of VPNs. Unless you use certificate or other device restrictions, and especially if your VPN is based on a standard like IPSec, there is nothing to prevent the end user from configuring the VPN on their device. Don’t assume users won’t figure out how to VPN in, even if you don’t provide direct support.

To put this in perspective, in the Securosis environment we allow extensive use of iOS. We didn’t have to configure anything special to support iOS devices – we simply had to not configure anything to block them.

Email access with server-side data loss prevention (DLP)

With this option you allow users access to their enterprise email, but you enforce content-based restrictions using DLP to filter messages and attachments before they reach the devices.

Most DLP tools filter at the mail gateway (MTA) – not at the mail server (e.g., Exchange). Unless your DLP tool offers explicit support for filtering based on content and device, you won’t be able to use this option.

If your DLP tool is sufficiently flexible, though, you can use the DLP tool to prevent sensitive content from going to the device, while allowing normal communications. You can either build this off existing DLP policies or create completely new device-specific ones.

Sandboxed messaging app / walled garden

One of the more popular options today is to install a sandboxed app for messaging and file access, to isolate and control enterprise data. These apps do not use the iOS mail client, and handle all enterprise emails and attachments internally. They also typically manage calendars and contacts, and some include access to intranet web pages.

The app may use iOS Data Protection, implement its own encryption and hardening, or use both. Some of these apps can be installed without requiring a configuration profile to enforce a passcode, remote wipe, client certificate, and other settings, but in practice these are nearly universally required (placing these apps more in the Partially Managed category). Since you don’t necessarily have to enforce settings, we include these in the Unmanaged Devices category, but they will show up again in the Partially Managed section.

A sandboxed messaging app may support one are all of the following, depending on the product and how you have it configured:

  • Isolated and encrypted enterprise email, calendars, and contacts.
  • Encrypted network connection to the enterprise without requiring a separate VPN client (end-to-end encryption).
  • In-app document viewing for common document types (usually using the built-in iOS document viewer, which runs within the sandbox).
  • Document isolation. Documents can be viewed within the app, but “Open In…” is restricted for all or some document types.
  • Remote wipe of the app (and data store), the device, or both.
  • Intranet web site and/or file access.
  • Detection of jailbroken iOS devices to block use.

The app becomes the approved portal to enterprise data, while the user is free to otherwise do whatever they want on the device (albeit often with a few minor security policies enforced).

This post is already a little long so I will cut myself off here. Next post I will cover document (as opposed to messaging) sandboxed apps, DRM, and our last data security options for unmanaged devices.


Thursday, February 23, 2012

Implementing DLP: Ongoing Management

By Rich

Managing DLP tends to not be overly time consuming unless you are running off badly defined policies. Most of your time in the system is spent on incident handling, followed by policy management.

To give you some numbers, the average organization can expect to need about the equivalent of one full time person for every 10,000 monitored employees. This is really just a rough starting point – we’ve seen ratios as low as 1/25,000 and as high as 1/1000 depending on the nature and number of policies.

Managing Incidents

After deployment of the product and your initial policy set you will likely need fewer people to manage incidents. Even as you add policies you might not need additional people since just having a DLP tool and managing incidents improves user education and reduces the number of incidents.

Here is a typical process:

Manage incident handling queue

The incident handling queue is the user interface for managing incidents. This is where the incident handlers start their day, and it should have some key features:

  • Ability to customize the incident for the individual handler. Some are more technical and want to see detailed IP addresses or machine names, while others focus on users and policies.
  • Incidents should be pre-filtered based on the handler. In a larger organization this allows you to automatically assign incidents based on the type of policy, business unit involved, and so on.
  • The handler should be able to sort and filter at will; especially to sort based on the type of policy or the severity of the incident (usually the number of violations – e.g. a million account numbers in a file versus 5 numbers).
  • Support for one-click dispositions to close, assign, or escalate incidents right from the queue as opposed to having to open them individually.

Most organizations tend to distribute incident handling among a group of people as only part of their job. Incidents will be either automatically or manually routed around depending on the policy and the severity. Practically speaking, unless you are a large enterprise this cloud be a part-time responsibility for a single person, with some additional people in other departments like legal and human resources able to access the system or reports as needed for bigger incidents.

Initial investigation

Some incidents might be handled right from the initial incident queue; especially ones where a blocking action was triggered. But due to the nature of dealing with sensitive information there are plenty of alerts that will require at least a little initial investigation.

Most DLP tools provide all the initial information you need when you drill down on a single incident. This may even include the email or file involved with the policy violations highlighted in the text. The job of the handler is to determine if this is a real incident, the severity, and how to handle.

Useful information at this point is a history of other violations by that user and other violations of that policy. This helps you determine if there is a bigger issue/trend. Technical details will help you reconstruct more of what actually happened, and all of this should be available on a single screen to reduce the amount of effort needed to find the information you need.

If the handler works for the security team, he or she can also dig into other data sources if needed, such as a SIEM or firewall logs. This isn’t something you should have to do often.

Initial disposition

Based on the initial investigation the handler closes the incident, assigns it to someone else, escalates to a higher authority, or marks it for a deeper investigation.

Escalation and Case Management

Anyone who deploys DLP will eventually find incidents that require a deeper investigation and escalation. And by “eventually” we mean “within hours” for some of you.

DLP, by it’s nature, will find problems that require investigating your own employees. That’s why we emphasize having a good incident handling process from the start since these cases might lead to someone being fired. When you escalate, consider involving legal and human resources. Many DLP tools include case management features so you can upload supporting documentation and produce needed reports, plus track your investigative activities.


The last (incredibly obvious) step is to close the incident. You’ll need to determine a retention policy and if your DLP tool doesn’t support retention needs you can always output a report with all the salient incident details.

As with a lot of what we’ve discusses you’ll probably handle most incidents within minutes (or less) in the DLP tool, but we’ve detailed a common process for those times you need to dig in deeper.


Most DLP systems keep old incidents in the database, which will obviously fill it up over time. Periodically archiving old incidents (such as anything 1 year or older) is a good practice, especially since you might need to restore the records as part of a future investigation.

Managing Policies

Anytime you look at adding a significant new policy you should follow the Full Deployment process we described above, but there are still a lot of day to day policy maintenance activities. These tend not to take up a lot of time, but if you skip them for too long you might find your policy set getting stale and either not offering enough security, or causing other issues due to being out of date.

Policy distribution

If you manage multiple DLP components or regions you will need to ensure policies are properly distributed and tuned for the destination environment. If you distribute policies across national boundaries this is especially important since there might be legal considerations that mandate adjusting the policy.

This includes any changes to policies. For example, if you adjust a US-centric policy that’s been adapted to other regions, you’ll then need to update those regional policies to maintain consistency. If you manage remote offices with their own network connections you want to make sure policy updates pushed out properly and are consistent.

Adding policies

Brand new policies will take the same effort as the initial polices, other than you’ll be more familiar with the system. Thus we suggest you follow the Full Deployment process again.

Policy reviews

As with anything, today’s policy might not apply the same in a year, or two, or five. The last thing you want to end up with is a disastrous mess of stale, yet highly customized and poorly understood polices as you often see on firewalls.

Reviews should consist of:

  • Periodic reviews of the entire policy set to see if it still accurately reflects your needs and if new policies are required, or older ones should be retired.
  • Scheduled reviews and testing of individual policies to confirm that they still work as expected. Put it on the calendar when you create a new policy to check it at least annually. Run a few basic tests, and look at all the violations of the policy over a given time period to get a sense of how it works. Review the users and groups assigned to the policy to see if they still reflect the real users and business units in your organization.
  • Ad-hoc reviews when a policy seems to be providing unexpected results. A good tool to help figure this out is your trending reports – any big changes or deviations from a trend are worth investigating at the policy level.
  • Policy reviews during product updates, since these may change how a policy works or give you new analysis or enforcement options.

Updates and tuning

Even effective policies will need periodic updating and additional tuning. While you don’t necessarily need to follow the entire Full Deployment process for minor updates, they should still be tested in a monitoring mode before you move into any kind of automated enforcement.

Also make sure you communicate and noticeable changes to affected business units so you don’t catch them by surprise. We’ve heard plenty examples of someone in security flipping a new enforcement switch or changing a policy in a way that really impacted business operations. Maybe that’s the goal, but it’s always best to communicate and hash things out ahead of time.

If you find a policy really seems ineffective then it’s time for a full review. For example, we know of one very large DLP user who had unacceptable levels of false positives on their account number protection due to the numbers being too similar to other numbers commonly in use in regular communications. They solved the problem (after a year or more) by switching from pattern matching to a database fingerprinting policy that checked against the actual account numbers in a customer database.

Retiring policies

There are a lot of DLP policies you might use for a limited time, such as a partial document matching policy to protect corporate financials before they are released. After the release date, there’s no reason to keep the policy.

We suggest you archive these policies instead of deleting them. And if your tool supports it, set expiration dates on policies… with notification so it doesn’t shut down and leave a security hole without you knowing about it.

Backup and archiving

Even if you are doing full system backups it’s a good idea to perform periodic policy set backups. Many DLP tools offer this as part of the feature set. This allows you to migrate policies to new servers/appliances or recover policies when other parts of the system fail and a full restore is problematic.

We aren’t saying these sorts of disasters are common; in fact we’ve never heard of one, but we’re paranoid security folks.

Archiving old policies also helps if you need to review them while reviewing an old incident as part of a new investigation or a legal discovery situation.


Analysis, as opposed to incident handling, focuses on big picture trends . We suggest three kinds of analysis:

Trend analysis

Often built into the DLP server’s dashboard, this analysis looks across incidents to evaluate overall trends such as:

  • Are overall incidents increasing or decreasing?
  • Which policies are having more or less incidents over time?
  • Which business units experience more incidents?
  • Are there any sudden increases in violations by a business unit, or of a policy, that might not be seen if overall trends aren’t changing?
  • Are a certain type of incidents tied to a business process that should be changed?

The idea is to mine your data to evaluate how your risk is increasing or decreasing over time. When you’re in the muck of day to day incident handling it’s often hard to notice these trends.

Risk analysis

A risk analysis is designed to show what you are missing. DLP tools only look for what you tell them to look for, and thus won’t catch unprotected data you haven’t built a policy for.

A risk analysis is essentially the Quick Wins process. You turn on a series of policies with no intention of enforcing them, but merely to gather information and see if there are any hot spots you should look at more in depth or create dedicated policies for.

Effectiveness analysis

This helps assess the effectiveness of your DLP tool usage. Instead of looking at general reports think of it like testing to tool again. Try some common scenarios to circumvent your DLP to figure out where you need to make changes.

Content discovery/classification

Content discovery is the process of scanning storage for the initial identification of sensitive content and tends to be a bit different than network or endpoint deployments. While you can treat it the same, identifying policy violations and responding to them, many organizations view content discovery as a different process, often part of a larger data security or compliance project.

Content discovery projects will off turn up huge amounts of policy violations due to files being stored all over the place. Compounding the problem is the difficulty in identifying the file owner or business unit that’s using the data, and why they have it. Thus you tend to need more analysis, at least with your first run through a server or other storage repository, to find the data, identify who uses and owns it, the business need (if any), and alternative options to keep the data more secure.


We’ve covered most non-product-specific troubleshooting throughout this series. Problems people encounter tend to fall into the following categories:

  • Too many false positives or negatives, which you can manage using our policy tuning and analysis recommendations.
  • System components not talking to each other. For example, some DLP tools separate out endpoint and network management (often due to acquiring different products) and then integrate them at the user interface level. Unless there is a simple network routing issue, fixing these may require the help of your vendor.
  • Component integrations to external tools like web and email gateways may fail. Assuming you were able to get them to talk to each other previously, the culprit is usually a software update introducing an incompatibility. Unfortunately, you’ll need to run it down in the log files if you can’t pick out the exact cause.
  • New or replacement tools may not work with your existing DLP tool. For example, swapping out a web gateway or using a new edge switch with different SPAN/Mirror port capabilities.

We really don’t hear about too many problems with DLP tools outside of getting the initial installation properly hooked into infrastructures and tuning policies.

Maintenance for DLP tools is relatively low, consisting mostly of five activities (two of which we already discussed):

  • Full system backups, which you will definitely do for the central management server, and possibly any remote collectors/servers depending on your tool. Some tools don’t require this since you can swap in a new default server or appliance and then push down the configuration.
  • Archiving old incidents to free up space and resources. But don’t be too aggressive since you generally want a nice library of incidents to support future investigations.
  • Archiving and backing up policies. Archiving policies means removing them from the system, while backups include all the active policies. Keeping these separate from full system backups provides more flexibility for restoring to new systems or migrating to additional servers.
  • Health checks to ensure all system components are still talking to each other.
  • Updating endpoint and server agents to the latest versions (after testing, of course).


Ongoing reporting is an extremely important aspect of running a Data Loss Prevention tool. It helps you show management and other stakeholders that you, and your tool, are providing value and managing risk.

At a minimum you should produce quarterly, if not monthly, rollup reports on trends and summarizing overall activity. Ideally you’ll show decreasing policy violations, but if there is an increase of some sort you can use that to get the resources to investigate the root cause.

You will also produce a separate set of reports for compliance. These may be on a project basis, tied to any audit cycles, or scheduled like any other reports. For example, running quarterly content discovery reports showing you don’t have any unencrypted credit card data in a storage repository and providing these to your PCI assessor to reduce potential audit scope. Or running monthly HIPAA reports for the HIPAA compliance officer (if you work in healthcare).

Although you can have the DLP tool automatically generate and email reports, depending on your internal political environment you might want to review these before passing them to outsiders in case there are any problems with the data. Also, it’s never a good idea to name employees in general reports – keep identifications to incident investigations and case management summaries that have a limited audience.


Wednesday, February 22, 2012

Implementing DLP: Deploy

By Rich

Up until this point we’ve focused on all the preparatory work before you finally turn on the switch and start using your DLP tool in production. While it seems like a lot, in practice (assuming you know your priorities) you can usually be up and running with basic monitoring in a few days. With the pieces in place, now it’s time to configure and deploy policies to start your real monitoring and enforcement.

Earlier we defined the differences between the Quick Wins and Full Deployment processes. The easy way to think about it is Quick Wins is more about information gathering and refining priorities and policies, while Full Deployment is all about enforcement. With the Full Deployment option you respond and investigate every incident and alert. With Quick Wins you focus more on the big picture. To review:

  • The Quick Wins process is best for initial deployments. Your focus is on rapid deployment and information gathering vs. enforcement to help guide your full deployment. We previously detailed this process in a white paper and will only briefly review it in this series.
  • The Full Deployment process is what you’ll use for the long haul. It’s a methodical series of steps for full enforcement policies. Since the goal is enforcement (even if enforcement is alert/response and not automated blocking/filtering) we spend more time tuning policies to produce desired results.

We generally recommend you start with the Quick Wins process since it gives you a lot more information before jumping into a full deployment, and in some cases might realign your priorities based on what you find.

No matter which approach you take it helps to follow the DLP Cycle. These are the four high-level phases of any DLP project:

  1. Define: Define the data or information you want to discover, monitor, and protect. Definition starts with a statement like “protect credit card numbers”, but then needs to be converted into a granular definition capable of being loaded into a DLP tool.
  2. Discover: Find the information in storage or on your network. Content discovery is determining where the defined data resides, while network discovery determines where it’s currently being moved around on the network, and endpoint discovery is like content discovery but on employee computers. Depending on your project priorities you will want to start with a surveillance project to figure out where things are and how they are being used. This phase may involve working with business units and users to change habits before you go into full enforcement mode.
  3. Monitor: Ongoing monitoring with policy violations generating incidents for investigation. In Discover you focus on what should be allowed and setting a baseline; in Monitor your start capturing incidents that deviate from that baseline.
  4. Protect: Instead of identifying and manually handling incidents you start implementing real-time automated enforcement, such as blocking network connections, automatically encrypting or quarantining emails, blocking files from moving to USB, or removing files from unapproved servers.

Define Reports

Before you jump into your deployment we suggest defining your initial report set. You’ll need these to show progress, demonstrate value, and communicate with other stakeholders.

Here are a few starter ideas for reports:

  • Compliance reports are a no brainer and are often included in the products. For example, showing you scanned all endpoints or servers for unencrypted credit card data could save significant time and resources by reducing scope for a PCI assessment.
  • Since our policies are content based, reports showing violation types by policy help figure out what data is most at risk or most in use (depending on how you have your policies set). These are very useful to show management to align your other data security controls and education efforts.
  • Incidents by business unit are another great tool, even if focused on a single policy, in helping identify hot spots.
  • Trend reports are extremely valuable in showing the value of the tool and how well it helps with risk reduction. Most organizations we talk with who generate these reports see big reductions over time, especially when they notify employees of policy violations.

Never underestimate the political value of a good report.

Quick Wins Process

We previously covered Quick Wins deployments in depth in a dedicated whitepaper, but here is the core of the process:

The differences between a long-term DLP deployment and our “Quick Wins” approach are goals and scope. With a Full Deployment we focus on comprehensive monitoring and protection of very specific data types. We know what we want to protect (at a granular level) and how we want to protect it, and we can focus on comprehensive policies with low false positives and a robust workflow. Every policy violation is reviewed to determine if it’s an incident that requires a response.

In the Quick Wins approach we are concerned less about incident management, and more about gaining a rapid understanding of how information is used within our organization. There are two flavors to this approach – one where we focus on a narrow data type, typically as an early step in a full enforcement process or to support a compliance need, and the other where we cast a wide net to help us understand general data usage to prioritize our efforts. Long-term deployments and Quick Wins are not mutually exclusive – each targets a different goal and both can run concurrently or sequentially, depending on your resources.

Remember: even though we aren’t talking about a full enforcement process, it is absolutely essential that your incident management workflow be ready to go when you encounter violations that demand immediate action!

Choose Your Flavor

The first step is to decide which of two general approaches to take: * Single Type: In some organizations the primary driver behind the DLP deployment is protection of a single data type, often due to compliance requirements. This approach focuses only on that data type. * Information Usage: This approach casts a wide net to help characterize how the organization uses information, and identify patterns of both legitimate use and abuse. This information is often very useful for prioritizing and informing additional data security efforts.

Choose Your Deployment Architecture

Earlier we had you define your priorities and choose your deployment architecture, which, at this point, you should have implemented. For the Quick Wins process you select one of the main channels (network, storage, or endpoint) as opposed to trying to start with all of them. (This is also true in a Full Deployment).

Network deployments typically provide the most immediate information with the lowest effort, but depending on what tools you have available and your organization’s priorities, it may make sense to start with endpoints or storage.

Define Your Policies

The last step before hitting the “on” switch is to configure your policies to match your deployment flavor.

In a single type deployment, either choose an existing category that matches the data type in your tool, or quickly build your own policy. In our experience, pre-built categories common in most DLP tools are almost always available for the data types that commonly drive a DLP project. Don’t worry about tuning the policy – right now we just want to toss it out there and get as many results as possible. Yes, this is the exact opposite of our recommendations for a traditional, focused DLP deployment.

In an information usage deployment, turn on all the policies or enable promiscuous monitoring mode. Most DLP tools only record activity when there are policy violations, which is why you must enable the policies. A few tools can monitor general activity without relying on a policy trigger (either full content or metadata only). In both cases our goal is to collect as much information as possible to identify usage patterns and potential issues.


Now it’s time to turn on your tool and start collecting results.

Don’t be shocked – in both deployment types you will see a lot more information than in a focused deployment, including more potential false positives. Remember, you aren’t concerned with managing every single incident, but want a broad understanding of what’s going on on your network, in endpoints, or in storage.


Now we get to the most important part of the process – turning all that data into useful information.

Once we collect enough data, it’s time to start the analysis process. Our goal is to identify broad patterns and identify any major issues. Here are some examples of what to look for:

  • A business unit sending out sensitive data unprotected as part of a regularly scheduled job.
  • Which data types broadly trigger the most violations.
  • The volume of usage of certain content or files, which may help identify valuable assets that don’t cleanly match a pre-defined policy.
  • Particular users or business units with higher numbers of violations or unusual usage patterns.
  • False positive patterns, for tuning long-term policies later.
  • All DLP tools provide some level of reporting and analysis, but ideally your tool will allow you to set flexible criteria to support the analysis.

Full Deployment Process

Even if you start with the Quick Wins process (and we recommend you do) you will always want to move into a Full Deployment. This is the process you will use anytime you add policies or your environment changes (e.g. you add endpoint monitoring to an existing network deployment).

Now before we get too deep keep in mind that we are breaking things out very granularly to fit the widest range of organizations. Many of you won’t need to go this in depth due to size or the nature of your policies/priorities. Don’t get hung up on our multi-step process since many of you won’t need to move so cautiously and can run through multiple steps in a single day.

The key to success is to think incrementally; all too often we encounter organizations that throw our a new, default policy and then try to start handling all the incidents immediately. While DLP generally doesn’t have a high true false positive rate, that doesn’t mean you won’t get a lot of hits on a policy before tuning. For example, if you set a credit card policy incorrectly you will alert more on employees buying sock monkeys on Amazon with their personal cards than major leaks.

In the Full Deployment process you pick a single type of data or information to protect, create a single policy, and slowly roll it out and tune it until you get full coverage. As with everything DLP this might move pretty quickly, or it could take many months to work out the kinks if you have a complex policy or are trying to protect data that’s hard to discern from allowed usage.

At a high level this completely follows the DLP Cycle, but we will go into greater depth.

Define the policy

This maps to your initial priorities. You want to start with a single kind of information to protect that you can well-define at a technical level. Some examples include:

  • Credit card numbers
  • Account numbers from a customer database
  • Engineering plans from a particular server/directory
  • Healthcare data
  • Corporate financials

Picking a single type to start with helps reduce management overhead and allows you to tune the policy.

The content analysis policy itself needs to be as specific as possible while reflecting the real-world usage of the data to be protected. For example, if you don’t have a central source where engineering plans are stored it will be hard to properly protect them. You might be able to rely on a keyword, but that tends to result in too many false positives. For customer account numbers you might need to pull directly from a database if, for example, there’s no pattern other than a 7 or 10 digit number (which, if you think about it and live in the US, is going to be more than a little problem).

We covered content analysis techniques in our Understanding and Selecting a Data Loss Prevention Solution paper and suggest you review that while determining which content analysis techniques to use. It includes a worksheet to help guide you through the selection.

In most cases your vendor will provide some prebuilt policies and categories that can jump start your own policy development. It’s totally acceptable to start with one of those and evaluate the results.

Deploy to a subset

The next step is to deploy the policy in monitoring mode on a limited subset of your overall coverage goal. This is to keep the number of alerts down and give you time to adjust the policy. For example:

  • In a network deployment, limit yourself to monitoring a smaller range of IP addresses or subnets. Another option is to start with a specific channel, like email, before moving on to web or general network monitoring. If your organization is big enough, you’ll use a combination of both at the start.
  • For endpoints limit yourself to both a subset of systems and a subset of endpoint options. Don’t try and monitor USB usage, cut and paste, and local storage all at once – pick one to start.
  • For storage scanning pick either a single system, or even a subdirectory of that system depending on the overall storage volume involved.

The key is to start small so you don’t get overloaded during the tuning process. It’s a lot easier to grow a smaller deployment than deal with the fallout of a poorly-tuned policy overwhelming you. We stick to monitoring mode so we don’t accidentally break things.

Analyze and tune

You should start seeing results pretty much the moment you turn the tool on. Hopefully you followed our advice and have your incident response process ready to go since even when you aren’t trying, odds are you will find things that require escalation.

During analysis and tuning you iteratively look at the results and adjust the policy. If you see too many false positives, or real positives that are allowed in that context, you adjust the policy. An example might be refining policies to apply differently to different user groups (executives, accounting, HR, legal, engineering, etc.). Or you might need to toss out your approach to use a different option – such as switching to database fingerprinting/matching from a pattern-based rule due to the data being too close to similar data in regular use.

Another option is if your tool supports full-network capture, or you are using DLP in combination with a network forensics tool. In those cases you can collect a bunch of traffic and test policy changes against it immediately instead of tuning a policy, running it for a few days or weeks to see results, then tuning again.

You also need to test the policy for false negatives by generating traffic (such as email messages, it doesn’t need to be fancy).

The goal is to align results with your expectations and objectives during the limited deployment.

Manage incidents and expand scope

Once the policy is tuned you can switch into full incident-handling mode. This doesn’t include preventative controls like blocking, but fully investigating and handling incidents. At this point you should start generating user-visible alerts and working with business units and individual employees to change habits. Some organizations falsely believe it’s better not to inform employees they are being monitored and when they violate policies so they don’t try to circumvent security and it’s easier to catch malicious activity. This is backwards, since in every DLP deployment we are aware of the vast majority of risk is due to employee mistakes or poorly managed business processes, not malicious activity. The evidence in the DLP deployments we’ve seen clearly shows that educating users when they make mistakes dramatically reduces the number of overall incidents.

Since user education so effectively reduces the number of overall incidents we suggest taking time to work within the limited initial deployment scope since this, in turn, lowers the overhead as you expand scope.

As you see the results you want you slowly expand scope by adding additional network, storage, or endpoint coverage, such as additional network egress points/IP ranges, additional storage servers, or more endpoints.

As you expand bit by bit you continue to enforce and tune the policy and handle incidents. This allows you to adapt the policies to meet the needs of different business units and avoid being overwhelmed in situations where there are a lot of violations. (At this point it’s more an issue of real violations than false positives).

If you are a smaller organization or don’t experience too many violations with a policy you can mostly skip this step, but even if it’s only for a day we suggest starting small.

Protect iteratively

At this point you will be dealing with a smaller amount of incidents, if any. If you want to implement automatic enforcement, like network filtering or USB blocking, now is a good time.

Some organizations prefer to wait a year or more before moving into enforcement and there’s nothing wrong with this. However, what you don’t want to do is try implementing preventative controls on too wide a scale. As with monitoring we suggest you start iteratively to allow you time to deal with all the support calls and ensure the blocking is working as you expect.

Add component/channel coverage

At this point you should have a reliable policy on a wide scale (potentially) blocking policy violations. The next step is to expand by adding additional component coverage (such as adding endpoint to a network deployment) or expanding channels within a component (additional network channels like email or web gateway integration, or additional endpoint functionality). This, again, gives you the time to tune the policies to best fit the conditions.

As we said earlier many organizations will be able to blast through some basic policies pretty quickly and not be overloaded with the results. But it’s still a good idea to have this more-incremental process in your head in case you need it. If you started with the Quick Wins you’ll have a good idea on the amount of effort needed to tune your policies before you ever start.


Wednesday, February 15, 2012

Implementing DLP: Deploying Storage and Endpoint

By Rich

Storage deployment

From a technical perspective, deploying storage DLP is even easier than the most basic network DLP. You can simply point it at an open file share, load up the proper access rights, and start analyzing. The problem most people run into is figuring out which servers to target, which access rights to use, and whether the network and storage repository can handle the overhead.

Remote scanning

All storage DLP solutions support remotely scanning a repository by connecting to an open file share. To run they need to connect (at least administrator-only) to a share on the server scan.

But straightforward or not, there are three issues people commonly encounter:

  1. Sometimes it’s difficult to figure out where all the servers are and what file shares are exposed. To resolve this you can use a variety of network scanning tools if you don’t have a good inventory to start.
  2. After you find the repositories you need to gain access rights. And those rights need to be privileged enough to view all files on the server. This is a business process issue, not a technical problem, but most organizations need to do a little legwork to track down at least a few server owners.
  3. Depending on your network architecture you may need to position DLP servers closer to the file repositories. This is very similar to a hierarchical network deployment but we are positioning closer to the storage to reduce network impact or work around internal network restrictions (not that everyone segregates their internal network, even though that single security step is one of the most powerful tools in our arsenal). For very large repositories which you don’t want to install a server agent on, you might even need to connect the DLP server to the same switch. We have even heard of organizations adding a second network interfaces on a private segment network to support particularly intense scanning.

All of this is configured in the DLP management console; where you configure the servers to scan, enter the credentials, assign policies, and determine scan frequency and schedule.

Server agents

Server agents support higher performance without network impact, because the analysis is done right on the storage repository, with only results pushed back to the DLP server. This assumes you can install the agent and the server has the processing power and memory to support the analysis. Some agents also provide additional context you can’t get from remote scanning.

Installing the server agent is no more difficult than installing any other software, but as we have mentioned (multiple times) you need to make sure you test to understand compatibility and performance impact. Then you configure the agent to connect to the production DLP server.

Unless you run into connection issues due to your network architecture, you then move over to the DLP management console to tune the configuration. The main things to set are scan frequency, policies, and performance throttles. Agents rarely run all the time – you choose a schedule, similar to antivirus, to reduce overhead and scan during slower hours.

Depending on the product, some agents require a constant connection to the DLP server. They may compress data and send it to the server for analysis rather than checking everything locally. This is very product-specific, so work with your vendor to figure out which option works best for you – especially if their server agent’s internal analysis capabilities are limited compared to the DLP server’s. As an example, some document and database matching policies impose high memory requirements which are infeasible on a storage server, but may be acceptable on the shiny new DLP server.

Document management system/NAS integration

Certain document management systems and Network Attached Storage products expose plugin architectures or other mechanisms that allow the DLP tool to connect directly, rather than relying on an open file share.

This method may provide additional context and information, as with a server agent. This is extremely dependent on which products you use, so we can’t provide much guidance beyond “do what the manual says”.

Database scanning

If your product supports database scanning you will usually make a connection to the database using an ODBC agent and then configure what to scan.

As with storage DLP, deployment of database DLP may require extensive business process work: to find the servers, get permission, and obtain credentials. Once you start scanning, it is extremely unlikely you will be able to scan all database records. DLP tools tend to focus on scanning the table structure and table names to pick out high-risk areas such as credit card fields, and then they scan a certain number of rows to see what kind of data is in the fields.

So the process becomes:

  1. Identify the target database.
  2. Obtain credentials and make an ODBC connection.
  3. Scan attribute names (field/column names).
  4. (Optional) Define which fields to scan/monitor.
  5. Analyze the first n rows of identified fields.

We only scan a certain number of rows because the focus isn’t on comprehensive realtime monitoring – that’s what Database Activity Monitoring is for – and to avoid unacceptable performance impact. But scanning a small number of rows should be enough to identify which tables hold sensitive data, which is hard to do manually.

Endpoint deployment

Endpoints are, by far, the most variable component of Data Loss Prevention. There are massive differences between the various products on the market, and far more performance constraints required to fit on general-purpose workstations and laptops, rather than on dedicated servers. Fortunately, as widely as the features and functions vary, the deployment process is consistent.

  1. Test, then test more: I realize I have told you to test your endpoint agents at least 3 times by now, but this is the single most common problem people encounter. If you haven’t already, make sure you test your agents on a variety of real-world systems in your environment to make sure performance is acceptable.
  2. Create a deployment package or enable in your EPP tool: The best way to deploy the DLP agent is to use whatever software distribution tool you already use for normal system updates. This means building a deployment package with the agent configured to connect to the DLP server. Remember to account for any network restrictions that could isolate endpoints from the server. In some cases the DLP agent may be integrated into your existing EPP (Endpoint Protection Platform) tool. Most often you will need to deploy an additional agent, but if it is fully integrated you won’t need to push a software update, and can configure and enable it either through the DLP management console or in the EPP tool itself.
  3. Activate and confirm deployment: Once the agent is deployed go back to your DLP management console to validate that systems are covered, agents are running, and they can communicate with the DLP server. You don’t want to turn on any policies yet – for now you’re just confirming that the agents deployed successfully and are communicating.


Monday, February 13, 2012

Implementing DLP: Deploying Network DLP

By Rich

Deploying on the network is usually very straightforward – especially since much of the networking support is typically built into the DLP server.

If you encounter complications they are generally:

  • due to proxy integration incompatibilities,
  • around integrating with a complex email infrastructure (e.g., multiple regions), or
  • in highly distributed organizations with large numbers of network egress points.

Passive sniffing

Sniffing is the most basic network DLP monitoring option. There are two possible components involved:

  • All full-suite DLP tools include network monitoring capabilities on the management server or appliance. Once you install it, connect it to a network SPAN or mirror port to monitor traffic.
  • Since the DLP server can normally only monitor a single network gateway, various products also support hierarchical deployment, with dedicated network monitoring DLP servers or appliances deployed to other gateways. This may be a full DLP server with some features turned off, a DLP server for a remote location that pulls policies and pushes alerts back to a central management server, or a thinner appliance or software designed only to monitor traffic and send information back to the management server.

Integration involves mapping network egress points and then installing the hardware on the monitoring ports. High-bandwidth connections may require a server or appliance cluster; or multiple servers/appliances, each monitoring a subset of the network (either IP ranges or port/protocol ranges).

If you don’t have a SPAN or mirror port you’ll need to add a network tap. The DLP tool needs to see all egress traffic, so a normal connection to a switch or router is inadequate.

In smaller deployments you can also deploy DLP inline (bridge mode), and keep it in monitoring mode (passthrough and fail open). Even if your plan is to block, we recommend starting with passive monitoring.


Email integrates a little differently because the SMTP protocol is asynchronous. Most DLP tools include a built-in Mail Transport Agent (MTA). To integrate email monitoring you enable the feature in the product, then add it into the chain of MTAs that route SMTP traffic out of your network.

Alternatively, you might be able to integrate DLP analysis directly into your email security gateway, if your vendors have a partnership.

You will generally want to add your DLP tool as the next hop after your email server. If you also use an email security gateway, that means pointing your mail server to the DLP server, and the DLP server to the mail gateway.

If you integrate directly with the mail gateway your DLP tool will likely add x-headers to analyzed mail messages. This extra metadata instructs the mail gateway how to handle each messages (allow, block, etc.).

Web gateways and other proxies

As we have mentioned, DLP tools are commonly integrated with web security gateways (proxies) to allow more granular management of web (and FTP) traffic. They may also integrate with instant messaging gateways, although that is very product specific.

Most modern web gateways support something called the ICAP protocol (Internet Content Adaptation Protocol) for extending proxy servers. If your web gateway supports ICAP you can configure it to pass traffic to your DLP server for analysis. Proxying connections enable analysis before the content leaves your organization. You can, for example, allow someone to use webmail but block attachments and messages containing sensitive information.

So much traffic now travels over SSL connections that you might want to integrate with a web gateway that performs SSL interception (also called a “reverse proxy”). These work by installing a trusted server certificate on all your endpoints (a straightforward configuration update) and performing a “man-in-the-middle” interception on all SSL traffic. Traffic is encrypted inside your network and from the proxy to the destination website, but the proxy has access to decrypted content.

Note: this is essentially attacking and spying on your own users, so we strongly recommend notifying them before you start intercepting SSL traffic for analysis.

If you have SSL interception up and running on your gateway, there are no additional steps beyond ICAP integration.

Additional proxies, such as instant messaging, have their own integration requirements. If the products are compatible this is usually the same process as integrating a web gateway: just turn the feature on in your DLP product and point both sides at each other.

Hierarchical deployments

Until now we have mostly described fairly simple deployments, focused on a single appliance or server. That’s the common scenario for small and some mid-size organizations, but the rest of you have multiple network egress points to manage – possibly in very distributed situations, with limited bandwidth in each location.

Hopefully you all purchased products which support hierarchical deployment. To integrate, you place additional DLP servers or appliances on each network gateway, then configure them to slave to the primary DLP server/appliance in your network core. The actual procedure varies by product, but here are some things to look out for:

  • Different products have different management traffic bandwidth requirements. Some work great in all situations, but others are too bandwidth-heavy for some remote locations.
  • If your remote locations don’t have a VPN or private connection back to your core network, you will need to establish them for handle management traffic.
  • If you plan on allowing remote locations to manage their own DLP incidents, now is the time to set up a few test policies and workflow to verify that your tool can support this.
  • If you don’t have web or instant messaging proxies at remote locations, and don’t filter that traffic, you obviously lose a major enforcement option. Inconsistent network security hampers DLP deployments (and isn’t good for the rest of your security, either!).
  • We are only discussing multiple network deployments here, but you might use the same architecture to cover remote storage repositories or even endpoints.

The remote servers or appliances will receive policies pushed by your main management server and then perform all analysis and enforcement locally. Incident data is sent back to the main DLP console for handling unless you delegated to remote locations.

As we have mentioned repeatedly, if hierarchical deployment is a requirement, please be sure to test this capability before putting money down on a product. This is not the sort of problem you want to try solving during deployment.


Tuesday, February 07, 2012

Implementing and Managing a Data Loss Prevention (DLP) Solution: Index of Posts

By Rich

We’re pretty deep into our series on Implementing DLP, so it’s time to put together an index to tie together all the posts. I will keep this up to date as new content goes up, and in the end it will be the master list for all eternity. Or until someone hacks our site and deletes everything. Whichever comes first.


Implementing DLP: Starting Your Integration

By Rich

With priorities fully defined, it is now time to start the actual integration.

The first stop is deploying the DLP tool itself. This tends to come in one of a few flavors – and keep in mind that you often need to license different major features separately, even if they all deploy on the same box. This is the heart of your DLP deployment and needs to be in place before you do any additional integration.

  • DLP Server Software: This is the most common option and consists of software installed on a dedicated server. Depending on your product this could actually run across multiple physical servers for different internal components (like a back-end database) or to spread out functions. In a few cases products require different software components running concurrently to manage different functions (such as network vs. endpoint monitoring). This is frequently a legacy of mergers and acquisitions – most products are converging on a single software base with, at most, additional licenses or plugins to provide additional functions.

Management server overhead is usually pretty low, especially in anything smaller than a large enterprise, so this server often handles some amount of network monitoring, functions as the email MTA, scans at least some file servers, and manages endpoint agents. A small to medium sized organization generally only needs to deploy additional servers for load balancing, as a hot standby, or to cover remote network or storage monitoring with multiple egress points or data centers.

Integration is easy – install the software and position the physical server wherever needed, based on deployment priorities and network configuration. We are still in the integration phase of deployment and will handle the rest of the configuration later.

  • DLP Appliance: In this scenario the DLP software comes preinstalled on dedicated hardware. Sometimes it’s merely a branded server, while in other cases the appliance includes specialized hardware. There is no software to install, so the initial integration is usually a matter of connecting it to the network and setting a few basic options – we will cover the full configuration later.

As with a standard server, the appliance usually includes all DLP functions (which you might still need licenses to unlock). The appliance can generally run in an alternative remote monitor mode for distributed deployment.

  • DLP Virtual Appliance: The DLP software is preinstalled into a virtual machine for deployment as a virtual server. This is similar to an appliance but requires work: to get up and running on your virtualization platform of choice, configure the network, and then set the initial configuration options up as if it were a physical server or appliance.

For now just get the tool up and running so you can integrate the other components. Do not deploy any policies or turn on monitoring yet.

Directory Server Integration

The most important deployment integration is with your directory servers and (probably) the DHCP server. This is the only way to tie activity back to actual users, rather than to IP addresses.

This typically involves two components:

  • An agent or connection to the directory server itself to identify users.
  • An agent on the DHCP server to track IP address allocation.

So when a user logs onto the network, their IP address is correlated against their user name, and this is passed on to the DLP server. The DLP server can now track which network activity is tied to which user, and the directory server enables it to understand groups and roles.

This same integration is also required for storage or endpoint deployment. For storage the DLP tool knows which users have access to which files based on file permissions – not that they are always accurate. On an endpoint the agent knows which policies to run based on who is logged in.


Monday, February 06, 2012

Implementing DLP: Integration Priorities and Components

By Rich

It might be obvious by now, but the following charts show which DLP components, integrated with which existing infrastructure, you need based on your priorities. I have broken this out into three different images to make them more readable. Why images? Because I have to dump all this into a white paper later, and building them in a spreadsheet and taking screenshots is a lot easier than mucking with HTML-formatted charts.

Between this and our priorities post and chart you should have an excellent idea of where to start, and how to organize, your DLP deployment.


Wednesday, February 01, 2012

Implementing DLP: Integration, Part 1

By Rich

At this point all planning should be complete. You have determined your incident handling process, started (or finished) cleaning up directory servers, defined your initial data protection priorities, figured out which high-level implementation process to start with, mapped our the environment so you know where to integrate, and performed initial testing and perhaps a proof of concept.

Now it’s time to integrate the DLP tool into your environment. You won’t be turning on any policies yet – the initial focus is on integrating the technical components and preparing to flip the switch.

Define a Deployment Architecture

Earlier you determined your deployment priorities and mapped out your environment. Now you will use them to define your deployment architecture.

DLP Component Overview

We have covered the DLP components a bit as we went along, but it’s important to know all the technical pieces you can integrate depending on your deployment priorities. This is just a high-level overview, and we go into much more detail in our Understanding and Selecting a Data Loss Prevention Solution paper.

This list includes many different possible components, but that doesn’t mean you need to buy a lot of different boxes. Small and mid-sized organizations might be able to get everything except the endpoint agents on a single appliance or server.

  • Network DLP consists of three major components and a few smaller optional ones:
    1. Network monitor or bridge/proxy – this is typically an appliance or dedicated server placed inline or passively off a SPAN or mirror port. It’s the core component for network monitoring.
    2. Mail Transport Agent – few DLP tools integrate directly into a mail server; instead they insert their own MTA as a hop in the email chain.
    3. Web gateway integration – many web gateways support the ICAP protocol, which DLP tools use to integrate and analyze proxy traffic. This enables more effective blocking and provides the ability to monitor SSL encrypted traffic if the gateway includes SSL intercept capabilities.
    4. Other proxy integration – the only other proxies we see with any regularity are for instant messaging portals, which can also be integrated with your DLP tool to support monitoring of encrypted communications and blocking before data leaves the organization.
    5. Email server integration – the email server is often separate from the MTA, and internal communications may never pass through the MTA which only has access to mail going to or coming from the Internet. Integrating directly into the mail server (message store) allows monitoring of internal communications. This feature is surprisingly uncommon.
  • Storage DLP includes four possible components:
    1. Remote/network file scanner – the easiest way to scan storage is to connect to a file share over the network and scan remotely. This component can be positioned close to the file repository to increase performance and reduce network saturation.
    2. Storage server agent – depending on the storage server, local monitoring software may be available. This reduces network overhead, runs faster, and often provides additional metadata, but may affect local performance because it uses CPU cycles on the storage server.
    3. Document management system integration or agent – document management systems combine file storage with an application layer and may support direct integration or the addition of a software agent on the server/device. This provides better performance and additional context, because the DLP tool gains access to management system metadata.
    4. Database connection – a few DLP tools support ODBC connections to scan inside databases for sensitive content.
  • Endpoint DLP primarily relies on software agents, although you can also scan endpoint storage using administrative file shares and the same remote scanning techniques used for file repositories. There is huge variation in the types of policies and activities which can be monitored by endpoint agents, so it’s critical to understand what your tool offers.

There are a few other components which aren’t directly involved with monitoring or blocking but impact integration planning:

  • Directory server agent/connection – required to correlate user activity with user accounts.
  • DHCP server agent/connection – to associate an assigned IP address with a user, which is required for accurate identification of users when observing network traffic. This must work directly with your directory server integration because the DHCP servers themselves are generally blind to user accounts.
  • SIEM connection – while DLP tools include their own alerting and workflow engines, some organizations want to push incidents to their Security Information and Event Management tools.

In our next post I will post a chart that maps priorities directly to technical components.


Monday, January 30, 2012

Implementing DLP: Final Deployment Preparations

By Rich

Map Your Environment

No matter which DLP process you select, before you can begin the actual implementation you need to map out your network, storage infrastructure, and/or endpoints. You will use the map to determine where to push out the DLP components.

  1. Network: You don’t need a complete and detailed topographical map of your network, but you do need to identify a few key components.
    1. All egress points. These are where you will connect DLP monitors to a SPAN or mirror port, or install DLP inline.
    2. Email servers and MTAs (Mail Transport Agents). Most DLP tools include their own MTA which you simply add as a hop in your mail chain, so you need to understand that chain.
    3. Web proxies/gateways. If you plan on sniffing at the web gateway you’ll need to know where these are and how they are configured. DLP typically uses the ICAP protocol to integrate. Also, if your web proxy doesn’t intercept SSL… buy a different proxy. Monitoring web traffic without SSL is nearly worthless these days.
    4. Any other proxies you might integrate with, such as instant messaging gateways.
  2. Storage: Put together a list of all storage repositories you want to scan. The list should include the operating system type, file shares / connection types, owners, and login credentials for remote scanning. If you plan to install agents test compatibility on test/development systems.
  3. Endpoints: This one can be more time consuming. You need to compile a list of endpoint architectures and deployments – preferably from whatever endpoint management tool you already use for things like configuration and software updates. Mapping machine groups to user and business groups makes it easier to deploy endpoint DLP by business units. You need system configuration information for compatibility and testing. As an example, as of this writing no DLP tool supports Macs so you might have to rely on network DLP or exposing local file shares to monitor and scan them.

You don’t need to map out every piece of every component unless you’re doing your entire DLP deployment at once. Focus on the locations and infrastructure needed to support the project priorities you established earlier.

Test and Proof of Concept

Many of you perform extensive testing or a full proof of concept during the selection process, but even if you did it’s still important to push down a layer deeper, now that you have more detailed deployment requirements and priorities.

Include the following in your testing:

  • For all architectures: Test a variety of policies that resemble the kinds you expect to deploy, even if you start with dummy data. This is very important for testing performance – there are massive differences between using something like a regular expression to look for credit card numbers vs. database matching against hashes of 10 million real credit card numbers. And test mixes of policies to see how your tool supports multiple policies simultaneously, and to verify which policies each component supports – for example, endpoint DLP is generally far more limited in the types and sizes of policies it supports. If you have completed directory server integration, test it to ensure policy violations tie back to real users. Finally, practice with the user interface and workflow before you start trying to investigate live incidents.
  • Network: Integrate out-of-band and confirm your DLP tool is watching the right ports and protocols, and can keep up with traffic. Test integration – including email, web gateways, and any other proxies. Even if you plan to deploy inline (common in SMB) start by testing out-of-band.
  • Storage: If you plan to use any agents on servers or integrated with NAS or a document management system, test them in a lab environment first for performance impact. If you will use network scanning, test for performance and network impact.
  • Endpoint: Endpoints often require the most testing due to the diversity of configurations in most organizations, the more-limited resources available to the DLP engine, and all the normal complexities of mucking with user’s workstations. The focus here is on performance and compatibility, along with confirming which content analysis techniques really work on endpoints (the typical sales exec is often a bit … obtuse … about this). If you will use policies that change based on which network the endpoint is on, also test that.

Finally, if you are deploying multiple DLP components – such as multiple network monitors and endpoint agents – it’s wise to verify they can all communicate. We have talked with some organizations that found limitations here and had to adjust their architectures.


Friday, January 27, 2012

Implementing DLP: Picking Priorities and a Deployment Process

By Rich

At this point you should be in the process of cleaning your directory servers, with your incident handling process outlined in case you find any bad stuff early in your deployment. Now it’s time to determine your initial priorities to figure out whether you want to start with the Quick Wins process or jump right into full deployment.

Most organizations have at least a vague sense of their DLP priorities, but translating them into deployment priorities can be a bit tricky. It’s one thing to know you want to use DLP to comply with PCI, but quite another to know exactly how to accomplish that.

On the right is an example of how to map out high-level requirements into a prioritized deployment strategy. It isn’t meant to be canonical, but should provide a good overview for most of you. Here’s the reasoning behind it:

DLP Priorities

  • Compliance priorities depend on the regulation involved. For PCI your best bet is to use DLP to scan storage for Primary Account Numbers. You can automate this process and use it to define your PCI scope and reduce assessment costs. For HIPAA the focus often starts with email to ensure no one is sending out unencrypted patient data. The next step is often to find where that data is stored – both in departments and on workstations. If we were to add a third item it would probably be web/webmail, because that is a common leak vector.
  • Intellectual Property Leaks tend to be either document based (engineering plans) or application/database based (customer lists). For documents – assuming your laptops are already encrypted – USB devices are usually one of the top concerns, followed by webmail. You probably also want to scan storage repositories, and maybe endpoints, depending on your corporate culture and the kind of data you are concerned about. Email turns out to be a less common source of leaks than the other channels, so it’s lower on the list. If the data comes out of an application or database then we tend to worry more about network leaks (an insider or an attacker), webmail, and then storage (to figure out all the places it’s stored and at risk). We also toss in USB above email, because all sorts of big leaks have shown USB is a very easy way to move large amounts of data.
  • Customer PII is frequently exposed by being stored where it shouldn’t be, so we start with discovery again. Then, from sources such as the Verizon Data Breach Investigations Report and the Open Security Foundation DataLossDB we know to look at webmail, endpoints and portable storage, and lastly email.

You will need to mix and match these based on your own circumstances – and we highly recommend using data-derived reports like the ones listed above to help align your priorities with evidence, rather than operating solely on gut feel. Then adapt based on what you know about your own organization – which may include things like “the CIO said we have to watch email”.

If you followed our guidance in Understanding and Selecting a DLP Solution you can feed the information from that worksheet into these priorities.

Now you should have a sense of what data to focus on and where to start. The next step is to pick a deployment process.

Here are some suggestions for deciding which to start with. The easy answer is to almost always start with the Quick Wins process…

  • Only start with the full deployment process if you have already prioritized what to protect, have a good sense of where you need to protect it, and believe you understand the scope you are dealing with. This is usually when you have a specific compliance or IP protection initiative, where the scope includes well-defined data and a well-defined scope (e.g., where to look for the data or monitor and/or block it).
  • For everyone else we suggest starting with the Quick Wins process. It will highlight your hot spots and help you figure out where to focus your full deployment.

We’ll discuss each of those processes in more depth later.


Wednesday, January 25, 2012

Implementing DLP: Getting Started

By Rich

In our Introduction to Implementing and Managing a DLP Solution we started describing the DLP implementation process. Now it’s time to put the pedal to the metal and start cranking through it in detail.

No matter which path you choose (Quick Wins or Full Deployment), we break out the implementation process into four major steps:

  1. Prepare: Determine which process you will use, set up your incident handling procedures, prepare your directory servers, define priorities, and perform some testing.
  2. Integrate: Next you will determine your deployment architecture and integrate with your existing infrastructure. We cover most integration options – even if you only plan on a limited deployment (and no, you don’t have to do everything all at once).
  3. Configure and Deploy: Once the pieces are integrated you can configure initial settings and start your deployment.
  4. Manage: At this point you are up and running. Managing is all about handling incidents, deploying new policies, tuning and removing old ones, and system maintenance.

As we write this series we will go into depth on each step, while keeping our focus on what you really need to know to get the job done.

Implementing and managing DLP doesn’t need to be intimidating. Yes, the tools are powerful and seem complex, but once you know what you’re doing you’ll find it isn’t hard to get value without killing yourself with too much complexity.


One of the most important keys to a successful DLP deployment is preparing properly. We know that sounds a bit asinine because you can say the same thing about… well, anything, but with DLP we see a few common pitfalls in the preparation stage. Some of these steps are non-intuitive – especially for technical teams who haven’t used DLP before and are more focused on managing the integration.

Focusing on the following steps, before you pull the software or appliance out of the box, will significantly improve your experience.

Define your incident handling process

Pretty much the instant you turn on your DLP tool you will begin to collect policy violations. Most of these won’t be the sort of thing that require handling and escalation, but nearly every DLP deployment I have heard of quickly found things that required intervention. ‘Intervention’ here is a polite way of saying someone had a talk with human resources and legal – after which it is not uncommon for that person to be escorted to the door by the nice security man in the sharp suit.

It doesn’t matter if you are only doing a bit of basic information gathering, or prepping for a full-blown DLP deployment – it’s essential to get your incident handling process in place before you turn on the product. I also recommend at least sketching out your process before you go too far into product selection. Many organizations involve non-IT personnel in the day-to-day handling of incidents, and this affects user interface and reporting requirements.

Here are some things to keep in mind:

  • Criteria for escalating something from a single incident into a full investigation.
  • Who is allowed access to the case and historical data – such as previous violations by the same employee – during an investigation.
  • How to determine whether to escalate to the security incident response team (for external attacks) vs. to management (for insider incidents).
  • The escalation workflow – who is next in the process and what their responsibilities are.
  • If and when an employee’s manager is involved. Some organizations involve line management early, while others wait until an investigation is more complete.

The goal is to have your entire process mapped out, so if you see something you need to act on immediately – especially something that could get someone fired – you have a process to manage it without causing legal headaches.

Clean directory servers

Data Loss Prevention tools tie in tightly to directory servers to correlate incidents to users. This can be difficult because not all infrastructures are set up to tie network packets or file permissions back to the human sitting at a desk (or in a coffee shop).

Later, during the integration steps, you will tie into your directory and network infrastructure to link network packets back to users. But right now we’re more focused on cleaning up the directory itself so you know which network names connect to which users, and whether groups and roles accurately reflect employees’ job and rights.

Some of you have completed something along these lines already for compliance reasons, but we still see many organizations with very messy directories.

We wish we could say it’s easy, but if you are big enough, with all the common things like mergers and acquisitions that complicate directory infrastructures, this step may take a remarkably long time. One possible shortcut is to look at tying your directory to your human resources system and using HR as the authoritative source.

But in the long run it’s pretty much impossible to have an effective data security program without being able to tie activity to users, so you might look at something like an entitlement management tool to help clean things up.

This is already running long, so we will wrap up implementation in the next post…


Monday, January 23, 2012

Implementing and Managing a DLP Solution

By Rich

I have been so tied up with the Nexus, CCSK, and other projects that I haven’t been blogging as much as usual… but not to worry, it’s time to start a nice, juicy new technical series. And once again I return to my bread and butter: DLP. As much as I keep thinking I can simply run off and play with pretty clouds, something in DLP always drags me back in. This time it’s a chance to dig in and focus on implementation and management (thanks to McAfee for sponsoring something I’ve been wanting to write for a long time). With that said, let’s dig in…

In many ways Data Loss Prevention (DLP) is one of the most far-reaching tools in our security arsenal. A single DLP platform touches our endpoints, network, email servers, web gateways, storage, directory servers, and more. There are more potential integration points than nearly any other security tool – with the possible exception of SIEM. And then we need to build policies, define workflow, and implement blocking… all based on nebulous concepts like “customer data” and “intellectual property”.

It’s no wonder many organizations are intimidated by the thought implementing a large DLP deployment. Yet, based on our 2010 survey data, somewhere upwards of 40% of organizations use some form of DLP.

Fortunately implementing and managing DLP isn’t nearly as difficult as many security professionals expect. Over the nearly 10 years we have covered the technology – talking with probably hundreds of DLP users – we have collected countless tips, tricks, and techniques for streamlined and effective deployments that we’ve compiled into straightforward processes to ease most potential pains.

We are not trying to pretend deploying DLP is simple. DLP is one of the most powerful and important tools in our modern security arsenal, and anything with that kind of versatility and wide range of integration points can easily be a problem if you fail to appropriately plan or test.

But that’s where this series steps in. We’ll lay out the processes for you, including different paths to meet different needs – all to help you get up and running; and to stay there as quickly, efficiently, and effectively as possible. We have watched the pioneers lay the trails and hit the land mines – now it’s time to share those lessons with everyone else.

Keep in mind that despite what you’ve heard, DLP isn’t all that difficult to deploy. There are many misperceptions, in large part due to squabbling vendors (especially non-DLP vendors). But it doesn’t take much to get started with DLP.

On a practical note this series is a follow-up to our Understanding and Selecting a Data Loss Prevention Solution paper now in its second revision. We pick up right where that paper left off, so if you get lost in any terminology we suggest you use that paper as a reference.

On that note, let’s start with an overview and then we’ll delve into the details.

Quick Wins for Long Term Success

One of the main challenges in deploying DLP is to show immediate value without drowning yourself in data. DLP tools are generally not be too bad for false positives – certainly nowhere near as bad as IDS. That said, we have seen many people deploy these tools without knowing what they wanted to look for – which can result in a lot of what we call false real positives: real alerts on real policy violations, just not things you actually care about.

The way to handle too many alerts is to deploy slowly and tune your policies, which can take a lot of time and may even focus you on protecting the wrong kinds of content in the wrong places. So we have compiled two separate implementation options:

  • The Quick Wins process is best for initial deployments. Your focus is on rapid deployment and information gathering rather than enforcement, and will help guide your full deployment later. We detailed this process in a white paper and will only briefly review it here.
  • The Full Deployment process is what you’ll use for the long haul. It’s a methodical series of steps for full enforcement policies. Since the goal is enforcement (even if enforcement is alert and response, instead of automated blocking and filtering), and we spend more time tuning policies to produce useful results.

The key difference is that the Quick Wins process isn’t intended to block every single violation – just really egregious problems. It’s about getting up and running and quickly showing value by identifying key problem areas and helping set you up for a full deployment. The Full Deployment process is where you dig in, spend more time on tuning, and implement long-term policies for enforcement.

The good news is that we designed these to work together. If you start with Quick Wins, everything you do will feed directly into full deployment. If you already know where you want to focus you can jump right into a full deployment without bothering with Quick Wins. In either case the process guides you around common problems and should speed up implementation.

In our next post we’ll show you where to get started and start laying out the processes…


Tuesday, August 30, 2011

Detecting and Preventing Data Migrations to the Cloud

By Rich

One of the most common modern problems facing organizations is managing data migrating to the cloud. The very self-service nature that makes cloud computing so appealing also makes unapproved data transfers and leakage possible. Any employee with a credit card can subscribe to a cloud service and launch instances, deliver or consume applications, and store data on the public Internet. Many organizations report that individuals or business units have moved (often sensitive) data to cloud services without approval from, or even notification to, IT or security.

Aside from traditional data security controls such as access controls and encryption, there are two other steps to help manage unapproved data moving to cloud services:

  1. Monitor for large internal data migrations with Database Activity Monitoring (DAM) and File Activity Monitoring (FAM).
  2. Monitor for data moving to the cloud with URL filters and Data Loss Prevention.

Internal Data Migrations

Before data can move to the cloud it needs to be pulled from its existing repository. Database Activity Monitoring can detect when an administrator or other user pulls a large data set or replicates a database.

File Activity Monitoring provides similar protection for file repositories such as file shares.

These tools can provide early warning of large data movements. Even if the data never leaves your internal environment, this is the kind of activity that shouldn’t occur without approval.

These tools can also be deployed within the cloud (public and/or private, depending on architecture), and so can also help with inter-cloud migrations.

Movement to the Cloud

While DAM and FAM indicate internal movement of data, a combination of URL filtering (web content security gateways) and Data Loss Prevention (DLP) can detect data moving from the enterprise into the cloud.

URL filtering allows you to monitor (and prevent) users connecting to cloud services. The administrative interfaces for these services typically use different addresses than the consumer side, so you can distinguish between someone accessing an admin console to spin up a new cloud-based application and a user accessing an application already hosted with the provider.

Look for a tool that offers a list of cloud services and keeps it up to date, as opposed to one where you need to create a custom category and manage the destination addresses yourself. Also look for a tool that distinguishes between different users and groups so you can allow access for different employee populations.

For more granularity, use Data Loss Prevention. DLP tools look at the actual data/content being transmitted, not just the destination. They can generate alerts (or block) based on the classification of the data. For example, you might allow corporate private data to go to an approved cloud service, but block the same content from migrating to an unapproved service. Similar to URL filtering, you should look for a tool that is aware of the destination address and comes with pre-built categories. Since all DLP tools are aware of users and groups, that should come by default.

This combination isn’t perfect, and there are plenty of scenarios where they might miss activity, but that is a whole lot better than completely ignoring the problem. Unless someone is deliberately trying to circumvent security, these steps should capture most unapproved data migrations.