Understanding and Selecting a DLP Solution: Part 4, Data-At-Rest Technical Architecture

Welcome to part 4 of our series on Data Loss Prevention/Content Monitoring and Filtering solutions. If you’re new to the series, you should check out Part 1, Part 2, and Part 3 first. I apologize for getting distracted with some other priorities (especially the Data Security Lifecycle), I just realized it’s been about two weeks since my last DLP post in this series. Time to stick the nose to the grindstone (I grew up in a tough suburb) and crank the rest of this guide out. Last time we covered the technical architectures for detecting policy violations for data moving across the network in communications traffic, including email, instant messaging, web traffic, and so on. Today we’re going to dig in to an often overlooked, but just as valuable feature of most major DLP products- Content Discovery. As I’ve previously discussed, the most important component of a DLP/CMF solution is it’s content awareness. Once you have a good content analysis engine the potential applications increase dramatically. While catching leaks on the fly is fairly powerful, it’s only one small part of the problem. Many customers are finding that it’s just as valuable, if not more valuable, to figure out where all that data is stored in the first place. Sure, enterprise search tools might be able to help with this, but they really aren’t tuned well for this specific problem. Enterprise data classification tools can also help, but based on discussions with a number of clients they don’t tend to work well for finding specific policy violations. Thus we see many clients opting to use the content discovery features of their DLP product. Author’s Note: It’s the addition of robust content discovery that I consider the dividing line between a Data Loss Prevention solution and a Content Monitoring and Filtering solution. DLP is more network focused, while CMF begins the expansion to robust content prevention. I use the name DLP extensively since it’s the industry standard, but over time we’ll see this migrate to CMF, and eventually to Content Monitoring and Protection, as I discussed in this post. The biggest advantage of content discovery in a DLP/CMF tool is that it allows you to take a single policy and apply it across data no matter where it’s stored, how it’s shared, or how it’s used. For example, you can define a policy that requires credit card numbers to only be emailed when encrypted, never be shared via HTTP or HTTPS, only be stored on approved servers, and only be stored on workstations/laptops by employees on the accounting team. All of this is done in a single policy on the DLP/CMF management server. We can break discovery out into three major modes: Endpoint Discovery: scanning workstations and laptops for content. Storage Discovery: scanning mass storage, including file servers, SAN, and NAS. Server Discovery: application-specific scanning on stored data in email servers, document management systems, and databases (not currently a feature of most DLP products, but beginning to appear in some Database Activity Monitoring products). These types perform their analysis using three technologies: Remote Scanning: a connection is made to the server or device using a file sharing or application protocol, and scanning performed remotely. This is essentially mounting a remote drive and scanning it from a scanning server that takes policies from and sends results to the central policy server. For some vendors this is an appliance, for others it’s a server, and for smaller deployments it’s integrated into the central management server. Agent-Based Scanning: an agent is installed on the system/server to be scanned and scanning performed locally. Agents are platform specific, and use local CPU cycles, but can potentially perform significantly faster than remote scanning, especially for large repositories. For endpoints, this should be a feature of the same agent used for enforcing Data-In-Use controls. Temporal-Agent Scanning: Rather than deploying a full time agent, a memory-resident agent is installed, performs a scan, then exits without leaving anything running or stored on the local system. This offers the performance of agent-based scanning in situations where you don’t want a full-time agent running. Any of these technologies can work for any of the modes, and enterprises will typically deploy a mix depending on policy and infrastructure requirements. We currently see some technology limitations of each approach that affect deployment: Remote scanning can significantly increase network traffic and has performance limitations based on network bandwidth and target and scanner network performance. Some solutions can only scan gigabytes per day (sometimes hundreds of GB, but below TB/day), per server based on these practical limitations which may not be sufficient for very large storage. Agents, temporal or permanent, are limited by processing power and memory on the target system which often translates to restrictions on the number of policies that can be enforced, and the types of content analysis that can be used. For example, most endpoint agents are not capable of enforcing large data sets of partial document matching or database fingerprinting. This is especially true of endpoint agents which are more limited Agents don’t support all platforms. Once a policy violation is discovered, the discovery solution can take a variety of actions: Alert/Report: create an incident in the central management server just like a network violation. Wa : notify the user via email that they may be in violation of policy. Quarantine/Notify: move the file to the central management server and leave a .txt file with instructions on how to request recovery of the file. Quarantine/Encrypt: encrypt the file in place, usually leaving a plain text file on how to request decryption. Quarantine/Access Control: change the access controls to restrict access to the file. Remove/Delete: either transfer the file to the central server without notification, or just delete it. The combination of different deployment architectures, discovery techniques, and enforcement options creates a powerful combination for protecting data-at-rest and supporting compliance initiatives. For example, we’re starting to see increasing deployments of CMF to support PCI compliance- more for the ability to ensure (and

Read Post

Home Security Tip: Nuke It From Orbit

I say we take off and nuke the entire site from orbit. It’s the only way to be sure. -Ripley (Sigourney Weaver) in Aliens While working at home has some definite advantages, like the Executive Washroom, Executive Kitchen, and Executive HDTV, all this working at home alone can get a little isolating. I realized the other month that I spend more hours every day with my cats than any other human being, including my wife. Thus I tend to work out of the local coffee shop a day or two a week. Nice place, free WiFi (that I help secure on occasion), and a friendly staff. Today I was talking with one of the employees about her home computer. A while ago I referred her to AVG Free antivirus and had her turn on her Windows firewall. AVG quickly found all sorts of nasties- including, as she put it, “47 things in that quarantine thing called Trojans. What’s that?” Uh oh. That’s bad. I warned her that her system, even with AV on it, was probably so compromised that it would be nearly impossible to recover. She asked me how much it would cost to go over and fix it, and I didn’t have the heart to tell her. Truth is, as most of you professional IT types know, it might be impossible to clean out all the traces of malware from a system compromised like that. I’m damn good at this kind of stuff, yet if it were my computer I’d just nuke it from orbit- wipe the system and start from scratch. While I have pretty good backups, this can be a bit of a problem for friends and family. Here’s how I go about it on a home system for friends and family: Copy off all important files to an external drive- USB or hard drive, depending on how much they have. Wipe the system and reinstall Windows from behind a firewall (a home wireless router is usually good enough, a cable or DSL modem isn’t). Install all the Windows updates. Read a book or two, especially if you need to install Service Pack 2 on XP. Install Office (hey, maybe try OpenOffice) and any other applications. Double check that you have SP2, IE7, and the latest Firefox installed. Install any free security software you want, and enable the Microsoft Malicious Software removal tool and Windows firewall. See Security Mike for more, even though he hasn’t shown me his stuff yet. Set up their email and such. Take the drive with all their data on it, and scan it from another computer. Say a Mac with ClamAV installed? I usually scan with two different AV engines, and even then I might warn them not to recover those files. Restore their files. This isn’t perfect, but I haven’t had anyone get re-infected yet using this process. Some of the really nasty stuff will hide in data files, but especially if you hold onto the files for a few weeks at least one AV engine will usually catch it. It’s a risk analysis; if they don’t need the files I recommend they trash them. If they really need the stuff we can restore it as carefully as possible and keep an eye on things. If it’s a REALLY bad infection I’ll take the files on my Mac, convert them to plain text or a different file format, then restore them. You do the best you can, and can always nuke it again if needed. In her case, I also recommended she change any bank account passwords and her credit card numbers. It’s the only way to be sure… Share:

Read Post

Movement In The DLP Market?

Rumors are a major deal in the DLP market might drop soon. As in an acquisition. Being just a rumor I’ll keep the names to myself for now, but it’s an interesting development. One that will probably stir the market and maybe get things moving, even if the acquisition itself fails. Share:

Read Post

Woops- Comments Should Really Be Open Now

A while back I opened up the comments so you didn’t have to register, but somewhere along the lines that setting was reset. They should be open now, and I’ll keep them open until the spam or trolls force me to change things. Share:

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.