Content Monitoring And Protection
|
Sign Up!
|
|
|
|
|
Project Quant
|
|
The patch management metrics project.
|
|
|
Tag Cloud
|
|
|
 |
|
Entries Calendar
|
| S |
M |
T |
W |
T |
F |
S |
| 28 | 1 |
2 |
3 |
4 |
5 |
6 |
| 7 |
8 |
9 |
10 |
11 |
12 |
13 |
| 14 |
15 |
16 |
17 |
18 |
19 |
20 |
| 21 |
22 |
23 |
24 |
25 |
26 |
27 |
| 28 |
29 |
30 |
31 |
1 |
2 |
3 |
|
|
By Rich
One of the more difficult aspects of the analyst gig is sorting through all the information you get, and isolating out any inherent biases. The kinds of inquiries we get from clients can all too easily skew our perceptions of the industry, since people tend to come to us for specific reasons, and those reasons don't necessarily represent the mean of the industry. Aside from all the vendor updates (and customer references), our end user conversations usually involve helping someone with a specific problem -- ranging from vendor selection, to basic technology education, to strategy development/problem solving. People call us when they need help, not when things are running well, so it's all too easy to assume a particular technology is being used more widely than it really is, or a problem is bigger or smaller than it really is, because everyone calling us is asking about it. Countering this takes a lot of outreach to find out what people are really doing even when they aren't calling us.
Over the past few weeks I've had a series of opportunities to work with end users outside the context of normal inbound inquiries, and it's been fairly enlightening. These included direct client calls, executive roundtables such as one I participated in recently with IANS (with a mix from Fortune 50 to mid-size enterprises), and some outreach on our part. They reinforced some of what we've been thinking, while breaking other assumptions. I thought it would be good to compile these together into a "state of the industry" summary. Since I spend most of my time focused on web application and data security, I'll only cover those areas:

When it comes to web application and data security, if there isn't a compliance requirement, there isn't budget -- Nearly all of the security professionals we've spoken with recognize the importance of web application and data security, but they consistently tell us that unless there is a compliance requirement it's very difficult for them to get budget. That's not to say it's impossible, but non-compliance projects (however important) are way down the priority list in most organizations. In a room of a dozen high-level security managers of (mostly) large enterprises, they all reinforced that compliance drove nearly all of their new projects, and there was little support for non-compliance-related web application or data security initiatives. I doubt this surprises any of you.
"Compliance" may mean more than compliance -- Activities that are positioned as helping with compliance, even if they aren't a direct requirement, are more likely to gain funding. This is especially true for projects that could reduce compliance costs. They will have a longer approval cycle, often 9 months or so, compared to the 3-6 months for directly-required compliance activities. Initiatives directly tied to limiting potential data breach notifications are the most cited driver. Two technology examples are full disk encryption and portable device control.
PCI is the single biggest compliance driver for web application and data security -- I may not be thrilled with PCI, but it's driving more web application and data security improvements than anything else.
The term Data Loss Prevention has lost meaning -- I discussed this in a post last week. Even those who have gone through a DLP tool selection process often use the term to encompass more than the narrow definition we prefer.
It's easier to get resources to do some things manually than to buy a tool -- Although tools would be much more efficient and effective for some projects, in terms of costs and results, manual projects using existing resources are easier to get approval for. As one manager put it, "I already have the bodies, and I won't get any more money for new tools." The most common example cited was content discovery (we'll talk more about this a few points down).
Most people use DLP for network (primarily email) monitoring, not content discovery or endpoint protection -- Even though we tend to think discovery offers equal or greater value, most organizations with DLP use it for network monitoring.
Interest in content discovery, especially DLP-based, is high, but resources are hard to get for discovery projects -- Most security managers I talk with are very interested in content discovery, but they are less educated on the options and don't have the resources. They tell me that finding the data is the easy part -- getting resources to do anything about it is the limiting factor.
The Web Application Firewall (WAF) market and Security Source Code Tools markets are nearly equal in size, with more clients on WAFs, and more money spent on source code tools per client -- While it's hard to fully quantify, we think the source code tools cost more per implementation, but WAFs are in slightly wider use.
WAFs are a quicker hit for PCI compliance -- Most organizations deploying WAFs do so for PCI compliance, and they're seen as a quicker fix than secure source code projects.
Most WAF deployments are out of band, and false positives are a major problem for default deployments -- Customers are installing WAFs for compliance, but are generally unable to deploy them inline (initially) due to the tuning requirements.
Full drive encryption is mature, and well deployed in the early mainstream -- Full drive encryption, while not perfect, is deployable in even large enterprises. It's now considered a level-setting best practice in financial services, and usage is growing in healthcare and insurance. Other asset recovery options, such as remote data destruction and phone home applications, are now seen as little more than snake oil. As one CISO told us, "I don't care about the laptop, we just encrypt it and don't worry about it when it goes missing".
File and folder encryption is not in wide use -- Very few organizations are performing any wide scale file/folder encryption, outside of some targeted encryption of PII for compliance requirements.
Database encryption is hard, and not widely used -- Most organizations are dissatisfied with database encryption options, and do not deploy it widely. Within a large organization there is likely some DB encryption, with preference given to file/folder/media protection over column level encryption, but most organizations prefer to avoid it. Performance and key management are cited as the primary obstacles, even when using native tools. Current versions of database encryption (primarily native encryption) do perform better than older versions, but key management is still unsatisfactory. Large encryption projects, when initiated, take an average of 12-18 months.
Large enterprises prefer application-level encryption of credit card numbers, and tokenization -- When it comes to credit card numbers, security managers prefer to encrypt it at the application level, or consolidate numbers into a central source, using representative "tokens" throughout the rest of the application stack. These projects take a minimum of 12-18 months, similar to database encryption projects (the two are often tied together, with encryption used in the source database).
Email encryption and DRM tend to be workgroup-specific deployments -- Email encryption and DRM use is scattered throughout the industry, but is still generally limited to workgroup-level projects due to the complexity of management, or lack of demand/compliance from users.
Database Activity Monitoring usage continues to grow slowly, mostly for compliance, but not quickly enough to save lagging vendors -- Many DAM deployments are still tied to SOX auditing, and it's not as widely used for other data security initiatives. Performance is reasonable when you can use endpoint agents, which some DBAs still resist. Network monitoring is not seen as effective, but may still be used when local monitoring isn't an option. Network requirements, depending on the tool, may also inhibit deployments.
My main takeaway is that security managers know what they need to do to protect information assets, but they lack the time, resources, and management support for many initiatives. There is also broad dissatisfaction with security tools and vendors in general, in large part due to poor expectation setting during the sales process, and deliberately confusing marketing. It's not that the tools don't work, but that they're never quite as easy as promised.
It's an interesting dilemma, since there is clear and broad recognition that data security (and by extension, web application security) is likely our most pressing overall issue in terms of security, but due to a variety of factors (many of which we covered in our Business Justification for Data Security paper), the resources just aren't there to really tackle it head-on.
–Rich
Posted at Monday 1st June 2009 8:18 am
Filed under:
(12) Comments •
(0) Trackbacks •
Permalink
By Rich
As most of you know, I've been covering DLP for entirely too long. It's a major area of our research, with an entire section of our site dedicated to it.
To be honest, I never really liked the term "Data Loss Prevention". When this category first appeared, I used the term Content Monitoring and Filtering. The vendors didn't like it, but since I wrote (with a colleague) the Gartner Magic Quadrant, they sort of rolled with it. The vendors preferred DLP since it sounded better for marketing purposes (I have to admit, it's sexier than CMF). Once market momentum took over and end users started using DLP more than CMF, I rolled with it and followed the group consensus.
I never liked Data Loss Prevention since, in my mind, it could mean pretty much anything that "prevents data loss". Which is, for the most part, any security tool on the market. My choice was to either jump on the DLP bandwagon, or stick to my guns and use CMF, even though no one would know what I was talking about. Thus I transitioned over, started using DLP, and focused my efforts on providing clear definitions and advice related to the technology.
Over the past 2 weeks I've come to realize that DLP, as a term for a specific category of technology, is pretty much dead. I've been invited to multiple DLP conferences/speaking opportunities, none of which are focused on what I'd consider DLP tools. I've been asked to help work on DLP training materials that don't even have a chapter on DLP tools. I've had multiple end-user conversations on DLP... almost always referring to a different technology.
The DLP vendors did such a good job of coming up with a sexy name for their technology that the rest of the world decided to use it... even when they had nothing to do with DLP. Thus, any vendor reading this can consider this post my official recommendation that you drop the term DLP, and move to Content Monitoring and Protection (CMP -- a term Chris Hoff first suggested that I've glommed onto). Or just make something else up.
I'll continue using DLP on this site, but the non-DLP vendors have won and the term is completely diluted and no longer refers to a specific technology. Thus I'll stop being incredibly anal about it, and you might see me associated with "DLP" when it has nothing to do with pure-play DLP as I've historically defined it.
That said, when I'm writing about it I still intend to use the term DLP in my personal writing in accordance with my very specific definition (below), and will start using 'CMP' more heavily.
Data Loss Prevention/Content Monitoring and Protection is:
Products that, based on central policies, identify, monitor, and protect data at rest, in motion, and in use through deep content analysis.
For the record, I get all uppity about mangled definitions because all too often they're used to create market confusion, and reduce value to users. People end up buying things that don't do what they expected.
–Rich
Posted at Tuesday 26th May 2009 2:18 pm
Filed under:
(11) Comments •
(0) Trackbacks •
Permalink
By Rich
By the time I post this you won't be able to find a tech news site that isn't covering this one. I know, since my name was on the list of analysts the press could contact and I spent a few hours talking to everyone covering the story yesterday. Rather than just reciting the press release, I'd like to add some analysis, put things into context, and speculate wildly. For the record, this is a big deal in the long term, and will likely benefit all of the major DLP vendors, even though there's nothing earth shattering in the short term.
As you read this, Microsoft and RSA are announcing a partnership for Data Loss Prevention. Here are the nitty gritty details, not all of which will be apparent from the press release:
- This month, the RSA DLP product (Tablus for you old folks) will be able to assign Microsoft RMS (what Microsoft calls DRM) rights to stored data based on content discovery. The way this works is that the RMS administrator will define a data protection template (what rights are assigned to what users). The RSA DLP administrator then creates a content detection policy, which can then apply the RMS rights automatically based on the content of files. The RSA DLP solution will then scan file repositories (including endpoints) and apply the RMS rights/controls to protect the content.
- Microsoft has licensed the RSA DLP technology to embed into various Microsoft products. They aren't offering much detail at this time, nor any timelines, but we do know a few specifics. Microsoft will slowly begin adding the RSA DLP content analysis engine to various products. The non-NDA slides hint at everything from SQL Server, Exchange, and Sharepoint, to Windows and Office. Microsoft will also include basic DLP management into their other management tools.
- Policies will work across both Microsoft and RSA in the future as the products evolve. Microsoft will be limiting itself to their environment, with RSA as the upgrade path for fuller DLP coverage.
And that's it for now. RSA DLP 6.5 will link into RMS, with Microsoft licensing the technology for future use in their products. Now for the analysis:
- This is an extremely significant development in the long term future of DLP. Actually, it's a nail in the coffin of the term "DLP" and moves us clearly and directly to what we call "CMP"- Content Monitoring and Protection. It moves us closer and closer to the DLP engine being available everywhere (and somewhat commoditized), and the real value in being in the central policy management, analysis, workflow, and incident management system. DLP/CMP vendors don't go away- but their focus changes as the agent technology is built more broadly into the IT infrastructure (this definitely won't be limited to just Microsoft).
- It's not very exciting in the short term. RSA isn't the first to plug DLP into RMS (Workshare does it, but they aren't nearly as big in the DLP market). RSA is only enabling this for content discovery (data at rest) and rights won't be applied immediately as files are created/saved. It's really the next stages of this that are interesting.
- This is good for all the major DLP vendors, although a bit better for RSA. It's big validation for the DLP/CMP market, and since Microsoft is licensing the technology to embed, it's reasonable to assume that down the road it may be accessible to other DLP vendors (be aware- that's major speculation on my part).
- This partnership also highlights the tight relationship between DLP/CMP and identity management. Most of the DLP vendors plug into Microsoft Active Directory to determine users/groups/roles for the application of content protection policies. One of the biggest obstacles to a successful DLP deployment can be a poor directory infrastructure. If you don't know what users have what roles, it's awfully hard to create content-based policies that are enforced based on users and roles.
- We don't know how much cash is involved, but financially this is likely good for RSA (the licensing part). I don't expect it to overly impact sales in the short term, and the other major DLP vendors shouldn't be too worried for now. DLP deals will still be competitive based on the capabilities of current products, more than what's coming in an indeterminate future.
Now just imagine a world where you run a query on a SQL database, and any sensitive results are appropriately protected as you place them into an Excel spreadsheet. You then drop that spreadsheet into a Powerpoint presentation and email it to the sales team. It's still quietly protected, and when one sales guy tries to email it to his Gmail account, it's blocked. When he transfers it to a USB device, it's encrypted using a company key so he can't put it on his home computer. If he accidentally sends it to someone in the call center, they can't read it. In the final PDF, he can't cut out the table and put it in another document. That's where we are headed- DLP/CMP is enmeshed into the background, protecting content through it's lifecycle based on central policies and content and context awareness.
In summary, it's great in the long term, good but not exciting in the short term, and beneficial to the entire DLP market, with a slight edge for RSA. There are a ton of open questions and issues, and we'll be watching and analyzing this one for a while.
As always, feel free to email me if you have any questions.
–Rich
Posted at Thursday 4th December 2008 1:39 am
Filed under:
(8) Comments •
(0) Trackbacks •
Permalink
By Rich
In Part 1 I talked about the definition of endpoint DLP, the business drivers, and how it integrates with full-suite solutions. Today (and over the next few days) we're going to start digging into the technology itself.
Base Agent Functions
There is massive variation in the capabilities of different endpoint agents. Even for a single given function, there can be a dozen different approaches, all with varying degrees of success. Also, not all agents contain all features; in fact, most agents lack one or more major areas of functionality.
Agents include four generic layers/features:
- Content Discovery: Scanning of stored content for policy violations.
- File System Protection: Monitoring and enforcement of file operations as they occur (as opposed to discovery, which is scanning of content already written to media). Most often, this is used to prevent content from being written to portable media/USB. It's also where tools hook in for automatic encryption or application of DRM rights.
- Network Protection: Monitoring and enforcement of network operations. Provides protection similar to gateway DLP when a system is off the corporate network. Since most systems treat printing and faxing as a form of network traffic, this is where most print/fax protection can be enforced (the rest comes from special print/fax hooks).
- GUI/Kernel Protection: A more generic category to cover data in use scenarios, such as cut/paste, application restrictions, and print screen.
Between these four categories we cover most of the day to day operations a user might perform that places content at risk. It hits our primary drivers from the last post- protecting data from portable storage, protecting systems off the corporate network, and supporting discovery on the endpoint. Most of the tools on the market start with file and (then) networking features before moving on to some of the more complex GUI/kernel functions.
Agent Content Awareness
Even if you have an endpoint with a quad-core processor and 8 GB of RAM, the odds are you don't want to devote all of that horsepower to enforcing DLP.
Content analysis may be resource intensive, depending on the types of policies you are trying to enforce. Also, different agents have different enforcement capabilities which may or may not match up to their gateway counterparts. At a minimum, most endpoint tools support rules/regular expressions, some degree of partial document matching, and a whole lot of contextual analysis. Others support their entire repertoire of content analysis techniques, but you will likely have to tune policies to run on a more resource constrained endpoint.
Some tools rely on the central management server for aspects of content analysis, to offload agent overhead. Rather than performing all analysis locally, they will ship content back to the server, then act on any results. This obviously isn't ideal, since those policies can't be enforced when the endpoint is off the enterprise network, and it will suck up a fair bit of bandwidth. But it does allow enforcement of policies that are otherwise totally unrealistic on an endpoint, such as database fingerprinting of a large enterprise DB.
One emerging option is policies that adapt based on endpoint location. For example, when you're on the enterprise network most policies are enforced at the gateway. Once you access the Internet outside the corporate walls, a different set of policies is enforced. For example, you might use database fingerprinting (exact database matching) of the customer DB at the gateway when the laptop is in the office or on a (non split tunneled) VPN, but drop to a rule/regex for Social Security Numbers (or account numbers) for mobile workers. Sure, you'll get more false positives, but you're still able to protect your sensitive information while meeting performance requirements.
Next up: more on the technology, followed by best practices for deployment and implementation.
–Rich
Posted at Wednesday 2nd July 2008 3:34 am
Filed under:
(2) Comments •
(0) Trackbacks •
Permalink
By Rich
As the first analyst to ever cover Data Loss Prevention, I've had a bit of a tumultuous relationship with endpoint DLP. Early on I tended to exclude endpoint only solutions because they were more limited in functionality, and couldn't help at all with protecting data loss from unmanaged systems. But even then I always said that, eventually, endpoint DLP would be a critical component of any DLP solution. When we're looking at a problem like data loss, no individual point solution will give us everything we need.
Over the next few posts we're going to dig into endpoint DLP. I'll start by discussing how I define it, and why I don't generally recommend stand-alone endpoint DLP. I'll talk about key features to look for, then focus on best practices for implementation.
It won't come as any surprise that these posts are building up into another one of my whitepapers. This is about as transparent a research process as I can think of. And speaking of transparency, like most of my other papers this one is sponsored, but the content is completely objective (sponsors can suggest a topic, if it's objective, but they don't have input on the content).
Definition
As always, we need to start with our definition for DLP/CMP:
"Products that, based on central policies, identify, monitor, and protect data at rest, in motion, and in use through deep content analysis".
Endpoint DLP helps manage all three parts of this problem. The first is protecting data at rest when it's on the endpoint; or what we call content discovery (and I wrote up in great detail). Our primary goal is keeping track of sensitive data as it proliferates out to laptops, desktops, and even portable media. The second part, and the most difficult problem in DLP, is protecting data in use. This is a catch all term we use to describe DLP monitoring and protection of content as it's used on a desktop- cut and paste, moving data in and out of applications, and even tying in with encryption and enterprise Document Rights Management (DRM). Finally, endpoint DLP provides data in motion protection for systems outside the purview of network DLP- such as a laptop out in the field.
Endpoint DLP is a little difficult to discuss since it's one of the fastest changing areas in a rapidly evolving space. I don't believe any single product has every little piece of functionality I'm going to talk about, so (at least where functionality is concerned) this series will lay out all the recommended options which you can then prioritize to meet your own needs.
Endpoint DLP Drivers
In the beginning of the DLP market we nearly always recommended organizations start with network DLP. A network tool allows you to protect both managed and unmanaged systems (like contractor laptops), and is typically easier to deploy in an enterprise (since you don't have to muck with every desktop and server). It also has advantages in terms of the number and types of content protection policies you can deploy, how it integrates with email for workflow, and the scope of channels covered. During the DLP market's the first few years, it was hard to even find a content-aware endpoint agent.
But customer demand for endpoint DLP quickly grew thanks to two major needs- content discovery on the endpoint, and the ability to prevent loss through USB storage devices. We continue to see basic USB blocking tools with absolutely no content awareness brand themselves as DLP. The first batches of endpoint DLP tools focused on exactly these problems- discovery and content-aware portable media/USB device control.
The next major driver for endpoint DLP is supporting network policies when a system is outside the corporate gateway. We all live in an increasingly mobile workforce where we need to support consistent policies no matter where someone is physically located, nor how they connect to the Internet.
Finally, we see some demand for deeper integration of DLP with how a user interacts with their system. In part, this is to support more intensive policies to reduce malicious loss of data. You might, for example, disallow certain content from moving into certain applications, like encryption. Some of these same kinds of hooks are used to limit cut/paste, print screen, and fax, or to enable more advanced security like automatic encryption or application of DRM rights.
The Full Suite Advantage
As we've already hinted, there are some limitations to endpoint only DLP solutions. The first is that they only protect managed systems where you can deploy an agent. If you're worried about contractors on your network or you want protection in case someone tries to use a server to send data outside the walls, you're out of luck. Also, because some content analysis policies are processor and memory intensive, it is problematic to get them running on resource-constrained endpoints. Finally, there are many discovery situations where you don't want to deploy a local endpoint agent for your content analysis- e.g. when performing discovery on a major SAN.
Thus my bias towards full-suite solutions. Network DLP reduces losses on the enterprise network from both managed and unmanaged systems, and servers and workstations. Content discovery finds and protects stored data throughout the enterprise, while endpoint DLP protects systems that leave the network, and reduces risks across vectors that circumvent the network. It's the combination of all these layers that provides the best overall risk reduction. All of this is managed through a single policy, workflow, and administration server; rather than forcing you to create different policies; for different channels and products, with different capabilities, workflow, and management.
In our next post we'll discuss the technology and major features to look for, followed by posts on best practices for implementation.
–Rich
Posted at Monday 30th June 2008 10:57 am
Filed under:
(2) Comments •
(0) Trackbacks •
Permalink
By Rich
Since Friday is usually "trash" day (when you dump articles you don't expect anyone to read) I don't usually post anything major. But thanks to some unexpected work that hit yesterday, I wasn't able to get part 2 of this series out when I wanted to. If you can tear yourself away from those LOLCatz long enough, we're going to talk about web browsers, WAFs, and web application gateways. These are the first two components of Application and Database Monitoring and Protection (ADMP), which I define as:
Products that monitor all activity in a business application and database, identify and audit users and content, and, based on central policies, protect data based on content, context, and/or activity.
Browser Troubles
As we discussed in part 1, one of the biggest problems in web application security is that the very model of the web browsers and the World Wide Web is not conducive to current security needs. Browsers are the ultimate mashup tool- designed to take different bits from different places and seamlessly render them into a coherent whole. The first time I started serious web application programming (around 1995/96) this blew my mind. I was able to embed disparate systems in ways never before possible. And not only can we embed content within a browser, we can embed browsers within other content/applications. The main reason, as a developer, I converted from Netscape to IE was that Microsoft allowed IE to be embedded in other programs, which allowed us to drop it into our thick VR application. Netscape was stand alone only; seriously limiting its deployment potential.
This also makes life a royal pain on the security front where we often need some level of isolation. Sure, we have the same-origin policy, but browsers and web programming have bloated well beyond what little security that provides. Same-origin isn't worthless, and is still an important tool, but there are just too many ways around it. Especially now that we all use tabbed browsers with a dozen windows open all the time. Browsers are also stateless by nature, no matter what AJAX trickery we use. XSS and CSRF, never mind some more sophisticated attacks, take full advantage of the weak browser/server trust models that result from these fundamental design issues.
In short, we can't trust the browser, the browser can't trust the server, and individual windows/tabs/sessions in the browser can't trust each other. Fun stuff!
WAF Troubles
I've talked about WAFs before, and their very model is also fundamentally flawed. At least how we use WAFs today. The goal of a WAF is, like a firewall, to drop known bad traffic or only allow known good traffic. We're trying to shield our web applications from known vulnerabilities, just like we use a regular firewall to block ports, protocols, sources, and destinations. Actually, a WAF is closer to IPS than it is to a stateful packet inspection firewall.
But web apps are complex beasts; every single one a custom application, with custom vulnerabilities. There's no way a WAF can know all the ins and outs of the application behind it, even after it's well tuned. WAFs also only protect against certain categories of attacks- mostly some XSS and SQL injection. They don't handle logic flaws, CSRF, or even all XSS. I was talking with a reference yesterday of one of the major WAFs, and he had no trouble slicing through it during their eval phase using some standard techniques.
To combat this, we're seeing some new approaches. F5 and WhiteHat have partnered to feed the WAF specific vulnerability information from the application vulnerability assessment. Imperva just announced a similar approach, with a bunch of different partners.
These advances are great to see, but I think WAFs will also need to evolve in some different ways. I just don't think the model of managing all this from the outside will work effectively enough.
Enter ADMP
The idea of ADMP is that we build a stack of interconnected security controls from the browser to the database. At all levels we both monitor activity and include enforcement controls. The goal is to start with browser session virtualization connected to a web application gateway/WAF. Then traffic hits the web server and web application server, both with internal instrumentation and anti-exploitation. Finally, transactions drop to the database, where they are again monitored and protected.
All of the components for this model exist today, so it's not science fiction. We have browser session virtualization, WAFs, SSL-VPNs (that will make sense in a minute), application security services and application activity monitoring, and database activity monitoring. In addition to the pure defensive elements, we'll also tie in to the applications at the design and code level through security services for adaptive authentication, transaction authentication, and other shared services (happy Dre? :) ). The key is that this will all be managed through a central console via consistent policies.
In my mind, this is the only thing that makes sense. We need to understand the applications and the databases that back them. We have to do something at the browser level since even proper parameterization and server side validation can't meet all our needs. We have to start looking at transactions, business context, and content, rather than just packets and individual requests.
Point solutions at any particular layer have limited effectiveness. But if we stop looking at our web applications as pieces, and rather design security that addresses them as a whole, we'll be in much better shape. Not that anything is perfect, but we're looking at risk reduction, not risk elimination. A web application isn't just a web server, just some J2EE code, or just a DB- it's a collection of many elements working together to perform business transactions, and that's how we need to look at them for effective security.
The Browser and Web Application Gateway
A little while back I wrote about the concept of browser session virtualization. To plagiarize myself and save a little writing time so I can get my behind to happy hour:
What we ideally need is a way to completely isolate our content in the browser. One way to do this is session virtualization, pioneered by GreenBorder, who was later acquired by Google (the GreenBorder site is just in support mode now). When a user connects to our site, we push down some code to create a virtual environment in the browser that we strictly control. We wall off that session, which could just be an isolated iFrame in a page, so that it only accesses content we send it. Basically, we break the normal browser model and hijack what we need. This would, for example, help stop CSRF since other browser elements won't be able to trigger a connection to our application. Done right, it limits man in the middle attacks, even if the user authorizes a bad digital certificate.
To work properly, this needs to be tied to a gateway that controls the session. While we could do it from the web/app server itself, I suspect we'll see this as a web application firewall feature, just as we see similar features from SSL-VPNs. I think isolated WAFs have a very limited lifespan, but this is exactly the kind of feature that will extend their value. Better yet, we can tie this in to our Application and Database Monitoring and Protection to build a browser-to-database protected path. We can completely track a transaction or piece of content from the database server to the browser and back.
We could even use this to isolate out potentially "bad" content in an in-browser sandbox. For example, it could be a way to enable all those social networking widgets in a more controlled way but locking in potentially bad content instead of known good.
Will this protect us from keystroke sniffers or a completely compromised host? Nope, but it will definitely help with a large number of our current browser security issues. If we combine it with full ADMP and additional methods like transaction authentication, I think we can regain a bit of control of the web application security mess.
Thus we see one migration path for a WAF. A user goes to connect to the application and hits the WAF, which is now more of a Web Application Gateway. The gateway, like an SSL-VPN, sends the session virtualization code down to the browser. We do this outside of the web application for performance reasons. The secure virtual session is established and the gateway then allows communications with the application behind it.
For things like retail and financial sites that include only limited third party content if any, we can monitor activity from the browser through to the application and work within the isolated session. It both improves our ability to control what's being sent to the browser, and gives us a higher degree of confidence that what's coming from the browser is safe. We still validate everything, but since we're tied to the application itself we can validate in the browser and at the gateway before we even hit the app (and further validate there). Since in a controlled environment we know what transactions should be allowed (or not), we have more ability to detect and block "bad" transactions like SQL injection from the user.
In less controlled environments- thing MySpace or Gmail and everything in between- the gateway also becomes a filter for third party content. Like Checkpoint's new ForceField. The gateway filters out, to the best of its ability, harmful third party content coming from third party sites. Basically, it becomes an SSL-VPN for secure browsing.
This is obviously not viable for all sites due to bandwidth considerations, and in those circumstances we'll drop this part and stick to the rest of the ADMP stack, or only virtualize our pieces of content knowing the user is at risk for the third party stuff we're still linking them to.
Future of the WAF, Option 2
I've just described a scenario where the WAF extends into a Secure Web Application Gateway that adds virtualization, encryption, and content filtering. That doesn't mean WAFs won't also still exist in non-virtualized situations, since that will still be a massive volume of sites out there.
For these sites the WAF continues to progress with deeper application integration and application understanding, and works with the elements I'll describe later that will be embedded into the applications and databases. Rather than hanging around outside the application with barely any idea what's going on behind it, the WAF will take its cues from the app, help manage sessions, and monitor activity outside the app to block the few things we know we can pick up at that layer.
Why use the WAF at all? To give us a choke point and offload some of the monitoring and processing that could hurt application performance. Let's be honest- maybe WAFs will eventually go away, but performance problems alone will probably keep next-gen WAFs viable for a while. There are also plenty of things we can now block before they ever hit the application controls, which, by nature of being integrated at the app level, will be more complex and delicate.
But again, by tightly integrating with our other layers, instead of naievely assuming that an external black box will magically solve our problems, we get a much higher level of functionality. Feeding in vulnerability data as we're just starting to do is a good beginning, but once we plug in deeper to the application and database servers we'll get entirely new levels of functionality.
Part 2 Conclusions
What I've described today is how we can build a (more) trusted path from the browser to the face of the application. WAFs will add gateway capabilities, both protecting the applications behind them and the browsers in front of them. Since this won't be the right approach in all circumstances, WAFs will also evolve with tighter integration to the application and other ADMP stack components.
Again, this might sound like little more than the usual analyst fiction, but all the components are here today. Also, I don't expect my predictions to be totally accurate. I'm roughly guessing I'm at 85% or so.
Next week I'll start digging into the application and database. We'll talk about application instrumentation, anti-exploitation, DAM, trusted transaction paths, and shared security services.
–Rich
Posted at Friday 27th June 2008 6:12 am
Filed under:
(3) Comments •
(0) Trackbacks •
Permalink
By Rich
I've been spending the past few weeks wandering around the country for various shows, speaking to some of the best and brightest in the world of application and database security. Heck, I even hired one of them. During some of my presentations I laid out my vision for where I believe application (especially web application) and database security are headed. I've hinted at it here on the blog, discussing the concepts of ADMP, the information-centric security lifecycle, and DAM, but it's long past time I detailed the big picture.
I'm not going to mess around and write these posts so they are accessible to the non-geeks out there. If you don't know what secure SDLC, DAM, SSL-VPN, WAF, and connection pooling mean, this isn't the series for you. That's not an insult, it's just that this would drag out to 20+ pages if I didn't assume a technical audience.
Will all of this play out exactly as I describe? No way in hell. If everything I predict is 100% correct I'm just predicting common knowledge. I'm shooting for a base level of 80% accuracy, with hopes I'm closer to 90%. But rather than issuing some proclamation from the mount, I'll detail why I think things are going where they are. You can make your own decisions as to my assumptions and the accuracy of the predictions that stem from them.
Also, apologies to Dre's friends and family. I know this will make his head explode, but that's a cost I'm willing to pay. Special thanks to Chris Hoff and the work we've been doing on disruptive innovation, since that model drives most of what I'm about to describe. Finally, this is just my personal opinion as to where things will go. Adrian is also doing some research on the concept of ADMP, and may not agree with everything I say. Yes, we're both Securosis, but when you're predicting uncertain futures no one can speak with absolute authority. (And, as Hoff says, no one can tell you you're wrong today).
Forces and Assumptions
Based on the work I've been doing with Hoff, I've started to model future predictions by analyzing current trends and disruptive innovations. Those innovations that force change, rather than ones that merely nudge us to steer slightly around some new curves. In the security world, these forces (disruptions) come from three angles- business innovation, threat innovation, and efficiency innovation. The businesses we support are innovating for competitive advantage, as are the bad guys. For both of them, it's all about increasing the top line. The last category is more internal- efficiency innovation to increase the bottom line. Here's how I see the forces we're dealing with today, in no particular order:
- Web browsers are inherently insecure. The very model of the world wide web is to pull different bits from different places, and render them all in a single view through the browser. Images from over here, text from over here, and, using iframes, entire sites from yet someplace else. It's a powerful tool, and I'm not criticizing this model; it just is what it is. From a security standpoint, this makes our life more than a little difficult. Even with a strictly enforced same origin policy, it's impossible to completely prevent cross-site issues, especially when people keep multiple sessions to multiple sites open all at the same time. That's why we have XSS, CSRF, and related attacks. We are trying to build a trust model where one end can never be fully trusted.
- We have a massive repository of insecure code that grows daily. I'm not placing the blame on bad programmers; many of the current vulnerabilities weren't well understood when much of this code was written. Even today, some of these issues are complex and not always easy to remediate. We are also discovering new vulnerability classes on a regular basis, requiring review and remediation on any existing code. We're talking millions of applications, never mind many millions of lines of code. Even the coding frameworks and tools themselves have vulnerabilities, as we just saw with the latest Ruby issues.
- The volume of sensitive data that's accessible online grows daily. The Internet and web applications are powerful business tools. It only makes sense that we connect more of our business operations online, and thus more of our sensitive data and business operations are Internet accessible.
- The bad guys know technology. Just as it took time for us to learn and adopt new technologies, the bad guys had to get up to speed. That window is closed, and we have knowledgeable attackers.
- The bad guys have an economic infrastructure. Not only can they steal things, but they have a market to convert the bits to bucks. Pure economics give them viable business models that depend on generating losses for us.
- Bad guys attack us to steal or assets (information) or hijack them to use against others (e.g., to launch a big XSS attack). They also sometimes attack us just to destroy our assets, but not often (less economic incentive, even for DoS blackmail).
- Current security tools are not oriented to the right attack vectors. Even WAFs offer limited effectiveness since they are more tied to our network security models than our data/information-centric models.
- We do not have the resources to clean up all existing code, and we can't guarantee future code, even using a secure SDLC, won't be vulnerable. This is probably my most contentious assumption, but most of the clients I work with just don't have the resources to completely clean what they do have, and even the best programmers will still make mistakes that slip through to production.
- Code scanning tools and vulnerability analysis tools can't catch everything, and can't eliminate all false positives. They'll never catch logic flaws, and even if we had a perfect tool, the second a new vulnerability appeared we'd have to go back and fix everything we'd built up to that point.
- We're relying on more and more code and web services developed by others. From machine-generated web applications, to frameworks and off-the-shelf web apps we customize, to mashups where we directly pass content generated by someone else to our users.
- "Web applications" is a misnomer- we mean the entire stack: web servers, web application servers, the databases behind them, and all the various interconnected n tiers. Many of these are internally accessible, creating an additional vector for attack.
To rephrase these a bit:
- Bad guys are focused on our web applications, and are intelligent and motivated.
- We have a lot of insecure code, keep generating more, and can't rely on secure development to fix it all.
- WAFs and code scanning help, but aren't enough.
- We need to protect a big, complex stack including content and services outside our control.
Following these forces, I'm drawing some assumptions about what any solution needs to look like:
- We need to include browser elements, but can't trust the browser.
- We need to monitor and enforce at the transaction level, both for audibility and for logic flaws and other security issues.
- Such monitoring and enforcement needs to run from the browser to the database.
- Any solution needs to understand the application and database, not just layer over it.
- We need to filter anything we pass on to the user.
- We need to focus on protecting the information.
I'm not totally thrilled with how I've laid this out, but I think it's reasonably understandable. Tomorrow I'll walk through how I think the technology will develop, and where today's tools fit in. I suspect Adrian will start chiming in, once he gets off the road, with his own interpretations.
–Rich
Posted at Wednesday 25th June 2008 7:37 am
Filed under:
(3) Comments •
(0) Trackbacks •
Permalink
By Rich
Yes, it's one of those weeks, with two webcasts and a conference (SANS Pen Testing and Application Security in Vegas).
For this one we'll be talking about DLP content discovery for Vontu/Symantec. It's not just me; there will be a customer case study (yes, an honest to goodness security person willing to talk about what they've done). Here's the official description, and you can register here:
Where Is Your Confidential Data and How Do You Protect It? A Real Life Customer Success
Do you know where your confidential data is stored and how to protect it? Industry analysts predict that data discovery will be the single fastest-growing segment of the Data Loss Prevention (DLP) market in 2008 and beyond. In this webcast, you will get the opportunity to hear firsthand how Sharp HealthCare implemented a DLP solution to secure their sensitive customer data stored across the organization, and what business results they are seeing today. Join Rich Mogull, founder of Securosis LLC and former Gartner analyst, and Starla Rivers, Technical Security Architect at Sharp, as they address how to easily deploy DLP and quickly realize the solution benefits.
–Rich
Posted at Sunday 1st June 2008 11:07 am
Filed under:
(2) Comments •
(0) Trackbacks •
Permalink
By Rich
Last night I had this recurring dream I seem to have a few times a year. It involves a plane crash, but not one that I'm on. The dream always changes, but in every case I'm out and about someplace, I look up and see a struggling plane, it crashes, and I rush over to help. The dream almost always end before I do anything, and since I'm no longer a field medic portions of it usually involve me figuring out how I can help. Must be my overblown, currently unused hero complex or something. Never doubt the bounds of my ego.
This has absolutely nothing to do with what I want to talk about today; I just think it's weird.
I swear I posted on this before, but I can't find it anywhere. During a dinner with one of the Database Activity Monitoring vendors (the best DAM name in the industry) I mentioned their market size was equal to, or slightly larger than, the pure-play DLP market (that's where we toss out peripheral products that only use DLP as a feature). I assumed this was common knowledge, but their jaws dropped. We ran through some back of the envelope calculations, and placed DLP at about $70M in 2007, with DAM right in the same range. My estimates might be off by up to $20M, but that's basically a rounding error when we're talking total market revenues.
Here are a few caveats. My estimates don't include a lot of peripheral vendors, and I slice down as best I can to estimate DLP vs. DAM specific sales. For example, Orchestria launched a new DLP product in 2007, but I only include a fraction of their revenue in the market rollup since most of the revenue is still coming from their compliance product. Same for Verdasys, Proofpoint, Imperva (also has web application firewalls), and other vendors either with multiple product lines, or where DLP or DAM is a feature of something else. I work with most of the major DLP and DAM vendors, and while some share their revenues under NDA, some don't, and these are mostly private companies. I'm also certain a couple of them are lying to me.
So there you have it. DLP is heavily hyped, but DAM is essentially the same size without as much hype. I believe this is because DAM, in some cases, directly addresses a compliance requirement, while DLP isn't usually required (although can often really help).
–Rich
Posted at Wednesday 14th May 2008 3:12 am
Filed under:
(1) Comments •
(0) Trackbacks •
Permalink
By Rich
In our last post we finished our review of DLP content discovery best practices by discussion rolling out and maintaining your deployment. Today we're going to focus on a couple of use cases that illustrate how it all works together. I'm writing these as fake case studies, which is probably really obvious considering my lack of creativity in the names.
DLP Content Discovery for Risk Reduction and to Support PCI Audit
RetailSportsCo is a mid-sized online and brick-and-mortar sporting goods retailer, with about 4,000 headquarters employees and another 2,000 retail employees, across 50 locations. They classify as a Level 2 merchant due to their credit card transaction volume and are currently PCI complaint, but struggled through the process and ended up getting a series of compensating controls approved by their auditor, but only for their first year.
During the audit it was discovered that credit card information had proliferated uncontrolled throughout the organization. It was scattered through hundreds of files on dozens of servers; mostly Excel spreadsheets and Access databases used, and later ignored, by different business units. Since storage of unencrypted credit card numbers is prohibited by PCI, their auditor required them to remove or secure these files. Audit costs for the first year increased significantly due to the time spent by the auditor validating that the information was destroyed or secured.
RetailSportsCo purchased a DLP solution and created a discovery policy to locate credit card information across all storage repositories and employee systems. The policy was initially deployed against the customer relations business unit servers, where over 75 files containing credit card numbers were discovered. After consultation with the manager of the department and employee notification, the tool was switched into enforcement mode and all these files were quarantined back into an encrypted repository.
In phase 2 of the project, DLP endpoint agents were installed on the laptops of sales and customer relations employees (about 100 employees). Users and managers were educated, and the tool discovered and removed approximately 150 additional files. Phase 3 added coverage of all known storage repositories at corporate headquarters. Phase 4 expanded scanning to storage at retail locations, over a period of 5 months. The final phase will add coverage of all employee systems in the first few months of the coming year, leveraging their workstation configuration management system for a scaled deployment.
Audit reports were generated showing exactly which systems were scanned, what was found, and how it was removed or protected. Their auditor accepted the report, which reduced audit time and costs materially (more than the total cost of the DLP solution). One goal of the project is to scan the entire enterprise at least once a quarter, with critical systems scanned on either a daily or weekly basis. RetailSportsCo has improved security and reduced risk by reducing the potential number of targets, and reduced compliance costs by being able to provide auditors with acceptable reports demonstrating compliance.
DLP Content Discovery to Reduce Competitive Risk (Industrial Espionage)
EngineeringCo is a large high-technology manufacturer of consumer goods with 51,000 employees. In the past they've suffered from industrial espionage, when the engineering plans for new and existing products were stolen. They also suffered a rash of unintentional exposures and product plans were accidentally placed in public locations, including the corporate website.
EngineeringCo acquired a DLP content discovery solution to reduce these exposure risks and protect their intellectual property. Their initial goal was to reduce the risk of exposure of engineering and product plans. Unlike RetailSportsCo, they decided to start with endpoints, then move into scanning enterprise storage repositories. Since copies of all engineering and product plans reside in the enterprise content management system, they chose a DLP solution that could integrate and continuously monitor selected locations and automatically build partial-document matching policies for all documents. The policy was tested and refined to ignore common language in the files, such as corporate headers and footers, which initially caused every document using the corporate template to register in the DLP tool.
EngineeringCo started with a phased deployment to install the DLP endpoint discovery agent on all corporate systems. In phase 1, the tool was rolled out to 100 systems per week, starting with product development teams. The initial policy allowed those teams access to the sensitive information, but documented what was on their systems. Those reports were later mated to their encryption tool to ensure that no unencrypted laptops hold the sensitive data. Phase 2 expanded deployment to the broader enterprise, initially in alerting mode. After 90 days the product was switched into enforcement mode and any identified content outside of the product development teams was quarantined with an alert sent to the user, who could request an exemption. Initial alert rates were high, but user education reduced levels to only a dozen or so "violations" a week during the 90-day grace period.
In the coming year EngineeringCo plans to refine their policy to restrict product development employees from placing registered documents onto portable storage. The network component of their DLP tool already restricts emailing and other file transfers outside of the enterprise. They also plan on adding policies to protect employee healthcare information and customer account information.
These are, of course, fictional best practices examples, but they're drawn from discussions with dozens of DLP clients. The key takeaways are:
- Start small, with a few simple policies and a limited scanning footprint.
- Grow deployments as you reduce incidents/violations to keep your incident queue under control and educate employees.
- Start with monitoring/alerting and employee educations, then move on to enforcement.
- This is risk reduction, not risk elimination. Use the tool to identify and reduce exposures but don't expect it to magically solve all your data security problems.
- When you add new policies, test first with a limited audience before rolling them out to the entire scope, even if you are already covering the entire enterprise with other policies.
–Rich
Posted at Thursday 1st May 2008 3:32 am
Filed under:
(4) Comments •
(0) Trackbacks •
Permalink
By Rich
In Part 3 of our series we finished our review of the technical architecture and selection; now we're going to delve into best practices for deployment. We will focus on setting expectations, prioritization, and defining your internal processes. The main obstacle to successful deployments isn't a technology weakness, but rather the failure of the enterprise to understand what to protect, decide how to protect it, and recognize what's reasonable in a real-world environment.
Setting Expectations
The single most important factor for any successful DLP deployment -- content discovery or otherwise -- is properly setting expectations at the start of the project. DLP tools are powerful, but far from a magic bullet or black box that makes all data completely secure. When setting expectations you need to pull key stakeholders together in a single room and define what's achievable with your solution. All discussion at this point assumes you've already selected a tool. Some of these practices deliberately overlap steps during the selection process since at this point you'll have a much clearer understanding of the capabilities of your chosen tool.
In this phase, you discuss and define the following:
- What kinds of content you can protect, based on the content analysis capabilities of your tool.
- Expected accuracy rates for those different kinds of content; for example, you'll have a much higher false positive rate with statistical/conceptual techniques than partial document or database matching.
- Protection options: Can you encrypt? Move files? Change access controls?
- Performance, based on scanning techniques.
- How much of the infrastructure you'd like to cover (which servers, endpoints, and other storage repositories).
- Scanning frequency (days? hours? near continuous?).
- Reporting and workflow capabilities.
It's extremely important to start defining a phased implementation. It's completely unrealistic to expect to monitor every nook and cranny of your infrastructure with your initial rollout. Nearly every organization finds they are more successful with a controlled, staged rollout that slowly expands breadth of infrastructure coverage and types of content to protect.
Prioritization
If you haven't already prioritized your information during the selection process, you need to pull all major stakeholders together (business units, legal, compliance, security, IT, HR, etc.) and determine which kinds of information are more important, and which to protect first. I recommend you first rank major information types (e.g., customer PII, employee PII, engineering plans, corporate financials), then re-order them based on priority for monitoring/protecting within your DLP content discovery tool.
In an ideal world your prioritization should directly align with the order of protection, but while some data might be more important to the organization (engineering plans) other data may need to be protected first due to exposure or regulatory requirements (PII). You'll also need to tweak the order based on the capabilities of your tool.
After your prioritize information types to protect, run through and determine approximate timelines for deploying content policies for each type. Be realistic, and understand that you'll need to both tune new policies and leave time for the organizational to become comfortable with any required business changes.
We'll look further at how to roll out policies and what to expect in terms of deployment times later in this series.
Define Process
DLP tools are, by their very nature, intrusive. Not in terms of breaking things, but intrusive in terms of the depth and breadth of what they find. Organizations are strongly advised to define their business processes for dealing with DLP policy creation and violations before turning on the tools. Here's a sample of a process for defining new policies:
- Business unit requests policy from DLP team to protect content type.
- DLP team meets with business unit to determine goals and protection requirements.
- DLP team engages with legal/compliance to determine any legal or contractual requirements or limitations.
- DLP team defines draft policy.
- Draft policy tested in monitoring (alert only) mode without full workflow and tuned to acceptable accuracy.
- DLP team defines workflow for selected policy.
- DLP team reviews final policy and workflow with business unit to confirm needs have been met.
- Appropriate business units notified of new policy and any required changes in business processes.
- Policy deployed in production environment in monitoring mode, but will full workflow enabled.
- Protection certified as stable.
- Protection/enforcement actions enabled.
And here's one for policy violations:
- Violation detected; appears in incident handling queue.
- Incident handler confirms incident and severity.
- If action required, incident handler escalates and opens investigation.
- Business unit contact for triggered policy notified.
- Incident evaluated.
- Protective actions taken.
- If file moved/protected, notify user and drop placeholder file with contact information.
- Notify employee manager and HR if corrective actions required.
- Perform required employee education.
- Close incident.
These are, of course, just rough samples in text form, but they should give you a good idea of where to start.
–Rich
Posted at Tuesday 29th April 2008 4:01 am
Filed under:
(1) Comments •
(0) Trackbacks •
Permalink