Understanding and Selecting Data Masking: Management and Advanced Features

In this post we will examine many of the features and functions of masking that go beyond the basics of data collection and transformation. The first, and most important, is the management interface for the masking product. Central management is the core addition that transforms masking from a simple tool into an enterprise data security platform. Central management is not new; but capabilities, and maturity, and integration are evolving rapidly. In the second part of today’s post we will discuss advanced masking functions we are beginning to see, to give you an idea of where these products are heading. Sure, all these products provide management of the basic functions, but the basics don’t fully encompass today’s principal use cases – the advanced feature set and management interfaces differentiate the various products, and are likely to drive your choice of product. Central Management This is the proverbial “single pane of glass” for management of data, policies, data repositories, and task automation. The user interface is how you interact with data systems and control the flow of information. A good UI can simplify your job, but a bad one will make you want to never use the product! Management interfaces have evolved to accommodate both IT management and non-technical stakeholders alike, allowing them to set policy, define workflows, understand risk, and manage where data goes. Some products even provide the capability to manage endpoint agents. Keep in mind that each masking platform has its own internal database to store policies, masks, reports, user credentials, and other pertinent information; and some offer visualization technologies and dashboards to help you see what exactly is going on with your data. The following is a list of management features to consider when evaluating the suitability of a masking platform: Policy Management: A policy is nothing more than a rule on how sensitive data is to be treated. Policies usually consist of a data mask – the thing that transforms data – and a data source the mask is applied to. Every masking platform comes with several predefined masks, as well as an interface to customize masks to your needs. But the policy interfaces go one step further, associating a mask with a data source. Some platforms take this one step further – allowing a policy to be automatically applied to specific data types, such as credit card numbers, regardless of source or destination. Policy management is typically simplified with predefined policy sets, as we will discuss below. Discovery: For most customers discovery has become a must-have feature – not least because it is essential for regulatory compliance. Data discovery is an active scan to first find data repositories, and then scan them for sensitive data. The discovery process works by scanning files and databases, matching content to known patterns (such as 9-digit Social Security numbers) or metadata (data that describes data structure) definitions. As sensitive data is discovered, the discovery tool creates a report containing both the location and a list of the sensitive data types found. Once data is discovered there are many options for what to do next. The report can be sent to interested parties, archived for compliance, or even fed back into the masking product for automatic policy enforcement. The discovery results can be used to build a catalog of metadata, physically map locations within a data center, and even present a risk score based on location and data type. Discovery can be tuned to look in specific locations, refined to look for as few or as many data types as the user is interested in, and automated to find preselected patterns on a regular schedule. Credential Management: Selection, extraction, and discovery of information from different data sources all require credentialed access (typically a user name and password) to the file or database in question. The goal is to automate masking as much as possible, so it would be infeasible to expect users to provide a user name and password to begin every masking task. The masking platform needs to either securely store credentials or use credentials from an access management system like LDAP or Active Directory, and supply seamlessly them as needed. Data Set Management: For managing test data sets, as well as for compliance, you need to track which data you mask and where you send it. This information is used to orchestrate moving data around the organization – managing which systems get which masked data, tracking when the last update was performed, and so on. As an example, think about the propagation of medical records: an insurance company, a doctor’s office, a clinical trial organization, and the federal government, all receive different subsets of the data, with different masks applied depending on which information each needs. This is the core function of data management tools, many of which have added masking capabilities. Similarly, masking vendors have added data management capabilities in response to customer demand for complex data orchestration. The formalization of how data sets are managed is also key for both automation and visualization, two topics we will discuss below. Data Subsetting: For large enterprises, masking is often applied across hundreds or thousands of databases. In these cases it’s incredibly important to be as efficient as possible to avoid overtaxing databases or saturating networks with traffic. People who manage data define the smallest data subset possible that still satisfies application testers’ needs for production quality masked data. This involves cutting down the number of rows exported/viewed, and possibly reducing the number of columns. Defining a common set of columns also helps clone a single masked data set for multiple environments, reducing the computational burden of creating masked clones. Automation: Automation of masking, data collection, and distribution tasks are core functions of every masking platform. The automated application of masking policies, and integration with third party systems that rely on masked data, drastically reduce workload. Some systems offer very rudimentary automation capabilities, such as UNIX cron jobs, while others have very complex features to manage remote jobs and work

Read Post

Malware Analysis Quant [Final Paper]

Those of you who have followed Securosis for a while know that our Quant research is the big daddy of all our projects. We build a very granular process map for a certain function, build a metrics model, and in some cases survey our community to figure out what they do and what they don’t. We have already tackled Patch Management, Network Security Operations, and Database Security Options. Our latest Quant study tackled Malware Analysis. Here’s an excerpt from the Introduction to provide some context: It has been clear for a while that today’s anti-malware defenses basically don’t work, and as a result way too much malware makes it through your defenses. When you get an infection you start a process to figure out what happened. First you figure out what the attack is, how it works, how to stop it (or work around it), and how far it has spread within your organization. That’s all before you can even think about fixing it. To the best of our knowledge, no one has built a specific process map for what this looks like, or a model for figuring out how much it costs to deal with malware on an operational basis. We built the process map and cost model to help folks understand the true impact of malware attacks. It’s not pretty, and many folks, I’m sure, would rather not know. But this research is for those who want to understand malware analysis. You can see from the process map below that this isn’t a process for the faint of heart, and that’s why most organizations fail in their malware defense efforts. B many organizations do a fair job of fighting malware because they take a very structured and analytical approach to understanding attacks, isolating attack vectors, finding already compromised devices, and updating controls to prevent reinfection. Check out the full report and the accompanying metrics model (.xlsx). As you read this report it is worth keeping the Quant philosophy in mind: the high level process framework is intended to cover all the tasks involved, but that doesn’t mean you need to do everything. Individual organizations pick and choose the appropriate steps for them. This exhaustive model can help you understand the operational processes of analyzing malware. We would like to thank Sourcefire for sponsoring the research, and all the folks who took a few minutes to fill out the survey. And finally, if you are interested in the blog posts that iteratively built up the series, check out the Malware Analysis Quant Index of Posts. Share:

Read Post

Totally Transparent Research is the embodiment of how we work at Securosis. It’s our core operating philosophy, our research policy, and a specific process. We initially developed it to help maintain objectivity while producing licensed research, but its benefits extend to all aspects of our business.

Going beyond Open Source Research, and a far cry from the traditional syndicated research model, we think it’s the best way to produce independent, objective, quality research.

Here’s how it works:

  • Content is developed ‘live’ on the blog. Primary research is generally released in pieces, as a series of posts, so we can digest and integrate feedback, making the end results much stronger than traditional “ivory tower” research.
  • Comments are enabled for posts. All comments are kept except for spam, personal insults of a clearly inflammatory nature, and completely off-topic content that distracts from the discussion. We welcome comments critical of the work, even if somewhat insulting to the authors. Really.
  • Anyone can comment, and no registration is required. Vendors or consultants with a relevant product or offering must properly identify themselves. While their comments won’t be deleted, the writer/moderator will “call out”, identify, and possibly ridicule vendors who fail to do so.
  • Vendors considering licensing the content are welcome to provide feedback, but it must be posted in the comments - just like everyone else. There is no back channel influence on the research findings or posts.
    Analysts must reply to comments and defend the research position, or agree to modify the content.
  • At the end of the post series, the analyst compiles the posts into a paper, presentation, or other delivery vehicle. Public comments/input factors into the research, where appropriate.
  • If the research is distributed as a paper, significant commenters/contributors are acknowledged in the opening of the report. If they did not post their real names, handles used for comments are listed. Commenters do not retain any rights to the report, but their contributions will be recognized.
  • All primary research will be released under a Creative Commons license. The current license is Non-Commercial, Attribution. The analyst, at their discretion, may add a Derivative Works or Share Alike condition.
  • Securosis primary research does not discuss specific vendors or specific products/offerings, unless used to provide context, contrast or to make a point (which is very very rare).
    Although quotes from published primary research (and published primary research only) may be used in press releases, said quotes may never mention a specific vendor, even if the vendor is mentioned in the source report. Securosis must approve any quote to appear in any vendor marketing collateral.
  • Final primary research will be posted on the blog with open comments.
  • Research will be updated periodically to reflect market realities, based on the discretion of the primary analyst. Updated research will be dated and given a version number.
    For research that cannot be developed using this model, such as complex principles or models that are unsuited for a series of blog posts, the content will be chunked up and posted at or before release of the paper to solicit public feedback, and provide an open venue for comments and criticisms.
  • In rare cases Securosis may write papers outside of the primary research agenda, but only if the end result can be non-biased and valuable to the user community to supplement industry-wide efforts or advances. A “Radically Transparent Research” process will be followed in developing these papers, where absolutely all materials are public at all stages of development, including communications (email, call notes).
    Only the free primary research released on our site can be licensed. We will not accept licensing fees on research we charge users to access.
  • All licensed research will be clearly labeled with the licensees. No licensed research will be released without indicating the sources of licensing fees. Again, there will be no back channel influence. We’re open and transparent about our revenue sources.

In essence, we develop all of our research out in the open, and not only seek public comments, but keep those comments indefinitely as a record of the research creation process. If you believe we are biased or not doing our homework, you can call us out on it and it will be there in the record. Our philosophy involves cracking open the research process, and using our readers to eliminate bias and enhance the quality of the work.

On the back end, here’s how we handle this approach with licensees:

  • Licensees may propose paper topics. The topic may be accepted if it is consistent with the Securosis research agenda and goals, but only if it can be covered without bias and will be valuable to the end user community.
  • Analysts produce research according to their own research agendas, and may offer licensing under the same objectivity requirements.
  • The potential licensee will be provided an outline of our research positions and the potential research product so they can determine if it is likely to meet their objectives.
  • Once the licensee agrees, development of the primary research content begins, following the Totally Transparent Research process as outlined above. At this point, there is no money exchanged.
  • Upon completion of the paper, the licensee will receive a release candidate to determine whether the final result still meets their needs.
  • If the content does not meet their needs, the licensee is not required to pay, and the research will be released without licensing or with alternate licensees.
  • Licensees may host and reuse the content for the length of the license (typically one year). This includes placing the content behind a registration process, posting on white paper networks, or translation into other languages. The research will always be hosted at Securosis for free without registration.

Here is the language we currently place in our research project agreements:

Content will be created independently of LICENSEE with no obligations for payment. Once content is complete, LICENSEE will have a 3 day review period to determine if the content meets corporate objectives. If the content is unsuitable, LICENSEE will not be obligated for any payment and Securosis is free to distribute the whitepaper without branding or with alternate licensees, and will not complete any associated webcasts for the declining LICENSEE. Content licensing, webcasts and payment are contingent on the content being acceptable to LICENSEE. This maintains objectivity while limiting the risk to LICENSEE. Securosis maintains all rights to the content and to include Securosis branding in addition to any licensee branding.

Even this process itself is open to criticism. If you have questions or comments, you can email us or comment on the blog.