Login  |  Register  |  Contact

Project Quant: Database Security - Masking

In our last task in the Protect phase of Quant for Database Security, we'll cover the discrete tasks for implementing data masking. In a nutshell, masking is applying a function to data in order to obfuscate sensitive information, while retaining its usefulness for reporting or testing. Common forms of masking include randomly re-assigning first and last names, and creating fake credit card and Social Security numbers. The new values retain the format expected by applications, but are not sensitive in case the database is compromised.

Masking has evolved into two different models: the traditional Extraction, Transformation, Load (ETL) model, which alters copies of the data; and the dynamic model, which masks data in place. The conventional ETL functions are used to extract real data and provide an obfuscated derivative to be loaded into test and analytics systems. Dynamic masking is newer, and available in two variants. The first overwrites the sensitive values in place, and the second variant provides a new database 'View'. With views, authorized users may access either the original or obfuscated data, while regular users always see the masked version. Which masking model to use is generally determined by security and compliance requirements.

Plan

  • Time to confirm data security & compliance requirements. What data do you need to protect and how?
  • Time to identify preservation requirements. Define precisely what reports and analytics are dependent upon the data, and what values must be preserved.
  • Time to specify masking model (ETL, Dynamic).
  • Time to generate baseline test data. Create sample test cases and capture results with expected return values and data ranges.

Acquire

  • Variable: Time to evaluate masking tools/products.
  • Optional: Cost to acquire masking tool. This function may be in-house or provided by free tools.
  • Time to acquire access & permissions. Access to data and databases required to extract and transform.

Setup

  • Optional: Time to install masking tool.
  • Variable: Time to select appropriate obfuscation function for each field, to both preserve necessary values and address security goals.
  • Time to configure. Map rules to fields.

Deploy & Test

  • Time to perform transformations. Time to extract or replace, and generate new data.
  • Time to verify value preservation and test application functions against baseline. Run functional test and analytics reports to verify functions.
  • Time to collect sign-offs and approval.

Document

  • Time to document specific techniques used to obfuscate.

—Adrian Lane

Previous entry: Project Quant: DatabaseSecurity - WAF | | Next entry: Project Quant: Database Security - Configuration Management

Comments:

By Dan F  on  02/13  at  07:24 PM

Adrian,

Good article, but confused by the difference between ETL which you state “alters copies of the data” and the second variant of dynamic masking which is described as providing a new database “view”, presumably at the application layer on top of the original data.  Just so I’m clear the difference is that with ETL the data is permanently changed as opposed to just an application insertion of masked data with the dynamic method? 

On a separate note, if dynamic masking only provides views of data it would seem that this technique suffers from the same risks from a breach to the DB itself as it does with no masking? Certainly ‘casual’ leakage is prevented and the data is able to used while being compliant, but still very vulnerable.  Just want to ensure I’‘m thinking about this correctly.

Thanks for the knowledge.

Dan

By Adrian Lane  on  02/22  at  12:46 PM

@Dan F - I am trying to differentiate between altering a copy of data and altering the original. Traditional masking—the extraction, transformation and loaded (ETL) variety—_is_ permanent. But it does not work on the original data and needs to be periodically updated. A View is a database construct. It is provided by the database. Third party tools leverage the view capability and apply obfuscation to the view. 

The term ‘dynamic’ masking makes more sense with views than it does in-place transformation on the original data set. Views create a dynamic copy of the data, and the mask that does _not_ alter the original. I think the term dynamic will make a lot more sense as the market evolves to apply obfuscation as data is inserted into the database ... dynamically ... rather than a manual or batched ETL job. Some databases contain a mix of real and obfuscated data depending upon how they mask and how they update decision support or testing data.

Let me know if I failed to answer the question.

-Adrian

Name:

Email:

Location:

URL:

Remember my personal information

Notify me of follow-up comments?

Submit the word you see below: