Project Quant: Database Security - Masking
In our last task in the Protect phase of Quant for Database Security, we'll cover the discrete tasks for implementing data masking. In a nutshell, masking is applying a function to data in order to obfuscate sensitive information, while retaining its usefulness for reporting or testing. Common forms of masking include randomly re-assigning first and last names, and creating fake credit card and Social Security numbers. The new values retain the format expected by applications, but are not sensitive in case the database is compromised.
Masking has evolved into two different models: the traditional Extraction, Transformation, Load (ETL) model, which alters copies of the data; and the dynamic model, which masks data in place. The conventional ETL functions are used to extract real data and provide an obfuscated derivative to be loaded into test and analytics systems. Dynamic masking is newer, and available in two variants. The first overwrites the sensitive values in place, and the second variant provides a new database 'View'. With views, authorized users may access either the original or obfuscated data, while regular users always see the masked version. Which masking model to use is generally determined by security and compliance requirements.
Plan
- Time to confirm data security & compliance requirements. What data do you need to protect and how?
- Time to identify preservation requirements. Define precisely what reports and analytics are dependent upon the data, and what values must be preserved.
- Time to specify masking model (ETL, Dynamic).
- Time to generate baseline test data. Create sample test cases and capture results with expected return values and data ranges.
Acquire
- Variable: Time to evaluate masking tools/products.
- Optional: Cost to acquire masking tool. This function may be in-house or provided by free tools.
- Time to acquire access & permissions. Access to data and databases required to extract and transform.
Setup
- Optional: Time to install masking tool.
- Variable: Time to select appropriate obfuscation function for each field, to both preserve necessary values and address security goals.
- Time to configure. Map rules to fields.
Deploy & Test
- Time to perform transformations. Time to extract or replace, and generate new data.
- Time to verify value preservation and test application functions against baseline. Run functional test and analytics reports to verify functions.
- Time to collect sign-offs and approval.
Document
- Time to document specific techniques used to obfuscate.
—Adrian Lane
