One of the new feature that’s being developed with SQL Server 2016. If you are a DBA then you will like this feature very much. In the real world when you move the data from production to development or uat server we are supposed to obfuscate the data so that the developer wont see the actual live data. Till SQL Server 2016 we are achieving this with dynamic DML scripts which will mask the data. Based on the volume of records in the database we will end up long wait hours to obfuscate it.
I believe this one feature will be verful in masking the data. Let’s evaluate it
SQL Server 2016 CTP 2.0
Enable the trace flags below to play around with this feature
SQL Server 2016 CTP 2.1 and above
This feature is enabled by default. Dont enable the trace flags. If you enable these trace flags then dynamic masking feature will be disabled.
Let’s create some sample data to validate the new feature. I just trying with few records
CREATE TABLE [UserData]( [ID] [INT] IDENTITY(1, 1) , [FirstName] VARCHAR(100) , [LastName] VARCHAR(100) , [DOB] DATE , [Email] VARCHAR(200) , [PANNumber] VARCHAR(20)); GO INSERT INTO [dbo].[UserData] ( [FirstName] , [LastName] , [DOB] , [Email] , [PANNumber] ) VALUES( 'Vidhya', 'Sagar', '01-01-2010', 'firstname.lastname@example.org', 'ABCDE5589D' ), ( 'John', 'Gilbert', '01-01-1980', 'email@example.com', 'UEADC5589E' ), ( 'SQL', 'Server', '01-01-2012', 'firstname.lastname@example.org', 'POHGE5589Z' ), ( 'Ram', 'Kumar', '01-01-2014', 'email@example.com', 'BERAT5589A' ), ( 'Madhi', 'MVP', '01-01-2013', 'firstname.lastname@example.org', 'SJUGH5589K' );
Masking can be done in any of the 3 methods. Each metho
DEFAULT – Full mask. Use this method to completely mask the entire data.
EMAIL – Use this method to mask email address. This method will leave the first character and the suffix “.com” as it is and then it will mask the other characters
PARTIAL – This method is a custom approach to mask the data. You can decide which characters to unmask in the first and last column value.
Let’s try to evaluate all these methods. Use the scripts below to mask the data
ALTER TABLE UserData ALTER COLUMN PANNumber ADD MASKED WITH (FUNCTION = 'default()') -- Trying full mask on Char type ALTER TABLE UserData ALTER COLUMN Email ADD MASKED WITH (FUNCTION = 'email()') -- Trying email mask ALTER TABLE UserData ALTER COLUMN LastName ADD MASKED WITH (FUNCTION = 'partial(2,"XXXXX",2)')-- Trying partial mask ALTER TABLE UserData ALTER COLUMN DOB ADD MASKED WITH (FUNCTION = 'default()') -- Trying full mask on Date type
Testing with different permissions
- If the user is not part of db_owner but if the user is part of all other db roles then he wont be able to see the actual data.
- Now let’s provide db_owner permission to the same user and give a try.
- Since he is the db owner he is able to view all the unmasked data. Let’s say if you have a requirement to mask the data to db owners then you can use the new permission command which will revoke the masked data permission to the user.
DENY UNMASK TO ReadOnlyUser
If you notice the data is masked even though he got db owner permission.
Searching the masked data
Lets say you know the actual data and you dont have access to unmasked data, still you will be able to search the table.
SELECT * FROM UserData WHERE PANNumber='POHGE5589Z'
MASK and UNMASK are the two new permission related to dynamic masking. Sample code below
Script to find the masked columns
GRANT UNMASK TO UserName --Enables user to view unmasked data even though its masked DENY UNMASK TO UserName --Denies user to view unmasked data even though he got elevated access
These permissions are not applicable for sysadmins 🙂
Identify Masked Columns
New DMV is accompanying to get the details so we can use that to get the data. You can also use sys.columns DMV
SELECT OBJECT_NAME(object_id),Name [ColumnName],masking_function FROM sys.masked_columns SELECT OBJECT_NAME(object_id) [TableName],name [ColumnName] FROM sys.columns WHERE is_masked=1
Drop Masking for a Column
If you no longer need to mask the data then you can goahead and drop the masking function on the columns. Sample command is below
ALTER TABLE UserData ALTER COLUMN PANNumber DROP MASKED ALTER TABLE UserData ALTER COLUMN LastName DROP MASKED; ALTER TABLE UserData ALTER COLUMN DOB DROP MASKED; ALTER TABLE UserData ALTER COLUMN Email DROP MASKED;
- Very less time to do the masking compared to the regular obfuscation script
- Easier to deploy
- Control to show masked data
- Index can be used when you leave the first character unmasked
- Computed, Encrypted columns can’t be part of data masking
- Still production data is visible so if you take backup from DEV server and restore it in local machine (where you got admin rights), you can still see the data