Data Masking Definition

{TOC}

WHAT IS DATA MASKING?

Data Masking Definition

Data Masking is a method to hide sensitive information.

Sometimes people interchangingly use such terms as data anonymization, data de-identification, data scrambling, data scrubbing, and data obfuscation. Often times there are subtle differences in how this or that vendor defines the process. Yet, while industry seems to have different opinions on the subject, and call some subsets of data masking algorithms different names, one thing is clear.

One has to understand relative value of security and simply be able to estimate the risks. Whether you will say that invoking "expert determination" process makes "regular data masking" indeed to become a "de-identification process", it is a good practice to estimate the ability to re-identify the information after applying data masking algorithms. We will not make exaggerated claims that "rare people in the world are experts in de-identification." Along with HIPAA, we maintain that any person with the statistical knowledge can do the trick, however, we will say that at first, the person has to learn the specifics.

Data masking is not just a science of algorithms. It is also a science of public data sets. This kind of knowledge definitely calls for training, together with understanding of ~~k-diversity~~k-anonymity and ~~l-anonymity.~~l-diversity. Both of the terms are mathematical expression of ~~statistics~~statistical "common sense" but for those who want to learn more and are mathematically inclined, please keep reading below.

Download a Trial