e
q
u
e
s
t
a
d
e
m
o < back
De-identification vs. Data Masking
{TOC}
THE SOURCE OF CONFUSION
So, Ok, you have heard about data masking, de-identifying, anonymizing, scrubbing, and tokenization, and... now you are confused. You are not quite certain how to distinguish between them all. Everybody in the industry takes a different position on whether they are the same concept, indeed.
The first stop for the definitions are always international standards. The standards provide commonly accepted definitions and requirements among the practitioners around the world. Yet, in case of the data de-identification or data masking, there is no mentioning of the term in existing ISOs. The term that ISOs mention (or in particular the ISO/TS 25237) is Pseudonymization ( ISO/TS 25237: Health informatics – Pseudonymization, First edition, 2008-12-01 (Informatique de santé — Pseudonymisation))
Security
External
Another Threatplace would be compliance bodies - and we have some help here in the form of HHS with their guidance on "Safe Harbor" and 18 elements of data masking.
TAlso, there is plenty of information on the internet and some books, with the one of the most popular by Khaled el Emam and Luk Arbuckle "Anonymizing Health Data".
Internal
Sometimes Threatinternet sources and the books add to the confusion. For example, they would mention that data masking is done with elements that are not later used in analytics, and as examples will introduce social securities and names as subjects for data masking versus de-identification. Such claim may confuse a lot of people as indeed social security of one person may be a subject to analytical reports on their many health ailments by health insurance companies, which also use data de-identification techniques as per both HIPAA and GLBA - and such attributes as dates of births may be omitted in the report on water quality and its consumers per geographic region. Last names may make more sense in such reports, as they might indicate some ancestral vertical and correlation. Thus, whether to base de-identifying versus masking definition on such preposition would not be quite accurate.
TEXTSome of the sources consider the functional definition in distinguishing data masking and de-identification.
Sensitive Data Definition
TEXT
De-identification in the Context of Security
TEXT
18 elements of Data Masking per HIPAA
TEXT