National Institute of Standards and Technology release draft guide on De identifying Government data sets. A very useful guide for all those who practice privacy and cyber security.
November 18, 2022
De identifying data is a critical part of managing data, avoiding reputational damage if there is a data breach and complying with privacy legislation. It is fundamental yet poorly understood, let alone implemented. The National Institute of Standards and Technology has released the third draft of its De-Identifying Government Data Sets . As with many NIST reports it is lengthy not to mention highly technical. But it is worth reading. The NIST provides the best technical guides in the privacy and cyber security sphere.
This is an excellent guide because it sets out clearly what deidentificatio involves, why it is important, what the risks are and how organisations and agencies should approach de identification. The United Kingdom’s Information Commissioner has prepared excellence guidance on Anonymisation, pseudonymisation and privacy enhancing technologies. Given the nature of recent data breaches in Australia de identifying older records is important. The guidance in Australia is inadequate.
The abstract provides:
De-identification is a process that is applied to a dataset with the goal of preventing or limiting informational risks to individuals, protected groups, and establishments while still allowing for meaningful statistical analysis. Government agencies can use de-identification to reduce the privacy risk associated with collecting, processing, archiving, distributing, or publishing government data. Previously, NISTIR 8053, De-Identification of Personal Information, provided a survey of de-identification and re-identification techniques. This document provides specific guidance to government agencies that wish to use de-identification. Before using de-identification, agencies should evaluate their goals for using de-identification and the potential risks that de-identification might create. Agencies should decide upon a de-identification release model, such as publishing de-identified data, publishing synthetic data based on identified data, or providing a query interface that incorporates de-identification. Agencies can create a Disclosure Review Board to oversee the process of de-identification. They can also adopt a de-identification standard with measurable performance levels and perform re-identification studies to gauge the risk associated with de-identification. Several specific techniques for de-identification are available, including de-identification by removing identifiers and transforming quasi-identifiers and the use of formal privacy models. People performing de-identification generally use special-purpose software tools to perform the data manipulation and calculate the likely risk of re-identification. However, not all tools that merely mask personal information provide sufficient functionality for performing de-identification. This document also includes an extensive list of references, a glossary, and a list of specific de-identification tools, which is only included to convey the range of tools currently available and is not intended to imply a recommendation or endorsement by NIST. Read the rest of this entry »