U.S. flag

An official website of the United States government

Skip Header


A Post-randomization Method for Rigorous Identification Risk Control in Releasing Microdata

Written by:
RRS2020-01

Abstract

One significant concern in releasing survey microdata is the possibility of identifying the records of some survey units by matching the values of some of the variables, called key or pseudo-identifying variables, whose values can be obtained easily from other sources. For categorical key variables, Nayak, Zhang and You [Int. Stat. Rev, 86(2), 2018, 300-321] developed a novel approach for measuring and controlling identification risks. For any ξ > 1/3, it can guarantee that any unit’s probability of correct identification would not exceed ξ. We present another post-randomization method for giving that guarantee more stringently, even for ξ ≤ 1/3. We use data partitioning and unbiased post-randomization as two effective tools for preserving data utility. We illustrate and assess the procedure by applying it to a U.S. Census Bureau’s publicly released data set.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header