U.S. flag

An official website of the United States government

Skip Header


Post-randomization for Identification Risk Limited Microdata Release from General Surveys

Written by:
RRS2018-11

Abstract

Before releasing survey data, statistical agencies usually perturb the original data to keep each survey unit's information confidential. One significant concern is identity disclosure, which occurs when an intruder correctly identifies the records of a survey unit by matching the values of some key (or pseudo-identifying) variables. Nayak, Zhang and You (2018) developed a post-randomization method for a strict identification risk control in releasing survey microdata. The procedure also well preserves the observed frequencies and hence statistical estimates in case of simple random sampling. We show that in general surveys, the procedure may induce considerable bias in commonly used survey weighted estimators. We propose a modified procedure that better preserves weighted estimates. The procedure is illustrated and empirically assessed with an application to a publicly available U.S. Census Bureau data set.

Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header