U.S. flag

An official website of the United States government

Skip Header


A Comparison of Statistical Disclosure Control Methods: Multiple Imputation Versus Noise Multiplication

Written by:
RRS2013-02

Introduction

When survey organizations and statistical agencies such as the U.S. Census Bureau release microdata to the public, a major concern is the control of disclosure risk, while simultaneously ensuring quality and utility of the released data. Very often some popular statistical disclosure control methods such as data swapping, multiple imputation (MI), top coding/bottom coding (especially for income data), and multiplication with random noise, are applied before releasing the data. Multiple imputation has been in existence for some time as a viable methodology to handle missing data (see Rubin, 1987); following the initial proposal by Rubin (1993), in a series of papers (e.g., Drechsler and Reiter, 2010; Raghunathan, Reiter, and Rubin, 2003; Reiter, 2003, 2004, 2005a, 2005b) Reiter and his colleagues expanded its scope and provided a solid and rigorous foundation for its use so much so that statistical agencies can now employ this method for sensitive data protection while data users can carry out the required inference in a valid way. When MI is applied for statistical disclosure control, the multiply imputed data that are ultimately released are usually referred to as synthetic data. More recently, multiple imputation has been cleverly used by An and Little (2007) as an alternative to top coding. Recall that top coding consists of censoring the top part of the data above a specified threshold, and is commonly used in the context of income data so that the identity of those in the top income bracket is protected. We refer to the recent monograph by Drechsler (2011) for a detailed discussion of multiple imputation as a tool for disclosure control. Noise perturbation by addition or multiplication has also been advocated by some statisticians as a possible data confidentiality protection mechanism (Hwang, 1986; Little, 1993; Kim and Winkler, 2003); recently there has a been a renewed interest on this topic (Nayak, Sinha and Zayatz, 2011; Sinha, Nayak and Zayatz, 2011).

Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header