U.S. flag

An official website of the United States government

Skip Header


A Local l-Diversity Mechanism for Privacy Protected Categorical Data Collection

Written by:
RRS2019-08

Abstract

We consider the task of protecting respondent's privacy when collecting data on categorical variables. Any mechanism for masking the true value of a respondent can be viewed as a randomized response (RR) procedure, and its prudent planning depends crucially on the given privacy criterion. We examine some existing privacy criteria and describe their drawbacks. We show that a previous notion of average security is inappropriate. Several other criteria, which simply impose upper bounds on the parity of the RR design, inflict severe data utility loss, unless the number of categories is fairly small. This applies to local differential privacy (LDP), which is a leading privacy criterion, and reveals substantial statistical inefficiency of the RAPPOR procedure, which has been in use by Google, Apple and others. We propose a new privacy procedure that is similar to l-diversity but, works locally for each respondent. The procedure is simple to implement and its privacy protection is easy to understand and communicate to survey participants. We give an unbiased estimator of the probability vector of all categories and prove its minimaxity within a class of estimators under squared error loss. We argue and believe that the new procedure offers a better privacy-utility trade-off than LDP.

Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header