Census Bureau researchers last week presented at advisory committee meetings preliminary findings from experiments designed to illustrate how techniques used to protect privacy in earlier censuses would impact census results if applied today on 2010 Census results.
The findings offer stakeholders a tool for comparing the trade-offs between the earlier methods and the new approach designed for application to the P.L. 94-171 redistricting data, the TopDown Algorithm, which is based on the principles of differential privacy.
The research concludes that reusing the 1980 suppression techniques would significantly limit the amount of data that could be published. Relaxing and extending the 2010 Census swapping algorithm would not improve the re-identification outcomes, regardless of the swapping rate used.
By releasing this analysis, we aim to give stakeholders a better empirical understanding of the need to modernize our disclosure avoidance methods. Early results from our internal research in 2017 found that the protections used for the 2010 Census could no longer withstand a data reconstruction attack. Continuing to use those methods for the 2020 Census was not an option.
Also important is the fact that if the Census Bureau were to revert to a system based on traditional methods, making that shift would require significant time – at least six months – to retool systems and processes and conduct quality assurance after such a decision were made.
To ensure timely redistricting data delivery, the Data Stewardship Executive Policy Committee is set to make a determination on the privacy loss budget for the 2020 redistricting data in early June.
Suppression – removing information from published tables to protect privacy – was last used as a primary disclosure avoidance technique in 1980.
Applying the suppression rules from the 1980 Census to 2010 Census P.L. 94-171 Redistricting data, whole tables would be suppressed for geographies with between 1 and 14 persons. These counts would be for a reduced set of race and ethnicity categories based on OMB Directive 15, representing only 14 OMB-designated race and ethnicity groups as follows:
For additional tables, individual cells would be suppressed (replaced with “0”) if counts in those cells were 1 or 2:
Applying the previously used suppression rules to the 2010 Census Summary File 1 (SF1) tables found that, at the block level, more than 38% of person tables and 32% of housing unit tables would be suppressed.
The research team analyzed the impact of relaxing and extending the 2010 Census swapping algorithm to the data. Options explored included combinations of the following:
The analysis revealed:
These imply that mid-level swap rates, as implemented, may match the TopDown Algorithm in terms of accuracy but will have a low impact on reducing re-identification.
We are hosting a webinar this Friday, June 4, to walk through the research and take audience questions. The webinar will be recorded and posted as part of our series on Disclosure Avoidance. There you will also find transcripts and recordings for the previous webinars in the series.
Details:
Time: 2:00 – 3:00 pm (ET)
WebEx log-in: Click here to join the meeting
WebEx event number (If needed): 199 855 0149
WebEx event password (If needed): Census#1
Audio: Listen to the webinar in one of two ways:
Using your computer's speakers (choose "Audio Broadcast," which is 1-way audio) -OR- Using your TELEPHONE (call 888-996-4917, code: 9385910#)
The Census Bureau’s Data Stewardship Executive Policymaking Committee (DSEP) will meet in early June to review the latest data regarding the TopDown Algorithm and approve settings and parameters. Their decisions will be informed by the feedback we’ve received from numerous stakeholders, which has resulted in ongoing fine-tuning of the algorithm since the release of the last demonstration data set on April 28. Additional fine-tuning as directed by the DSEP will continue through June, with quality control analysis leading to the FTP release of the redistricting data by August 16.
Early June:
Late June:
By August 16:
September:
By September 30:
* Released via Census Bureau FTP site.
** Released via data.census.gov.