U.S. flag

An official website of the United States government

Skip Header


Developing the DAS: Demonstration Data and Progress Metrics

Customizing protections for each data product is an iterative process that requires data user engagement and feedback. As we produce demonstration data and performance metrics during Disclosure Avoidance System (DAS) development, we’ll post that information here.

For more information, view this brief: Why the Census Bureau Chose Differential Privacy

2010 Demonstration Data Products Suite: Redistricting and DHC

The “2010 Demonstration Data Products Suite – Redistricting and DHC,” is a suite of files based on 2010 Census results to help data users analyze the impact of the new 2020 Census Disclosure Avoidance System.

The files incorporate the final production settings chosen for both the 2020 Census Redistricting Data (Public Law 94-171) Summary File and the Demographic and Housing Characteristics File (DHC). Included in the released suite of files is the 2010 Census Production Settings Redistricting Data (P.L. 94-171) Demonstration Noisy Measurement File (2023-04-03) (2010 Redistricting NMF).

Noisy Measurement Files are the intermediate output of the Disclosure Avoidance System’s TopDown Algorithm (TDA). The TDA generates noisy measurements when it applies differentially private noise to each of the tabulations from the confidential data. Because the noise can result in internal and hierarchical inconsistencies within the tables we publish, the TDA completes a final step called “post-processing.” This corrects those inconsistencies before the tables or PPMFs are published.

This public release gives researchers and data scientists the opportunity to independently process the files, complete analysis and conduct valuable assessments of the confidentiality protections.

Demographic and Housing Characteristics File (DHC) Development

This product provides detailed demographic and housing characteristics about the nation and local communities. We encourage data users to aggregate small populations and geographies to improve accuracy and diminish implausible results.

  • Subjects: Age, sex, race, Hispanic or Latino origin, household type, family type, relationship to householder, group quarters population, housing occupancy, and housing tenure.
  • Access: data.census.gov. Direct links to popular tables are available on the DHC webpage.
  • Lowest level of geography: Varies, with many tables available for census blocks.
  • Release date: May 25, 2023.
  • Data Table Guide: Includes the list of tables, lowest level of geography by table, and table shells.
  • Equivalent 2010 Census product: The 2010 Census Summary File 1 is the closest equivalent to the DHC. For information about these differences between these products, visit the DHC webpage.

For more information about how differential privacy is applied to the DHC: Disclosure Avoidance and the 2020 Census: How the TopDown Algorithm Works

A subset of DHC tables were included in early iterations of DAS demonstration data. In the Redistricting Data section below, see:

  • 2010 Demonstration Data Products Baseline 2019-10-29
  • DAS Development Update 2020-05-27

Detailed Demographic and Housing Characteristics File A (Detailed DHC-A)

Improvements in the design, processing and coding of the 2020 Census allow the release of data for almost five times as many detailed race and ethnic groups than were possible in 2010.

  • Subjects: Population counts and sex by age statistics for approximately 1,500 detailed racial and ethnic groups, such as German, Lebanese, Jamaican, Chinese, Native Hawaiian, and Mexican, as well as American Indian and Alaska Native (AIAN) tribes and villages like the Navajo Nation.
  • Access: data.census.gov. Direct links to popular tables and summary file tables are available on the Detailed DHC-A webpage
  • Geographies: Nation, state, county, places (cities and towns), census tracts, and American Indian/Alaska Native/Native Hawaiian (AIANNH) areas. Please note, the amount of data available for the detailed racial and ethnic groups and AIAN tribes and villages depends on their population size within a specific geography. This approach allows the Census Bureau to produce as much detail as possible while ensuring strong confidentiality protections.
  • Release date: September 21, 2023.
  • Detailed Race and Ethnicity Crosswalk 2010 to 2020: Outlines which codes were used to tabulate each group in the 2010 Census and 2020 Census.
  • Equivalent 2010 Census product: The Detailed DHC-A (in combination with the forthcoming Detailed DHC-B) is the successor to the 2010 Census Summary File 2 and the 2010 Census American Indian and Alaska Native Summary File.

Detailed Demographic and Housing Characteristics File B (Detailed DHC-B)

  • Subjects: Household type and tenure information for the same detailed race and ethnicity groups and American Indian and Alaska Native tribes and villages mentioned for the Detailed DHC-A. 

  • Access: data.census.gov.

  • 2020 geographies: Nation, state, county, places (cities and towns), census tracts, and American Indian/Alaska Native/Native Hawaiian (AIANNH) areas.

  • Planned release date: September 2024. 

Additional information about the release of the Detailed DHC-B is available in the newsletter Census Bureau Provides Updates on 2020 Census Data Products.

Supplemental Demographic and Housing Characteristics File (S-DHC)

The S-DHC tables reflect especially complex relationships between the characteristics about households and the people living in them. These complex characteristics supplement the data about households and people available in the DHC product. We often refer to these tables as “complex person-household join tables” or “join tables.”  Some tables are repeated by race and ethnicity.

  • Subjects: Data that combine characteristics about households and the people living in them, including the total population in households, average household size by age and tenure, average family size, household and family type for people under 18 years old, and total population in households by tenure. 

  • Access: data.census.gov.

  • 2020 geographies: Nation, state.

  • Planned release date: September 2024. 

2020 Census Redistricting Data (P.L. 94-171)

Public Law 94-171 directs the Census Bureau to provide the data that may be used for redistricting to the governors and the officers or public bodies having responsibility for redistricting in each of the 50 states.

This product is the first from the 2020 Census that includes demographic and housing characteristics about detailed geographic areas including states, counties and places.

  • Subjects: Voting age, race, Hispanic or Latino origin, housing occupancy status, group quarters population by major group quarters type

  • Lowest level of geography: Census block

  • Access: FTP site in August (links to data files and support materials are available on the Decennial Census P.L. 94-171 Redistricting Data Summary Files page); data.census.gov on September 16

  • Date: Released on FTP August 12, 2021; the same data released on data.census.gov on September 16, 2021

Beginning in October 2019, the Census Bureau released a series of demonstration data products that applied iterative development versions of the 2020 Census Disclosure Avoidance System (DAS) to published 2010 Census Data. The first two demonstration data sets focused simultaneously on both redistricting and Demographic and Housing Characteristics data (DHC, known in earlier censuses as Summary File 1, or “SF1”). In August 2020, pandemic-triggered operational delays required the Census Bureau to prioritize development focus on the redistricting data to attempt to meet the statutory data release deadline. Demonstration data from September 17, 2020, forward focused solely on the redistricting data. (See Development Timeline)

The Census Bureau produced Detailed Summary Metrics and Privacy-Protected Microdata Files (PPMFs) to assist with data user analysis. IPUMS NHGIS converted the PPMFs into tabular format for ease of use.  Data users evaluated each iteration and provided feedback that helped shaped the algorithm and settings throughout the development process. On June 8, 2021, the Census Bureau’s Data Stewardship Executive Policymaking Committee chose the final settings for production of the redistricting data. The data were released August 12, 2021.

Note that while the data in the Privacy-Protected Microdata files, the underlying untabulated microdata files used to generate the Detailed Summary Metrics, look like individual records, they are all privacy-protected through the application of differentially private statistical noise.

Top of Section

On June 8, 2021, The U.S. Census Bureau’s Data Stewardship Executive Policy Committee (DSEP) selected the settings and parameters for the Disclosure Avoidance System (DAS) for the 2020 Census redistricting data (PL-94-171).

This is the sixth and final set of Privacy-Protected Microdata Files (PPMFs) for the redistricting data that allow data users to compare the effect of the Disclosure Avoidance System settings on previously published 2010 Census data. These and previous PPMFs are only intended to demonstrate the redistricting data, not the Demographic and Housing Characteristics File (DHC) or other 2020 Census data products.

Top of Section

There are two sets of Privacy-Protected Microdata Files (PPMFs), record layouts, and Detailed Summary Metrics in this release:

  • One set with a global privacy-loss budget ("epsilon") of 12.2 (10.3 for persons and 1.9 for housing units, approximating the anticipated final PLB level).  In the FTP directory, those files include “12-2” in the file names (e.g, ppmf_20210428_eps12-2_P.csv and ppmf_20210428_eps12-2_U.csv).
  • A second set with the global-privacy loss budget ("epsilon") of 4.5 (4.0 for persons and 0.5 for housing units, as used for prior demonstration data). In the FTP directory, those files include “4-5” in the file names (e.g, ppmf_20210428_eps4-5_P.csv and ppmf_20210428_eps4-5_U.csv).

We encourage data users to closely analyze this demonstration data. Feedback received by May 28, 2021, will be considered. Email feedback to: 2020DAS@census.gov; include “April PPMF” in the subject line.

Particularly useful feedback would describe:

  • Fitness-for-use: Based on your analysis, would the data needed for your applications (redistricting, Voting Rights Act analysis, estimates, projections, funding data sets, etc.) be satisfactory?
    • How did you come to that conclusion?
    • If your analysis found the data to be unsatisfactory, how incrementally would accuracy need to change to improve the use of the data for your required or programmatic use case(s)?
    • Have you identified any improbable results in the data that would be helpful for us to understand?"
  • Privacy: Do the proposed products present any confidentiality concerns that we should address in the DAS?
  • Improvements: Are there improvements you’ve identified that you want to make sure we retain in the final design? Be specific about the geography and error metric for the proposed improvement.

We will provide additional metrics and educational webinars throughout the month of May to help you with that analysis. (Subscribe to our newsletter for the release and other updates.)

Top of Section

Subscribe

Subscribe to our digital newsletter for the latest updates in DAS development.

Share Your Thoughts

We appreciate your engagement and encourage you to email comments and suggestions to 2020DAS@census.gov.

Page Last Revised - January 29, 2024
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header