U.S. flag

An official website of the United States government

Skip Header


Census Bureau Releases 2020 Census Microdata

AUGUST 5, 2024 —The U.S. Census Bureau today released the 2020 Census Privacy-Protected Microdata File (PPMF). The 2020 PPMF includes two protected microdata files—a person file and a housing unit or household file. The PPMF is the successor to the former decennial census Public Use Microdata Sample (PUMS), last issued for the 2010 Census.

As with the 2010 PUMS, the PPMF presents the data as rows of privacy-protected individual records (called microdata) rather than the aggregated totals found in data tables. This allows data users to generate custom tabulations not included in the Redistricting Data Summary File (P.L. 94-171), Demographic Profile, and the Demographic and Housing Characteristics File.

Data users are invited to join a webinar on August 6 to learn more about accessing and using the microdata files.

Differences Between the PPMF and the PUMS File Formats

The Census Bureau has released official microdata files every decade since the 1960 Census. There are substantial differences with this decade’s release, including:

  • 100 percent Coverage for the PPMF vs. 10 percent for the PUMS

The disclosure avoidance methodologies used for the 2010 PUMS only allowed for publication of a 10 percent systematic microdata sample of the full census population (i.e., complete census records for approximately 30 million people and over 13 million housing units). However, the new methodology used for the 2020 PPMF enabled publication of privacy-protected records for every person and housing unit included in the count (i.e., complete census records for over 331 million people and over 140 million housing units).

The 10 percent sample meant that the 2010 PUMS could not directly support analyses on small or less-populous geographies.

Because the 2020 PPMF includes full geographic detail down to the census block, it can support analyses on smaller tabulations or custom geographies. The precision of those analyses is dependent on the amount of privacy-loss budget allocated to the underlying statistics being analyzed. The privacy-loss budget is a chosen limit on how much disclosure protection is traded for increased accuracy.

Note that since it is a 100 percent file, the PPMF data are not impacted by the sampling error that affected the 2010 PUMS.

  • The PPMF Has Less Attribute and Characteristic Detail than the 2010 Census PUMS

The 2020 Census PPMF reflects the privacy-protected microdata output of the TopDown Algorithm used for the 2020 Census Redistricting Data (P.L. 94-171) Summary File, the Demographic Profile, and the Demographic and Housing Characteristics File (DHC). The PPMF is therefore limited to the attribute and characteristic detail from those data products.

For example, race detail in the PPMF is limited to the 63 race categories (e.g., “White Alone,” or “Asian and Black”) reflected in those data products and does not include any of the detailed or regional racial categories (e.g., “Scottish” or “South Asian”) found in the 2020 Census Detailed DHC-A (released in September 2023) and 2020 Census Detailed DHC-B data products (released August 1, 2024).

Note, however, that the 2010 PUMS restricted demographic detail to characteristics for which the national population was at least 10,000 persons. Many of the detailed race categories were not included in the 2010 PUMS for this reason. However, even though many of the 126 race-by-Hispanic ethnicity cells in the 2020 PPMF as well as other characteristics tabulated in the DHC do not reach this 10,000 national population threshold, all are included in the PPMF data releases.

Similarly, because the TopDown Algorithm was run independently for the person and housing unit universes (thus producing distinct and un-linked PPMF files for the person and housing unit universes), the PPMF will not allow for the kinds of person-household joins that would allow tabulations of complex household structure or composition information, e.g., the count of own children by race of householder.

The 2010 Census PUMS, by contrast, included many of those attribute and characteristic details that will be missing from the 2020 Census PPMF.

Consequently, while the 2020 Census PPMF will permit extensive custom tabulations for attributes and characteristics related to those included in the redistricting, Demographic Profile, and DHC data products, they will not support many of the types of analysis that require information about household structure.

  • Both Formats Impacted by Disclosure Avoidance Protections

Both the PUMS and PPMF protect data confidentiality by adding statistical noise, which can distort data. The 2010 PUMS data were protected and impacted by the 2010 Census’ noise infusion methods: swapping, geographic aggregation, category collapsing, data synthesis, and top/bottom-coding. Each of those disclosure avoidance mechanisms inherently introduced uncertainty (error) and many (e.g., swapping, synthetic data generation) had the potential to also introduce bias.

The 2020 PPMF data are protected and impacted by the “TopDown Algorithm,” a noise infusion method based on the mathematical framework known as “differential privacy.” The algorithm applies more tailored noise to specific data points. The 2020 PPMF also includes small amounts of disclosure avoidance-induced bias as a result of the post-processing of the 2020 Census data through the TopDown Algorithm.

Learn more about the scope and potential impact of disclosure avoidance-induced error and bias in the blog “What to Expect: Disclosure Avoidance and the 2020 Census Demographic and Housing Characteristics File.”

You can also visit our website for the latest information on this topic, including technical documentation, fact sheets, and metrics.

Webinar on August 6

Join us on August 6 for a webinar to learn more about accessing and using the microdata files.

Log-In Details:

• Date: Tuesday, August 6
• Time: 3:00-4:00 p.m. ET
• WebEx link
• WebEx Event Number (if needed): 2821 339 0853
• WebEx Event Password (if needed): Census#1

You can find recordings, transcripts, and slides for all disclosure avoidance webinars on the series webinar webpage.

As always, please contact us at 2020DAS@census.gov with questions. 

Relevant Links

Page Last Revised - August 5, 2024
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header