Estimated reading time: 6 minutes
This week, the U.S. Census Bureau will release the final data product from the 2020 Census. On Sept. 19, we’ll release the Supplemental Demographic and Housing Characteristics File (S-DHC).
While the pandemic delayed our operations, we moved deliberately to ensure we produced the high-quality statistics the public expects and to implement new confidentiality protections.
In this blog, our goal is to equip you for the release of the S-DHC.
The S-DHC does what its name implies – it supplements the data we released in May 2023 through the Demographic and Housing Characteristics File (DHC). The DHC provided information about people (age, sex, race, ethnicity, relationship to the householder) and households (household type such as family/nonfamily and owner/renter) – in mostly separate tables.
On the other hand, the S-DHC combines data about people and households in the same tables. Specifically, it provides statistics for the average size of families and households, as well as counts of people living in certain types of households.
For example, from DHC we learned:
In the S-DHC, we’ll learn:
The differences may seem subtle, but combining these details about the structure of households and the people living in them complicates protecting the confidentiality of the data, which we’ll talk more about below.
As a result of the need for stronger disclosure avoidance techniques, we are only releasing the S-DHC data at the national and state levels – a decision that enables us to both protect respondent confidentiality and provide quality statistics.
The S-DHC will provide eight tables, and six of them will be repeated for race and Hispanic origin groups.
The tables available are:
* The tables marked with an asterisk are available by the following race and Hispanic origin groups:
Note that the S-DHC data are not available for detailed race and ethnicity groups, such as Chinese or Mexican, unlike the recent Detailed DHC-B and Detailed DHC-A products.
As we mentioned above, protecting the combined person and household data is complicated and requires robust disclosure avoidance methods. Combining the data increases the risk of disclosing information about individuals because information for each person in the household (especially the householder) is linked to the information for everyone else in the household. This interrelationship makes it much harder to obscure the effect that one person’s record has on the others, which in turn makes it harder to guarantee that they are protected.
As with other 2020 Census data products, we protected the data by adding “statistical noise” – small, random additions or subtractions to the data, but with this data product we’ve also taken a couple of additional steps:
Figure 1. Example of Credible Intervals
Providing the intervals was an innovative step for us, as it marks the first time that we have published decennial census statistics with associated estimates of disclosure avoidance-related error. While they don’t reflect all sources of error, such as coverage error and truncation error, we hope they will help you gauge the impact of confidentiality protections on the quality of the S-DHC data.
Finally, we’ll note that because of the independent inclusion of noise into the statistics, there will be some inconsistencies in the S-DHC, just like there were with other 2020 Census data products. For example:
With these situations in mind, we encourage you to use caution when aggregating published counts to produce statistics for custom groups or geographies. Adding up the data will accumulate more noise.
We hope you will find the S-DHC informative about the people living in certain types of households in your state and in the country. While we have released similar, more recent data from the American Community Survey, the S-DHC represents the strength of many, many more responses since it comes from the 2020 Census. We’ve done our best to protect those responses and provide you with timely, relevant data.
While we’re excited the S-DHC wraps up the 2020 Census data products, we’re already looking ahead to the 2030 Census data products. In the coming months, we plan to share more information about our 2030 Census research on disclosure avoidance, data product planning, and public engagement opportunities.
Thank you for your input along the way as we developed the 2020 Census data products!