At the U.S. Census Bureau, we often say our goal is to count everyone once, only once, and in the right place. Sometimes in an effort to count everyone in a census, we end up counting some people more than once. The Census Bureau refers to a person counted more than once as a “duplicate.”
Today, we’ll talk through situations where that can happen and how we resolve duplicates in the 2020 Census.
There are several reasons for duplicates in a census:
We use a special algorithm to resolve the first situation and a series of steps to resolve the second and third.
We might receive more than one response for an address if, for example, a roommate or spouse responds to the census without realizing another member of the household has already responded.
For the 2020 Census, we allowed households to respond online or by phone with or without their Census ID — a unique 12-digit number that links the household’s response to our address list. (The paper invitations and questionnaires had the Census ID pre-printed.)
Allowing responses without an ID made it even easier for households to respond, but it also made it easier for more than one person to respond for the household.
We’ve developed sophisticated procedures that take these situations into account and build upon our decades of census and survey-taking experience. Each census, we use what we call the “Primary Selection Algorithm” (the details of which are protected for quality assurance reasons) to determine whom to count when we receive more than one response for a single address.
Sometimes people are initially counted in more than one place because of the complexity of their living situation.
A few examples include:
These situations are trickier to untangle, and we must rely on people providing us information about their living situations to do so. From there, we determine where to count them using what we call the “residence criteria,” which are based on a longstanding principle set by Congress to count people at their usual residence, which is where they live and sleep most of the time.
To help sort out people’s living situations, the 2020 Census included a question that asked, “Does this person usually live or stay somewhere else?” If “yes,” people could select among multiple options to indicate the reason.
We then had a special operation called Coverage Improvement to call a subset of households that responded “yes.”
From those phone interviews, we try to determine:
Sometimes when we initially counted people living in group quarters (places such as college dorms, prisons and nursing homes), the facility would provide an address for where the person stayed when they were not at the group quarters. (More information about how we count group quarters is available in the recent 2020 Census Group Quarters blog.)
During data processing, we used the alternate address in conjunction with the residence criteria to resolve instances when an individual was initially counted in both places.
The process described above was not enough to resolve all duplicated people for this subset of the population. For example, some households didn’t cooperate with the follow-up phone interviews, and some group quarters didn’t provide alternate addresses for their residents.
To further resolve duplication:
Sometimes, duplicates occur because of an issue with the address, such as:
We relied on statistical matching that considered geographic distance to identify and resolve these situations.
Our research suggests that if we find duplicates within a limited area, such as the same block, duplicates are more likely an issue of the address being duplicated than an issue related to the living situation. We used enhanced address matching methods for addresses for which the people were linked to identify and remove duplicated addresses in these select areas.
As we tally census results, we compare these results to other benchmark data, as discussed in the recent 2020 Census Data Review blog. From our review, we could tell that more duplicates remained in the 2020 Census, even after taking all the steps we describe above.
Consequently, after data collection was complete, we took the following steps to identify and remove additional duplicated individuals from the census:
In summary, we used long-established procedures to unduplicate multiple 2020 Census responses for the same address. We also worked to resolve duplication when individuals were enumerated at two different addresses by following up with households, using statistical matching techniques, and examining potentially duplicated addresses.
These extensive steps enable us to get closer to our goal of counting everyone once, only once, and in the right place.