U.S. flag

An official website of the United States government

Skip Header


Data Ingest and Linkage

When files are acquired and transmitted to Census, they are initially accessible only by a small staff responsible for inventorying the contents of the file, conducting basic Quality Control checks, and removing sensitive Personally Identifying Information (PII). This staff works in a secured physical environment and on a highly-restricted computing cluster that is behind the Census firewall.


The processing and de-identification staff confirms that the received files are exactly as described in the legal agreement. Census is never permitted to receive more than has been specified in the applicable agreement. The staff also confirms that the variables and documentation have a basic integrity that will allow us to use them.


Next, a data linkage team replaces sensitive PII with a unique key that can be used to link the records to other databases held at Census. The probabilistic linkage process relies on variables such as name, address, date of birth, and Social Security Number. These PII are used to link the incoming file to a “reference file” comprised of censuses, surveys, and other federal records. The reference file contains PII from these other files and a Protected Identification Key (PIK), which uniquely identifies each record. When a linkage can be made between the incoming file and the reference file, the PIK is appended to the incoming file.

Page Last Revised - December 16, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header