An official website of the United States government
Here’s how you know
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
Secure .gov websites use HTTPS
A lock (
) or https:// means you’ve safely connected to the .gov website. Share sensitive information only on official, secure websites.
Record linkage is the means of combining information from a variety of computerized files. It is also referred to as data cleaning (McCallum and Wellner 2003) or object identification (Tejada et al. 2002). The basic methods compare name and address information across pairs of files to determine those pairs of records that are associated with the same entity. An entity might be am business, a person, or some other type of unit that is listed. Based on economic relationships, straightforward extensions of methods might create functions and associated metrics for comparing information such as receipts or taxable income. The most sophisticated methods use information from multiple lists (Winkler 1999b), create new functional relationships between variables in two files that can be associated with new metrics for identifying corresponding entities (Scheuren and Winkler 1997), or use graph theoretic ideas for representing linkage relationships as conditional random fields that be partitioned into clusters representing individual entities (McCallum and Wellner 2003, Wei 2004).
Share
Related Information
WORKING PAPER
Statistical Research Reports and StudiesSome content on this site is available in several different electronic formats. Some of the files may require a plug-in or additional software to view.
Top