Consider two files of records. Within each file, each record corresponds to a different population unit; but the two files correspond to the same general population. We want to identify "matches” i.e., pairs of records (from the two files) that each correspond to the same population unit.

Each record contains data in K fields which correspond to characteristics such as age, race, etc. We may observe patterns of agreement/disagreement among the fields, for each pair of records. Using this information, we want as best as possible to identify matches. The problem of how best to use the field information has been addressed for K=3, under assumption that the events "agreement in field i," i=1, ..., K are stochastically mutually independent -- for true matches and likewise for true nonmatches. We address the problem for K>3, and avoid reliance on the assumption of independence by fitting interaction terms which reflect stochastic positive dependences.

Others in Series

Working Paper

User's Guide for the Generalized Record Linkage Program Generator (...

September 01, 1986

User's Guide for the Generalized Record Linkage Program Generator (GENLINK) SRD Program Generator System User's Guide: Part III

Working Paper

Capture-Recapture Estimation in the Presence of a Known Sex Ratio

September 04, 1986

Capture-Recapture Estimation in the Presence of a Known Sex Ratio

Working Paper

A Study of Alternative Imputation Techniques for Surveys in the Cur...

December 18, 1986

A Study of Alternative Imputation Techniques for Surveys in the Current Industrial Reports Series

View All

Related Information

WORKING PAPER

Statistical Research Reports and Studies

Page Last Revised - October 28, 2021

Some content on this site is available in several different electronic formats. Some of the files may require a plug-in or additional software to view.

Is this page helpful?
Thumbs Up Image

Yes

NO THANKS

255 characters maximum

255 characters maximum reached

Thank you for your feedback.
Comments or suggestions?

Top