U.S. flag

An official website of the United States government

Skip Header


Coverage Error Models for Census and Survey Data

Written by:
RR83-07

Introduction

The problem of coverage error in surveys and censuses has become an important statistical issue, supported by the fact that the U.S. Bureau of the Census has been sued in Federal court more than 50 times regarding the completeness of the 1980 census. The main purpose of this article is to discuss certain aspects of coverage error and to provide careful exposition of some alternative statistical models for such error.

Coverage error has been studied for several decades. In the U.S., coverage error has been estimated for each of the past four decennial censuses of population and housing, starting with the 1950 census. In Canada, coverage error has been estimated for each of the past five quinquennial censuses, starting with the 1961 census. Other countries such as Australia, Austria, Finland, and Korea have also produced estimates of the coverage error associated with their population censuses. Despite the apparent vast amount of research on coverage error, previous authors have not, to our knowledge, presented explicit statistical models for such error, although models have been implicit in all of the previous work.

The models we discuss are equivalent to the capture-recapture models employed in estimating the size and density of wildlife populations, and to the dual-system models employed in estimating the number of human vital events. They are also related to the log-linear models employed in the analysis of discrete multivariate data. Capture-recapture models originated in the 17th century, and the modern development dates from Peterson (1896), Lincoln (1930), and Schnabel (1938). Excellent recent reviews are given by Seber (1973) and Otis et al.0978). The application to human vital events was initiated by the pioneering work of Sekar and Deming (1949). Extensive recent discussion is presented by Marks, Seltzer, and Krotki (1974). Bishop, Fienberg, and Holland (1975) discuss the subject of log-linear models and their relation to the capture-recapture problem.

In Section 2 we present the basic coverage error model and discuss several important special cases that are useful in estimating the level of error. The model denoted Mth is the one employed implicitly in several of the previous coverage error studies. The basic model is extended in Section 3 to include the sampling error associated with a postenumeration survey. Statistical adjustments to census data designed to compensate for coverage error are discussed briefly in Section 4. We expose a clear connection between the basic coverage error model and the method of small domain estimation known as synthetic estimation. The paper closes with a general summary in Section 5, where we discuss future research possibilities as well as possibilities for relaxing some of the assumptions imposed in earlier sections.

Related Information


Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header