U.S. flag

An official website of the United States government

Skip Header


State of Statistical Data Editing and Current Research Problems

Written by:
Working Paper Number RR99-01

Introduction

This paper is my description of the state of statistical data editing and current research problems. It is not intended to be a complete description of all areas. Rather, it represents sub-areas of statistical data editing that I will describe in sufficient detail so that the discussion of a few research problems is more easily understood.

I define statistical data editing (SDE) as those methods that are used to edit (i.e., clean-up) and impute (fill-in) missing or contradictory data. The end result of SDE is data that can be used for intended analytic purposes. These include primary purposes such as estimation of totals and subtotals for publications that are free of self-contradictory information. The published totals do not contradict published totals in other sources. Self-contradictory information might include groups of items that do not add to desired subtotals or totals for subgroups that exceed a known proportion of the total for the entire group. The uses of the data after SDE might be preparation of variances of estimates for a number of sub-domains and micro-data analyses. If only a few published totals need to be accurate, then an efficient use of resources may be to perform detailed edits on only a few records that effect the estimated totals. If many analyses need to be performed on a large number of sub-domains or if the full set of accurate micro-data are needed, then a very large number of edits, follow-up, and corrections may be needed.

SDE can be used in all phases of survey processing. These phases include frame development, form design, proposed analytic purposes for which the data are collected, and quality assurance. This paper focuses primarily on SDE as it applies to analytic purposes, and places most emphasis on those procedures typically applied after the initial receipt of survey or other data. The main goal of SDE might be improved procedures and greater automation to enhance the ability of survey managers and analysts to provide accurate published estimates and micro-data.

Related Information


Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header