For many economic programs, there is a need to distinguish between the survey (sampling) unit, the reporting unit, and the tabulation unit:
A survey unit is an entity selected from the underlying statistical population of similarly-constructed units. Examples of survey units for different economic programs include establishments, Employer Identification Numbers (EIN), firms, state and local government entities, and building permit-issuing offices. Some programs use different survey units for different segments of the total population. Examples include the Annual Retail Trade Survey (ARTS) and the Survey of Construction (SOC). The ARTS samples EINs and firms (which can be comprised of one or more establishments), and the SOC samples residential housing permits and newly constructed housing units in areas where no permit is required. For cross-sectional or longitudinal surveys, the survey unit may change in composition over time (perhaps due to mergers, acquisitions, or divestitures).
A reporting unit is an entity from which data are collected. Reporting units are the vehicle for obtaining data and may or may not correspond to a survey unit for several reasons. First, the composition of the originally-sampled entity can change over the sample’s life cycle, as noted above. Second, for some surveys, an entity may request (or the Census Bureau may ask the entity) to report data in several separate pieces corresponding to different parts of the business or other entity type. For example, a large, diverse company in a company-based collection may request a separate form for each region or line of business in which it operates or may ask to report separately for each of its establishments to align with their record keeping practices. Similarly, many government programs have a central collection agency that provides the data for several governments, but issue additional mail-outs to obtain supplemental items that are not obtained by the central collection agency.
A tabulation unit houses the data used for estimation (or tabulation, in the case of a census). As with reporting units, the tabulation units may not correspond to a survey unit. Some programs consolidate establishment or plant-level data to a company level or parent government level to create tabulation units, so that the tabulation unit is often equivalent to the survey unit. Other programs create artificial units that split a reporting unit’s data among the different categories in which the reporting unit operates; for example, creating separate tabulation units by industry. In this case, the tabulation unit represents a portion of a survey unit.
For each program, the "statistical period" describes the reference period for the data collection. For example, an annual program might collect data on the prior year’s business; the statistical period refers to the prior year, but the data are obtained in the calendar year. During a given statistical period, all three types of units can be active, inactive, or ineligible. An active unit is in business and is in-scope for the program during the statistical period. An inactive unit is not operating or is not in-scope during the statistical period but is believed to have been active in the past and can potentially become active and in-scope in the future; examples include seasonal businesses for monthly or quarterly programs (temporarily idle) or businesses that operate in more than one industry, with the primary activity for a given statistical period being conducted in an "out-of-scope" industry. Finally, a survey unit may become ineligible and permanently excluded from subsequent computations due to a merger or acquisition, a permanent classification category change, or a death. All units are considered active until verified evidence otherwise is provided.
Economic programs compute two different types of response rates: a unit response rate and weighted item response rates. The Unit Response Rate (URR) is defined as the ratio of the total unweighted number of "responding units" to the total number of units eligible for collection. URRs are indicators of the performance of data collection for obtaining usable responses. Consequently, the majority of business programs base URRs on their reporting units, whereas the majority of ongoing government programs base URRs on the survey units1 that correspond to the tabulation units. Other exceptions are addressed on a case-by-case basis. The formulae for the URR provided in Section 2.1 and the detailed unit nonresponse rate breakdowns presented in the Section 2.2.1 use the term "reporting unit" for simplicity. A program can produce at most one URR per statistical period and per release phase2.
Quantity and Total Quantity Response Rates (QRR and TQRR) are item-level indicators of the "quality" of each estimate. In contrast to the URR, these weighted response rates are computed for individual data items, so that a program may produce several QRRs and TQRRs per statistical period and release. Both are weighted measures that take the size of the tabulation unit into account as well as the associated sampling parameters. These rates measure the proportion of each estimate obtained directly or indirectly from the survey unit and are consequently based on the tabulation units. The QRR measures the weighted proportion of an estimate obtained directly from the respondent for the survey/census; the TQRR expands the rate to include data from equivalent quality sources.
To compute the weighted item response rates, it is necessary to determine the source of the final tabulated value of the associated data item for each tabulation unit i. This value could be directly obtained from respondent data, indirectly obtained from other equivalent quality data sources, or imputed. The classification process is straightforward for items that are directly obtained from the survey questionnaire (i.e., form items), less so for items that are obtained as functions of collected items (i.e., derived items). The formulae provided in Sections 2.1 and 2.2.2. can be applied to either form or derived items, but require that the item value classification process be performed immediately prior and that the classification process or rules be documented.
1 The central collection unit may provide the responses for the majority of the program data (e.g., providing responses from all associated sample units for most of the program items)." Supplemental mailings are used to obtain the rest of the items.
2 Leading indicator surveys often have more than one official release of the same estimate." For example, a program might release a preliminary estimate for the current statistical period along with a revised estimate from the prior period." Response rates should be computed at each release phase, and it is expected that the response rates (unit or item) will generally increase for the same estimate with each release.
1.1 Eligibility Status
The total number of active reporting units in a statistical period is defined as NRU. These reporting units can be classified by their eligibility status: eligible for data collection (E), ineligible (IA), unknown eligibility (U), or data obtained from qualified administrative sources (A). Reporting units that have been determined to be out-of-scope for data collection during the statistical period are excluded from all computations, as are inactive cases." Note that the U cases are assumed to be active and in-scope in the absence of evidence otherwise. Reporting units may be considered eligible in one survey or census but ineligible for another, depending upon the target population. For example, a reporting unit that was in business after October 2004 is eligible for the 2004 Annual Retail Trade Survey, but is ineligible for the October 2004 Monthly Retail Trade Survey.
Term | E (Total Eligible) |
Definition | The count of reporting units that were eligible for data collection in the statistical period. |
Variable | ei – An indicator variable for whether a reporting unit is eligible for data collection in the statistical period. These include chronic refusal units (eligible reporting units that have notified the Census Bureau that they will not participate in a given program). If a reporting unit is eligible, ei = 1, else ei = 0. |
Computation | The sum of the indicator variable for eligibility (ei) over all the reporting units in the statistical period. |
|
|
Term | IA (Total Ineligible/Inactive) |
Definition | The count of reporting units that were ineligible for data collection in the current statistical period. |
Variable | iai– An indicator variable for whether a reporting unit in the statistical period has been confirmed as not a member of the target population at the time of data collection. An attempt was made to collect data, and it was confirmed that the reporting unit was not a member of the target population at that time. These reporting units are not included in the URR calculations for the periods in which they are ineligible. Information confirming ineligibility may come from observation, from a respondent, or from another source. Some examples of ineligible reporting units include firms that went out of business prior to the survey reference period, firms in an industry that is out–of–scope for the survey in question, and governments that reported data from outside of the reference period. If a reporting unit is ineligible, iai = 1, else iai = 0. |
Computation | The sum of the indicator variable for ineligibility (iai) over all the reporting units in the statistical period. |
|
|
Term | U (Total Unknown Eligibility) |
Definition | The count of reporting units in the statistical period for which eligibility could not be determined. |
Variable | ui –An indicator variable for whether the eligibility of a reporting unit in the statistical period could not be determined. If a reporting unit is of unknown eligibility, ui = 1, else ui = 0. For example, units whose returns are marked as "undeliverable as addressed" have unknown eligibility (ui = 1), as do unreturned mailed forms. |
Computation | The sum of the indicator variable for unknown eligibility (ui) over all the reporting units in the statistical period. |
|
|
Term | A (Administrative data used as source) |
Definition | The count of reporting units in the statistical period that belong to the target population and were pre–selected to use administrative data rather than collect survey data. |
Variable | ai – An indicator variable for whether administrative data of equivalent-quality-to-reported data rather than survey data was obtained for an eligible reporting unit in the statistical period. The decision not to collect survey data must have been made for survey efficiency or to reduce respondent burden and not because that reporting unit had been a refusal in the past. These reporting units are excluded from the URR calculations because they were not sent questionnaires, and thus could not respond, although their data are included in the calculation of the TQRRs. If a reporting unit is pre-selected to receive administrative data, ai = 1, else ai = 0. |
Computation | The sum of the indicator variable for units pre-selected to use administrative data (ai) over all the reporting units in the statistical period. |
The relationship among the counts of reporting units in the statistical period in the four eligibility categories is given by NRU = E + IA + U + A." For the ith reporting unit, ei + iai + ui + ai= 1. Note that the value of NRU may change by statistical period.
1.2 Response Status
Response status is determined only for the eligible active reporting units in the statistical period.
Term | R (Response) |
Definition | The count of reporting units in the statistical period that were eligible for data collection in the statistical period and classified as a response. |
Variable | rui –An indicator variable for whether an eligible reporting unit in the statistical period responded to the survey. To be classified as a response, the respondent for the reporting unit must have provided sufficient data, and the data must satisfy all the critical edits. The definition of sufficient data will vary across surveys. Programs must designate required data items before the data collection begins. If a reporting unit responded, rui = 1, else rui = 0 (note rui = 0 for reporting units which were eligible but did not respond and for reporting units classified as IA, U, or A). |
Computation | The sum of the indicator variable for eligible reporting units that responded (rui) over all the reporting units in the statistical period. |
1.3 Reasons for Nonresponse
To improve interpretation of the response rate and better manage resources, it is recommended that whenever possible, reasons for (or types of) nonresponse be measured on a flow basis whenever possible. These terms are used to describe "unit nonresponse" and will be presented in unweighted tabulations. Five specific terms describing nonresponse reasons are defined below. The first three terms (REF, CREF, and INSF) define nonresponse reasons for eligible reporting units. The final two terms (UAA and OTH) define the reasons for reporting units with unknown eligibility.
Term | REF (Refusal) |
Definition | The count of eligible reporting units in the statistical period that were classified as "refusal." |
Variable | refi – An indicator variable for whether an eligible reporting unit in the statistical period refused to respond to the survey. If a reporting unit refuses to respond, refi = 1, else refi = 0. |
Computation | Sum of the indicator variable for "refusal" (refi) over all the reporting units in the statistical period. |
Term | CREF (Chronic refusal) |
Definition | The count of eligible reporting units in the statistical period that were classified as "chronic refusals." |
Variable | crefi – An indicator variable for whether an eligible reporting unit in the statistical period was a "chronic refusal." A chronic refusal is a reporting unit that informed the Census Bureau that it would not participate in a given program. The Census Bureau does not send questionnaires to chronic refusals, but they are considered to be eligible reporting units. Chronic refusals comprise a subset of refusals. If a reporting unit is a chronic refusal, crefi = 1, else crefi = 0. |
Computation | The sum of the indicator variable for "chronic refusal" (crefi) over all the reporting units in the statistical period. |
Term | INSF (Insufficient data) |
Definition | The count of eligible reporting units in the statistical period that were classified as providing insufficient data. |
Variable | insfi - An indicator variable for whether an eligible reporting unit in the statistical period returned a questionnaire, but did not provide sufficient data to qualify as a response. If a reporting unit returned a questionnaire but failed to provide sufficient data to qualify as a response, insfi = 1, else insfi = 0. |
Computation | The sum of the indicator variable for "insufficient data" (insfi) over all the reporting units in the statistical period. |
Term | UAA (Undeliverable as addressed) |
Definition | The count of reporting units in the statistical period that were classified as "undeliverable as addressed." |
Variable | uaai – An indicator variable for whether a reporting unit in the statistical period had a questionnaire returned as "undeliverable as addressed." These reporting units are of unknown eligibility." If a questionnaire is returned as "undeliverable as addressed,"uaai = 1, else uaai = 0. |
Computation | The sum of the indicator variable for "undeliverable as addressed" (uaai) over all the reporting units in the statistical period. |
Term | OTH (Other nonresponse) |
Definition | The count of reporting units in the statistical period that were classified as "other nonresponse." |
Variable | othi – An indicator variable for whether a reporting unit in the statistical period was a nonresponse for a reason other than refusal, insufficient data, or undeliverable as addressed. These reporting units are of unknown eligibility. If a reporting unit does not respond for reasons other than refusal, insufficient data, or undeliverable as addressed,othi = 1, else othi = 0. |
Computation | The sum of the indicator variable for "other nonresponse" (othi) over all the reporting units in the statistical period. |
1.4 Quantity Response Rate Terms
The total number of active tabulation units in the statistical period is defined as NTU. Recall that the number of tabulation units NTU may differ from the number of reporting units NRU, depending on the economic program. After a program creates tabulation units and performs any necessary data allocation procedures (from reporting unit(s) to tabulation unit(s)), the individual data items are classified according to the final source of data obtained for the units: data reported by the respondent, equivalent–quality–to–reported data obtained from the program–approved outside sources (such as company annual reports, Security Exchange Commission (SEC) sites, trade association statistics), or imputed data. Tabulation units that have been determined to be out–of–scope for data collection during the statistical period are excluded from all computations, as are inactive cases.
Variable | vti (Tabulated value of data item t for tabulation unit i in the statistical period) |
Definition | The quantity stored in the variable for item t for the ith tabulation unit in the statistical period." This quantity may be reported, equivalent-quality-to-reported, or imputed. |
Term | Rt (Reported data tabulation units for item t) |
Definition | The count of eligible tabulation units that provided reported data during the studied statistical period for item t that satisfied all critical edits. This count will vary by item and by statistical period. |
Variable | rti – An indicator variable for whether tabulation unit i in the statistical period provided reported data for item t that satisfied all edits." If the "tabulated item t value for unit i (ti) contains reported data, then rti= 1, else rti = 0. |
Computation | The sum of the indicator variable for reported data (rti) over all the tabulation units (NTU) in the statistical period.
|
Term | Qt (Equivalent–quality–data tabulation units for item t) |
Definition | The count of eligible tabulation units that use equivalent–quality–to–reported data for item t. Note that these data are indirectly obtained for the tabulation unit. This count will vary by item and by statistical period. |
Variable | qti – An indicator variable for whether tabulation unit i in the statistical period contains equivalent–quality–to–reported data for item t." Such data can come from three sources: data directly substituted from another census or survey s (for the same reporting unit, data item concept, and time period), administrative data d, or data obtained from some other equivalent source c validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (e.g., company annual reports, Securities and Exchange Commission (SEC) filings, trade association statistics). If the tabulated item t value for unit i (ti)contains equivalent–quality–to–reported data then qti = 1, else qti = 0. |
Computation | The sum of the indicator variable for equivalent–quality–to–reported data (qti) over all tabulation units (NTU ) in the statistical period.
|
Term | St (Substituted data tabulation units for item t) |
Definition | The count of eligible tabulation units containing directly substituted data for item t. This count will vary by item and by statistical period. |
Variable | sti – An indicator variable for whether a tabulation unit in the statistical period contains directly substituted data from another census or survey for item t." The same reporting unit must provide the item value (in the other program), and the item concept and time period for the substituted values must agree between the two programs." If the tabulated item t value for unit i (ti) contains directly substituted data from another survey, sti = 1, else sti = 0. |
Computation | The sum of the indicator variable for directly substituted data (sti) over all tabulation units (NTU) in the statistical period.
|
Term | Dt (Administrative data tabulation units for item t) |
Definition | The count of eligible tabulation units containing administrative data for item t. This count will vary by item and by statistical period. |
Variable | dti – An indicator variable for whether a tabulation unit in the statistical period contains administrative data for item t. If the tabulated item t value for unit i (ti) contains administrative data, dti = 1, else dti = 0. |
Computation | The sum of the indicator variable for administrative data (dti) over all tabulation units (NTU) in the statistical period.
|
Term | Ct (Equivalent source data tabulation units for item t) |
Definition | The count of eligible tabulation units containing equivalent-source data that is neither administrative data nor data substituted directly from another economic program for item t. This count will vary by item and by statistical period. |
Variable | cti – An indicator variable for whether a tabulation unit in the statistical period contains equivalent-source data validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (e.g., company annual report, SEC filings, trade association statistics) for item t. If the tabulated item t value for unit i (ti) contains equivalent–source data, then cti = 1, else cti = 0. |
Computation | The sum of the indicator variable for equivalent-source data (cti) over all tabulation units (NTU ) in the statistical period.
|
Term | Mt (Imputed data tabulation units for item t) |
Definition | The count of eligible tabulation units containing imputed data for item t. This count will vary by item and by statistical period. |
Variable | mti – An indicator variable for whether a tabulation unit in the statistical period contains imputed data for item t. If the tabulated item t value for unit i (ti) contains imputed data, mti = 1, else mti = 0. |
Computation | The sum of the indicator variable for imputed data (mti) over all tabulation units (NTU) in the statistical period.
|
The relationship among Qt, St, Dt, and Ct for item t in a statistical period is given by Qt = St + Dt + Ct. The relationship among the counts of tabulation units for item t in the statistical period is given by NTU = Rt + Qt + Mt.
Variable | fi (Nonresponse weight adjustment factor) |
Definition | A tabulation unit nonresponse weight adjustment factor for the ith tabulation unit in the statistical period." The variable fi is set equal to 1 for surveys that use imputation to account for unit nonresponse. |
Variable | wi (Sample weight) |
Definition | The design weight for the ith tabulation unit in the statistical period. The design weight includes subsampling factors and outlier adjustments, but excludes post-sampling adjustments for nonresponse and for coverage. This variable represents the inverse unbiased probability of selection for the tabulation unit. |
Variable | ti (Design-weighted value of item t for tabulation unit i) |
Definition | The design–weighted tabulated quantity of the variable for item t for the ith tabulation unit in the statistical period (i.e, ti= wivti). Note that this value has not been adjusted for unit non–response. |
Term | T (Total value for item t) |
Definition | The estimated (weighted) total of data item t for the entire population represented by the tabulation units in the statistical period. T is based on the value of the data provided by the respondent, equivalent-quality-to-reported data, or imputed data. The calculation of T incorporates subsampling factors, weighting adjustment factors for unit nonresponse (adjustment-to-sample procedures only), and outlier-adjustment factors, but does not include post-stratification or other benchmarking adjustments. |
Computation | The product of the design weighted tabulated value of item t for the ith tabulation in the statistical period (ti) and the nonresponse weight adjustment factor (fi), summed over all tabulation units (NTU) in the statistical period.
|
The rates defined below serve as quality indicators in the process control sense for non–negatively valued items such as total employees or total payroll. For items that can take on positive and negative values, such as income or earnings on investments, the program should plan to develop two sets of weighted item response rates (QRRs and TQRRs) – one from negatively valued data and one from non-negatively valued data – or propose alternative quality indicators that provide adequate transparency into data quality and assist in taking corrective actions.
2.1 Primary Response Rates
Rate | URR (Unit Response Rate) |
Definition | The proportion of reporting units in the statistical period based on unweighted counts, that were eligible or of unknown eligibility that responded to the survey (expressed as a percentage). |
Computation | URR = [R/(E+U)] * 100 |
Rate | QRR (Quantity Response Rate for data item t) |
Definition | The proportion of the estimated (weighted) total (T) of data item t reported by the active tabulation units in the statistical period (expressed as a percentage). |
Computation | QRR= |
Rate | TQRR (Total Quantity Response Rate for data item t) |
Definition | The proportion of the estimated (weighted) total (T) of data item t reported by the active tabulation units in the statistical period" or from sources determined to be equivalent-quality-to-reported data (expressed as a percentage). |
Computation | TQRR = |
2.2 Detailed Response and Nonresponse Rates
2.2.1 Unit Nonresponse Rate Breakdowns
The following breakdowns provide unweighted unit nonresponse rates.
Rate | REF rate (Refusal Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "refusal" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | REF rate = [REF/(E+U)] * 100 |
Rate | CREF rate (Chronic Refusal Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "chronic refusals" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | CREF rate = [CREF/(E+U)] * 100 |
Rate | INSF rate (Insufficient Data Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "insufficient data" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | INSF rate = [INSF/(E+U)] * 100 |
Rate | UAA rate (Undeliverable as Addressed Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "undeliverable as addressed" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | UAA rate = [UAA/(E+U)] * 100 |
Rate | OTH rate (Other Reason for Nonresponse Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "other reason for nonresponse" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | OTH rate = [OTH/(E+U)] * 100 |
Rate | U rate (Unknown Eligibility Rate) |
Definition | The ratio of reporting units in the statistical period that were classified as "unknown eligibility" to the sum of eligible units and units of unknown eligibility (expressed as a percentage). |
Computation | U rate = [U/(E+U)] * 100 |
2.2.2 Total Quantity Response Rate Breakdowns
The following breakdowns provide weighted response rates.
Rate | Q rate (Equivalent-Quality-to-Reported Data Rate) |
Definition | The proportion of the total estimate for item t derived from equivalent-quality-to-reported data for tabulation units in the statistical period (expressed as a percentage). |
Computation | Q rate = |
Rate | S rate (Survey Substitution Rate) |
Definition | The proportion of the total estimate for item t derived from substituted other survey or census data for tabulation units in the statistical period (expressed as a percentage)." To be tabulated in this rate, substituted data items must be obtained from the same reporting unit in the same time period as the target program, and the item concept between the two programs must agree. |
Computation | S rate = |
Rate | D rate (Administrative Data Rate) |
Definition | The proportion of the total estimate of item t derived from administrative data for tabulation units in the statistical period (expressed as a percentage). |
Computation | D rate = |
Rate | C rate (Other Source Rate) |
Definition | The proportion of the total estimate of item t derived from other source data validated by a study approved by the program manager in collaboration with the appropriate Research and Methodology area (such as company annual reports, SEC filing, trade association statistics) for tabulation units in the statistical period (expressed as a percentage). |
Computation | C rate = |
Rate | M rate (Imputation Rate) |
Definition | The proportion of the total estimate of item t derived from imputes for tabulation units in the statistical period (expressed as a percentage). |
Computation | M rate = |