Benchmarking to the 2012 Economic Census

Estimation Method

Total estimates are computed using the Horvitz-Thompson estimator (i.e., as the sum of weighted data (reported or imputed) for all selected sampling units that meet the sample canvas and tabulation criteria). The weight for a given sampling unit is the reciprocal of its probability of selection into the sample. Estimates are input to a benchmarking procedure as described below. Variances are estimated using the method of random groups and are used to determine if measured changes are statistically significant.

Benchmarking

Final results of the 2012 Economic Census are used to benchmark the SAS estimates. Benchmarking of total revenue estimates is described first, followed by benchmarking of total expenses and e-commerce revenue. Finally, remaining detail data items that sum to the benchmarked total revenue or total expenses estimates are benchmarked.

Prior to benchmarking, two operations are performed:

Historical corrections are made to current sample data back to 2010.
Revenue estimates from the current sample for 2011 and subsequent years are linked to the published estimates from the prior sample. For a given detailed industry based on 2007 NAICS, the linking is performed by multiplying the Horvitz-Thompson revenue estimate from the current sample by a ratio (the "revenue ratio"). The numerator and denominator of the ratio are as follows:
- The numerator is the 2010 published revenue estimate for the industry on a 2007 NAICS basis from the prior sample.
- The denominator is the 2010 Horvitz-Thompson estimate of revenue for the industry on a 2007 NAICS basis from the current sample.

For most industry levels added to SAS in the 2009 survey year ("expansion" industries), linking to the prior sample is handled in a different way. Since the estimates in the previous sample were not benchmarked to an Economic Census, the Horvitz-Thompson revenue estimates for 2010 forward are not changed. Further, the prior sample's 2009 estimate is multiplied by the inverse of the revenue ratio. All the expansion industry levels from NAICS 52 (except 522310) are the exceptions; these are linked to the prior sample as described above.

The resulting revenue estimates (call these "modified" revenue estimates) are input to the benchmarking program. Using this program, the modified revenue estimates for survey years 2007 and later are revised in a manner that:

Uses the 2007 and 2012 Economic Census revenue totals as fixed constraints.
Minimizes the sum of squared differences between the year-to-year changes of the input and revised estimates for 2007 through the end of the time series.

Refer to the revised total revenue estimates output from the benchmarking operation as "benchmarked" estimates. Expansion industries cannot be benchmarked in this manner, since it has no 2007 estimate to connect to the Economic Census. For these industries, the same procedure is used, only with the single constraint of the 2012 Economic Census revenue total. Two industries did not use any Economic Census revenue constraints, NAICS 485110, and 488510. This was due to there being too much difference in instructions of what to count as revenue between SAS and the Economic Census. For these industries, the benchmarked estimates are set equal to the modified estimates.

Note that total revenue estimates for 2007 and prior years are not generally revised. Exceptions occur for industries where the Economic Census publishes an updated measurement of 2007 revenues in the 'Comparative Statistics' tables. In those cases, the procedure is revised to use three fixed constraints in benchmarking, the 2002, 2007, and 2012 Economic Census revenue totals, and minimize the same sum of squared differences for 2002 to the end of the time series. Revenue estimates are revised as far back as 2003 for those industries. This longer benchmarking span was applied for the following industries by NAICS: 561510, 621399, 621491, 623311, 623312, 811212, 813212, 813311.

A mathematical result of the benchmarking methodology is that all benchmarked estimates following the end of the last benchmark year (2012) can be calculated simply by multiplying the corresponding input estimates by the ratio of the benchmarked-to-modified estimate for the last benchmark year. Modified revenue estimates for years after 2012 are multiplied by this ratio, called a carry-forward factor (or census adjustment factor), to derive published total revenue estimates for 2013 and subsequent years. The carry-forward factor can change during the life of the sample if historical corrections change the modified estimate for 2012. At the end of the current sample, the carry-forward factor remains the same until the next benchmarking operation. For expansion industries with only one benchmarking constraint, benchmarked estimates for all years are simply the modified estimates multiplied by the carry-forward factor.

A method similar to the one for benchmarking total revenue is used to benchmark total expenses. First, the revenue ratio described above is applied to the Horvitz-Thompson total expense estimates for each detailed industry for 2010 and subsequent years, resulting in modified total expense estimates for these years. Note that the revenue ratio was designed to make the modified estimate of revenue for 2010 equal to the published estimate from the prior sample. Since total expenses still uses the revenue ratio, this will not be the case, which would lead to a potentially overstated change in the estimate between 2009 and 2010. To address this, the benchmarking operation is employed, using the published 2007 estimate from the prior sample and the 2010 Horvitz-Thompson estimate after applying the revenue ratio as fixed constraints to adjust 2008 and 2009 estimates.

The resulting total expenses estimates (call these "modified" total expenses estimates) are input to the benchmarking program. Using this program, the modified total expenses estimates for 2007 through the end of the time series are revised in a manner that:

Uses as constraints the modified 2007 and 2012 total expenses estimates multiplied by the benchmarked-to-modified ratio of total revenue for that year (note that we do not benchmark using the total expenses estimates from the Economic Census). For 2012, this ratio is equal to the revenue carry-forward factor, for 2007, this ratio is usually equal to one.
Minimizes the sum of squared differences between the year-to-year changes of the modified and revised total expenses estimates for 2007 through the end of the time series.

The same mathematical result of the benchmarking methodology described above for total revenue estimates also applies to total expenses. That is, modified total expenses estimates for 2012 and subsequent years can be multiplied by the same carry-forward factor described above to calculate published total expenses estimates for 2012 and subsequent years.

The two types of exceptions mentioned for benchmarking total revenue still apply. For industries with updated 2007 Economic Census revenue, the 2007 benchmarked-to-modified revenue ratio will not be one, so an additional constraint is added: the modified 2002 total expenses estimate (multiplied by the 2002 benchmarked-to-modified revenue ratio, which will always be equal to one). For expansion industries, the only constraint is from 2012.

The method for linking and benchmarking e-commerce revenue is nearly identical to the method for linking and benchmarking total expenses. There are three differences, two caused by the benchmarking operation's inability to handle time series containing estimates of zero (it is impossible to minimize year-to-year changes when some year-to-year changes are not real numbers). First, the benchmarking operation used to get modified e-commerce estimates is applied to selected aggregate industry levels, designed to avoid zero values. Modified e- commerce estimates for detailed levels are calculated to preserve the ratios of each detail level to its aggregate level from before applying the benchmarking operation (a process called 'raking'). Second, the range of this first benchmarking operation is 2004 to 2010 instead of 2007 to 2010. Third, any detailed industry level with a zero estimate in the revision range for the second benchmarking operation uses an alternative method: the modified e-commerce estimates for each year are multiplied by the benchmarked-to-modified revenue ratio for the same year.

Estimates for data items that sum to total revenue or total expenses are indirectly benchmarked using the following procedure. First, the same method for producing modified total expenses estimates is used to produce modified estimates for these items. Then, the modified detail revenue or expenses are benchmarked by raking, keeping the ratio of the benchmarked item to the benchmarked revenue or expenses equal to the ratio of the modified item to the modified revenue or expenses.

For any other data item that does not sum to total revenue or total expenses, modified estimates for these data items are computed using the same method for producing modified total expenses estimates. Then, modified estimates for these data items are multiplied by the benchmarked-to-modified total revenue ratios for each year to produce benchmarked estimates for these data items.

Benchmarked estimates for any sums of data items are obtained by adding the benchmarked estimates of the data items that comprise the sum.

Benchmarked estimates at aggregate industry levels are computed by summing the benchmarked estimates for the appropriate detailed industries comprising the aggregate.

Nonemployers

Estimates for employers plus nonemployers are only published for total revenue. All other estimates are based only on employer firms. Because of the industry levels at which we benchmark nonemployer totals, the benchmarked nonemployer totals published in SAS may not sum to the nonemployer totals published by Nonemployer Statistics.

Page Last Revised - December 20, 2021

Is this page helpful?
Thumbs Up Image

Yes

NO THANKS

255 characters maximum

255 characters maximum reached

Thank you for your feedback.
Comments or suggestions?

Top