The Nonemployer Statistics by Demographics series (NES-D) provides information on the demographic characteristics of nonemployer businesses. The NES-D is the result of a research project by the Census Bureau to complete the picture of U.S. business ownership by demographics for the United States. Historically, the quinquennial Survey of Business Owners (SBO) provided the only comprehensive source of information on both employer and nonemployer businesses by demographic characteristics of the business owners. In 2017, the SBO was replaced by the Annual Business Survey (ABS). The ABS is an annual survey that collects demographic characteristics from employer businesses. However, the ABS excludes the collection of demographic data from nonemployer businesses. The NES-D was developed to produce similar estimates as ABS on owner demographics for nonemployer businesses. The NES-D is not a survey; rather, it leverages existing individual-level administrative records to assign demographic characteristics to the universe of nonemployer businesses. Demographic characteristics including sex, ethnicity, race, veteran status, owner age, place of birth, and U.S. citizenship are assigned to nonemployer business owners.
Together, the NES-D and the ABS will continue to provide the only source of detailed and comprehensive statistics on the scope, nature and activities of all U.S. businesses by the demographic characteristics of the business owners. NES-D data is available annually by detailed geography and industry levels, receipt-size class, legal form of organization (LFO), and, from 2019 onward, urban and rural classification.
The Nonemployer Statistics by Demographics series (NES-D) provides information on the demographic characteristics of nonemployer businesses. The NES-D is the result of a research project by the Census Bureau to complete the picture of business ownership by demographics for the United States. Historically, the quinquennial Survey of Business Owners (SBO) provided the only comprehensive source of information on both employer and nonemployer businesses by demographic characteristics of the business owners. In 2017, the SBO was replaced by the Annual Business Survey (ABS). The ABS is an annual survey that collects demographic characteristics from employer businesses. However, the ABS excludes the collection of demographic data from nonemployer businesses. The NES-D was developed to produce similar estimates as ABS on owner demographics for nonemployer businesses. The NES-D is not a survey; rather, it leverages existing individual-level administrative records to assign demographic characteristics to the universe of nonemployer businesses. Demographic characteristics including sex, ethnicity, race, veteran status, owner age, place of birth, and U.S. citizenship are assigned to nonemployer business owners.
Together, the NES-D and the ABS will continue to provide the only source of detailed and comprehensive statistics on the scope, nature and activities of all U.S. businesses by the demographic characteristics of the business owners. NES-D data is available annually by detailed geography and industry levels, receipt-size class, legal form of organization (LFO), and urban and rural classification.
The NES-D adds demographic characteristics to the Nonemployer Statistics (NES) data and produces the total firm counts and the total receipts by those demographic characteristics. The details about NES program is available in this page: https://www.census.gov/programs-surveys/nonemployer-statistics/about.html. The NES-D utilizes various administrative records (AR) and the Census Bureau data sources that include the Business Register (BR), Internal Revenue Service (IRS) tax Form 1040 data, tax Schedule K-1 data, Decennial Census and American Community Survey (ACS) data, Social Security Administration’s database (Numident), and AR from the Department of Veterans Affairs (VA) to obtain the demographic characteristics. The Census Bureau identifies and extracts the universe of nonemployer businesses from the BR. The nonemployer universe is comprised of businesses with no paid employment or payroll, annual receipts of $1,000 or more ($1 or more in the construction industries) and filing IRS tax forms for sole proprietorships (Form 1040, Schedule C), partnerships (Form 1065), or corporations (the Form 1120 series). The BR also provides the LFO of the business as well as its receipts, industry classification, and geography classification. For more information on how nonemployer businesses are identified and defined, visit the Nonemployer Statistics technical documentation page: https://www.census.gov/programs-surveys/nonemployer-statistics/technical-documentation/methodology.html
The primary source of data for race and Hispanic origin information is Decennial Census and ACS data, with the Census Numident serving as a secondary source. To assign race and Hispanic origin responses, priority is given to the most recent data from Decennial Census and ACS; that is, first, to post-2011 ACS data, then the 2010 or most recent Census, followed by 2001-2010 ACS data, and finally Census 2000. If a race or Hispanic origin cannot be assigned for a business owner using Decennial Census or ACS data, then the Numident is used to assign the race or the Hispanic origin (for additional details on this topic, see https://www2.census.gov/ces/wp/2019/CES-WP-19-34.pdf).
Following the legacy SBO and the ABS, NES-D does not include a “multiple race” category for individuals indicating they are of multiple races. Instead, for owners who report multiple races in Decennial or ACS data, and are tabulated as “multiple race,” NES-D uses the detailed Census or ACS race information to assign the owner to each of the corresponding racial categories. For example, an owner who reports as white and American Indian and Alaska Native (AIAN) will be assigned and tabulated to both the white race category and the AIAN race category. For this reason, summed totals in NES-D tables for owner race and firm race will be greater than the summed totals for binary demographic categories such as Hispanic origin.
The Census Numident is the primary source for the age, sex, place of birth, and the U.S. citizenship status of the business owners making Decennial and ACS data as the secondary sources. Finally, the Department of Veteran Affairs (VA), USVETS data provides AR on veteran status of the business owners. Title 38 of the U.S. Code of Federal Regulations gives VA the authority to determine veterans’ status. Luque et al. (2019) (https://www2.census.gov/ces/wp/2019/CES-WP-19-01.pdf) facilitates the discussions on USVETS data, providing the concept of a veteran captured by the SBO/ABS in broader aspect than VA’s official definition of a veteran. This research also addresses the potential use of the Department of Defense (DOD) data in the future processings as an additional data source complementing VA’s data to bring the AR-based definition closer to the NES-D based veteran concept.
Anonymized unique individual identifiers are used to identify business owners and attach demographic characteristics from the data sources above to the nonemployer businesses. The anonymized unique identifiers are assigned to the individual businesses in AR and the census data sources upon receipt of data at the Census Bureau. These identifiers known as Protected Identification Keys (PIKs) are used as linking keys across the data sources to obtain demographic information. The demographic information acquired this way are attached to the demographic characteristics of the nonemployer business owners.
Depending on the LFO of the business, two IRS forms are used to obtain PIKs: IRS Form 1040 for sole proprietors, and Schedule K-1 for owners of partnerships and S-corporations. For C-corporations, there is no tax form or business registry that clearly and unequivocally identifies all owners of this type of business. For this reason, the Census Bureau is unable to assign demographic characteristics to C-corporations. Research is currently underway to explore whether demographics can be reliably imputed for C-corporations. C-corporations constitute only about 2 percent of the nonemployer universe and approximately 4 percent of receipts. Data for C-corporations are included in the published tables but are not shown by the demographic characteristics of the firms.
The administrative and census records data sources mentioned above provide demographic characteristics coverage for the vast majority of identified nonemployer business owners (not including owners of C-corporations). Sex, age, place of birth, and U.S. citizenship status are available for approximately 97 percent of the identified owners. Similarly, data for Hispanic origin are available for about 95 percent of all records. Also, the administrative sources provide about 90 percent of the race category data, and the remaining records are imputed.
Missing, demographic characteristics are imputed using the “hot deck” imputation method. In this method, the missing records are imputed using donors with similar characteristics of the recipients available in the same dataset.This imputation method is similar to the one used by the Annual Business Survey (for more information see https://www.census.gov/programs-surveys/abs/technical-documentation/methodology.html). For more details on NES-D administrative records coverage and related issues, see https://www2.census.gov/ces/wp/2019/CES-WP-19-34.pdf.
Assigning demographic characteristics to owners of sole proprietorships, and by extension to the firms themselves, is straightforward. Only individuals can own sole proprietorships, and each sole proprietorship has only one owner. Hence, if the PIK of the sole proprietor can be linked to a given demographic data source, then the sole proprietor’s firm will be assigned that demographic characteristic.
For partnerships and S-corporations, the assignment of demographic characteristics to the firm as a whole is more complicated since these types of firms can have more than one owner, and not all owners are necessarily individuals. Following the ABS and legacy SBO, NES-D assigns firms to demographic groups by determining the total share of firm ownership held by individual members of each (demographic) group. A firm is assigned to a given group if owners of that group collectively own a majority stake (more than 50 percent) in the firm. NES-D uses ownership share information available in the Schedule K-1 data to determine the demographic group holding majority stake in the firm (see https://www2.census.gov/ces/wp/2019/CES-WP-19-34.pdf for additional details). Those characteristics that have only two categories at the individual level (e.g., sex, Hispanic origin or veteran status) also have a third category at the firm level: equally-owned. For characteristics that have more than two individual-level categories, such as race, it is possible that no one group will collectively own a majority of the firm. In such cases, the firm is not assigned to any race category.
Not all firms are eligible for demographic classification; these firms are labeled as “unclassifiable”. Following the methodology of the ABS (and legacy SBO), i) only firms where the person with the largest ownership share owns at least 10 percent of the firm are eligible for demographic assignment, ii) up to 4 owners with the largest ownership shares in the firm are considered in the assignment, iii) only person owners are used in the estimation, and hence iv) only firms with person owners are used in the calculation. Additionally, for NES-D, C-Corporations are labeled as “unclassifiable” because firm level demographic characteristic assignments cannot be assigned at this time.
To make nonemployer owner-level and firm-level demographics estimates consistent, and consistent with (ABS) employer estimates, only the top 4 owners of classifiable firms are included in the calculations.
NES-D also provides minority-owned, nonminority-owned, and equal minority-nonminority-owned categories based on the race and Hispanic origin of the owners. Specifically, owners classified as non-Hispanic and White are included in the nonminority group. Also see the Tabulation section of the ABS methodology: https://www.census.gov/programs-surveys/abs/technical-documentation/methodology.html.
The industry classifications of firms are based on the 2017 North American Industry Classification System (NAICS) (https://www.census.gov/naics/). The source of industry codes is primarily from Internal Revenue Service filings and are self-classified by tax filers. In scope are all NAICS sectors except those classified in the following NAICS industries:
The 2021 NES-D provides data at the 2-digit to 4-digit NAICS level. The Census Bureau is researching if additional detail can be published in future years. For more information on the classification of industries, refer to the nonemployer statistics methodology: https://www.census.gov/programs-surveys/nonemployer-statistics/technical-documentation/methodology.html.
The 2021 NES-D provides data for the U.S., States, Metropolitan/Micropolitan Statistical Areas (MSA), and Counties. Most geography codes are derived from the business owner`s mailing address identified from administrative records. Because the owner's mailing address may not be the same as the physical location of the business, the resulting geography codes do not always represent where business is conducted, but this represents the best information available regarding the location of the business. The Census Bureau is researching if additional detail can be published in future years.
Firms are classified as urban or rural based on the population of the Census block of its physical location or mailing address. Firms without an assigned Census block are designated as “Not classified”. Firms with a physical location or mailing address on a Census block with at least 2,000 housing units or have a population of at least 5,000 are classified as “Urban”. All other firms are classified as “Rural”.
In accordance with U.S. Code, Title 13, Section 9, no data are published that would disclose the operations of an individual business. The Census Bureau has reviewed this data product to ensure appropriate access, use, and disclosure avoidance protection of the confidential source data (Project No. 7504866, Disclosure Review Board (DRB) approval number: CBDRB-FY24-0307).
Disclosure avoidance is the process used to protect the confidentiality of data provided by an individual or firm. Using disclosure avoidance procedures, the Census Bureau modifies or removes the characteristics which may disclose confidential information. Although it may appear that a table shows information about a specific individual or business, the Census Bureau has taken steps to mask or suppress the original data while making sure the results are still useful.
NES-D uses noise infusion as the primary method of disclosure avoidance for receipts. Noise infusion perturbs data values prior to tabulation by applying a random noise multiplier to receipts data. Disclosure protection is accomplished in a manner that causes the vast majority of cell values to be perturbed by, at most, a few percentage points. Each published cell value has an associated noise flag indicating the relative amount of distortion in the cell value resulting from the perturbation of the data for the contributors to the cell. In certain circumstances, some individual cells may be suppressed for additional disclosure avoidance. Suppressed data are replaced by one of the following symbols:
N - Not available or not comparable
S - Withheld because estimates did not meet publication standards
X - Not applicable
The level of noise applied to the receipts are identified by the following symbols:
G - Low noise: The cell value was changed by less than 2 percent by the application of noise.
H - Moderate noise: The cell value was changed by 2 percent or more but less than 5 percent by the application of noise.
J - High noise: The cell value was changed by 5 percent or more by the application of noise.
The data sources used to produce these estimates are of the highest quality, and well-grounded in a body of proven administrative records research that shows the quality and suitability of those data sources to directly replace demographic information in businesses, as well as household, surveys. The NES-D are derived from AR data and are not subject to sampling error. Therefore, there is no relative standard error or standard error due to sampling.
The data compiled for NES-D are subject to non-sampling errors, which can be attributed to many sources. For instance, administrative records data may contain measurement error because of issues such as coverage problems (e.g., the data source may not cover certain populations as well as others); linking or matching issues which may lead to bias problems; conceptual and timing misalignments; reporting errors; definition and classification difficulties; errors in recording or coding the data obtained; and other errors of coverage, processing, and estimation for missing or misreported data. In the case of NES-D, coverage and bias problems are not as pronounced because the nonemployer business owners are well represented in tax and other administrative and census records data. The accuracy of tabulated data is determined by the joint effects of the various non-sampling errors. Precautionary steps were taken in all phases of the processing to minimize the effects of non-sampling errors. For a detailed discussion of this topic, see the Limitations and Challenges section of the NES-D working paper: https://www2.census.gov/ces/wp/2019/CES-WP-19-34.pdf.
Tabulations were produced for the final data and are linked to the ABS website (https://www.census.gov/programs-surveys/abs/data/tables.html). Data are tabulated by the sex, ethnicity, race, and veteran status of the firm owners. Business ownership is defined as having 51 percent or more of the stock or equity in the business and is categorized by firms classifiable by sex, ethnicity, race, and veteran status and firms unclassifiable by sex, ethnicity, race, and veteran status. Data are also tabulated for businesses owned equally (50% / 50%) by men and women, by Hispanics and non-Hispanics, by minorities and nonminorities, and by veterans and nonveterans. Firms classifiable by sex, ethnicity, race, and veteran status are categorized by the following:
During the 2020 NES-D processing, the Census Bureau noticed an increase in unclassifiable firms by demographics. Research determined that the increase was caused by missing shareholder percentages from the administrative records source data. To mitigate this issue, the Census Bureau substituted the shareholder’s ordinary business income to replace missing values. The missing values replacement technique was applied to 2020 and 2021 NES-D processing.
To provide a comprehensive source of demographic business data, the Census Bureau has combined results from the ABS with results from the NES-D to produce estimates of all U.S. businesses. Combining employer estimates from the ABS with the nonemployer estimates from NES-D, provides a complete picture of business ownership for the U.S.
There are a small number of firms that move between the employer and nonemployer frames each year. Because the ABS and the NES-D estimates are computed independently, this results in a small over-coverage bias for the combined estimates because firms are included in both the employer and nonemployer estimates. For 2021 this over-coverage is estimated to be about 75,000 total firms. The Census Bureau plans to conduct research to identify methods to limit this over-coverage in future estimates.
Some unpublished estimates can be derived directly from datasets by subtracting published estimates from their respective totals. However, the results obtained by such subtraction would be subject to poor response, high sampling variability, or other factors that may make the results misleading. Individuals who use such calculations in datasets to create new estimates should cite the Census Bureau as the source of the original estimates only.
NES-D and SBO estimates are not directly comparable due to differences in survey and administrative records responses, non-sampling error or other issues such as definitional differences between the survey and AR data, and allowable survey responses for sole proprietorships that do not have a parallel in tax data. The highlights of these differences are described below. For a detailed discussion of this topic, see the ‘Comparison to SBO’ section of the following working papers:
https://www2.census.gov/ces/wp/2019/CES-WP-19-34.pdf
https://www2.census.gov/ces/wp/2019/CES-WP-19-01.pdf
Regarding race, i) the SBO included a “Some-Other-Race” category (which is no longer allowed in business statistics or surveys), and ii) AR research finds that agreement rates for race between AR and survey responses are very high but tend to be lower for small population groups (e.g., American Indian and Alaska Native, Native Hawaiian and Other Pacific Islander) relative to other race groups.
Regarding firm ownership by sex, the SBO response allowed for sole proprietorships to be equally owned by a man and a woman (usually married couples) while tax records can only consider the sex of the person that appears as the owner of the sole proprietorship on the 1040 tax Form. Consequently, the AR-based firm equally-owned category is expected to be, and is, lower than the SBO estimate. Note though that for a large share of nonemployer sole proprietors, the 2012 SBO already used AR for direct substitution of core demographics including sex. This resulted in lower 2012 SBO equally-owned estimates for nonemployer businesses than in the 2007 SBO.
Regarding firm ownership by veteran status, the concept of veteran captured by the SBO/ABS is broader than VA’s (official) definition of a veteran. Specifically, VA’s veteran definition does not include some military personnel such as individuals who are currently on active military duty and individuals serving in the National Guard/Reserve Component who never served on active duty in the past. In addition, some older and healthier veterans are less well represented in VA’s data. For these reasons, AR-based estimates are expected to be, and are, lower than SBO estimates.