U.S. flag

An official website of the United States government

Skip Header


Methodology

2021
  • 2024
  • 2021
  • 2018
  • 2015
  • 2012
2021

Survey Design

For purposes of this document, the following definitions are provided:

  • Building—a separate physical structure identified by the respondent containing one or more units.
  • Property—one or more buildings owned by a single entity (person, group, leasing company, and so on). For example, an apartment complex may have several buildings but they are owned as one property.

 

Target population:  All rental housing properties in the United States, circa 2020.

Sampling frame:  The RHFS sample frame is a single frame based on a subset of the 2019 American Housing Survey (AHS) sample units.  The RHFS frame included all 2019 AHS sample units that were identified as:

1)      rented or occupied without payment of rent

2)      units that are owner occupied and listed as “for sale or rent”

3)      vacant units for rent, for rent or sale, or rented but not yet occupied.

By design, the RHFS sample frame excluded public housing and transient housing types (i.e. boat, RV, van, other). Public housing units are identified in the AHS through a match with the Department of Housing and Urban Development (HUD) administrative records.

The RHFS frame is derived from the AHS sample, which is itself composed of housing units derived from the Census Bureau Master Address File. The AHS sample frame excludes group quarters housing. Group quarters are places where people live or stay in a group living arrangement. Examples include dormitories, residential treatment centers, skilled nursing facilities, correctional facilities, military barracks, group homes, and maritime or military vessels. As such, all of these types of group quarters housing facilities are, by design, excluded from the RHFS.

 

In some cases, nursing homes are co-located with other non-group quarter housing unit types, such as “assisted living” or “independent living” housing units. Since these units are “in scope” in the AHS sample frame, they are deemed “in scope” for the RHFS sample frame. However, it can be difficult to separate the group quarters units from the housing units (i.e., non-group quarters) when reporting total units for the RHFS property.  Census Bureau field representatives do their best to ensure the property information collected in the RHFS reflects only the housing units’ portion of the property.  Moreover, due to the often complicated and unique financial structure of these type of properties, it makes little sense to attempt to administer the RHFS questionnaire to them. Properties identified as assisted living facilities are counted in the RHFS for purposes of total rental units, and are flagged as “assisted living”, but little information is collected about the property.  The information collected for these properties included the following: number of units and buildings, year property acquired, year of oldest and newest buildings, how the property was acquired, type of ownership entity, percentage of units by bedroom type for property, and percentage of units by status (vacant-for-rent or other reason, occupied by property owner/personnel, vacancy rate and vacancy rate by bedroom unit type.

Finally, the 2021 RHFS sample was selected from the 2019 AHS but surveyed in 2021. The 2019 AHS sample is selected in late calendar year 2018 and includes new housing units built and ready for occupancy as of late 2018. As such, the RHFS sample will not include any rental properties built between late 2018 and 2020. The total number of rental units in 2+ properties built in 2019 was 321,000 and the number built in 2020 was 348,000[1].

 

Sampling unit:  Buildings with at least one unit that is either rented or vacant-for rent.

Sample design:  The 2021 RHFS was sampled from all rental units as identified in the 2019 AHS sample.   AHS cases were stratified based on building size. Building size was based on the number of units in the building as reported by the AHS respondent. AHS respondents reporting single-family attached or detached were put in the single unit building strata. Multiunit buildings were further categorized into four pre-defined strata based on the number of units (2-4, 5-24, 25-49, 50 or more).

Within each stratum, eligible buildings were sorted by geographic variables, including census region, state, urban/rural status, county, and zip code to obtain a stratified systematic sample.  The within-stratum sampling rates were determined to result in an expected coefficient of variation (CV)[2] of 10 percent for the aggregated stratum estimates at the national level.

Table 1 below shows the frame size and the final sample sizes by stratum. The sample size incorporates an oversample of buildings that was based on the results from the 2018 iteration of this survey.  This oversample was needed to account for non-response, ineligible properties, and sampled units that changed strata after being selected from the frame. Table 1 also displays the number of completed interviews in both the original and final stratum.   Units in the 25-49 units stratum were selected with certainty because the target sample size exceeded the number of units in the frame for these strata. The owners and/or property managers of the sampled buildings were contacted and asked about specific financing and property-related characteristics.

Table 1. 2021 RHFS Sample Size by Stratum

Stratum

Frame Size

Sample Size

Completed Interviews

by Original Stratum

Completed Interviews

By Final Stratum

1 unit

11,404

1,250

400

562

2-4 units

5,003

1,750

745

460

5-24 units

8,956

5,750

2745

860

25-49 units

1,805

1,805

857

543

50+ units

4,571

955

463

2,785

Total Sample

31,739

11,510

5,210

5,210

Frequency of sample redesign:  The RHFS sample is reselected approximately every three years.

Sample maintenance:  There are no sample maintenance procedures since the sample is selected from a new frame each iteration.

Data Collection

Data items requested and reference period covered: RHFS collects data on the financial, managerial, and physical characteristics of rental housing properties nationwide. The reference period of the survey was all twelve months of 2020. 

Key data items:  Key data items for RHFS are the definition of the property and the presence of a mortgage.

Type of request:  Voluntary

Frequency and mode of contact:  The 2021 RHFS includes single-family residential and multifamily residential properties with at least one housing unit intended for rent. Data collection will be conducted from June 2021 through November 2021. The reference period of the survey is all 12 months of 2020. Data collection is conducted in two phases. During Phase I, cases that include respondent contact information will first receive a letter inviting them to self-respond online using the Census Bureau Respondent Portal.  The Census Bureau Respondent Portal allows respondents to complete surveys online.  Respondents will create an account or login to an already existing account.  They will link to the RHFS by using the authentication code provided to them in the initial letter sent from the Census Bureau.  During Phase 2 field staff will begin searching for property owners and managers for cases that did not include sufficient respondent contact information and conduct interviews.  Additionally, field staff will contact cases that did not self-respond during Phase 1 and conduct interviews.

Data collection unit:  Data were collected from owners, managers, or knowledgeable agents of rental housing properties. 

Special Procedures:  Administrative data from various sources was used to fill in property definition information such as buildings and units and to indicate whether a property had a debt or not to convert incomplete interviews to completed interviews.

Compilation of Data

Editing: Respondent data were reviewed for consistency across related items.

Nonresponse:  Nonresponse is defined as the inability to obtain requested data from an eligible survey unit. Two types of nonresponse are often distinguished. Unit nonresponse is the inability to obtain any of the substantive measurements about a unit. In most cases of unit nonresponse, the Census Bureau was unable to obtain any information from the survey unit after several attempts to elicit a response.  Item nonresponse occurs either when a question is unanswered or unusable.

Nonresponse adjustment and imputation:  A nonresponse adjustment factor, which is the ratio of the sample properties divided by the interviewed sample properties, is calculated and assigned to the interviewed sample properties.  Separate factors are computed for each stratum.

For details on the nonresponse adjustment factor, see the Estimation section.

Other macro-level adjustments:  Weights of the sampled rental units are adjusted to create rental housing properties based on the respondents’ answers to a specific set of questions. For details on the weighting adjustments, see the Estimation section.

Tabulation unit:  Rental housing properties and units within rental housing properties.

Estimation: Estimates of total rental housing properties were calculated using the final sample weights which include sampling, nonresponse, and population adjustments.

This final weight is the product of the following components:

  • Basic Weight (SMPWGT)
  • Sampling Adjustment Factor (PROP_ADJ)
  • Nonresponse Adjustment Factor (NR_ADJ)
  • Total Rental Units Control Factor, based on 2021 AHS (PS_ADJ)

The factors are successively multiplied by the sample weight to obtain the final weight. The completed responses receive a final weight greater than 0, and each ineligible sample unit receives a weight of 0.

Basic Weight (SMPWGT)

The basic weight is the inverse of the initial probability of selection of the address.  The sample address was selected from rental units identified in the 2019 AHS sample.  The sample weight (SMPWGT) is the product of the first stage weight (FWGT) and the second stage weight (SWGT) as described below:

First Stage Weight (FWGT)

The first stage (AHS sample rental units) of the RHFS design is the same as the 2019 AHS. In order to account for this stage in the sample, the AHS sample weight is used in the calculation of the basic weight for the sample.  The first stage sample weight for the sample AHS cases is the AHS basic weight provided at time of RHFS sample. The second stage weight (SWGT) of the RHFS sample design accounts for the within stratum sampling across the five building size strata shown in Table 1. It is the ratio of the number of units in each strata to the number of units sampled from each strata.

The sample weight is calculated as:  SMPWGT = FWGT * SWGT

Sampling Adjustment Factor (PROP_ADJ)

The sampling adjustment factor adjusts the sample weights so that the final weight is representative of the complete rental property instead of the originally selected sample AHS_ID, where AHS_ID refers to an AHS sample case. Calculating the correct probability of selection and subsequent weight requires the assumption that each part of the property is selected independently.  The probability of selection for the property is calculated as follows:

PiProperty

                Where:

                           πproperty    calculated probability of selection for the total property

                           πi                  probability of selection for AHS_IDi of the property

                           1- πi              probability of not selecting AHS_IDi

                           Thus,

                           weightproperty =      1/ πproperty                   

                       

For example, AHS_ID1 was originally selected for interview.  However, the property is made up of 3 more AHS_IDs (AHS_ID2, AHS_ID3, AHS_ID4) and they each have an associated probability of selection (π1= 0.5, π2= 0.5, π3= 0.26, π4= 0.36).  Using the above formula gives us πi = 1-[(1-0.5)(1-0.5)(1-0.26)(1-0.36)] = 0.8816 and a new weight of 1.13.   A property cannot have a πi  greater than 1.

This approach is based on this concept:  P(A or B or C or D selected) = 1- P(none of A or B or C or D are selected)

The adjustment factor is then calculated as

PROP_ADJ

When this factor is applied to the SMPWGT we get the property weight (PROPWGT=SMPWGT* PROP_ADJ). This factor was calculated by looking at the respondent’s property definition, which includes the number of units per building.

Nonresponse Adjustment Factor (NR_ADJ)

The purpose of the nonresponse adjustment is to inflate the weights of responses to account for eligible nonresponses. There were 10,596 eligible sample cases out of the initial sample of 11,510.  Sampled cases were ineligible for several reasons, such as missing owner information, duplicates, public housing status or change in rental status. The calculations to compute the nonresponse adjustment are as follows:

Step 1. Assign both the complete response and the nonresponse cases to the appropriate cells as shown using region and sampling strata information.

Table 2. 2021 RHFS Nonresponse Adjustment Cell Assignment

 

 

BUILDSTRAT=00

(1 Unit)

 

BUILDSTRAT=01

(2-4 Units)

 

BUILDSTRAT=02

(5-24 Units)

 

BUILDSTRAT=03

(25-49 Units)

 

BUILDSTRAT=04

(50+ Units)

Region 1

 

 

 

 

 

Region 2

 

 

 

 

 

Region 3

 

 

 

 

 

Region 4

 

 

 

 

 

Step 2. For each of the cells of above, obtain the totals shown below:

  • Weighted count of completed responses (WC), 

WCi

  • Weighted count of non-responses (WNR),

WNRi

Step 3. Calculate the nonresponse adjustment as: 

NR_ADJi

Step 4. Apply the calculated nonresponse adjustment factor to all complete responses in the appropriate cell.  This results in a non-response adjusted property weight (NR_PROPWGT).

NR_ADJi * PROPWGT = NR_PROPWGT

Total Rental Units Control Factor (PS_ADJ)

The 2021 RHFS sample was selected from the 2019 AHS rental units. HUD and the Census Bureau deemed it desirable to ensure consistency between RHFS and AHS rental unit estimates. To carry this out, a Total Rental Units Control Factor, otherwise known as AHS control totals, was applied to the 2021 RHFS rental unit estimates. 

The application of the Total Rental Units Control Factor proceeded in four steps. In the first step, estimates of “AHS rental units by building size” were derived from the 2021 AHS. This included occupied rental units, vacant rented and vacant for rent units. The estimates excluded public housing and transient housing. Table 2 shows the weighted number of AHS Units that are in scope for RHFS.

Table 2. AHS Unit Control Totals

Number of AHS units in building                                  

AHS Units in RHFS Scope (weighted)[3](in thousands)

1 unit     

18,845

2 to 4 units

7,698

5 to 24 units

14,427

25 to 49 units

2,497

50+ units

6,080

Total

49,547

In step 2, we created an original-to-final stratum matrix. Recall that to stratify the AHS rental units for the RHFS, the AHS building size (structure type) variable was used. However, this “original” stratification variable reflected “units in a building,” not “units in a property.” A property in RHFS is defined as all units owned under a single mortgage which may, and often does, include more than one building. For example, an AHS unit in a 10-unit building could be part of a larger property with five 10-unit buildings, meaning the property has 50-units. Similarly, a single unit may be part of a multifamily property. Because of the “buildings may not equal properties” issue, it was necessary to create an original-to-final stratification matrix summarizing the number of units that were originally stratified in one category (i.e., 2-4 units) but were determined to be in a different stratification category once the property was visited (i.e., 5-24 units). Table 3 shows the original-to-final stratification matrix. The largest re-stratification is found in the 5-24 unit stratum which loses approximately 74% of its sample cases to other strata.  This stratum is particularly susceptible to re-stratification because most apartment complexes include building size configurations with units between 5-24 units. The AHS respondent reports the number of units in their building which typically falls in this range. However, for RHFS the property often includes the entire apartment complex, not just a single building.  The matrix also shows that about 3-9% of buildings in the strata with 2 or more units are re-stratified as single-unit properties in RHFS. This occurs because the rental unit is a condo or a single-family townhome, and the AHS respondent reports their building size as 2 or more units.

Table 3. Original-to-Final stratification matrix for AHS control totals

 

Original RHFS Stratum (based on AHS building size)

1 unit

2-4 units

5-24 units

25-49 units

50+ units

Final RHFS

Stratum
(Based on RHFS data collection)

1 unit

0.7889

0.0929

0.0470

0.0420

0.0304

2-4 units

0.1055

0.4945

0.0160

0.0048

0.0043

5-24 units

0.0377

0.1011

0.2631

0.0636

0.0043

25-49 units

0.0176

0.0492

0.0663

0.3601

0.0261

50+ units

0.0503

0.2623

0.6075

0.5294

0.9348

 

The third step was to multiply the “units by building size” vector in Table 2 by the original-to-final stratification matrix in Table 3. The final AHS control totals are presented in Table 4. Note that in the original AHS “units by building size” estimate for single units (Table 2), there were 18.845 million rental units. After the original-to-final stratification matrix is applied, there are 16.550 million rental unit properties with 1 unit. This means that nearly 2.3 million rental units classified as single unit in the AHS are part of multifamily properties in RHFS.

Table 4. Final AHS Control Totals

Number of units in RHFS property                                  

Post-Stratified Unit Controls (in thousands)

1 unit     

16,550

2 to 4 units

6,065

5 to 24 units

5,470

25 to 49 units

2,725

50+ units

18,737

Total

49,547

 

The fourth and final step was to apply the control total from Table 4. The non-response adjusted property weight (NR_PROPWGT) for each case i is used to compute a weighted total of units within each RHFS property size stratum k.

NR_PROPWGTi

The adjustment factor (PS_ADJ) for each stratum is then calculated as the ratio of each stratum’s control total and the weighted sum of units for that stratum:

PS_ADJk

Sampling Error:  The sampling error of an estimate based on a sample survey is the difference between the estimate and the result that would be obtained from a complete census conducted under the same survey conditions. This error occurs because characteristics differ among sampling units in the population and only a subset of the population is measured in a sample survey. The particular sample used in this survey is one of a large number of samples of the same size that could have been selected using the same design. Because each unit in the sampling frame had a known probability of being selected into the sample, it was possible to estimate the sampling variability of the survey estimates.

Common measures of the variability among these estimates are the sampling variance, the standard error, and the coefficient of variation (CV), which is also referred to as the relative standard error (RSE). The sampling variance is defined as the squared difference, averaged over all possible samples of the same size and design, between the estimator and its average value. The standard error is the square root of the sampling variance. The CV expresses the standard error as a percentage of the estimate to which it refers. For example, an estimate of 200 units that has an estimated standard error of 10 units has an estimated CV of 5 percent. The sampling variance, standard error, and CV of an estimate can be estimated from the selected sample because the sample was selected using probability sampling. Note that measures of sampling variability, such as the standard error and CV, are estimated from the sample and are also subject to sampling variability. It is also important to note that the standard error and CV only measure sampling variability. They do not measure any systematic biases in the estimates.

The Census Bureau recommends that individuals using these estimates incorporate sampling error information into their analyses, as this could affect the conclusions drawn from the estimates.

To estimate the variance of the 2021 RHFS survey estimates, a method of successive difference replication as outlined by Ash (2014)[4] was adapted for use by RHFS. This method uses replicate weights to compare the variation in the sampled units by cycling through a pattern of replicate factors. This method of replication described by Ash (2014) builds on the successive difference replication method developed by Fay and Train (1995).[5] The weighting procedure, including all adjustments, was repeated r = 1 to 100 times, once for each replicate, to produce 100 sets of replicate weights.   

Confidence Interval:  The sample estimate and an estimate of its standard error allow us to construct interval estimates with prescribed confidence that the interval includes the average result of all possible samples with the same size and design. To illustrate, if all possible samples were surveyed under essentially the same conditions, and an estimate and its standard error were calculated from each sample:

1.        Approximately 68 percent of the intervals from one standard error below the estimate to one standard error above the estimate would include the average estimate derived from all possible samples.

2.        Approximately 90 percent of the intervals from 1.645 standard errors below the estimate to 1.645 standard errors above the estimate would include the average estimate derived from all possible samples.

In the example above, the margin of error (MOE) associated with the 90 percent confidence interval is the product of 1.645 and the estimated standard error.

An MOE is provided for each survey estimate displayed in the tables.  The sample was designed to result in an expected coefficient of variation of 10% at the national level.  The key estimates and MOEs are displayed below in Table 5.

Table 5. 2021 RHFS Key Estimates

Key Estimate

Number of Properties

(000s)

Number of Units within Properties (000s)

Estimate

Margin of Error

Estimate

Margin of Error

Number of Properties

19,341

1157.8

 

 

Number of Units

 

 

49,547

2377.6

Properties with Mortgages

6,433

511.0

24,794

2607.8

 

Properties without Mortgages

11,792

813.2

18,818

1037.8

Properties with Mortgage Status Not Reported

1,116

280.4

5,936

981.8

 

Nonsampling error:  

Nonsampling error encompasses all factors other than sampling error that contribute to the total error associated with an estimate. This error may also be present in censuses and other non-survey programs.  Nonsampling error arises from many sources: inability to obtain information on all units in the sample; response errors; differences in the interpretation of the questions; mismatches between sampling units and reporting units, requested data and data available or accessible in respondents' records, or with regard to reference periods; mistakes in coding or keying the data obtained; and other errors of collection, response, coverage, and processing.

The Census Bureau recommends that individuals using these estimates factor in this information when assessing their analyses of these data, as nonsampling error could affect the conclusions drawn from the estimates.

A potential source of nonsampling error in the estimates is nonresponse. Nonresponse is the inability to obtain all the intended measurements or responses about all selected units. Unit nonresponse is used to describe the inability to obtain any of the substantive measurements about a sampled unit.  For the 2021 survey, the average unit response rate was 49.2%. To mitigate the effect of nonresponse, a nonresponse adjustment was used to inflate the weights of responses to account for eligible nonresponses. For details on the nonresponse adjustment factor, see the Estimation section.

 

Disclosure avoidance: Disclosure is the release of data that reveals information or permits deduction of information about a particular survey unit through the release of either tables or microdata. Disclosure avoidance is the process used to protect each survey unit’s identity and data from disclosure. Using disclosure avoidance procedures, the Census Bureau modifies or removes the characteristics that put information at risk of disclosure. Although it may appear that a table shows information about a specific survey unit, the Census Bureau has taken steps to disguise or suppress a unit’s data that may be “at risk” of disclosure while making sure the results are still useful.

Cell suppression (primary and complementary) is applied to estimates for the RHFS.  Cell suppression is a disclosure avoidance technique that protects the confidentiality of individual survey units by withholding cell values from release and replacing the cell value with a symbol, usually a “D.”  If the suppressed cell value were known, it would allow one to estimate an individual survey unit’s value too closely.

The cells that must be protected are called primary suppressions.

To make sure the cell values of the primary suppressions cannot be closely estimated by using other published cell values, additional cells may also be suppressed. These additional suppressed cells are called complementary suppressions.

The process of suppression does not usually change the higher-level totals. Values for cells that are not suppressed remain unchanged. Before the Census Bureau releases data, computer programs and analysts ensure primary and complementary suppressions have been correctly applied.

For more information on disclosure avoidance practices, see FCSM Statistical Policy Working Paper 22.

The Census Bureau has reviewed the estimates in Table Creator for unauthorized disclosure of confidential information and has approved the disclosure avoidance practices applied.  (Approval ID: CBDRB-FY23-014).

History of Survey Program:  Click here for information regarding previous RHFS sampling methodologies. 

 

[1] Ash, S. (2014). Using successive difference replication for estimating variances. Survey Methodology, 40(1), 47-59.

[2] Fay, R.E., and Train, G.F. (1995). Aspects of survey and model-based postcensal estimation of income and poverty characteristics for states and counties. Proceedings of the Section on Government Statistics, American Statistical Association, 154-159.

[3] AHS In Scope criteria: if ((INTSTATUS eq '1' and TENURE in ('2' '3')) or (INTSTATUS in ('2' '3') and VACANCY in ('01' '02' '04'))) and (HUDSUB_IUF ne '1') and (BLD ne '10).

[4] Ash, S. (2014). Using successive difference replication for estimating variances. Survey Methodology, 40(1), 47-59. 

[5] Fay, R.E., and Train, G.F. (1995). Aspects of survey and model-based postcensal estimation of income and poverty characteristics for states and counties. Proceedings of the Section on Government Statistics, American Statistical Association, 154-159.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header