U.S. flag

An official website of the United States government

Skip Header


Methodology

2015
  • 2024
  • 2021
  • 2018
  • 2015
  • 2012
2015

Survey Design

For purposes of this document, the following definitions are provided:

  • Basic street address—The street number and name, stripped of information regarding the residential unit number. Residential units in the same building typically have the same BSA, but a BSA can and often does include more than one building.
  • Full address—The BSA plus “within structure identifier,” which is the portion of an address that identifies the specific apartment, unit, suite, floor, trailer, or pad represented by the address record.
  • Building—A separate physical structure identified by the respondent. In some instances, more than one building has the same BSA. Presently, no additional information is available in the MAF to determine if a BSA comprises more than one building.
  • Property—A BSA or a collection of BSAs and/or other buildings owned by a single entity (person, group, leasing company, and so on). For example, an apartment complex may have several buildings with unique BSAs, but they are owned as one property.

Target population: All rental housing properties in the United States, circa 2014.

Sampling frame: The RHFS sample frame consists of rental housing properties from two sources; the 2013 American Housing Survey (AHS) Sample and the January 2014 Master Address File Extract Files (MAFXs) in the same geographic areas.  The AHS provided a frame of single unit rental housing properties while the MAFXs provided the frame of multiunit properties. For more information regarding the American Housing Survey, please see the AHS website.

For the single unit frame, all existing single unit rental housing properties were identified from the AHS sample using housing characteristics such as structure type and tenure collected in 2013. AHS respondents reporting single-family attached or detached or manufactured homes that were identified as one of the following were included in the single unit frame:

  • Rented or occupied without payment of rent
  • Vacant for rent
  • Vacant rented but not yet occupied

Public housing units were removed using an administrative list provided by the Department of Housing and Urban Development (HUD).  For the multiunit frame, the public housing units were removed using an administrative list provided by HUD and the remaining properties created by combining unit level addresses into a single basic street address based on either geographic coordinates or house number and street name.  Next, the out-of-scope properties such as manufactured homes and group quarters were removed.   Finally, the two types of properties (single and multi) were combined into a single frame.

Although condominiums were considered in-scope, they were removed from the sampling frame to obtain a more representative sample.  Initially, any unit that self-identified as a condominium as part of the RHFS sample that was selected from the MAFXs frame was automatically removed. Units with a single-family basic street address on the MAFXs frames were not selected to avoid overlap with the single units on the AHS frame.  However, because some units were listed on the AHS frame as multi-units but had a single unit basic street address on the MAFXs frame, they were not eligible to be selected from either frame

Sampling unit:  Single unit rentals and Basic Street Addresses (BSAs) with more than one unit.

Sample design:  The 2015 RHFS single unit (SU) building sample was sampled from single unit rentals as identified in the 2013 AHS sample. The multiunit (MU) building sample was taken from the Jan 2014 MAFX[1] subset to only include states/counties from 2013 AHS sample PSUs. Single unit buildings made up one strata. Multiunit buildings were further categorized into four pre-defined strata based on the number of units (2-4, 5-24, 25-49, 50 or more).  

Within each stratum, BSAs were sorted by geographic variables, including census region, state, urban/rural status, and county to obtain a stratified systematic sample.  As previously indicated, a BSA[2] may contain multiple buildings (that is, one BSA for two or more buildings). The within-stratum sampling rates were determined to result in an expected coefficient of variation (CV) [3] of 6 percent for the aggregated stratum estimates at the national level.

Table 1 below shows the frame size and the final sample sizes by stratum. The sample size incorporates an oversample of BSAs that were based on the results from the 2012 iteration of this survey.  This oversample was needed for three reasons; a high non-response rate, a large number of ineligible properties, and a large number of sampled units that changed strata after being selected from the frame.  The owners and or property managers were contacted and asked about specific financing and property-related characteristics.

Table 1. 2015 RHFS Sample Size by Stratum

Stratum

Frame Size

Sample Size

Completed Interviews by Original Stratum

Completed Interviews by Final Stratum

1 unit

6,493

2,515

991

991

2-4 units

3,443,735

742

263

274

5-24 units

906,292

1,665

652

377

25-49 units

82,782

4,061

1,494

934

50+ units

83,420

1,274

487

1,311

Total Sample

4,522,722

10,257

3,887

3,887

 

The sampling process began with the assumption of a one-to-one relationship between BSAs and properties.  For those properties that comprise multiple BSAs, the probability of selection reflected the probability that any of the individual BSAs was selected into the sample. For more details on this calculation, please see the Estimation section.

Frequency of sample redesign: The RHFS sample is reselected approximately every three years.

 Sample maintenance: There are no sample maintenance procedures since the sample is selected from a new frame each iteration.

Data Collection

Data items requested and reference period covered: RHFS collects data on the financial, managerial, and physical characteristics of rental housing properties nationwide. The data for 2015 RHFS was collected from May through December 2015. The reference period of the survey was all twelve months of 2014.

Key data items: Key data items for RHFS are the definition of the property and the presence of a mortgage.

Type of Request: Voluntary

Frequency and mode of contact: Prior to RHFS data collection, an Owner-Seeker operation to collect the contact information for the rental housing property owner was conducted. See the Special Procedures section for more details on the Owner-Seeker operation.

The RHFS data collection began with an advance letter to the rental housing property owner which invites their participation in the survey using a web-based instrument.

Respondents that did not complete their survey after approximately six weeks continued into the telephone follow-up operation.  The National Processing Center (NPC) clerical staff called to either remind the respondents to complete their survey online or attempted to conduct the interview over the phone.

Respondents that did not complete their survey after approximately three months continued into the field follow-up operation.  Field representatives (FRs) either called or visited the respondents in an attempt to conduct the interview.

Based on the available contact information, the NPC clerical staff and the FRs may have contacted either property owners or managers.  In some instances, the owner referred staff to the manager (or the manager referred the staff to the owner) due to one party not having sufficient knowledge to answer certain questions.

During the interview, respondents may have provided contact information for an additional respondent (i.e., property manager, owner, or accountant).  This person was contacted if the data already provided did not qualify as a sufficient partial interview.  In addition, an effort to determine if different questionnaires were being referred to the same owner (which can occur if one individual owns multiple properties).  In these situations, questionnaires were consolidated and administered during a single follow-up attempt.

Data collection unit:   Data were collected from owners and managers of rental housing properties.

Special procedures:  The Owner-Seeker operation was conducted by mailing a letter to a sample of renters from each property address, asking for the owner’s or management company’s contact information.

Compilation of Data

Editing: Respondent data were reviewed for consistency across related items.

Nonresponse: Nonresponse is defined as the inability to obtain requested data from an eligible survey unit. Two types of nonresponse are often distinguished. Unit nonresponse is the inability to obtain any of the substantive measurements about a unit. In most cases of unit nonresponse, the Census Bureau was unable to obtain any information from the survey unit after several attempts to elicit a response.  Item nonresponse occurs either when a question is unanswered or unusable.

Nonresponse adjustment and imputation:  A nonresponse adjustment factor, which is the ratio of the sample properties divided by the interviewed sample properties, is calculated and assigned to the interviewed sample properties.  Separate factors are computed for each stratum.

For details on the nonresponse adjustment factor, see the Estimation section.

Other macro-level adjustments: Weights of the sampled rental units are adjusted to create rental housing properties based on the respondents answers to a specific set of questions. For details on the weighting adjustments, see the Estimation section.

Tabulation unit:  Rental housing properties and units within rental housing properties.

Estimation: Estimates of total rental housing properties were calculated using the final sample weights which include sampling, nonresponse, and population adjustments.

This final weight is the product of the following components:

·         Sample Weight (SMPWGT)

·         Sampling Adjustment Factor (GWGT)

·         Nonresponse Adjustment Factor (NR_ADJ)

·         Single-family Unit Undercoverage Adjustment Factor

·         Total Rental Units Control Factor, based on 2015 AHS (PS_ADJ)

The factors are successively multiplied by the sample weight to obtain the final weight. The completed responses will receive a final weight greater than 0, and each ineligible sample unit will receive a weight of 0.

Sample Weight (SMPWGT)

The basic weight is the inverse of the initial probability of selection of the address.  The sample address was selected from either the 2013 AHS sample (single-unit rentals) or from the January 2014 MAFX files (multiunits).  The sample weight (SMPWGT) is the product of the first stage weight (FWGT) and the second stage weight (SWGT) as described below.

First Stage Weight (FWGT)

The first stage (PSUs) of the RHFS design is the same as the 1985 AHS. To account for this stage in the sample, the PSU weight will be used in the calculation of the basic weight for the sample.  The value of the first stage weight will vary by frame and PSU. The first stage sample weight for the single-unit rental cases will be the AHS basic weight. This weight already incorporates the first and second stages of the AHS sample. The first stage weight of the sample MAF cases will be the AHS PSU weight.

Second Stage Weight (SWGT)

The second stage weight (SWGT) of the RHFS sample design accounts for the within PSU sampling across the five building size strata shown in Table 1. It is the ratio of the number of addresses in each strata to the number of addresses sampled from each strata.

The sample weight is calculated as: SMPWGT = FWGT * SWGT

Sampling Adjustment Factor (GWGT)

The sampling adjustment factor adjusts the sample weights so that the final weight is representative of the complete rental housing property instead of the originally selected sample BSAID. Calculating the correct probability of selection and subsequent weight requires the assumption that each part (BSAID) of the property is selected independently.  The probability of selection for the property is calculated as follows:

Where:

πproperty     calculated probability of selection for the total property

πi                probability of selection for BSAIDi of the property

1- πi          probability of not selecting BSAID

                 Thus,

                 Property Weight  =                1/ πproperty                   

 

For example, BSAID1 was originally selected for interview.  However, the property is made up of 3 more BSAIDs (BSAID2, BSAID3, BSAID4) and they each have an associated probability of selection (π1= 0.5, π2= 0.5, π3= 0.26, π4= 0.36).  Using the above formula gives us πi = 1-[(1-0.5)(1-0.5)(1-0.26)(1-0.36)] = 0.8816 and a new weight of 1.13.  

This approach is based on this concept:  P(A or B or C or D selected) = 1- P(none of A or B or C or D are selected)

This is a different approach than that used in the 2012 weighting. In 2012, the probabilities of each of the BSAIDs were added together and then a new weight was calculated.  In our example, the 2012 probability of selection would have been πi  =0.5+0.5+0.26+0.36 = 1.62.  A property cannot have a πi greater than 1.

The adjustment factor (GWGT) is then calculated as

When this factor is applied to the SMPWGT we get the property weight (PROPWGT= SMPWGT*GWGT). This factor was calculated by looking at the respondent’s property definition, which includes the number of units per building.

Nonresponse Adjustment Factor (NR_ADJ)

The purpose of the nonresponse adjustment is to inflate the weights of responses to account for eligible nonresponses. The calculations to compute the nonresponse adjustment are as follows:

Step 1. Assign both the complete response and the nonresponse cases to the appropriate cells as shown using region and sampling strata information.

Table 2. 2015 RHFS Nonresponse Adjustment Cell Assignment

 

 

BUILDSTRAT=00

(1 Unit)

 

BUILDSTRAT=01

(2-4 Units)

 

BUILDSTRAT=02

(5-24 Units)

 

BUILDSTRAT=03

(25-49 Units)

 

BUILDSTRAT=04

(50+ Units)

Region 1

 

 

 

 

 

Region 2

 

 

 

 

 

Region 3

 

 

 

 

 

Region 4

 

 

 

 

 

Step 2. For each of the cells of above, obtain the totals shown below:

Weighted count of completed responses (WC),

·         Weighted count of non responses (WNR),

Step 3. Calculate the nonresponse adjustment as:

Step 4. Apply the calculated nonresponse adjustment factor to all complete responses in the appropriate cell.  This results in a non-response adjusted property weight (NR_PROPWGT).

Single-family Unit Undercoverage Adjustment Factor

Units with a single-family basic street address on the MAFXs frames were not selected to avoid overlap with the single units on the AHS frame.  However, because some units were listed on the AHS frame as multi-units but had a single unit basic street address on the MAFXs frame, they were not eligible to be selected from either frame.  To account for this undercoverage, a population adjustment factor of 1.25 was applied to all single units. 

Total Rental Units Control Factor (PS_ADJ)

In addition to the above adjustments, a final adjustment to align the rental unit estimates from the 2015 RHFS with the 2015 AHS was made. To carry this out, a Total Rental Units Control Factor was applied to the 2015 RHFS rental unit estimates.

The application of the Total Rental Units Control Factor proceeded in four steps. In the first step, estimates of “AHS rental units by building size” were derived by assigning unit counts from the Master Address File (MAF) to addresses identified as containing in-scope rental units in  the 2015 AHS. The controls do not include Public Housing Units or Condos. For each occupied rental unit or vacant rented/for rent unit in the 2015 AHS, the number of units at each unit’s basic street address were obtained from the MAF. If the number of units could not be assigned, then a unit was designated as multifamily if the record contained an apartment or unit number in the unit description, otherwise the unit was designated as single-family.  Control totals for each of the five property size strata were determined by summing the number of units at each basic street address as designated on the MAF for each size group.  The multifamily units whose size was unknown (432) were distributed to the four multifamily strata based on the overall distribution of known units in those strata. The final control total was 47,543 units, including 18,745 single-family units and 28,798 multifamily units.  Table 3 below displays the AHS/MAF control totals.

Table 3: 2015 AHS/MAF Control Totals

Number of AHS units in building/address

AHS Units in RHFS Scope (weighted) (in thousands)

1 unit

18,745

2 to 4 units

5,989

5 to 24 units

10,141

25 to 49 units

3,300

50+ units

9,368

Total

47,543

 

In step 2, we created an original-to-final stratum matrix.  Recall that the RHFS strata are initially defined by the number of units at a single address, or building. A property in RHFS is defined as all units owned under a single mortgage which may, and often does, include more than one building or address.  For example, an AHS unit in a 10-unit building could be part of a larger property with five 10-unit buildings, meaning the property has 50-units.  Similarly, a single unit may actually be part of a multifamily property.  Because of the “buildings may not equal properties” issue, it was necessary to create an original-to-final stratification matrix summarizing the number of units that were originally stratified in one category (i.e., 2-4 units) but were determined to be in a different stratification category once the property was visited (i.e., 5 – 24 units).  Table 4 shows the original-to-final stratification matrix.

The third step was to multiply the “units by building/address” vector in Table 3 by the original-to-final stratification matrix in Table 4. The final AHS/MAF control totals are presented in Table 5. Note that there were 9.368 million rental units in 50+ unit buildings, but that there are 14.576 million rental units in 50+ unit properties. This means that about 5 million rental units in buildings with less than 50 units are actually part of larger multifamily properties with 50 or more units. Also note that the number of single-family properties decreased slightly from 18.745 million to 18.686 million.  This is partly due to the reclassification of about 16% of the units in addresses with 2-4 units into single unit properties and the loss of 6% of the single-unit addresses to larger property size strata.

The fourth and final step was to apply the control totals from Table 5. The non-response adjusted property weight (NR_PROPWGT) for each case i is used to compute a weighted total of units within each RHFS property size stratum k. For the single-family property stratum (k=1), the single-family undercoverage adjustment factor of 1.25 is applied to the NR_PROPWGT before calculating the sum.

The adjustment factor (PS_ADJ) for each stratum is then calculated as the ratio of each stratum’s control total and the weighted sum of units for that stratum:

Table 4: Distribution of sampled cases from original to RHFS property size strata

 

 

Original Strata(building size)

Final Strata (property size)

 

1 Unit

2-4 Units

5-24 Units

25-49 Units

50+ Units

1 Unit

0.94

0.16

0.01

0.01

0.00

2-4 Units

0.05

0.69

0.06

0.00

0.01

5-24 Units

0.005

0.079

0.514

0.028

0.004

25-49 Units

0.002

0.019

0.069

0.584

0.012

50+ Units

0.004

0.06

0.34

0.38

0.98

 

Table 5: Final AHS/MAF Control Totals

Number of AHS units in building/address

AHS Units in RHFS Scope (weighted) (in thousands)

1 unit

18,686

2 to 4 units

5,727

5 to 24 units

5,580

25 to 49 units

2,974

50+ units

14,576

Total

47,543

Sampling Error: The sampling error of an estimate based on a sample survey is the difference between the estimate and the result that would be obtained from a complete census conducted under the same survey conditions. This error occurs because characteristics differ among sampling units in the population and only a subset of the population is measured in a sample survey. The particular sample used in this survey is one of a large number of samples of the same size that could have been selected using the same design. Because each unit in the sampling frame had a known probability of being selected into the sample, it was possible to estimate the sampling variability of the survey estimates.

Common measures of the variability among these estimates are the sampling variance, the standard error, and the coefficient of variation (CV), which is also referred to as the relative standard error (RSE). The sampling variance is defined as the squared difference, averaged over all possible samples of the same size and design, between the estimator and its average value. The standard error is the square root of the sampling variance. The CV expresses the standard error as a percentage of the estimate to which it refers. For example, an estimate of 200 units that has an estimated standard error of 10 units has an estimated CV of 5 percent. The sampling variance, standard error, and CV of an estimate can be estimated from the selected sample because the sample was selected using probability sampling. Note that measures of sampling variability, such as the standard error and CV, are estimated from the sample and are also subject to sampling variability. It is also important to note that the standard error and CV only measure sampling variability. They do not measure any systematic biases in the estimates.

The Census Bureau recommends that individuals using these estimates incorporate sampling error information into their analyses, as this could affect the conclusions drawn from the estimates.

To estimate the variance of the 2015 RHFS survey estimates, a method of successive difference replication as outlined by Ash (2014)[1] was adapted for use by RHFS. This method uses replicate weights to compare the variation in the sampled units by cycling through a pattern of replicate factors. This method of replication described by Ash (2014)4 builds on the successive difference replication method developed by Fay and Train (1995).[2] The weighting procedure, including all adjustments, was repeated r = 1 to 160 times, once for each replicate, to produce 160 sets of replicate weights.   

Confidence Interval: The sample estimate and an estimate of its standard error allow us to construct interval estimates with prescribed confidence that the interval includes the average result of all possible samples with the same size and design. To illustrate, if all possible samples were surveyed under essentially the same conditions, and an estimate and its standard error were calculated from each sample, then:

1.        Approximately 68 percent of the intervals from one standard error below the estimate to one standard error above the estimate would include the average estimate derived from all possible samples.

2.        Approximately 90 percent of the intervals from 1.645 standard errors below the estimate to 1.645 standard errors above the estimate would include the average estimate derived from all possible samples.

In the example above, the margin of error (MOE) associated with the 90 percent confidence interval is the product of 1.645 and the estimated standard error.

A margin of error is provided for each survey estimate displayed in the tables.  The sample was designed to result in an expected coefficient of variation of 6% at the national level.  The key estimates and margin of errors are displayed below in Table 6.

Table 6. 2015 RHFS Key Estimates

 

Key Estimate

Number of Properties

(000s)

Number of Units within Properties (000s)

Estimate

Margin of Error

Estimate

Margin of Error

Number of Properties

21,724

221

 

 

Number of Units

 

 

47,543

481

Properties with Mortgages

9,308

913

27,322

2,708

 

Properties without Mortgages

12,416

904

20,221

2,708

Nonsampling error: Nonsampling error encompasses all factors other than sampling error that contribute to the total error associated with an estimate. This error may also be present in censuses and other nonsurvey programs.  Nonsampling error arises from many sources: inability to obtain information on all units in the sample; response errors; differences in the interpretation of the questions; mismatches between sampling units and reporting units, requested data and data available or accessible in respondents' records, or with regard to reference periods; mistakes in coding or keying the data obtained; and other errors of collection, response, coverage, and processing.

The Census Bureau recommends that individuals using these estimates factor in this information when assessing their analyses of these data, as nonsampling error could affect the conclusions drawn from the estimates.

A potential source of nonsampling error in the estimates is nonresponse. Nonresponse is the inability to obtain all the intended measurements or responses about all selected units. Unit nonresponse is used to describe the inability to obtain any of the substantive measurements about a sampled unit.  For the 2015 survey, the average unit response rate was 65%. To mitigate the effect of nonresponse, a nonresponse adjustment was used to inflate the weights of responses to account for eligible nonresponses. For details on the nonresponse adjustment factor, see the Estimation section.

Disclosure avoidance: Disclosure is the release of data that reveals information or permits deduction of information about a particular survey unit through the release of either tables or microdata. Disclosure avoidance is the process used to protect each survey unit’s identity and data from disclosure. Using disclosure avoidance procedures, the Census Bureau modifies or removes the characteristics that put information at risk of disclosure. Although it may appear that a table shows information about a specific survey unit, the Census Bureau has taken steps to disguise or suppress a unit’s data that may be “at risk” of disclosure  while making sure the results are still useful.

Cell suppression (primary and complementary) is applied to estimates for the Rental Housing Finance Survey.  Cell suppression is a disclosure avoidance technique that protects the confidentiality of individual survey units by withholding cell values from release and replacing the cell value with a symbol, usually a “D”.  If the suppressed cell value were known, it would allow one to estimate an individual survey unit’s too closely.

The cells that must be protected are called primary suppressions.

To make sure the cell values of the primary suppressions cannot be closely estimated by using other published cell values, additional cells may also be suppressed. These additional suppressed cells are called complementary suppressions.

The process of suppression does not usually change the higher-level totals. Values for cells that are not suppressed remain unchanged. Before the Census Bureau releases data, computer programs and analysts ensure primary and complementary suppressions have been correctly applied.

For more information on disclosure avoidance practices, see FCSM Statistical Policy Working Paper 22.

The Census Bureau has reviewed the estimates in Table Creator for unauthorized disclosure of confidential information and has approved the disclosure avoidance practices applied.  

History of Survey Program:  Click here for information regarding RHFS sampling methodologies.

Data users should exercise caution when making comparisons between the 2015 and 2018 Rental Housing Finance Survey estimates. The 2015 sample design used separate frames for single and multi-unit addresses. Single unit rentals were selected from a frame of eligible rental units identified in the 2013 American Housing Survey (AHS) sample and multi-unit addresses were selected from a frame based on a list of basic street addresses on the Master Address File (MAF) located in 2013 AHS sample Primary Sampling Units (PSUs).  The 2018 sample design used a single frame based solely on addresses of rental units identified in the 2017 AHS.  The 2017 AHS was based on the new sample that was redesigned in 2015, while the 2013 AHS was based on the previous AHS sample design.  Thus, the post-stratification of unit control totals had very different distributions across survey years. The differences between 2015 and 2018 RHFS estimates of rental properties can be largely attributed to the differences in the post-stratification of unit control totals and the increase in average property size as measured in units per property between the 2015 and 2018 RHFS sample designs.

[1].  The MAF includes identifiers for both multiunit BSAs and residential BSAs. These identifiers are assumed to be accurate for purposes of sample selection.

[2].  As noted on page 1, “Residential units in the same building typically have the same BSA, but a BSA can and often does include more than one building.”

[3].  CV is defined as the stratum standard error divided by the stratum total.

[4].  Ash, S. (2014). Using successive difference replication for estimating variances. Survey Methodology, 40(1), 47-59.

[5].  Fay, R.E., and Train, G.F. (1995). Aspects of survey and model-based postcensal estimation of income and poverty characteristics for states and counties. Proceedings of the Section on Government Statistics, American Statistical Association, 154-159.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header