U.S. flag

An official website of the United States government

Skip Header


Small Area Estimation - State Poverty Rate Model Research Data Files

Written by:
RRS2017-05

Abstract

This zip file contains input files for the CPS equations of the SAIPE state poverty rate models for income years (IYs) 1989-2005 (omitting 1994, a year for which no SAIPE estimates were produced). These are intended for research, particularly when multiple years of data are of interest. The data is mostly the data used in SAIPE production, though some minor later revisions may have occurred in a few variables in a few years. Note also that for IY 2005 the CPS data were replaced in production by ACS data, and so the 2005 CPS equation models were run only for comparisons.

One difference from the production data concerns the variable fnlse, the final sampling std. errors obtained by iteratively updating the original gvf std. errors using results from REML estimation of the model. The fnlse values don't always agree with the sampling std. errors used in production because the values included here were obtained from iterative updates that used REML estimation of a common model across most years. (This model used census residuals as a regressor, while production used census poverty estimates as a regressor in some years. It also included the food stamp regressor for all years except 1998 and 1999, although the food stamp variable was dropped from the production state models in 1998 and not reinstituted until 2004.) This use of a "common model" facilitated the production of the fnlse values given here, and was thought to be more desirable for research purposes than was using the actual production sampling std. errors in all cases.

The files cpsxxp.txt and cpsxxt.txt contain data for models for poverty rates of ages 0-4, 5-17, 18-64, and 65+. There are two versions for age 5-17. One (in the cpsxxp.txt files) is for 5-17 year old children related in families. The other (in the cpsxxt.txt files) is for total 5-17 year old children. Data for age 0-4 is for total children age 0-4, and so is contained in the cpsxxt.txt files. For many of the years the same 0-4 data is replicated in the cpsxxp.txt files, but for other years some or all of the 0-4 columns in the cpsxxp.txt files are zeroes. For 18-64 and 65+ there is no distinction between total and related, and these data are available in both the cpsxxp.txt and cpsxxt.txt files.

In each file, and for each age group, the following data columns are included (where xx denotes the Income Year):

cpsxx = direct CPS estimated poverty rate
irsprxx = pseudo-poverty rate tabulated from IRS tax data
irsnfxx = tax nonfiler rate
fsxx = food stamp participation proportion
cpsse = direct estimated std. error of cpsxx
gvfse = initial GVF estimate of std. error of cpsxx
fnlse = final GVF estimate of std. error of cpsxx after it is updated iteratively in conjunction with REML estimation of the model.

For age 65+ the SSI participation rate variable (SSIxx) replaces fsxx. The last column in each file contains:
smpsize = CPS sample size (number of interviewed households).
The file read-cpspov-files.r is an R program that will read the CPS cpsxxp.txt or cpsxxt.txt files for a specified set of years and a specified age group, and store the data in suitable arrays. See the program for details.

Census Poverty Rate Equation Data Files:

Data input files are also included for the census poverty rate equations. These files are cen89pov.txt (for estimates from the 1990 census for IY 1989) and cen99pov.txt (for estimates from the 2000 census for IY 1999). Variables are analogous to the first four from the CPS equation files; the final three are omitted since sampling error in the census long form estimates was negligible at the state level. The file cen89pov.txt contains data for modeling poverty rates of age 0-4 total, age 5-17 related, age 18-64, and age 65+. Production modeling used the age 5-17 related estimates, or corresponding "census residuals" in both the CPS 5-17 related and 5-17 total models. The file cen99pov.txt contains data for both age 5-17 related and age 5-17 total (and so has 4 more columns than does cen89pov.txt). When the 2000 census estimates or residuals were used in the modeling, we made the distinction between 5-17 related and 5-17 total.

The files cen89res.txt and cen99res.txt contain the "census residuals" (the residuals from fitting the census poverty rate regression equations). Thus, if the interest is in the CPS equation, these can just be read in and there is no need to read in the census equation data and refit the census equation. The first of the files contains the residuals for 0-4t, 5-17r, 18-64, and 65+, while the second contains the residuals for 0-4t, 5-17r, 5-17t, 18-64, and 65+. This is analogous to the files cen89pov.txt and cen99pov.txt.

File of State Population Estimates:

The files poptot0_4.txt, poptot5_17.txt, poptot18_64.txt, and poptot65+.txt are ASCII text files of state population estimates for the four respective age groups. For 1990-1999 these are post-censal estimates, meaning that they were constructed started with 1990 census counts (which refer to April 1, 1990) and updating them demographically (with birth, death, and migration data and estimates) through the decade. They were not modified to account for the Census 2000 counts, so that the transition from 1999 to 2000 can be larger than it would be for corresponding inter-censal population estimates, for which such modifications are made to smooth the transition to the next census results. The 2000-2005 population estimates are inter-censal estimates that were constructed starting from the Census 2000 results, but also taking into account the 2010 census results.

The post-censal pop estimates are closer to what is used in SAIPE production, because for production the inter-censal pop estimates are not yet available. Also note that the 5-17 pop estimates included here are for the total 5-17 population, not for the 5-17 related in families population. See the SAIPE web site for a discussion of the distinction.

The pop estimates were obtained from the following websites:
https://www.census.gov/data/tables/time-series/demo/popest/1980s-state.html
https://www2.census.gov/programs-surveys/popest/tables/1990-2000/state/asrh/
https://www2.census.gov/programs-surveys/popest/datasets/2000-2010/intercensal/state/st-est00int-agesex.csv

For further information on the SAIPE production models, input data, and estimation and prediction procedures and results at all geographic levels (state, county, and school district) see the SAIPE web site at https://www.census.gov/did/www/saipe/ or the following article:
Bell, William R., Basel, Wesley W., and Maples, Jerry J. (2016), "An Overview of the U.S. Census Bureau's Small Area Income and Poverty Estimates Program," Chapter 19 in Analysis of Poverty Data by Small Area Estimation, ed. Monica Pratesi, Wiley, pp. 349-378.

William R. Bell
Associate Directorate for Research and Methodology
U.S. Census Bureau
William.R.Bell@census.gov

Carolina Franco
Center for Statistical Research and Methodology
U.S. Census Bureau
Carolina.Franco@census.gov

First version: July 31, 2013

Revised: September 15, 2015; August 23, 2017; September 19, 2017; January 16, 2020

CITATION:

William R. Bell and Carolina Franco. (2017). Small Area Estimation - State Poverty Rate Model Research Data Files. Center for Statistical Research & Methodology Research Report Series (Statistics #RRS2017-05). U.S. Census Bureau. Available online at <www.census.gov/library/working-papers/2017/adrm/rrs2017-05.html>.

Related Information


Page Last Revised - October 28, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header