U.S. flag

An official website of the United States government

Skip Header


Demographic and Income Model Methodology (2001)

The 2001 SAHIE experimental estimates were created to demonstrate the feasibility of producing high-quality modeled estimates for uninsured low-income women at the state-level. The experimental SAHIE model utilized the Current Population Survey (CPS) in addition to other data sources. The project was done in conjunction with the Centers for Disease Control and Prevention's (CDC) Division of Cancer Prevention and Control (DCPC). The CDC have a congressional mandate to provide screening services for breast and cervical cancer to low-income, uninsured, and underserved women through the National Breast and Cervical Cancer Early Detection Program (NBCCEDP). The program had an interest in utilizing the SAHIE estimates for their program administration.

For 2001, SAHIE publishes experimental STATE estimates of the female population with and without health insurance coverage, along with measures of uncertainty, for the full cross-classification of:

  • 3 age categories: 18-64, 40-64, and 50-64
  • 2 income categories: 0-200% and 0-250% of the poverty threshold.
  • 4 races/ethnicities: all races/ethnicities, White not Hispanic, Black not Hispanic, and Hispanic (any race).

Estimates are adjusted so that, before rounding, state numbers sum to the national 2002 CPS ASEC (which contains questions about income during calendar year 2001) poverty universe.

The remainder of this page provides a summary of the demographic and income model methodology used for the SAHIE 2001 experimental estimates. Additional methodological detail is available at the link below. Technical papers that describe previous versions of the model are available on the Publications page.

Overview

We estimate the number of people with health insurance coverage by state within demographic groups and income categories. The number insured in a group is the product of the number in the group and the proportion in that group who are insured. Correspondingly, our model has two main parts: one for estimating the numbers of people in state demographic and income groups, and one for estimating the proportions with health insurance in these groups. Each part is a hierarchical two-level regression model. We use Bayesian methods to estimate the model. We estimate the number without insurance as the difference between the number of people in a category and the number with insurance. The demographic groups and income categories are described in the Model Details section.

The dependent variables in the regression models are:

  • 3-year average estimates from the Annual Social and Economic Supplement (ASEC) of the Current Population Survey (CPS);
  • estimates from Census 2000;
  • numbers of IRS tax exemptions;
  • numbers of participants of the Supplemental Nutrition Assistance Program (SNAP) formerly known as the Food Stamp program; and
  • numbers of Medicaid and Children's Health Insurance Program (CHIP) participants.

The CPS ASEC estimates of the number of people in a state demographic and income group, and of the proportion insured, are assumed to be unbiased. The other dependent variables are related to and indicative of these numbers or proportions but are not assumed to be unbiased estimates for them.

The universe for these health insurance estimates is the CPS poverty universe. Therefore, we use demographic estimates of the population adjusted to the CPS poverty universe.

For further information on the dependent variables and population estimates, see information about data inputs.

We control the estimates for states so the following conditions are met:

  • The numbers of insured sum to national CPS ASEC direct estimates of insured in the demographic groups. As a result, the numbers of uninsured sum to the national CPS ASEC direct estimates of uninsured.
  • The sum of the insured and uninsured over all income categories for a state demographic group equals the CPS poverty universe for that state demographic group.

The CPS ASEC estimates for different states have different reliability because of the size of samples in each state. Our estimates consider this factor. Estimates from states with larger samples tend to be closer to the direct estimates.

We provide a confidence interval for each estimate that represents uncertainty from both sampling and modeling. These confidence intervals are Bayesian credible regions calculated using posterior standard deviations and a normal approximation.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header