U.S. flag

An official website of the United States government

Skip Header


Small Area Estimation

Motivation:

Small area estimation is important in light of a continual demand by data users for finer geographic detail of published statistics and for various subpopulations. Traditional demographic sample surveys designed for national estimates do not provide large enough samples to produce reliable direct estimates for small areas such as counties and even most states. The use of valid statistical models can provide small area estimates with greater precision; however, bias due to an incorrect model or failure to account for informative sampling can result.

 

Research Problems:

  • Development of models that combine data across multiple surveys or combines survey and observational data (non-probability samples) to improve survey estimates.
  • Development of model diagnostic and model comparison tools for small area models.
  • Development of small area share models for subareas estimates (e.g., school districts or tracts).
  • Development of a design-based simulation system which mimics the American Community Survey to use as a test-bed for area- and unit-level small area models, estimation (both model-based and design-based) methodology and estimation of uncertainty measures.
  • Study of measurement error in small area estimation models.
  • Development of temporal small area estimation techniques.
  • Development of spatial small area estimation techniques.
  • Development of more robust estimates of mean squared error of prediction by incorporating Bayesian and bootstrap methods.
  • Development of unit-level model framework which appropriately takes into account the complex design of the survey.

 

Current Subprojects:

  • Using ACS Estimates to Improve Estimates from Smaller Surveys via Bivariate Small Area Estimation Models (Franco, Bell/R&M)
  • Bootstrap Mean Squared Error Estimation for Small Area Means under Non-normal Random Effects (Maples, Datta, Irimata, Slud)
  • Developing correlated small area share models to create estimates of school district child poverty and population (Maples)
  • Developing graphical methods to assess the assumption of constant parameter values across all domains (Maples, Dompreh)
  • Developing Bayesian pseudolikelihood models for unit-level data obtained from a complex sample survey (Janicki)
  • Assessment of mean squared errors of empirical best linear unbiased predictors for misspecified models (Datta, Slud)

 

Potential Applications:

  • Borrowing strength from ACS estimates using bivariate modeling has many potential applications, including improving estimates from smaller surveys such as SIPP, NHIS, and CPS, and improving the ACS one year estimates themselves using the previous ACS 5-year estimates.
  • Model diagnostic and comparison tools can be applied in any small area application, from SAIPE to SAHIE, to small area models applied to SIPP, AHS, etc.
  • The design-based simulation framework for evaluating modes can be used for SAIPE, SAHIE, and other small area programs that use ACS data. The framework can also test the properties of design-based/assisted estimation procedures, such as improvements of sampling variance estimates, propensity score models etc.
  • Temporal extensions of small area models will be potentially useful in the VRA Section 203B determinations, and can be applied to ACS data in general, as well as to other surveys that are repeated over time.
  • The evaluation of measurement error will help determine if it is appropriate to use ACS-estimates as covariates in models for the Section 203B determinations, and at what level of aggregation.
  • Small area share models may be a replacement to the current for the current school district estimates procedures for SAIPE.
  • Spatial small area models can improve estimates and provide limited disclosure avoidance for some of the ACS special tabulations.

 

Accomplishments (October 2018-September 2020):

  • Developed empirical and theoretical evidence that shows the strong potential of borrowing strength from ACS estimates to improve estimates from smaller U.S. sample surveys using simple bivariate small area estimation models, including applications to NHIS and SIPP, and an application that improves ACS one-year estimates using previous five-year estimates.
  • Developed a small area share model to estimate the number of school aged children in poverty for school districts given the official county level poverty estimates.
  • Studied alternative models for SAIPE county estimates of school-aged children in poverty using a design-based simulation, and explored the impact of sampling variance estimation in model selection, exploring how design-based estimate and GVF-based estimates impact performance.
  • Derived several different mean squared error estimators, both analytical and bootstrapped-based, which will be evaluated in a large simulation study.
  • Studied the impact of differential privacy noise infusion on voting district plans and evaluated measures of variability.

 

Short-Term Activities (FY 2021 – FY 2023):

  • Extend the Small Area Shares model to allow for dependence between sets of shares, e.g., allow the school district to county shares of school age children in poverty and not-in-poverty to have a dependence.
  • Finish creating the Artificial Population which mimics the distribution of the U.S. population and implement an ACS-like survey design.
  • Improve predictions in ACS special tabulations using a mixture of spatial models.
  • Evaluate different mean squared error estimates under the Fay-Herriot model when the error distribution is not always correctly specified.
  • Study the impact of measurement error in covariates in small area models for the Voting Rights Act Section 203 determinations.
  • Explore times series extensions of the Multinomial Logit model and determine suitability for Voting Rights Act Section 203 determinations.
  • Develop multivariate spatial models which use differentially private measurements and auxiliary survey data for the purpose of predicting the number of persons in counties and AIAN areas for detailed race groups.

 

Longer-Term Activities (beyond FY 2023):

  • Develop graphical methods to test assumptions about constant model parameters across all areas.
  • Develop models that jointly model survey-weighted proportions and effective sample sizes.
  • Explore if a time series model can be applied to improve sampling variance estimates by borrowing strength from estimates from previous years.
  • Evaluation of new models (county and school district) to update official SAIPE methodology.
  • Deliver a set of 1000 independent survey samples from the Artificial Population with a design similar to the American Community Survey.

 

Selected Publications:

Datta, G.S. and Li, J. (In Press). “A Quasi-Bayesian Approach to Small Area Estimation Using Spatial Models,” Calcutta Statistical Association Bulletin.

Datta, G.S., Lee, J., and Li, J. (In Press). “Pseudo-Bayesian Small Area Estimation,” Journal of Survey Statistics and Methodology.

Franco, C. and Bell, W.R. (In Press). “Using American Community Survey Data to Improve Estimates from Smaller U.S. Surveys through Bivariate Small Area Estimation Models,” Journal of Survey Statistics and Methodology.

Parker, P.A., Janicki, R., and Holan, S. (In Press). “Bayesian Methods Applied to Small Area Estimation for Establishment Statistics,” in Bavdaž, M., Bender, S., Jones, J., MacFeely, S., Sakshaug, J.W., Thompson, K.J., and van Delden, A. (Eds.), Advances in Business Statistics, Methods and Data Collection, Wiley.

Parker, P., Holan, S., and Janicki, R. (2022). “Computationally Efficient Bayesian Unit-level Models for Non-Gaussian Data Under Informative Sampling with Application to Estimation of Health Insurance Coverage,” The Annals of Applied Statistics, Vol 16, No. 2, 887-904.

Ghosh, T., Ghosh, M., Maples, J., and Tang, X. (2022). "Multivariate Global-Local Priors for Small Area Estimation," STATS, v5, 673-688. https://www.mdpi.com/2571-905X/5/3/40/htm.

Janicki, R., Raim, A.M., Holan, S.H., and Maples, J. (2022). “Bayesian Nonparametric Multivariate Spatial Mixture Mixed Effects Models with Application to American Community Survey Special Tabulations,” The Annals of Applied Statistics, Volume 16, Issue 1, 144-168.

Erciulescu, A., Franco, C., and Lahiri, P. (2021). “Use of Administrative Records in Small Area Estimation,” in Chun, A. Y. and Larsen, M. (Eds.), Administrative Records for Survey Methodology, New York, NY: Wiley Publishers.

Liu, B., Dompreh, I., and Hartman, A.M. (2021). “Small Area Estimation of Smoke-Free Workplace Policies and Home Rules in U.S. Counties,” Journal of Nicotine and Tobacco Research.

Parker, P. A., Holan, S. H., and Janicki, R. (2020). “Bayesian Unit-Level Modeling of Count Data under Informative Sampling Designs,” Stat, 9.

Maples, J. (2019). “Small Area Estimates of the Child Population and Poverty in School Districts Using Dirichlet-Multinomial Models,” 2019 Proceedings of the American Statistical Association, Section on Survey Research Methods, American Statistical Association, Alexandria, VA, 3150-3152.

Bell, W. R., Chung, H. C., Datta, G. S., and Franco, C. (2019). “Measurement Error in Small Area Estimation: Functional vs. Structural vs. Naïve Models,” Survey Methodology, 45, 61-80.

Chakraborty, A., Datta, G.S., and Mandal, A. (2019). “Robust Hierarchical Bayes Small Area Estimation for Nested Error Regression Model, International Statistical Review, 87, S1, S158–S176, doi:10.1111/insr.12283.

Chung, H., Datta, G., and Maples, J. (2019). “Estimation of Median Incomes of the American States: Bayesian Estimation of Means of Subpopulations,” Opportunities and Challenges in Development, Simanti Bandyopadhyay and Mousumi Datta (ed.), New York: Springer, 505-518.

Franco, C., Little, R. J. A., Louis, T. A., and Slud, E. V. (2019).  “Comparative Study of Confidence Intervals for Proportions in Complex Surveys,” Journal of Survey Statistics and Methodology, 7, 3, 334-364.

Datta, G.S., Rao, J.N.K., Torabi, M., and Liu, B. (2018). “Small Area Estimation with Multiple Covariates Measured with Errors: A Nested Error Linear Regression Approach of Combining Two Surveys, Journal of Multivariate Analysis, 167, 49-59.

Arima, S., Bell, W. R., Datta, G. S., Franco, C., and Liseo, B. (2017). “Multivariate Fay-Herriot Bayesian Estimation of Small Area Means Under Functional Measurement Error,” Journal of the Royal Statistical Society--Series A, 180(4), 1191-1209.

Janicki, R. and Vesper, A. (2017). “Benchmarking Techniques for Reconciling Small Area Models at Distinct Geographic Levels,” Statistical Methods Applications, DOI: https://doi.org/10.1007/s10260-017-0379-x, 26, 557-581.

Maples, J. (2017). “Improving Small Area Estimates of Disability: Combining the American Community Survey with the Survey of Income and Program Participation,” Journal of the Royal Statistical Society-Series A, 180(4), 1211-1227.

Chakraborty, A., Datta, G.S., and Mandal, A. (2016). “A Two-component Normal Mixture Alternative to the Fay-Herriot Model,” Joint issue of Statistics in Transition new series and Survey Methodology, Part II, 17, 67-90.

Janicki, R (2016). “Estimation of the Difference of Small Area Parameters from Different Time Periods,” Research Report Series (Statistics #2016-01), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Datta, G.S. and Mandal, A. (2015). “Small Area Estimation with Uncertain Random Effects,” Journal of the American Statistical Association: Theory and Methods, 110, 1735-1744.

Franco, C. and Bell, W. R. (2015). “Borrowing Information over Time in Binomial/logit Normal Models for Small Area Estimation,” Joint issue of Statistics in Transition and Survey Methodology, 16, 4, 563-584.

Bell, W.R., Datta, G.S., and Ghosh, M. (2013). “Benchmarking Small area Estimators,” Biometrika, 100, 189-202, doi:10.1093/biomet/ass063.

Franco, C. and Bell, W. R. (2013). “Applying Bivariate/Logit Normal Models to Small Area Estimation,” In JSM Proceedings, Survey Research Methods Section.  Alexandria, VA: American Statistical Association. 690-702.

Datta, G., Ghosh, M., Steorts, R., and Maples, J. (2011). “Bayesian Benchmarking with Applications to Small Area Estimation,” TEST, Volume 20, Number 3, 574-88.

Janicki, R. (2011). “Selection of Prior Distributions for Multivariate Small Area Models with Application to Small Area Health Insurance Estimates,” JSM Proceedings, Government Statistics Section. American Statistical Association, Alexandria, VA.

Maples, J. (2011). “Using Small-Area Models to Improve the Design-Based Estimates of Variance for County Level Poverty Rate Estimates in the American Community Survey,” Research Report Series (Statistics #2011-02), Center for Statistical Research and Methodology, U.S. Census Bureau, Washington, D.C.

Slud, E. and Maiti, T. (2011). “Small-Area Estimation Based on Survey Data from Left-Censored Fay-Herriot Model,” Journal of Statistical Planning & Inference, 3520-3535.

Joyce, P. and Malec, D. (2009). “Population Estimation Using Tract Level Geography and Spatial Information,” Research Report Series (Statistics #2009-3), Statistical Research Division, U.S. Census Bureau, Washington, D.C.

Malec, D. and Maples, J. (2008). “Small Area Random Effects Models for Capture/Recapture Methods with Applications to Estimating Coverage Error in the U.S. Decennial Census,” Statistics in Medicine, 27, 4038-4056.

Malec, D. and Müller, P. (2008). “A Bayesian Semi-Parametric Model for Small Area Estimation,” in Pushing the Limits of Contemporary Statistics: Contributions in Honor of Jayanta K. Ghosh (eds. S. Ghoshal and B. Clarke), Institute of Mathematical Statistics, 223-236.

Huang, E., Malec, D., Maples J., and Weidman, L. (2007). “American Community Survey (ACS) Variance Reduction of Small Areas via Coverage Adjustment Using an Administrative Records Match,” Proceedings of the 2006 Joint Statistical Meetings, American Statistical Association, Alexandria, VA, 3150-3152.

Maples, J. and Bell, W. (2007). “Small Area Estimation of School District Child Population and Poverty: Studying Use of IRS Income Tax Data,” Research Report Series (Statistics #2007-11), Statistical Research Division, U.S. Census Bureau, Washington, D.C.

Slud, E. and Maiti, T. (2006). “Mean-Squared Error Estimation in Transformed Fay-Herriot Models,” Journal of the Royal Statistical Society-Series B, 239-257.

Malec, D. (2005). “Small Area Estimation from the American Community Survey Using a Hierarchical Logistic Model of Persons and Housing Units,” Journal of Official Statistics, 21 (3), 411-432.

 

Contact:

Jerry Maples, Ryan Janicki, Gauri Datta, Kyle Irimata, Bill Bell (ADRM), Eric Slud

 

Funding Sources for FY 2021-2025:         

0331 – Working Capital Fund / General Research Project

Various Decennial, Demographic, and Economic Projects

Related Information


Page Last Revised - January 3, 2024
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header