U.S. flag

An official website of the United States government

Skip Header


Spatial Analysis & Modeling

Motivation:

It is often the case that data collected from large-scale surveys can be used to produce high quality estimates at large domains. However, data users are often interested in more granular domains or regions than can be reasonably supported by the data due to small samples which can lead to both imprecise estimates as well as unintended disclosure of respondent data. Indirect methods of inference which utilize statistical models, latent Gaussian processes, and auxiliary data sources have proven to be an effective method for improving the quality of published data products. In addition, there is often a high degree of clustering and spatial correlation present in these large data sets which can be exploited to improve precision. Statistical modeling can be used to incorporate spatial, multivariate, and temporal dependencies as well as to integrate various data sources to both improve quality as well as to produce new estimates in regions and sub-domains with sparse or no data.

 

Research Problems:

  • Statistical methodology for integration of data from various sources.
  • Development of unit-level models.
  • Incorporation of survey weights in statistical models.
  • Development of change-of-support methodology.
  • Development of computationally efficient methods for fitting models to non-Gaussian data.
  • Incorporation of spatially-correlated random effects in small area models.
  • Model-based methods for prediction at low geographic levels.
  • Mean-squared error, uncertainty, and interval estimation.
  • Synthesis of privacy protection and model-based inference.
  • Nonparametric covariance estimation.
  • Inference for irregularly spaced observations from locally-stationary random fields.

 

Current Subprojects:

  • Spatio-temporal methods for simultaneous shrinkage of both means and variances for small area estimation. (Holan, Janicki, Parker)
  • Developing Bayesian pseudolikelihood models for unit-level data obtained from a complex sample survey. (Holan, Janicki, Parker)
  • Development of unit-level models with temporal dependence. (Holan, Janicki, Parker)
  • Development of change-of-support methodology for inference on regions with no direct measurement, based on observations on a distinct geographic region or grid. (Holan, Janicki)
  • Incorporation of spatially-correlated random effects in small area models. (Datta, Janicki, Maples)
  • Development of model-based methods for improving survey estimates at low geographic levels, such as tract and block group. (Holan, Janicki, Parker)
  • Accurate measurement of the uncertainty associated with predictions from highly-complex models. (Holan, Janicki, Parker)
  • Integration of deep learning with spatial modeling. (Holan, Janicki, Parker)
  • Obtaining consistency results when observations are irregularly spaced. (Lahiri)
  • Generation of synthetic micro data from complex spatio-temporal models which preserves properties and dependencies found in the original data and can be published without disclosing confidential information. (Holan, Janicki, Parker)

 

Potential Applications:

  • Estimation of health insurance coverage by different demographic classifi- cations at different geographic levels.
  • Creation of new custom tabulations of ACS data products.
  • Improvement of the precision of noisy measurements of census counts or other variables subject to disclosure avoidance techniques.
  • Methodology for producing public use synthetic micro data.

 

Accomplishments (October 2018-September 2020):

  • Developed a multivariate spatial mixture model for American Community Survey special tabulations which can be used to produce model-based predictions when the survey-specific sample size is insufficient, either due to privacy concerns or data quality concerns.
  • Developed spatial models for differentially private measurements of decen- nial census counts and ratios for improving precision and aggregating to marginal table cells.
  • Developed a spatial change-of-support model for predicting counts in regions where no direct response variable is available.

 

Short-Term Activities (FY 2021 – FY 2023):

  • Produce model-based estimates of 2010 decennial census counts using spatial models fit to differentially private measurements for nine target table shells.
  • Exploration of novel uses of auxiliary data and data integration for im- proved prediction and development of new data products.
  • Research the extent to which utilization of spatial information and multivariate dependencies can reduce the impact of the effect of differential privacy on the precision of data products.
  • Development of software for efficiently fitting a variety of spatial, spatio-temporal, longitudinal, mixture, and other hierarchical Bayesian models.
  • Investigate new and efficient computational methods for fitting high-dimensional models.

 

Longer-Term Activities (beyond FY 2023):

  • Development of model-based methods for inference on very small domains, such as block groups, when the data are very sparse and are of sufficient quality for publication.
  • Development of efficient methods for producing special tabulations which of survey data and which meet the U. S. Census Bureau’s data quality standards.
  • Development of methodology for producing estimates at non-standard geographies such as American Indian and Alaska Native areas and school districts
  • Methodology for producing synthetic microdata which can be made publicly available for data users.

 

Selected Publications:

Parker, P., Holan, S.H., and Janicki, R. (2023). “Comparison of Unit Level Small Area Estimation Modeling Approaches for Survey Data Under Informative Sampling,” Journal of Survey Statistics and Methodology, Vol 11, No. 4, 858-872.

Parker, P., Holan, S.H., and Janicki, R. (2023). “A Comprehensive Overview of Unit Level Modeling of Survey Data for Small Area Estimation Under Informative Sampling,” Journal of Survey Statistics and Methodology, Vol 11, No. 4, 829-857.

Parker, P., Holan, S.H., and Janicki, R. (In Press). “Conjugate Modeling Approaches for Small Area Estimation with Heteroscedastic Structure,” Journal of Survey Statistics and Methodology.

Janicki, R., Holan, S. H., Irimata, K. M., Livsey, J., and Raim, A. (In Press). “Spatial Change of Support Models for Differentially Private Decennial Census Counts of Persons by Detailed Race and Ethnicity,” Journal of Statistical Theory and Practice.

Parker, P., Holan, S. H., and Janicki, R. (2022). “Computationally Efficient Bayesian Unit-Level Models for Multivariate Non-Gaussian Data Under Informative Sampling.” Annals of Applied Statistics, 16, 887 – 904.

Janicki, R., Raim, A., Holan, S. H., and Maples, J. (2022). “Bayesian Nonparametric Multivariate Spatial Mixture Mixed Effects Models with Application to American Community Survey Special Tabulations.” Annals of Applied Statistics, 16, 144 – 168.

Parker, P., Holan, S. H., and Janicki, R. (2020). “Conjugate Bayesian unit-level modeling of count data under informative sampling designs.” Stat, 9, e267.

 

Contact:

Ryan Janicki, Soumen Lahiri, Paul Parker, Scott Holan (ADRM), Serge Aleshin-Guendel

 

Funding Sources for FY 2021-2025:  

0331 – Working Capital Fund / General Research Project

Various Decennial, Demographic, and Economic Projects

Related Information


Page Last Revised - January 3, 2024
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header