U.S. flag

An official website of the United States government

Skip Header


Questions and Answers

Questions and Answers

Population Estimates Methods Conference - June 8, 1999

Post 2000 Census in France

Author: Michel Isnard
Presenter:Jean Dumais

Question(s): What are the criteria for defining metropolitan areas in France? How many metropolitan areas of one million or more population are there in France?

Response: An Urban Area captures the economic influence of an urban core. It comprises an urban core offering at least 5000 jobs and all the surrounding communities for which at least 40% of the population is active in the area (1990 Census definition).

There were 368 Urban Areas in 1990.

Listed below, Urban areas of at least 350,000 people in 1990:

Urban Area Household Population (x 1000)
PARIS - 10,293,096
LYON - 1,508,572
MARSEILLE-AIX - 1,345,521
LILLE - 1,079,390
BORDEAUX - 831,327
TOULOUSE - 797,313
NANTES - 609,231
NICE - 540,234
STRASBOURG - 519,203
GRENOBLE - 477,180
ROUEN - 459,177
TOULON - 456,186
RENNES - 430,160
NANCY - 393,152
MONTPELLIER - 378,150
VALENCIENNES - 370,128

Population Estimates for Small areas in the UK: Performance and Promise

Presenter: Stephen Simpson

Question(s): What kinds of specific difficulties are encountered in estimating the population of city areas?

Response: The result that city areas are more difficult to estimate is an empirical one: population estimates for small areas in United Kingdom city cores have higher inaccuracy than would be expected given their population size, change over the past decade, and presence of special student or armed forces populations. This is not a problem for city suburbs outside city cores.

Our understanding is that the administrative records, which are involved to some extent in all estimates, are deficient in these city core areas more than in others and that this is the cause of the difficulty of estimation. They are deficient because of higher mobility of residents, and higher numbers of people who are hard to capture in administrative records - healthy young adults, adults who wish to avoid representation in official records, and no-income, no-residence adults who are not in the target population of some official records.

This result replicates many other studies' findings from many other countries that urban areas are difficult to enumerate in censuses and to interview in surveys. However in Australia population estimates have been found (Andrew Howe, ABS) on average to perform better in city areas: due to incompleteness of administrative registers in some rural areas. The specific weaknesses of specific estimation data have always to be borne in mind.

Population and Housing Estimates for Census Blocks: The San Diego Experience

Presenter: Jeff Tayman

Question(s): How does your methodology account for boundary changes?

Response: We make adjustments to the geographic boundary files and apply them to the 1990 census base and the last year of the current estimates. If necessary, we partition the population and housing data into the new pieces, but often it just requires a change to the geographic code. In this way, we can compare to the census and have a base file for developing the next estimate. The downside is that data for any years between the census and the last estimate point are not based on the most current boundaries.

Boundary changes are a problem. We realize that our current solution is not the best solution, but it is workable, albeit with a lot of effort when many boundary changes are involved. Fortunately, there have been relatively few boundary changes in San Diego County this decade. We are working on less labor-intensive solutions at present.

Population and Housing Estimates for Census Blocks: The San Diego Experience

Presenter: Jeff Tayman

Question(s): How often is the GQ list and long-term residents updated? Are there any problems with rapid change? Do you think that changes in occupancy rates can be modeled using changes in economic conditions (assuming that occupancy is the equilibrating factor when demand for labor declines), as a way of updating small area occupancy rates?

Response: We update the group quarters list as needed. We get good information on military, college, and prison group quarters. The military, especially shipboard population can change significantly from year to year, but we can pick these changes up. I think we probably miss smaller changes in other kinds of group quarters, say under 25, but large changes are usually reported by local agencies.

I think it is very possible to develop supply-demand models for vacancy rates for counties and higher levels of geography. In fact, our region wide forecasting model includes such a model. Whether these relationships hold for subcounty areas is any ones guess. A major problem in developing and maintaining such a model is the lack of subcounty economic information, especially current estimates. Although, I think it is bad practice to hold vacancy rates constant at the last census during major shifts in the economy and which do not reflect current building trends. A start would at least to adjust county level rates and use share or trend methods to estimate subcounty rates off of the county-level change.

Use of Property Tax Records and Household Composition Matrices to Improve the Household Unites Method for Small Area Population Estimates

Presenter: Warren Brown

Question(s): I assume that your method calculates a projected PPH matrix by PPH * T, where T is the transition matrix. From where do you estimate the transition matrix? Or, which margin in the matrix gets updated over time?

Response: I am proposing the use of Household Composition matrices to improve small area estimates, not projections. They can be used in projections as well, but my focus in this project is on estimates.

I am working on two items:

  1. For small areas--such as tracts, block groups, and block group parts--are there a limited number of patterns of household composition that can be identified using a data reduction technique such as cluser analysis.
  2. Are there typical paths by which these small areas change their household composition? That is "college town" neighborhoods that don't change, as one type; and "Levittown" neighborhoods that change dramatically as they age in place, as another type.

The transition matrices are calculated from decennial census data. Determine the change in the household composition that took place between 1980 and 1990 thereby deriving the transition matrix.

Use past paths as guides to expected changes in household composition, thereby affecting the appropriate PPH multiplier in the Housing Unit method for estimates. Investigate the use of the ACS as a source for monitoring changes in household composition for small areas, by type of neighborhood.



Design Alternatives for Building Block Estimates

Presenter: Ron Prevost

Question(s): Do you think that, by 2010, you will be able to use your matched administrative records for a nearly direct count? If not, what is the best use of these matched databases for estimates purposes-what methods do you favor: Synthetic, sample, ratio-correlation, or some other use of your records?

Response: We are creating a long-range research strategy to test administrative records as a design alternative for Census2010. This strategy includes experimentation as part of Census2000 for a census simulation experiment and the potential to enhance or provide expanded input data to support the Intercensal Population Estimates Program. This strategy will be presented at the fall conference of the Federal Committee for Statistical Methodology.

Administrative records have been used in estimation operations for the majority this century. It is difficult to say precisely which method will provide the best approach. The variety of potential applications could include the Housing Unit Method, Component Methods, Shift Share Regression Models or Micro-simulation.

My personal favorite is the Housing Unit Method because of its simplistic elegance and verifiability. The majority of customers employing and reviewing current estimates are not statisticians. Most individuals understand the concepts of housing units, vacancy, and household size. This type of method provides an avenue to develop lasting partnerships between the Census Bureau/FSCPE, and local governments and data users, to understand, use, and improve final products.

Spatially Arrayed Growth Forces and Small Area Population Estimates Methodology

Presenter: Roger Hammer

Question(s): Are there any substantive assumptions or economic model behind your calculation of FG?

Response: The underlying assumption of our approach is that municipality-level population changes are influenced not only by the past behavior and contemporaneous indicators for the municipality of interest but also by the characteristics of neighboring areas. This constitutes something of a contagion model of population growth. The calculation of an adjusted set of estimates utilizing the characteristics of neighboring municipalities tests this assumption.

The formula for the "Force of Growth" incorporates several assumptions, although they are less substantive. First, the population density and the population growth rate are the relevant characteristics of neighboring municipalities with regard to population growth. Second, the specifications of density and growth are ratios comparing the neighboring area with the municipality of interest. Third, the "Force of Growth" is a multiplicative function of these ratios and finally density and growth are equally weighted.

In summary, I think that we are not making any strong assumptions and the assumptions that we are making can be empirically tested within the model in the preparation and comparison of a set of adjusted estimates.

Development of a National Accounting of Address and Housing Inventory: A BaselineInformation for Post-Censal Population Estimates

Presenter: Ching-Li Wang

Question(s): What are your recommendations for updating/geocoding remote areas, non-city-style, etc? Latitude/Longitude coordinates? Are there other forms of updating?

Response: The geocoding/updating remote, non-city-style address has been very challenging. The Bureau uses address listing procedure and assigns an ID number on a spot on the map. Following the same procedure, we can do the following:

  1. Any new non-city street type address needs to provide a map;
  2. The assessor's offices provide the parcel code and other information associated with the parcel (such as coordinates, directions);
  3. Develop a localtional system based on coordinates.

I think the coordinates will be a more precise locational reference.

Once these information are available, the Census Bureau can assign a temporary ID before the address is incorporated with the MAF and TIGER update.

Applying Data from the American Community Survey (ACS) and the Master Address File (MAF) to the Intercensal Population Estimates Program

Presenter: Gregg Diffendal

Question(s): You talk about controlling ACS to ARSH estimates-which implies that ARSH estimates must be more "correct." Yet, a well-done survey should be good in and of itself-especially in capturing recent trends in migration. Shouldn't the updating be done in the other direction as well, letting ACS correct ARSH estimates? What are your thoughts on this?

Response: There are a variety of reasons that surveys use population controls in their weighting. I will attempt to describe a few of them.

Sampling theory tells us that if we know the total population (or any total for the universe) then the estimates will have smaller variances if we use the control in the weighting.

The ACS does not wish to be in conflict with the official estimates produced by the Census Bureau. It is best for us to be consistent, even if we are consistently wrong.

Part of our talk was to show how the ACS estimates could impact the population controls. In the future, the results we saw for Rockland County for blacks would lead us to revise the estimates for blacks in future years.

When you start splitting the data by age race sex and Hispanic origin, you may have very few sample cases in specific cells. If you are measuring the difference between migrant and non-migrant and looking at the residual, you probably are dealing with numbers that would not differ significantly from zero when you account for sampling error.

The ACS does ask for movers and I think this could impact the estimates of migration from the population estimates.

Surveys have to make a variety of assumptions that may not always be exactly correct. For example there probably is undercoverage of new housing units, noninterviews and missing data for specific questions, etc. Using population controls helps minimize the uncertainty and bias from these types of errors.

Presenters: Ron Prevost, Patty Becker, Ching-Li Wang, Gregg Diffendal, Warren Brown, Roger Hammer, Jean Dumais, Greg Williams

Question(s): It is well known and documented that a problem with LUCA updating was "conceptual differences across databases"-that is, the object being measured/described in one database is not exactly the object being measured in the other, causing significant matching difficulties. Now, this problem occurs in lots of databases-do you have any thoughts on how to attack this problem in general?

Responses:

Ron Prevost:

Some of the conceptual differences can only be achieved through an understanding of the data. An example might be converting a county assessor's file with plat numbers to housing unit numbers. We don't have a good way to perform this task without a file that could give us a "cross-walk". Another example might be that we are processing files containing businesses, residences, or a combination of addresses. Our standard approach discussed below appears to accomplish that task, as well as a variety of address types and street aliases.

In general ARRS's approach to address matching has been to run both the input and the target databases through an address sanitizer/standardizer that is CASS compliant. In our case we use Group 1/Code 1 software. In our latest national test approximately 5% of all addresses do not match to this software's databases. We've started testing a process to enhance matching through the use of probabilistic matching software and a second run through Group 1/Code 1.

From early tests it appears that with this additional process we can accurately process about 50% of the addresses not handled in the first pass. We are analyzing output from these procedures and determining if these statistics are consistent across the nation. Furthermore, we plan to research what sort of biases might be inherent in the final 2% of all addresses.

Once the basic street address processing has been completed we will review unit identifiers. Folks have completed work for the MAF Quality Improvement Program (national survey) and have tested probabilistic matching processes to convert the wide range of unit identifiers to a common identifier to improve matching techniques. Example "Upstairs, Downstairs"; or "Front", "Back" might be converted to 1 and 2.

Note: The ability to process addresses through CASS compliant software does not imply an evaluation of matching records to the MAF. Presently we do not have a method to convert Post Office Boxes and General Delivery Addresses to City-Style street addresses or Physical Location Addresses on the MAF. ARRS will be exploring techniques to accomplish this task in our 2000 experiments program (AREX2000).

Ching-Li Wang:

The basic problems in dealing with the address are related to the address naming and numbering system, and how the addresses are reported or key in the database. In addition, the addresses and street names also changed over time depending on local authority decisions. As a result, the address which can be match in the past is no longer can be matched. Therefore, we need a data collection network through a Federal-State-Local Cooperative Program for Address Gathering. That is also what I like to see the EAGLE (Enhance Address Gathering for Local Estimates) to fly.

With the EAGLE, any information about changes or additions in addresses can be quickly transmitted to the Bureau through the network on a regular basis - like the "Daily Entry" in the accounting practice. That is what I say it is important to have a National Accounting of Addresses and Housing Inventory.

Through the EAGLE and the data collection network, the Bureau will be in a better position to have constant local input. At the same time, the Bureau will have the opportunity to set a nationally standardized address data system. With the standardization of addressing system, the address level database can become comparable.

For the time being, it is necessary to analyze the addressing system in each database and develop various matching rules. But, eventually, we need a new data input system to update MAF/TIGER. Struggling with different database will not solve the addressing problems. I still feel we need a National Accounting System of Address and Housing Inventory.

Patty Becker:

Matching on street name can be very difficult when you don't know the area. It has taken us many years to develop a standard for the spelling of Detroit streets, and there's only about 2000 of them.

In the LUCA connection, it is necessary to standardize both the census MAF and the local MAF on a common basis in order to do the match. The easiest standard to use is CASS, the Post Office standard, although I often don't agree with what it does around here. In Detroit we have a routine to standardize other files to our standard street names.

In general, this is just one of the many pitfalls for LUCA which was not sufficiently anticipated ahead of time. Geography Division staff were surprised at how difficult it was to match, and they never gave local governments any guidance at all.

I should also note that commercial geocoding progams have ways of trying to get around this which may or may not result in an accurate geocode, and I've never been happy with them. For Detroit, of course we have our own. The other thing that often happens in using the commercial software is that there is a high fallout rate, and users don't know why it's there. So either they live with it (biasing their data in favor of the streets that don't cause problems) or else they just give up.

Page Last Revised - October 8, 2021
Is this page helpful?
Thumbs Up Image Yes Thumbs Down Image No
NO THANKS
255 characters maximum 255 characters maximum reached
Thank you for your feedback.
Comments or suggestions?

Top

Back to Header