The following is a text-only version of the paper "Sampling Error Modelling
of Poverty and Income Statistics for States" by Mark C. Otto and William R.
Bell.  All mathematical equations are presented in two forms in the paper:  a
text-symbolic form and an english-only form. 

SAMPLING ERROR MODELLING OF POVERTY AND INCOME STATISTICS FOR STATES

Mark C. Otto and William R. Bell, U.S. Bureau of the Census
Mark Otto, Bureau of the Census, Rm. 3000-4 SRD, Washington, DC 20233-9100

Key Words: Small area estimation, Repeated surveys, Wishart distribution

1    Introduction

This paper develops models for sampling error covariance matrices of estimates 
of age group poverty rates, median income, and per capita income from the 
Current Population Survey's (CPS) March Supplement.  (Not part of the normal
CPS estimation procedures, the covariances for 1989 to 1993 were estimated by 
Bob Fay using vplex, a variance estimation program (Fay 1989).)  The ultimate 
objective is to use the sampling error models developed here, in combination 
with models for the time series of poverty rates and income measures, to 
improve estimates of the "true" (unobserved) state poverty rates and income 
measures.

Our models account for three features of the CPS sampling error covariance 
structure: (1) differences in the variances by state (through random state 
effects); (2) dependence of variances on sample size and on the level of the 
estimates (through a generalized variance function, GVF); and (3) sampling 
error correlations over time (through an autoregressive-moving-average (ARMA) 
time series model).  Section 2 describes our general modelling approach.  Then,
in Section 3, we discuss details of the model development for the CPS 
application.

2    General Approach 

A general model used in both time series and small area applications starts by 
writing y sub i equals capital Y sub i plus e sub i, where the y sub i's are 
direct survey estimators, the capital Y sub i's are the population 
characteristics ("truth") being estimated, and the e sub i's are the sampling 
errors in the y sub i's.  In our application the single index i would index 
both states and years.  In matrix-vector notation and assuming normality, the 
general model for the observed data, y equals the vector y sub one through y 
sub n, is

Equation 2.1:

Text-Symbolic:
y = Y + e          Y = (Y_1,...,Y_n)', etc.
Y = X*beta + u     u ~ N(0, sigma(psi))
e ~ N(0, V(eta))   independent of u.

English-Only:
y equals capital Y plus e, where capital Y equals the vector Y sub one through
Y sub n, etc.  Then, capital Y equals X times beta plus u, where u follows a 
normal distribution with mean zero and variance sigma of psi, and e follows a
normal distribution with mean zero and variance V of eta, independent of u.

The data assumed available are both y and C, the latter being a direct estimate
of the variance-covariance matrix of the sampling errors e.  The parameters of 
the model (2.1) are the p by 1 vector of regression parameters beta, and the r
by one and m by one vectors of parameters psi and eta that determine the n by 
n covariance matrices sigma of psi and V of eta.  Having postulated a model of 
form (2.1), the task is to use the data y and C to make inferences about the 
parameters (beta, psi, eta) and, ultimately, about the true population 
quantities Y.

In this paper we focus on the use of C to make inferences about eta, which we 
shall call sampling error modelling.  It has been fairly standard in small 
area estimation to simply assume V = C is known (so eta corresponds to all 
distinct elements of V).  This is not really a "model" and has the 
disadvantage that it fails to acknowledge any uncertainty about V.  We shall 
use an approach suggested by Bell and Otto (1993), which is based on assuming 
a Wishart distribution (DeGroot 1970, Section 5.5) for C as a working model:

Equation 2.2:

Text-Symbolic:
nu*C ~ Wishart(nu, V(eta))

English-Only:
nu times C is distributed Wishart with mean nu and variance V of eta.

This model allows us to recognize uncertainty about V by recognizing 
uncertainty about the parameters eta that determine V = V(eta).  Generally, 
the degrees of freedom parameter nu will also be unknown.  In some cases nu 
can be set to an estimated value (see Bell and Otto 1993).  Alternatively, we 
could let nu depend on model parameters in eta (expanding the definition of 
eta, and writing  nu of eta).  In either case, we shall assume eta defines the 
unknown parameters of the model (Equation 2.2).

Though estimates of sampling error variances and covariances are rarely of the 
simple form that would lead exactly to a Wishart distribution, the model 
(Equation 2.2) may still prove useful since it provides an objective means of 
using the data C in making inferences about  eta, and hence about V.  Under a 
classical (non-Bayesian) approach, the model (Equation 2.2) can be estimated 
by maximizing the Wishart likelihood,

Equation 2.3:

Text-Symbolic:
L(eta | C) = g(nu) |V(eta)|^{-nu/2} * e^{(-1/2)*tr(V(eta)^{-1}C)},

English-Only:
the wishart likelihood of eta given C equals g of eta times the determinant of
V of eta raised to the negative nu over 2 all times e raised to the open 
parenthesis negative one half times the trace of open parenthesis the inverse 
of V of eta times C close parenthesis close parenthesis,

where g of eta includes those terms in the Wishart density not explicitly 
present in (Equation 2.3) -- these involve nu but not V of eta.  Under a 
Bayesian approach, (Equation 2.3) can be multiplied by a (possibly 
noninformative) prior distribution p of eta, which yields something 
proportional to the posterior p of eta given C.

The CPS sample design is such that, since 1985, samples for different states 
are drawn independently.  By ordering the observations y so that all the y sub
i for each state occur in succession, V of eta, and correspondingly C, will be 
block diagonal.  Thus, the diagonal blocks C sub s of C will be assumed to 
have independent Wishart distributions analogous to (Equation 2.2).  The next 
section develops this model using generalized variance functions (GVFs) and 
time series models to define V of eta.

3    Modelling the CPS Sampling Errors

In this section we develop models for the sampling errors in the state CPS 
estimates of age group poverty rates, median income, and per capita income.  
Section 3.1 analyzes variation of the sampling error variances of states and 
years.  Section 3.2 examines the relation of the variances to the estimates 
using generalized variance functions (GVFs).  Section 3.3 analyzes the 
correlations of the sampling errors over time.  These are all preliminary 
analyses to Sections 3.4 and 3.5, which put the results together into a full 
model for the state sampling error covariance matrices C sub s.

3.1  Preliminary Analyses of the Variances

To check whether the sampling error variances vary by state and by year, we 
did an analysis of variance (ANOVA) with the weighted log variances for each 
statistic.  The variances, v sub s t, were weighted by the state sample sizes,
n sub s t, defined as the number of households in sample for state s and year 
t.  The log transformation was used to make the data more normal.

Equation 3.1:

Text-Symbolic:
log(n_{st} v_{st}) = log({Y_{st}}^2) + State_s + Year_t + z_{st},
s = 1, ..., 51 and t = 1989, ..., 1993.

English-Only:
log of open parenthesis n sub s t times v sub s t close parenthesis equals log 
of open parenthesis capital Y sub s t squared close parenthesis plus state sub 
s plus year sub t plus z sub s t where s equals an integer between and 
including one and fifty one and t equals an integer between and including 1989 
and 1993.

Since we do not have the true values, capital Y sub s t, we substitute the 
corresponding direct estimates y sub s t  in (Equation 3.1).

The usual ANOVA F-statistics are suspect for (3.1) because the z sub s t's may 
not be independent over time, so we concentrated on simply examining the mean 
squares (MS).  The mean squars for Years are an order of magnitude smaller 
than those for States for the age 0 to 4 poverty rate and both income 
statistics, but only marginally smaller for the other poverty rates.  We 
suspect that some of the variation over years and states can be captured by a 
generalized variance function (GVF) that permits a more general dependence of 
the variances on the level of the data than is accounted for by log of capital
Y sub s t squared only.  This may lessen the need for a State effect and even 
eliminate the need for a Year effect.

3.2  Generalized Variance Functions

We modelled the relation of the variances to their estimates using GVF models 
(Wolter, 1985, p. 203) that extend (Equation 3.1) as follows:

Equation 3.2:

Text-Symbolic:
log(n_s*v_{st}) = log(GVF(Y_{st}) + State_s + Year_t + z_{st}
s = 1, ..., 51 and
t = 1989, ..., 1993.

English-Only:
log of open parenthesis n sub s times v sub s t close parenthesis equals the 
log of the generalized variance function of open parenthesis capital Y sub s t 
close parenthesis plus state sub s plus year sub t plus z sub s t, where s is
an integer between and including 1 and 51, and t is an integer between and 
including 1989 and 1993.

The GVFs take on one of the forms in Table 1.  (We show the GVFs in terms of 
the variance, whereas Wolter gives them in terms of the relative variance.)  
Note that for the poverty rates, which are proportions, the usual binomial 
distribution theory suggests a model of the form alpha plus beta times capital
Y plus gamma times capital Y squared with alpha equals zero and gamma equals 
negative beta.  We use a bias corrected version of Akaike's AIC (Hurvich and 
Tsai, 1991) to discriminate between our models (the model with minimum AIC 
being favored).  The AIC results in Table 1 show that for median income, the 
AIC are not very different (within 2 of each other) except that for (v), which 
is significantly worse.  For per capita income, however, models (iii) and (iv) 
are preferred over the others.

For Poverty Rates, the AIC results in Table 1 show that all the GVFs tried are 
a major improvement over the constant relative variance model,(i), with models 
(iv) and (vi) consistently being the two best models.  AICs for models (ii), 
(iii), and (v) are not that much higher, however, so that our general 
conclusion is that while use of a GVF more general than the constant relative 
variance model is important, the particular choice among (ii) to (vi) may not 
be essential.

Table 1:  AICs of GVF models
------------------------------------------------------------------------
Income Statistics
------------------------------------------------------------------------
                                   AIC
GVF                                            Med              P.C.
------------------------------------------------------------------------
(i)   gamma times capital Y squared
      (gamma equals e to the mu in 
      (Equation 3.1))                          197.5            224.4

(ii)  alpha plus beta times capital Y          196.8            227.4

(iii) alpha plus beta times capital Y
      plus gamma times capital Y squared       195.5            211.2

(iv)  the inverse of open parenthesis
      alpha plus beta over capital Y
      close parenthesis                        196.9            223.3

(v)   the inverse of open parenthesis
      alpha plus beta over Y plus 
      gamma over Y squared close 
      parenthesis                              204.1            214.6 

(vi)  alpha times capital Y raised
      to the power beta                        197.1            227.6

------------------------------------------------------------------------
Poverty Rates
------------------------------------------------------------------------
                                   AIC for Age
GVF              0-4           5-17           18-64           65+
------------------------------------------------------------------------
(i)              315.9         314.8          231.1           406.2
(ii)             217.4 (3)     254.2          217.3           361.2
(iii)            220.6         255.6          218.6           363.0
(iv)             217.4 (2)     250.4 (2)      215.1 (1)       359.8 (1)
(v)              220.6         253.4          218.4           363.2
(vi)             217.3 (1)     250.2 (1)      215.8 (2)       360.4 (2) 
------------------------------------------------------------------------
The superscripts (1), (2), and (3) show the first, second, and third
best fitting models.  The third model is shown only when its AIC is
close to those of the first two.
------------------------------------------------------------------------

3.3  Analysis of the Correlations

As noted earlier, in recent years CPS samples for different states have been 
drawn independently, so that sampling error correlations between states are 
approximately zero.  Correlations over time between sampling errors in CPS 
estimates for any given state are generated by correlations in individual 
responses over time and by the nature of the CPS sample design.  The 4-8-4 CPS 
rotation pattern leads to autocorrelation in monthly sampling errors that has 
been investigated for labor force characteristics by Train, Cahoon, and Makens 
(1978), Dempster and Hwang (1993), and Adam and Fuller (1992).  Time series 
models for sampling errors in monthly CPS estimates have been developed by 
Tiller (1992) and Bell and Hillmer (1994).  (Sampling error autocorrelation in 
monthly CPS estimates is also affected by composite estimation, which is not 
done for the annual March supplement estimates.)

Sampling error autocorrelation in the annual CPS estimates that we analyze 
here should follow a simpler pattern, since the 4-8-4 monthly rotation scheme 
produces a 50% sample overlap one year apart, and no overlap two or more years 
apart.  If samples comprising different rotation groups were selected 
independently, this would mean that sampling errors more than one year apart 
would be approximately uncorrelated, which would correspond to a moving 
average model of order one (MA(1)).  However, two aspects of the CPS design 
can lead to sampling error autocorrelation extending beyond the sample overlap.
The first is the practice of replacing households that rotate out of the 
sample by neighboring households, since neighbors probably exhibit correlation 
in economic characteristics such as income.  The second aspect is the fact 
that primary sampling units (PSUs) are redrawn only for CPS redesigns that 
occur about every 10 years, so that the between PSU component of sampling 
error probably contributes autocorrelation for many years due to PSU overlap.  
Train, Cahoon, and Makens (1978) estimated nonzero sampling error correlations 
between monthly CPS estimates at time points with no sample overlap.  We thus 
want to examine estimated sampling error autocorrelations for evidence that 
they are nonzero at lags beyond one year.  We postulated an 
autoregressive-moving average model of order, 1,1 (ARMA(1,1)) as a more 
general structure to allow for this.  If the autoregressive parameter is zero, 
this model reduces to the MA(1) model that corresponds to sampling error 
autocorrelation only at the one year lag where sample overlap occurs.

Before determining an appropriate time series model to account for sampling 
error autocorrelation, two other questions should be addressed.  First, does 
the sampling error autocorrelation appear stationary, that is, does the 
correlation of e sub s t and e sub open parenthesis s t minus one close 
parenthesis depend only on the lag l separating the two estimates, not on the 
year t?  Second, do the autocorrelations show essentially the same pattern for 
all the states?  Notice that building a separate time series model for each 
state is impractical, given that we have estimated sampling error 
autocorrelations only for five years for each state.

ANOVA models for each measure were fit to investigate the contributions of 
state, lag, and year within lag effects to the variation in the (transformed) 
correlation estimates.  The lag effects were estimated to be the most 
important, followed by the state effects.  The mean squares of the year 
effects nested within the lags were at most 8 percent of the lag effect mean 
squares.  This suggested stationarity may be the reasonable assumption, and 
that any true variation over states in autocorrelation is secondary to the lag
effect.

Lastly, we estimated the means by lag over states and years of the transformed 
autocorrelations for each statistic and backtransformed them.  These are shown 
in the following table:

Table 2:  CPS Correlations by Lag
------------------------------------------------------------------------
                                                Lag
Statistic                  1              2             3          4
------------------------------------------------------------------------
Per Capita Income         0.53           0.31          0.27       0.23
Median Income             0.52           0.30          0.27       0.20
------------------------------------------------------------------------
Poor 0-4                  0.37           0.14          0.12       0.09
Poor 5-17                 0.36           0.19          0.17       0.12
Poor 18-64                0.40           0.22          0.19       0.15
Poor 65 and over          0.30           0.09          0.06       0.07
------------------------------------------------------------------------

These results show, as expected, the highest correlation at lag 1, but with 
evidence of additional correlation at lags 2 through 4.  The patterns are 
reasonably consistent with the postulated (ARMA(1,1) model, although the fit 
would be better if the correlations at lags 3 and 4 showed faster decay 
towards zero.  An ARMA(1,2) model could be used to capture this pattern.

3.4  A Model for the State Sampling Error Covariance Matrices C sub s

The analyses of sections 3.1 to 3.3 established that the sampling errors in 
the CPS estimates of annual state income and poverty characteristics (i) are 
autocorrelated, with autocorrelation extending beyond lag 1, and (ii) are 
heteroscedastic, with variances depending on sample size, level (of estimates),
and state.  The model we shall use for the C sub s accounts for 
autocorrelation with a time series model (e.g., ARMA(1,1)), and for 
heteroscedasticity through scaling by sample size, use of a GVF, and use of 
state effects on variances.  It is also possible that variances depend on time,
apart from their dependence on level, and that autocorrelation varies by
state.  But these effects appear secondary to those we shall account for in a 
our model, and they also would be difficult to include in the model.  Thus, we 
shall ignore these possible effects in the remainder of the analysis.

As noted in Section 2, our model for the C sub s will be based on the Wishart 
distribution, with a parametric representation of the expected value of C sub 
s equals V sub s of eta.  To account for the effects noted above, V sub s of
eta can take the following form:

Equation 3.3:

Text-Symbolic:
V_s(eta) = omega_s * D_s(alpha, beta, gamma) * R(phi, theta) * 
           D_s(alpha, beta, gamma)

English-Only:
V sub s of eta equals omega sub s times D sub s of alpha, beta and gamma times
R of phi and theta times D sub s of alpha, beta, and gamma.

where omega sub s is the effect of state s on the covariances, D sub s of 
alpha, beta, and gamma, is a diagonal matrix with entries corresponding to 
square toots of GVFs divided by sample sizes n sub st (e.g., the square root 
of open parenthesis open parenthesis alpha plus beta times y sub s t plus 
gamma times y sub s t squared close parenthesis divided by n sub s t close 
parenthesis, with estimates y sub s t replacing the true values capital Y sub
s t), and R of phi and theta is a 5 X 5 correlation matrix corresponding to a 
time series model such as ARMA(1,1)(e sub s t equals phi times e sub s t plus 
epsilon sub s t minus theta times epsilon sub s open parenthesis t minus one
close parenthesis).  The full model can then be defined by assuming nu times
C sub s has a Wishart (nu, V sub s of eta) distribution, with V sub s of eta  
given by (Equation 3.3) and nu the degrees of freedom (assumed the same for
each state).

A problem with the model (Equation 3.2) is the number of state effect 
parameters omega sub s (51) for the amount of data available (effectively 255 
estimated variances, since the 510 estimated autocorrelation don't contribute 
information about the omega sub s).  In fact, some convergence problems were 
experienced when fitting the models (Equation 3.2) (requiring tinkering with 
initial values to resolve these), and it was suspected that these problems 
were due to the high ratio of parameters to data.  As an alternative to reduce 
the number of model parameters while still allowing for differing state 
effects omega sub s the omega sub s are assumed to be random effects, with tau
equals one over omega sub s coming from a Gamma distribution constrained so 
that the expected value of omega sub s equals one.  This implies a Gamma (a+1,
a inverse) distribution for tau sub s defined by the one parameter a.  Thus, 
the model is

Equation 3.4:

Text-Symbolic:
nu*C_s = W_s / tau_s
W_s ~ independent Wishart(nu,{V tilde}_s(eta))
tau_s ~ i.i.d. Gamma(a+1, a^{-1}

English-Only:
nu times C sub s equals W sub s divided by tau sub s, where W sub s is 
distributed independent Wishart with parameters nu and V tilde sub s of
eta, and tau sub s is distributed independently and identically Gamma with
parameters a plus one and a inverse,

with V tilde sub s of eta given by dropping omega sub s from (Equation 3.3).  
The degrees of freedom parameter, nu, is assumed common across states s.  The 
density of C sub s can be shown to be

Text-Symbolic:
p(C_s) = g(nu, a)*[a+(nu/2)*tr(({V tilde}_s(eta))^{-1}*C_s)]^{-(a+1+(nu*k/2))}
         * |{V tilde}_s(eta)|^{-nu/2}*|C_s|^{(nu-k-1)/2}

English-Only:
the density of C sub s equals g of nu and a times open bracket a plus nu over
2 times the trace of open parenthesis open parenthesis V tilde sub s of eta 
close parenthesis inverse times C sub s close parenthesis close bracket raised
to negative open parenthesis a plus one plus nu times k over two close 
parenthesis, all times open parenthesis the determinant of V tilde sub s of 
eta close parenthesis raised to negative nu over two, all times open 
parenthesis the determinant of C sub s close parenthesis raised to the open 
parenthesis one half times open parenthesis nu minus k minus one close 
parenthesis close parenthesis

where

Text-Symbolic:
g(nu, a) = [pi^{k*(k-1)/4} * product_{j=1 to k}(gamma((nu-j+1)/2))]^{-1}
            * (a^{a}*(gamma(a))^{-1})*(gamma(a+1+(nu*k/2)))*(nu/2)^{nu*k/2}.

English-Only:
g of nu and a equals open bracket pi raised to open parenthesis k times open
parenthesis k minus one close parenthesis divided by 4 close parenthesis 
all times the product from j equals one to k of gamma of open parenthesis 
open parenthesis nu minus j plus one close parenthesis divided by 2 close
parenthesis close bracket inverse, all times open parenthesis a raised to a 
close parenthesis divided by gamma of a times gamma of open parenthesis a 
plus one plus nu times k over two close parenthesis, all times open parenthesis
nu over two close parenthesis raised to open parenthesis nu times k over 2 
close parenthesis.

The density of C sub s is the likelihood function for eta that can be 
maximized in classical inference or used to develop the posterior of eta for 
Bayesian inference.

Note that as a goes to zero the Gamma(a+1, a inverse) distribution becomes 
diffuse, essentially letting the omega sub s be fixed (unrelated) state 
effects.  As a goes to infinity the Gamma(a+1, a inverse) distribution becomes 
degenerate at 1, implying no state effects (omega sub s equals one for all s).

As another way to understand the model (Equation 3.4), note that the 
Gamma(a+1, a inverse) distribution for tau sub s in (Equation 3.4) is the same 
as (a+1/a)*Gamma(a+1,(a+1) inverse) (note DeGroot 1970, p.39), and Gamma(a+1, 
(a+1) inverse) is the same as the chi squared distribution with 2 times open 
parenthesis a plus one close parenthesis all divided by 2 times open 
parenthesis a plus one close parenthesis distribution.  Also, in the 
univariate (k=1) case, the Wishart distribution for W sub s in (Equation 3.4) 
becomes that of V tilde sub s of eta (now a scalar) times chi squared random 
variable with nu degrees of freedom.  It is thus easy to see that the 
distribution for C sub s in the univariate case is that of a divided by open
parenthesis a plus one close parenthesis all times V tilde sub s of eta times 
an F(nu, 2 times (a+1)) random variable.  For k greater than one then, apart 
from the a over open parenthesis a plus one close parenthesis factor, the 
distribution of C sub s implied by (Equation 3.4) is something like a 
multivariate generalization of the F distribution.  (Though the label 
multivariate F has been used for distributions related to the joint 
distribution of only the diagonal elements of C sub s; see Johnson and Kotz
(1972, pp. 240-243).)  Unless a is "large," the model (Equation 3.4) implies a 
longer tail in the distribution of C sub s than the Wishart (or chi squared).  
This is needed to accommodate the variation across states.

A related random effect variance model was proposed by Kleffe and Rao (1992), 
and studied further by Arora and Lahiri (1993).  The model used was simpler 
than (Equation 3.4), being for the univariate case and assuming independent 
sampling errors for different small area estimators.  It also differed from 
(Equation 3.4) in that a distribution was assumed directly for the small area 
variances, rather than allowing random small area effects on the variances (as 
was done in assuming omega sub s random to get from (Equation 3.3) to 
(Equation 3.4)).  The random variance distribution was left unspecified by 
Kleffe and Rao, while Arora and Lahiri assumed a gamma distribution for the 
precisions (reciprocals of the variances).

3.5  Sampling Error Model Estimation

We estimated by maximum likelihood 24 different variants of the model 
(Equation 3.4) for the sampling error covariance matrices, C sub s, for each 
statistic.  The variants corresponded to all combinations of 8 GVFs and 3 ARMA 
models.  The GVFs included those in Table 1, plus a constant variance GVF, and 
the CPS GVF, beta times capital Y plus gamma times capital Y squared, which is 
(iii) with alpha equals zero.  The ARMA models tried were the AR(1), 
ARMA(1,1), and ARMA(1,2).  Table 3 summarizes some of the results.

Consistent with the preliminary results of section 3.2, for the poverty rates 
the constant relative variance model fit poorly.  Its AICs were higher than 
those of the best fitting GVF by about 90 to 180, depending on the age group.  
The constant variance model also fit poorly, its AICs being about 140 or 150 
higher than the best.  Among the other GVF models, the AIC differences were at 
most 5.  The results for three candidate GVFs are shown in Table 3.  The 
inverse of open parenthesis alpha plus beta over capital Y plus gamma over 
capital Y squared close parenthesis GVF had the lowest AIC for each age group 
except 0-4, for which the CPS variance formula, beta times capital Y plus 
gamma times capital Y squared, was best.  However, as the AIC differences are 
not great, for the poverty rates any GVF other than constant variance or 
constant relative variance might be used.  In particular, the CPS variance 
formula might be picked for its relatively good fit, familiarity, and 
theoretical appeal.

Table 3:  AIC Differences and a and nu Parameter Estimates
------------------------------------------------------------------------
                     Poverty Rates
------------------------------------------------------------------------
Age    GVF                                  change AIC    a         nu
------------------------------------------------------------------------
0-4    alpha plus beta times capital Y      
       plus gamma times capital Y squared     -0.4        47.2      20.2

       beta times capital Y plus gamma 
       times capital Y squared                -2.0        46.8      20.2

       the inverse of open parenthesis
       alpha plus beta over Y plus 
       gamma over Y squared close 
       parenthesis                             0.0        46.6      20.2 

5-17   alpha plus beta times capital Y
       plus gamma times capital Y squared      0.8        28.5      18.7

       beta times capital Y plus gamma 
       times capital Y squared                 2.6        27.9      18.7

       the inverse of open parenthesis
       alpha plus beta over Y plus 
       gamma over Y squared close 
       parenthesis                             0.0        27.8      18.8

18-64  alpha plus beta times capital Y
       plus gamma times capital Y squared      1.9       130.8      20.1

       beta times capital Y plus gamma  
       times capital Y squared                 4.6       113.3      20.0

       the inverse of open parenthesis
       alpha plus beta over Y plus 
       gamma over Y squared close 
       parenthesis                             0.0       146.1      20.1

65+    alpha plus beta times capital Y
       plus gamma times capital Y squared      0.2        16.6      14.7 

       beta times capital Y plus gamma 
       times capital Y squared                 0.3        16.3      14.7

       the inverse of open parenthesis
       alpha plus beta over Y plus 
       gamma over Y squared close 
       parenthesis                             0.0        16.6      14.7
------------------------------------------------------------------------
                     Income
------------------------------------------------------------------------
Statistic    GVF                            change AIC      a       nu
------------------------------------------------------------------------
Per Capita   gamma times capital Y squared    22.3          17.4    22.3

Per Capita   alpha plus beta times capital 
             Y plus gamma times capital Y 
             squared                           1.7          16.2    23.0

Per Capita   the inverse of open 
             parenthesis alpha plus beta 
             over Y plus gamma over Y 
             squared close parenthesis         0.0          17.6    23.0
  
Median       gamma times capital Y squared     8.7          22.3    24.0

Median       alpha plus beta times capital 
             Y plus gamma times capital Y 
             squared                           1.5          24.8    24.2

Median       the inverse of open 
             parenthesis alpha plus beta 
             over Y plus gamma over Y 
             squared close parenthesis         0.0          25.5    24.2      
------------------------------------------------------------------------
change AIC is the difference between the AIC of the given model and the
the inverse of open parenthesis alpha plus beta over Y plus gamma over Y 
squared close parenthesis model for that statistic.  a is the random 
effect parameter, and nu is the degrees of freedom.                        
------------------------------------------------------------------------

For the income statistics, the open parenthesis alpha plus beta over Y plus 
gamma over Y squared close parenthesis inverse GVF again provided the best fit,
though the AICs for the alpha plus beta times Y plus gamma times Y squared and 
open parenthesis alpha plus beta over Y close parenthesis inverse GVFs were 
very close.  Any of these three GVFs might be used.  (Results for the first 
two of these GVFs are given in Table 3.)  The other GVFs tried did not fit 
very well.  However, the fit of the constant relative variance model, (gamma 
times Y squared, see Table 3) was not terrible, so if strong weight were given 
to simplicity of the model, this GVF might be used.

Of the time series models, our preliminary analysis in Section 3.3 was borne 
out.  The slow decay of the autocorrelations (Table 2.) was fit best by the 
ARMA(1,2) model.  This model had the lowest AIC for all the statistics.  (For 
the 65+ poverty rates for AR(1) model achieved approximately the same AIC.)  
On average over the other models, the AICs were 15 higher for the ARMA(1,1) 
and 27 higher for the AR(1).  This makes the ARMA(1,2) the clear choice.

Estimates of the random effects parameter, a, and the degrees of freedom 
parameter, nu , showed significant variation between statistics (a more so 
than nu), but little variation between alternative (reasonably fitting) models 
for a given statistic.  This result is encouraging; we would not like to see 
dependence of a or nu on the GVF or ARMA model chosen.  Future research will 
look more closely at the nature of the random effect, and will explore methods 
of allowing for uncertainty about them when using the model (Equation 2.1) to 
make inferences about the true population quantities, capital Y sub s t.

4    References

Adam, A., and Fuller, Wayne A.  (1992).  "Covariance Estimators for the 
Current Population Survey," In Proceedings for the Section on Survey Research 
Methods, Washington, D.C.  American Statistical Association, 586-591.

Bell, W.R., and S.C. Hillmer (1994).  "Applying time series models in survey 
estimation." submitted for publication.

Bell, W.R., and M.C. Otto (1993).  Bayesian assessment of uncertainty in 
seasonal adjustment   with sampling error present.  Statistical Research 
Division Research Report Series   92/12, Bureau of the Census.

Binder, D.A., S.R. Bleue, and J.P. Dick (1993).  Time series methods applied 
to survey data.   Presented at the 1993 ISI Meeting.

DeGroot, M.H. (1970).  Optimal Statistical Decisions.  McGraw Hill Book 
Company, New York.

Dempster, A.P. and J.S. Hwang (1993).  "Component Models and Bayesian 
Technology for Estimation of State Employment and Unemployment Rates," In
Proceedings of the 1993 Annual Research Conference, U.S. Bureau of the Census,
571-581.

Fay, R.E. (1989).  Theory and application of replicate weighting for variance 
calculations.  In Proceedings of the Section on Survey Research Methods, 
Washington, DC.  American Statistical Association.

Ghosh, M., and N.  Nangia (1993).  Estimation of median income of four-person 
families: A bayesian time series approach.  In Proceedings of the 1993 Annual
Research Conference, pages 555-570.  U.S. Bureau of the Census.

Ghosh, M., and J.  Rao (1994).  Small area estimation: an appraisal (with 
discussion).  Statistical Science, 9(1), 55-93.

Hanson, R.H. (1978).  The current population survey: Design and methodology.  
Technical Paper 40, U.S. Department of Commerce, U.S. Government Printing 
Office, Washington, DC..

Hurvich, C.M., and Chih-Ling Tsai (1991).  Bias of the corrected AIC criterion 
for underfitted regression and time series models.  Biometrika, 78(3), 499-509.

Isaki, C.T., E.T. Huang, and J.H. Tsay (1991).  Smoothing adjustment factors 
from the 1990 post enumeration survey.  In Proceedings of the Section on 
Social Statistics, pages 338-343, Washington, D.C. American Statistical 
Association.

Johnson, N.L., and S.  Kotz (1972).  Distributions in Statistics: Continuous 
Multivariate Distributions.  John Wiley & Sons, Inc., New York.

Prasad, N., and J.  Rao (1990).  The estimation of mean squared error of small 
area estimators.  Journal of the American Statistical Association, 85(409), 
163-171.

Schaible, W.L. (ed.)  (1993).  Indirect Estimators in Federal Programs, 
Working Paper 21.  Subcommittee on Small Area Estimation Federal Committee on 
Statistical Methodology, Statistical Policy Office, Office of Management and 
Budget, Washington, DC..

Tiller, R.  (1992).  Time series modeling of sample survey data from the U.S. 
Current Population Survey.  Journal of Official Statistics, 8, 149-166.

Train, G., L.  Cahoon, and P.  Makens (1978).  "The Current Population Survey 
Variances, Inter-Relationships, and Design Effects," In Proceedings of the 
Survey Methods Research Section, Washington, DC, American Statistical 
Association, 443-448.

Wolter, K.M. (1985) Introduction to Variance Estimation.  Springer-Verlag, New
York.