In ecological studies, investigators often use existing
scientific knowledge to specify hypotheses or models, and then collect data at
a site of interest to test the hypotheses or fit the models. If collateral data
from nearby or similar sites exist, it is common practice to use this
information to make a judgmental assessment of the support for and against the
model/hypothesis, but otherwise not to incorporate these collateral data into
the analysis in a formal way.
For example, consider the situation in which a state
agency has maintained a statewide surface water quality monitoring network, and
a local community is interested in using some of these data to assess trends in
selected contaminants at sites within its jurisdiction. The common practice is
to use the data at each site for a site-specific trend analysis, while using
data from other nearby sites only in a comparative analysis or discussion. This
approach persists despite the fact that if variability in water quality at a
site is high, a long record of single-site observation is required to be
confident in a conclusion concerning change over time at that site.
A seemingly natural question of interest might be
whether collateral data at nearby sites can contribute to the site-specific
analysis other than in a comparative study. The answer often is “yes,” as a
consequence of exploiting the commonality (or exchangeability) among sites. On
the one hand, each field site has unique features associated with forcing
functions (e.g., watershed conditions and pollutant inputs) and with response
functions (e.g., water depth and hydraulic conditions). However, the
environmental sciences include common principles that should lead us to expect
similarity in ecosystem response to stresses, and implied in a discussion of
response at other nearby sites is often an expectation that these sites have
something in common with the site of interest.
As a result, it should often be possible to improve
(i.e., reduce inferential error) the single-site analysis by “borrowing
strength” from other similar sites. This may be accomplished using an empirical Bayes (or multilevel) approach where collateral information (which in the above example
is the assessment of trends at the other similar sites) is used to construct a
“prior” probability model that characterizes this information. Using Bayes Theorem,
the prior probability is then combined with a probability model for the trend
at the site of interest. In many instances, combining information using
empirical Bayes methods yields smaller interval estimates and thus stronger
inferences than would result if this information was ignored.
The strategy of “borrowing strength” from other
similar analyses is an attribute shared by several statistical methods. Bayesian inference, empirical
Bayes, and the classical method of random coefficients regression all have this
characteristic. Bayesian inference, of course, results from the application of
Bayes Theorem, which provides a logical framework for pooling information from
more than one source. Empirical Bayes (EB) methods also use Bayes Theorem, but
otherwise they are more classical (or frequentist) than Bayesian in that they
involve estimators and consider classical properties. In the typical parametric
empirical Bayes problem, we wish to simultaneously estimate parameters µ1,...,µp (e.g., p means). The EB prior for this problem
is often exchangeable; that is, the
prior belief for each of the i=1,...,p
parameters to be estimated does not depend on the particular value of i (the prior belief is the same for each
parameter). With exchangeability, the prior model is assumed to describe a
simple underlying relationship among the
µj, and Bayes Theorem is used to define the EB estimators for
the posterior parameters.
Exchangeability in the empirical Bayes set-up is a
particularly useful concept for simultaneous parameter estimation with a system
that has a hierarchical or nested structure. Examples of these systems are
plentiful. For instance, cross sectional lake data may arise from individual
lakes (at the lowest level of the hierarchy) that are
located within ecoregions (at the next level of the hierarchy). Alternatively,
individual stream stations may be nested within a stream segment or nested
within a watershed. This nestedness implies a structure for the linkage of
separate sites or systems that could be exploited in a hierarchical model.
Empirical Bayes descriptions and applications
are less common than are Bayesian analyses in the statistics and ecology
literature. While most textbooks on Bayesian inference have sections treating
EB problems, they tend not to be emphasized, perhaps because they have
frequentist attributes and do not require a “true” prior. See Reckhow (1993 and
1996) for ecological examples of empirical Bayes analysis.
Reckhow, K.H. 1993. A Random Coefficient
Model for Chlorophyll-Nutrient Relationships in Lakes. Ecological Modelling. 70:35-50.
Reckhow, K.H. 1996. Improved
Estimation of Ecological Effects Using an Empirical Bayes Method. Water Resources Bulletin. 32: 929-935.
No comments:
Post a Comment