Many studies in
ecology, both experimental and observational, are designed to assess what may
be referred to as a “treatment effect.” The treatment effect can pertain to
such things as: the influence of various factors on growth rate of an organism,
the effect of a pollution control strategy on ambient pollutant concentrations,
or the effect of a newly-created herbicide on animal life. It is common
practice in these situations for the scientist to obtain data on the treatment
effect and use hypothesis testing to assess the statistical significance of the
effect.
In classical or
frequentist statistical analysis, hypothesis testing for a treatment effect is
often based on a point null hypothesis (which actually should be used only if it
is considered appropriate from a scientific standpoint). Typically, the point
null hypothesis is that there is no effect; it is often stated in this way as a
“straw man” that the scientist expects to reject on the basis of the data
evidence. To test the null hypothesis, data are obtained to provide a sample
estimate of the effect of interest and then to compute an estimate of the test
statistic. Following that, a table for the test statistic is consulted to
assess how unusual the observed value of the test statistic is, given (assuming) that the null hypothesis is
true. If the observed value of the test statistic is unusual, that is, if it
essentially incompatible with the null hypothesis, then the null hypothesis is
rejected.
In classical
statistics, this assessment of the test statistic is based on the sampling
distribution for the test statistic. The sampling distribution is a probability
density function that is hypothetical is nature. In effect, it is a smoothed
histogram for the test statistic plotted for a large number of hypothetical samples
with the same sample size. Inference in classical statistics is based on the
distribution of estimators and test statistics in many (hypothetical) samples,
despite the fact that virtually all statistical investigations involve a single
sample. This hypothetical sampling distribution provides a measure of the frequency,
or probability, that a particular value, or range of values, for the test
statistic will be determined for a set of many samples. In classical
statistics, we equate this long-run frequency to the probability for a
particular sample, before that sample is taken.
There are two
problems with this approach that are addressed through use of Bayesian
statistical methods. The first is that the hypothesis test is based on a test
statistic that is at best indirectly related to the quantity of interest - the
truth (or probability of truth) of the null hypothesis. The p-value commonly
reported in hypothesis testing is the probability (frequency), given that the
null hypothesis is true, of observing values for the test statistic that are as
extreme, or more extreme, than the value actually observed; in other words:
p(test statistic equals or exceeds k|H0
is true)
The scientist,
however, is interested in the probability of the correctness of the hypothesis,
given that he has observed a particular value for the test statistic; in other
words:
p(H0 is true|test statistic=k)
Classical
statistical inference does not provide a direct answer to the scientist's
question; Bayesian inference does.
The second problem
relates to the issue of “conditioning,” which concerns the nature of the sample
information in support of the hypothesis. Bayesian hypothesis tests are
conditioned only on the sample taken, whereas classical hypothesis tests are
conditioned on other hypothetical samples in the sampling distribution (more
extreme than that observed) that could have been selected, but were not. The
Bayesian approach, of course, uses more than the sample as it also incorporates
prior information. However, the prior, while judgmental, does relate to the
hypothesis of interest, whereas the sampling distribution relates to logically
irrelevant, hypothetical samples. Clearly, the Bayesian approach is more
focused on the problem of interest to the ecologist.
Reckhow (1990) illustrates
the tendency of p-values to overstate the sample evidence against the null
hypothesis in an example concerning acidification of lakes.
https://www.researchgate.net/publication/249011176_Bayesian_Inference_in_Non-Replicated_Ecological_Studies?ev=prf_pub
No comments:
Post a Comment