Monday, August 5, 2013

Assessing Water Quality Standards Compliance – A Bayesian Approach

If a water quality management plan is developed and implemented to meet a water quality standard, monitoring is usually the basis for assessing compliance and determining if any management modifications are needed. Yet, we know that lags in implementation of plans, lags in pollutant concentration change and/or in biotic response, measurement uncertainty, and natural variability all may lead to errors in inferences based on measurements. This has led some in the water quality modeling community to recommend the use of models to assess progress. However, all water quality models have prediction uncertainties, some of which can be quite large. So, which assessment is more reliable – the model forecast or the monitoring data?
We believe that both assessments can, and should, be used to evaluate compliance and the adequacy of management actions. That is, even though the model is just that – a model – and even though it will always yield uncertain predictions, it has value in forecasting impacts (otherwise we would not be using it to develop the management plan). Likewise, lags, natural variability, and measurement uncertainty do not prevent useful inferences to be derived from measurements. Qian and Reckhow (2007) present a Bayesian approach for pooling pre-implementation model forecasts with post-implementation measurements to assess compliance with the relevant water quality standard. In its simplest form, this Bayesian approach involves a variance-weighted combination of the model forecast and the post-implementation monitoring data. 


Bayes Theorem lies at the heart of Bayesian inference; it is based on the use of probability to express knowledge and the combining of probabilities to characterize the advancement of knowledge. The simple, logical expression of Bayes Theorem stipulates that, when combining information, the resultant (or posterior) probability is proportional to the product of the probability reflecting à priori knowledge (the prior probability) and the probability representing newly acquired data/knowledge (the sample information, or likelihood function). Expressed more formally, Bayes Theorem states that the posterior probability for “y” conditional on experimental outcome “x” (written p(y|x)) is proportional to the probability of y before the experiment (written p(y)) times the probabilistic outcome of the experiment (written p(x|y)):                   
                          (1)

Here, we are interested in whether chlorophyll concentrations in the Neuse River Estuary will be in compliance with North Carolina’s water quality standard of 40 μg/l chlorophyll, following implementation of management actions to reduce nitrogen input to the estuary.  Many states in the US require that the probability of exceeding a water quality standard must be less than 10%, so compliance with the chlorophyll standard is said to be achieved if there is less than a 10% chance (probability) of chlorophyll exceeding 40 μg/l. 
For the Neuse River Estuary, two models were developed to assess the impact of management actions to reduce nitrogen loading to the Neuse River Estuary.  One model is SPARROW (SPAtially Referenced Regressions On Watershed attributes); the SPARROW model was used to predict nitrogen loading to the estuary, and a second model (NeuBERN; a Bayes network) was developed to predict the chlorophyll a concentrations in the estuary, based on the SPARROW nitrogen load predictions.  As a result of application of these models, a plan for nitrogen load reduction was developed that was expected to achieve compliance with the chlorophyll standard. Once the nitrogen load reduction plan was implemented, monitoring of chlorophyll in the estuary was initiated to help assess compliance.
As stated above, the pre-implementation predictions from the model were augmented with post-implementation chlorophyll a measurements from the estuary to better assess the probability of compliance.  Qian and Reckhow (2007) illustrated the methods for combining model predictions and monitoring data using the Neuse River data as an example.  The basis of their work is the repeated application of Bayes Theorem (Equation 1) as a vehicle to combine information from different sources. Prediction of chlorophyll concentration based on the linked SPARROW-NeuBERN model output constitutes the prior probability distribution of chlorophyll concentration in the estuary. Monitoring data for chlorophyll obtained after implementation of the management plan provides the likelihood function.  The resulting posterior distribution (recall Equation 1) represents the combined information from model output and monitoring data. 
Bayes Theorem was applied in a sequential manner on an annual basis beginning with data collected in 1992. When data from the next year (1993) became available, the posterior distribution developed in the previous time step (1992) became the prior distribution for assessing compliance during the next time step (1993).  This iterative Bayesian updating process, sequentially-presented in Figure 1 and summarized in Figure 2, represents a natural mechanism for information accumulation which can be effective in assessing changes in water quality status.
In Figure 1, the solid bell-shaped curve is the prior distribution estimated for each year, the dashed bell-shaped curve is the posterior distribution, and the vertical line is the North Carolina chlorophyll standard of 40 μg/l. The posterior distribution represents a probability-weighted average of prior and monitoring data.   It is evident in the graph on the upper left of Figure 1 that our model-predicted chlorophyll a concentration distribution in the Neuse River Estuary differed from the 1992 observed chlorophyll concentrations (displayed in logarithmic scale in Figure 1), which are represented by the histogram.  However, as the Bayesian updating analysis proceeds through the 1990s, the updating of each year’s prior with new data causes the prior and posterior probabilities to gradually merge. The composite analysis presented in Figure 2 summarizes the gradual convergence of sequentially updated posterior distributions to the distribution represented by the data histogram.


Figure 1. Sequential updating of chlorophyll a concentration distributions in the Neuse River Estuary are presented in natural logarithm scale.  Each panel represents one year. The solid bell-shaped lines are the prior distributions, the dashed bell-shaped lines are the posterior distributions, and the histograms are the annual monitoring data. The North Carolina chlorophyll water quality standard is shown by the vertical line segment.

Figure 2. The sequentially updated posterior distributions are shown to converge to the distribution represented by the data histogram.

 In summary, the use of Bayes Theorem yields a model/data consensus on water quality in the Neuse River Estuary. Natural variability (exacerbated by the hurricanes that often strike this area of North Carolina) causes the chlorophyll observations (Figure 1) to fluctuate from year to year. The initial modeling effort, along with Bayesian updating brings stability to the assessment. As a result, we can be confident that chlorophyll achieved compliance with the water quality standard.

Qian, S., and K.H. Reckhow. 2007. Combining model results and monitoring data for water quality assessment. Environmental Science and Technology.41:5008-5013.