Monday, December 23, 2013

How “Mechanistic” are Mechanistic Water Quality Models?

Mechanistic surface water quality models have been developed by scientists and engineers as mathematical descriptions of hydrologic and ecologic processes. Mechanistic modelers have tended to concentrate on the mathematical expression of theory, probably as a consequence of: (1) scientific interest and challenge, (2) a belief that the theory was reasonably well-understood and that this understanding could be expressed mathematically, (3) limited available data to fit and evaluate models, and (4) limited resources to collect additional data. For these reasons, model coefficients and reaction rates in mechanistic models generally are intended to characterize actual processes and are not (prior to model “tuning”) intended to be empirically-fitted constants (which might be considered an “effective” value for a model parameter).

Since the parameters of mechanistic models are intended to describe real processes, it may be assumed that an experimental study of a particular process can yield a parameter estimate that can be directly inserted into the model. In some cases, it is acknowledged that a reaction rate or coefficient in a model is affected by certain conditions in a waterbody (e.g., turbulence), and thus adjustments must be made to the experimentally-based value. However, if the model truly is a complete mechanistic description of the system of interest, then adjustment should be unnecessary; this is the underlying belief of modelers who advocate development of “physically-based” models.

However, given the relative simplicity of all simulation models in comparison to the complexity of nature, it seems reasonable to question the legitimacy of any "mechanistic" mathematical description of surface water quality. Further, given data limitations and scientific knowledge limitations, it seems reasonable to question even the goal to strive for a model that need not be calibrated. The correctness of model structure, the knowledge of the model user, and the availability of experimental and observational evidence all influence parameter choice for mechanistic models. Unfortunately, too often knowledge and data are extremely limited, making choice of parameters and choice of important model processes guesswork to a distressingly large degree. The example presented below is not re-assuring with respect to these two issues: (1) scientific support for the selection of model parameters, and (2) scientific support for the specification of appropriate model functional relationships.

One of the basic functions in an aquatic ecosystem model is phytoplankton settling. An early example of its use is in the model proposed by Chen and Orlob (1972):


   
  where:

         V = segment volume (m3)
         C1 = phytoplankton concentration (g/m3)
         Q = flow volume (m3/t)
         E = diffusion coefficient (m2/t)
         A = segment surface/bottom area (m2)
         µ1 = phytoplankton growth rate (t-1)
        R1 = phytoplankton respiration rate (t-1)
         s1 = phytoplankton settling rate (t-1)
         M1 = phytoplankton mortality rate (t-1)
         µ2 = zooplankton growth rate (t-1)
         C2 = zooplankton concentration (g/m3)
         F2,1 = fractional feeding preference

Other examples are quite similar; a common alternative approach is that phytoplankton settling is sometimes treated as a velocity term with an areal loss:

     phytoplankton settling (mass/time) = v1AC1                                                       

To understand some of the problems with the current approach for parameter determination in mechanistic surface water quality models, it is useful to examine this process further. For that purpose, "phytoplankton settling velocity" provides a good example. Phytoplankton, or algae, are important in aquatic ecosystems, and thus one or more phytoplankton compartments are found in most mechanistic surface water quality models concerned with nutrient enrichment. Phytoplankton settling is one of the key mechanisms for removal of phytoplankton from the water column.

Stoke's law provides the starting point for the mathematical characterization of phytoplankton settling. Few models, however, employ Stoke's law; instead a simple constant settling velocity (in units of length/time) expression is commonly used. To apply a model with this settling velocity term, a modeler must either measure phytoplankton settling directly, or select a representative value from another study. Since field measurement of phytoplankton settling is a difficult task, use of literature-tabulated values is standard practice.

Probably the most thorough listing of suggested values for phytoplankton settling velocity continues to be Bowie et al. (1985), which presents a thorough table of reported values, by algal type (see the table below). Bowie et al. note that under quiescent conditions in the laboratory, phytoplankton settling is a function of algal cell radius, shape, density, and special cell features such as gas vacuoles and gelatinous sheaths. For natural water bodies, water turbulence can be quite important. In two- or three-dimensional models with hydrodynamic simulation, turbulence is accounted for in the model equations; in zero- or one-dimensional models, the effect of turbulence on phytoplankton settling must usually be incorporated into the choice of settling velocity.

That information is typically the extent of technical guidance considered by modelers when selecting this parameter using a reference like the table from Bowie et al. The range of options in the table is substantial, even within a single category (e.g., diatoms) for algal type. The algal cell size, shape, and other features mentioned in the previous paragraph can vary from species to species within a single type category, so this may be responsible for some of the variability in the table. However, even if the modeler who must choose a point estimate has data that identify dominant species in a water body at a particular time and location, dominance is apt to change with time and location. Further, models contain at most only a few distinct phytoplankton compartments, so a choice must still be made concerning species to be modeled and their characteristics.

Examination of the original references from which the table was created does little to enlighten the parameter selection process. Most of the references summarized in the table do not present observational studies of phytoplankton; rather, they are simulation model studies, and the value for phytoplankton settling velocity listed in the table is the value chosen for the model. In some of the references checked, little or no basis was provided for the choice. When a rationale for choice was given, it was usually to adopt or adjust the few values presented in the literature from experimental studies, or to adopt a value from another modeling study. In one way or another, it appears that virtually all of the values presented in the table have some dependency on the early experimental work of Smayda and Boleyn (1965) and other work by Smayda.

Unfortunately, evaluation studies of simulation models have provided little insight on good point estimates for this parameter. Observational data on surface water quality are almost always inadequate for testing functional relationships and assessing parameter choices. Typical observational data sets are noisy, with few measurements of each of only a few variables. In the case of phytoplankton settling velocity, observational data are apt to consist of phytoplankton cell densities at various dates, times, and areal locations, but probably not depths. Since phytoplankton are also removed from the water column through consumption by higher food chain organisms, the observational data do not permit separate identification of the removal mechanisms.

Given this situation, modelers have relied almost exclusively on the few experimental studies in the laboratory and their judgment concerning adjustments to these values. For one-dimensional models without explicit modeling of hydrodynamics, the chosen value may be as much as an order of magnitude higher than the laboratory values. Two- or three-dimensional models with hydrodynamics may incorporate the unadjusted laboratory value. After early modeling studies presented chosen values, these values were sometimes adopted in subsequent studies without comment (in effect, "default" values were identified). Thus, there is probably much less information in the columns of the table than implied by the number of values reported.

In summary, the choices for phytoplankton settling velocity appear to be based on ad hoc adjustments to a few values measured under controlled conditions. There is virtually no field confirmation of choices made for parameters individually (as opposed to collectively). This situation is fairly typical of the state-of-the-art in mechanistic surface water quality simulation modeling.


Bowie, G.L., Mills, W.B., Porcella, D.B., Campbell, C.L., Pagenkopf, J.R., Rupp, G.L., Johnson, K.M., Chan, P.W.H., Gherini, S.A., and Chamberlin, C.E., 1985. Rates, Constants, and Kinetics Formulations in Surface Water Quality Modeling. U.S. Environmental Protection Agency, EPA/600/3-85/040.

Chen, C.W and Orlob, G.T., 1972. Ecologic Simulation for Aquatic Environments. Office of Water Resources Research, US Dept. of Interior. Washington, DC.

Smayda, T.I. and Boleyn, B.J., 1965. Experimental observations on the floatation of marine diatoms. Part I: Thalassiosira naria, T. rotula and Nitzschia seriata. Limnol. and Oceanogr., 10:499-510.


Monday, December 9, 2013

Dealing Effectively with Uncertainty

Are we better off knowing about the uncertainty in outcomes from proposed actions? That is, will our decisions generally be better if we have some idea of the range of possible outcomes that might result? I have always thought so, and yet current practice in water quality modeling and assessment suggests that others feel differently or perhaps believe that uncertainty is small enough so that it can be safely ignored.

Consider my experience from many years ago. While in graduate school, I became involved in a proposed consulting venture in New Hampshire. As a young scientist, I was eager to “shake up the world” with my new scientific knowledge, so I suggested to my consulting colleagues that we add uncertainty analysis to our proposed 208 (remember the Section 208 program?) study. Everyone agreed; thus we proposed that uncertainty analysis be a key component of the water quality modeling task for the 208 planning process. Well, after we made our presentation to the client, the client’s first question was essentially, “The previous consultants didn’t acknowledge any uncertainty in their proposed modeling study, what’s wrong with your model?” This experience made me realize that I had much to learn about the role of science in decision making and about effective presentations!

While this story may give the impression that I’m being critical of the client for not recognizing the ubiquitous uncertainty in environmental forecasts, in fact I believe that the fault primarily lies with the scientists and engineers who fail to fully inform clients of the uncertainty in their assessments. Partially in their defense, water quality modelers may fail to see why decision makers are better off knowing the forecast uncertainty, and perhaps modelers may not want to be forced to answer the embarrassing question like that posed to me years ago in New Hampshire.

For this situation to change, that is, for decision makers to demand estimates of forecast error, decision makers first need: (1) motivation - that is, they must become aware of the substantial magnitude of forecast error in many water quality assessments, and (2) guidance – they must have simple heuristics that will allow them to use this knowledge of forecast error to improve decision making in the long run. Once this happens, and decision makers demand that water quality forecasts be accompanied with error estimates, water quality modelers can support this need through distinct short-term and long-term strategies.

Short-term approaches are needed due to the fact that most existing water quality models are incompatible with complete error analysis as a result of overparameterization; thus short-term strategies should be proposed for: (1) conducting an informative, but incomplete error analysis, and (2) using that incomplete error analysis to improve decision making. In the long-term, recommendations can be made to: (1) restructure the models so that a relatively complete error analysis is feasible, and/or (2) employ Bayesian approaches that are compatible with adaptive management techniques that provide the best approach for improving forecasts over time.

In the short-term, if knowledge, data, and/or model structure prevents uncertainty analysis from being complete, is there any value in conducting an incomplete uncertainty analysis? Stated another way, is it reasonable that decision making will be improved with even partial information on uncertainties, in comparison to current practice with no reporting of prediction uncertainties? Often, but not always, the answer is “yes,” although the usefulness of incomplete uncertainty characterization, like the analysis itself, is limited.

Using decision analysis as a prescriptive model, we know that uncertainty analysis can improve decision making when prediction uncertainty is integrated with the utility (or loss, damage, net benefits) function to allow decision makers to maximize expected utility (or maximize net benefits). When uncertainty analysis is incomplete (and perhaps more likely, when the utility function is poorly characterized) the concepts of decision analysis may still provide a useful guide.

For example, triangular distributions could be assessed for uncertain model terms, and assuming that parameter covariance is negligible (which unfortunately may not be the case), then limited systematic sampling (e.g., Latin hypercube) could be used to simulate the prediction error. The result of this computation could be either over/under estimation of error, but it does provide some indication of error magnitude. However, this information alone, while perhaps helpful for research and monitoring needs, is not sufficient for informed decision making. The approximate estimates of prediction uncertainty need to be considered in conjunction with decision maker attitudes toward risk for key decision variables.

Implicit in this attitude toward risk is an expression of preferences concerning tradeoffs. For example, are decision makers (or stakeholders, or other affected individuals/groups) risk averse with respect to ecological damage, such that they are willing to increase project costs in order to avoid species loss? If a reasonable quantification of prediction uncertainty were available for the decision attribute - loss of an endangered species, then the prediction might be expressed as “there’s a 40% chance of loss of this species with plan A, but only a 5% chance of loss with plan B.” When costs of the plans are also considered, the tradeoff between species loss and cost is augmented by awareness of risk that comes from the prediction uncertainty characterization. Risk is not evident from deterministic (point) predictions of the decision attributes, so the decision is likely to be better informed with the risk assessment that is made possible with prediction uncertainty.

In the long run, a better strategy is to restructure the models, emphasizing the development of models that are compatible with the need for error propagation and adaptive assessment/management. Bayesian (probability) networks are particularly suitable for this task (see http://kreckhow.blogspot.com/2013/07/bayesian-probability-network-models.html), as are simulation techniques that address the problem of equifinality resulting from overparameterized models (see: http://kreckhow.blogspot.com/2013/06/an-assessment-of-techniques-for-error.html).

No one can claim that scientific uncertainty is desirable; yet, no one should claim that scientific uncertainty is best hidden or ignored. Estimates of uncertainty in predictions are not unlike the point estimates of predicted response. Like the point predictions, the uncertainty estimates contain information that can improve risk assessment and decision making. The approaches proposed above will not eliminate this uncertainty nor will it change the fact that, due to uncertainty, some decisions will yield consequences other than those anticipated. They will, however, allow risk assessors and decision makers to use the uncertainty to structure the analysis and present the scientific inferences in an appropriate way. In the long run, that should improve environmental management and decision making.