Tuesday, April 30, 2013

Is Conventional Water Quality Model Verification A Charade?


In the development and application of water quality models, it is standard practice to set aside data, not used in calibration, for model verification purposes. This approach is based on the reasoning that the set-aside-data provide a test of the model under new conditions and thus reflect how the model will perform when applied for prediction. How plausible is this reasoning?

Consider the situation where a model is calibrated with data from 2010-2011, and then data from 2012 are used for verification. What is likely to be different between these calibration and verification data sets? Will these differences be sufficient to give us confidence that the calibrated model can be relied upon for predictions when important forcings/inputs (e.g., pollutant loadings to a waterbody) change?

In essentially all cases, the major differences between 2010-2011 and 2012 datasets are likely to be natural forcing functions such as hydrology, temperature, and solar radiation. It is extremely unlikely that the forcing functions that are the focus of the model application, such as LULC changes in a watershed or point source pollutant discharges, will change very much. To the extent that pollutant loads to a waterbody change over this time period, it will largely be due to changes in hydrology.

So, conventional water quality model verification has become basically a charade. This situation is not the fault of modelers; rather, it is simply the consequence of limited available data. Nonetheless, water quality modelers who employ this approach to model verification need to be more candid about the limited value of conventional model verification.

As an alternative, here is the basis for a statistical test that could provide a measure of the rigor in model verification. To begin, consider the figure below displaying histograms of dissolved oxygen data for model calibration and verification:
The next figure overlays the calibration and verification histograms for Case 1; notice how similar they are. The lack of difference between these two data sets indicates that “verification” lacks rigor; essentially, the model is being re-assessed with calibration-like data.
Now consider Case 2 below:
An overlay of the two histograms, shown below, indicates that the calibration and verification data sets are different, which suggests that verification is more rigorous than in case 1. However, note that the verification data in case 2 show DO to be lower than for model calibration. Since model applications are quite likely to address improved water quality and higher dissolved oxygen, the verification test may be rigorous but it does not reflect conditions expected for model use.



Now consider Case 3 below:



In case 3, the histogram of verification data is different from the histogram of calibration data, and this time the verification DO are higher than the calibration DO, which is a more likely prediction scenario.

In conclusion, to evaluate the rigor of the verification exercise, I recommend that modelers apply a Kolmogorov-Smirnov test, or a Chi-Square test, to quantitatively assess the difference between the calibration and verification data sets.  If this becomes routine practice, the accumulated results will provide us with a comparative basis for having confidence that a water quality model can be used to reliably predict water quality in response to management changes.



Friday, April 26, 2013

Scientific Uncertainty and Risk Assessment


One of the commonly-mentioned approaches for policy analysis is "risk assessment." In brief, there is often a recommendation in state and federal governments that proposed regulations and policies should be evaluated using risk assessment before approval. This has sparked a good deal of concern and suspicion - "What could those shady politicians be up to?"- many people might wonder. Some people worry that a requirement for risk assessment is just a clever way to reduce environmental protection and to slow down regulatory efforts in seemingly endless scientific analysis and review.

Perhaps so. But, I see a lesson and an opportunity in this. One thing that we gain from risk assessment is an appreciation of the magnitude of the uncertainty in the science surrounding environmental management and decision making. It is distressing that essentially all decisions affecting environmental management reflect incomplete or inaccurate science. For example, it is unfortunately true that scientists cannot predict with great confidence the effect of land use changes on water quality. Yet, we generally rely on those predictions to guide TMDL decisions. What should we do? Forget the scientific input because it's not terribly good?

No. The lesson from risk assessment is that we should demand from scientists an estimate of the goodness of their science. This means that we must ask scientists questions such as "How good is that prediction?" or request that scientists "Give us a range of numbers that reflects the scientific uncertainty." Then, as citizens or as decision makers, we need to use this information on scientific uncertainty to work toward improved environmental management. How do we do that?

Well, here's an example from everyday life. All of us have made decisions on outdoor activities in consideration of the forecast for rain. In deciding whether to hold or postpone an outdoor activity, we typically seek (scientific) information on such things as the probability (reflecting uncertainty) of rain. Further, it is not uncommon  to hear the weather forecast on the evening news, but still defer a final decision on the activity until an updated weather prediction in the morning (in other words, get more sample information).

Beyond consideration of the scientific assessment in the weather forecast, we also think about how important the activity is to us. Do we really want to participate in the activity, such that a little rain will not greatly reduce our enjoyment? Or, is the activity of only limited value, such that a small probability of rain may be enough so that we choose not participate?

Every day, we make decisions based on an interplay, or mix, of uncertainty in an event (e.g., rain) and value (enjoyment) of an activity. We are used to weighing these considerations in our minds and deciding. These same considerations--getting new information on the weather (which is analogous to supporting new scientific research, as in adaptive management), and deciding how valuable the activity is to us (which is what we determine through cost/benefit analysis)--are key features of risk assessment. So let us move from our informal, everyday risk assessment to formal, scientific risk assessment, and identify the lesson and the opportunity as they relate to environmental management.

To me, the lesson in risk assessment is to recognize that the science in support of environmental management is usually uncertain, and sometimes highly uncertain. But the opportunity that is provided by risk assessment should result in improved decision making. To accomplish this, we must first require scientists to quantify or estimate the scientific uncertainty. Then we must require our decision makers to use the estimate of uncertainty to properly weigh the scientific information (not unlike what we do in our informal, everyday risk assessment). In the long run, this should improve environmental management decisions by making better use of the available information.

Wednesday, April 24, 2013


Decisions and Scientific Research for Water Quality Management

Since scientists are trained to identify questions in need of additional research, it is natural for the scientific community to focus on gaps in understanding and talk about uncertainty when discussing the state of scientific knowledge concerning an issue of public concern. It is perhaps plausible, then, for the public and for decision makers to interpret that quest for better understanding as a declaration that the scientific basis for decision is inadequate. This interpretation is premature, and in many instances, incorrect.

For years, federal, state, and university scientists have been engaged in research that addresses key scientific questions of concern for the management of water quality in Chesapeake Bay. There is now, and will continue to be, a need for the scientific or technical assessment of water quality impacts of proposed management actions in the Chesapeake. This scientific assessment will be uncertain, regardless of the confidence with which it is expressed. Unfortunately, uncertainty is likely to cause confusion, leading decision makers to wonder how a decision can be made on a proposed management option when the scientists are unsure. The result may be that the science is pronounced useless, irrelevant, or in need of improvement.

There are two key points I would like to briefly discuss - both dealing with the scientific understanding, or conversely the uncertainty, in water quality studies. The first point is:

There is almost always enough scientific knowledge to make an informed decision.

This is an important message, because, as noted, scientists frequently emphasize issues that are not fully understood and are in need of more research. For example, what will be achieved with a 100 ft. riparian buffer strip as opposed to a 50 ft. buffer strip? Or, will two acre lot zoning achieve water quality goals? These are questions that cannot be answered with certainty. However, just because scientific analysis cannot give a confident, precise answer to questions like these does not mean that decisions should be deferred pending results of additional scientific study. This is an important point. There will almost always be scientific uncertainty about expected water quality impacts of proposed management actions, but there is almost always sufficient information to act.

This leads to my second point:

Decision makers need to understand how to use the scientific uncertainty so that they can distinguish situations calling for new management actions from situations calling for more research.

While I just noted that we almost always know enough to make a decision and take action, there certainly are situations where the uncertainty is so great and the consequences of bad decisions so severe that it is wise to defer action and support more scientific study. We do this in "everyday life" as all of us have attitudes about risky actions that reflect the uncertainty in the outcome and the cost if we’re wrong. In addition, though, there are also situations where an immediate decision is prudent, while at the same time additional scientific study should be supported in expectation of "mid-course corrections." This is basically the approach for the Chesapeake Bay water quality problems - immediate actions are being implemented by state and local governments, while at the same time a number of research projects are being funded. Decision making will be more effective if decision makers can distinguish between these two situations; to do this, decision makers must request an understandable statement of the uncertainty in the scientific studies (e.g., in the water quality predictions).

Fortunately, there is an approach to decision making informed by uncertain science in Chesapeake Bay which should eventually improve decisions. This involves “adaptive management” or “learning while doing.” In the Chesapeake Bay, this strategy is begun with a properly designed water quality monitoring program to assess the water quality response in the Chesapeake Bay and Watershed to the initial management actions. Careful observation of the water quality response can then lead to “mid-course” improvements in management that are more exactly tailored to the system. For example, it may be discovered through monitoring that certain sources of nitrogen and phosphorus are more (less) responsible to water quality degradation, leading to focused management actions.

No one likes the fact that our science is imperfect, but no one should ignore this fact. If we acknowledge the limitations - the uncertainty - in scientific studies, we will end up with more informed, and better, decisions in the long run.