Thursday, May 2, 2013

Operational Water Quality Standards and Numeric Nutrient Criteria


Effective water quality management is built on a foundation of water quality standards. Recognizing this, most states have focused on making standards defensible from a scientific and socioeconomic perspective. However, standards must ultimately be protective, and for that we must consider the operational enforcement of the standard.

Standards become scientifically and socioeconomically defensible through careful determination of the designated use, an appropriate criterion, and an antidegradation policy. This basically means that the designated use should properly reflect regulatory requirements, societal preferences, and scientific assessments, while the criterion should reflect the science relating water quality indicators to use designation.

Standards become operationally enforceable when they are stated in a manner that makes compliance assessment clear and unambiguous. Most surface water quality standards are expressed and evaluated based on a single, point-valued chemical criterion (e.g., 50 ug/l arsenic for Class C Waters in North Carolina). This criterion is then used for two primary compliance assessments: (1) current water quality – based on a comparison of the criterion with measurements to determine if a waterbody is currently in compliance, and (2) future water quality – based on model forecasts to determine if proposed management actions will achieve compliance.

Consider the following examples of the two types of compliance assessments:
1.   The turbidity criterion for Class C Waters in North Carolina is 50 NTU (Nephelometric Turbidity Units). Given natural variability in precipitation and water runoff, changes in human activities in developed watersheds, and measurement error, a set of turbidity measurements over time at a single sampling station is going to vary.
2.   The chlorophyll a criterion is 40 ug/l for Class C Waters in North Carolina. Given the uncertainty in predictive model forecasts, it is highly likely that the upper tail of the probability distribution characterizing chlorophyll a model forecast error will exceed 40 ug/l for any feasible management strategy for most waterbodies that are currently out of compliance.
Based on the wording in the North Carolina water quality standards, compliance assessment will reflect a comparison of a precise fixed criterion with a distribution of measurements or forecasts. From a practical standpoint, how does this comparison proceed? In other words, is compliance with the criterion to be achieved only if there are no observations/predictions that exceed the numeric criterion (e.g., zero violations)? That strategy may be feasible when comparing a set of current water quality measurements with a fixed criterion. However, that strategy is generally not practical with water quality model forecasts which will likely yield a nonzero probability of exceeding a water quality criterion in most applications.

For impaired waters (303(d)) listing based on measurements of current water quality, the EPA and state agencies have tended to allow 10% exceedances of the numeric criterion, probably in acknowledgment of natural variability and measurement error. However, for TMDL forecasting, which requires compliance assessment with a water quality model, the EPA and state agencies tend to ignore model forecast uncertainty, despite the fact that this uncertainty may be quite large. Thus EPA and state agencies lack practical experience for selection of the allowable percent exceedances of a criterion associated with future pollutant loading for a TMDL, to account for model prediction uncertainty.

Allowing a selected percentage of exceedances of a numeric criterion does make sense. In principle, unless there is to be an infinite penalty associated with exceedance of a criterion, an analysis of benefits and costs would lead to probabilistically-based standards that included a nonzero chance of exceedance of the criterion. In practice, determining cost/benefit-based standards is a difficult task; hence, the arbitrary choice of 10% exceedances appeared to be a pragmatic action by EPA.

Still, we should be able to do better. First, research could help guide the choice of allowable percent exceedances so that it bears some relation to the consequences of compliance and noncompliance. Second, research is needed on estimation of model forecast errors so that application of the standard in forecast scenarios incorporates a reasonable choice for percent exceedances. Finally, the language in the water quality standards needs to be expressed so that the standards are operationally enforceable.

An additional area of concern for operationally enforceable water quality standards relates to the recent push by EPA for numeric nutrient criteria, in part to remove the ambiguity of narrative criteria. However, numeric water quality criteria can also be ambiguous. Consider the North Carolina numeric dissolved oxygen criterion: “not less than an average of 5.0 mg/l with a minimum instantaneous value of not less than 4.0 mg/l.” We know that DO varies naturally with temperature in both time and space. So a dissolved oxygen criterion can be ambiguous and nonprotective unless it is operationally assessed based on the: (1) space/time variability in dissolved oxygen in a waterbody, and (2) the “region” of space/time that the DO standard is intended to protect. Otherwise, water quality monitoring to assess compliance with this criterion can result in compliance or noncompliance due solely to a sampling design that ignores natural variability.

The importance of the TMDL program and the 303(d) listing process has increased the need for operational water quality standards. By explicitly acknowledging variability and uncertainty through standards that allow for percent exceedances, the standards become less ambiguous and more enforceable.

1 comment:

  1. Ken,

    Have you seen Florida's nutrient criteria? The wordings are something like this: the annual geometric mean of phosphorus shall not exceed XX more than once in three years. Kansas is also debating whether to use the magnitude-duration-frequency concept in setting nutrient criteria. My view is that we are confusing what Barnett and O'Hagan (1997) called ideal standard and realizable standard. A standard is set with respect to the mean concentration, which is impossible to measure. The ideal standard is then translated into a realizable standard derived from monitoring samples. Unfortunately, the translation is a statistical exercise.

    ReplyDelete