Identification of uncertainty sources in distributed hydrological modelling: Case study of the Grote Nete catchment in Belgium

The quest for good practice in modelling merits thorough and sustained attention since good practice increases the credibility and impact of the information, and insight that modelling seeks to generate. This paper presents the findings of an evaluation whose goal was to understand the uncertainty in applying a distributed hydrological model to the Grote Nete catchment in Flanders, Belgium. Uncertainties were selected for investigation depending on how significantly they affected the model's decision variables. A Fault Tree was used to determine various combinations of inputs, mathematical code, and human error failures that could result in a specified risk. A combination of forward and backward approaches was used in developing the Fault Tree. Eleven events were identified as contributing to the top event. A total of 7 gates were used to describe the Fault Tree. A critical path analysis was carried out for the events and established their rank or order of significance. Three measures of importance were applied, namely the F-Vesely, the Birnbaum, and the B-Proschan importance measures. Model development of distributed models involves considerable uncertainty. Many of these dependencies arise naturally and their correct evaluation is crucial to the accurate analysis of the modelling system reliability.


Introduction
The quest for good practice in modelling merits thorough and sustained attention since good practice increases the credibility and impact of the information, and insight that modelling seeks to generate (Jakeman et al., 2006). This paper presents the findings of an evaluation whose goal was to understand the uncertainty in applying a distributed hydrological model to the Grote Nete catchment in Belgium, which could be otherwise stated as the assessment of whether the model can do what is reasonably expected of it in representing the distributed hydrology of this catchment. The results would establish the degree of confidence to be placed in the model's representation of the catchment response. The evaluation was conducted as a first step in model development by a trial and error process whose aims ultimately included learning about the hydrologic characteristics of the study area, the information available for modeling this area, and MIKE SHE as the modeling tool used in distributed modeling of the study area. This was a site-specific evaluation, but whose outcome can be used in similar circumstances. This paper presents a case of the solution to the problem of building distributed models with the quality characteristics necessary for representation of the complex hydrology of a natural catchment.
The water resources of the Grote Nete catchment have been profoundly influenced by anthropogenic activities, including the construction of canals, agricultural and land drainage systems, and land use changes. Physical deterioration of rivers and their floodplains is common. In recent years, awareness and concern have increasingly been directed towards the potential adverse impacts that anthropogenic changes have had on river valley ecosystems. Scientists have come to realise that mankind's economic strides made over the last two centuries were at the expense of the earth's biodiversity, its environment, and the stability of its self-regulatory systems (Todd et al., 2003). The growing awareness of the value of natural ecosystems has resulted in various efforts being initialised to reverse past anthropogenic changes, and various methods for natural restoration are being considered. Quantitative and qualitative information on these intervention measures is known to fluctuate with country or region, extent of ecological degradation, present land use within region, and understanding of the accompanying hydrological processes (Kusler and Kentula, 1990;Mitsch and Wilson, 1996;Richardson, 1994).
However, changing people's views on water use and making them understand the meaning and necessity of good watershed management requires solid scientific arguments. Such arguments can be communicated to the stakeholders involved through the use of decision support systems (DSS). Model-based DSS are nowadays used frequently to assess the impacts of policies prior to their implementation (Andreu et al., 1996;De Kok et al., 2001;Mysiak et al., 2005). Examples include multiple criteria decision analysis (MCDA), a tool to support sustainable management of groundwater resources in South Africa (Pietersen, 2006); and the hydrological decision support framework (HDSF), intended as a tool for assessing and managing water resources (Clark and Smithers, 2006). Models are used to answer the research questions that arise with water retention, including the risks associated with water conservation, the required extent of basin restoration, and the existence of other options to address the primary problem of hydrological extremes. Scenarios are then used as a prerequisite for assessing the influence of potential land-use/ land-cover changes on runoff generation (Niehoff et al., 2002). In the case of the Grote Nete catchment, to both fill the existing gap in knowledge of the hydrological influence restorative intervention measures and provide a basis for undertaking watershed management with stakeholders, it was necessary to develop a physically-based, spatially distributed hydrological model as a first part of what would eventually be a holistic DSS for the Grote Nete catchment. The growing availability of inexpensive parallel computers for deployment in distributed modelling and the improvements in visualisation of simulations, are increasingly removing the hitherto limitation of computer power as a constraint to the application of models. With the increasing application of distributed models comes the need to understand potential risks or uncertainties related to their use. Errors and uncertainties in their use are often substantial (Willems, 2005). It is easy to read too much into the output and model predictions, and there is a risk that the model gets used for a purpose other than what it was developed for, potentially rendering the conclusions invalid (Jakeman et al., 2006). It is very difficult to characterise all the processes in nature, and it is impossible to make predictions of future responses without acknowledging the inherent risk or uncertainty involved (Beven, 2000). Technological developments in distributed hydrological modelling have created a need for methods capable of analyzing their reliability. This is especially so in the areas of assessing model reliability, detecting weak links in the modelling process, modelling process optimisation, and provision of insight into the normal or abnormal behaviour of the models. There are plenty of uncertainties in the modelling process. Process simulation models are complex in nature and it is not easy to simplify them (Garen et al., 1999). A 'good' model strikes the balance between complexity and accuracy (Beck et al., 1997).

Distributed models
Distributed models are those which are able to explicitly represent the spatial variability of some, if not most, of the important land surface and climatic characteristics. Such models have important applications to the interpretation and prediction of the effects of land-use change and climate variability since they relate model parameters directly to physically observable land surface characteristics. Model development of distributed models involves challenges related to the validation of internal variables, and at multiple scales. Problems in code verification and model validation arise from the difficulties in obtaining complete input data (Grayson et al., 1992). A common problem is the existence of multiple optimal parameter sets and the presence of high interaction or correlation between subsets of fitted model parameters. The former problem has been discussed by Beven who suggests that there may be many acceptable parameter sets within a model structure, which may come from different regions of the parameter space (Beven, 1993;Beven and Freer, 2001). This then results in the possibility of having multiple calibrated parameter sets spanning a broad range of feasible parameter space, which produce virtually indistinguishable simulated river discharge (Kuczera and Franks, 2002). Given the observations available, there may be no rigorous basis for differentiating between the acceptable parameter sets. Beven introduced the term 'equifinality' to describe this phenomenon (Beven, 1993). The most important implication of the equifinality problem is the non-uniqueness of calibrated parameters. A reverse scenario is the risk that the global optimum is not found (Jakeman et al., 2006).

Uncertainty
Uncertainty is interpreted differently by different disciplines (Mowrer, 2000). It encompasses many concepts (Morgan and Henrion, 1990). Beven describes the risk of a possible outcome as uncertainty (Beven, 2000). He writes that techniques for uncertainty or risk analysis are well developed, but are not widely used. Model outcomes are left vulnerable if the uncertainty associated with the modelling is not analysed (Beven, 2000). Some terminology related to uncertainty including variation, variability, ambiguity, heterogeneity, approximation, inexactness, vagueness, inaccuracy, subjectivity, imprecision, misclassification, misinterpretation, error, faults, mistakes, and artefacts (Dubus et al., 2003). Uncertainty assessment is increasingly being applied, with expected benefits including quantification of uncertainty, identification of factors most influential to model predictions, and generation of output most relevant to decision making. Uncertainty must be considered in developing any model, but is particularly important, and usually difficult in the case of integrated models (Jakeman et al., 2006).

Uncertainty sources in distributed modelling
In distributed hydrological modelling, risk and uncertainty lie within the collection of possible outputs and their likelihoods. They are the sum of outcome, likelihood, significance, causal scenario, and the population affected (Kumamoto and Henley, 1996). It is very difficult to characterise all the processes in nature, or to make predictions of future responses without acknowledging the inherent risk or uncertainty involved.
Uncertainties are selected for investigation depending on how significantly they affect the decision variable (De Kort and Booij, 2007). Many studies have been carried out on uncertainties in hydrology, primarily based on the principles and criteria of classical statistics that lay emphasis on mean square errors of percentiles and on unbiasedness (Parent and Bernier, 2003). The importance of uncertainty may be determined by first-order uncertainty analysis (Melching et al., 1990), sensitivity analysis (Morris, 1991), Monte Carlo analysis (Seibert, 1997), Bayesian uncertainty (Tol and de Vos, 1998), parameter uncertainty investigation by validation, or by uncertainty frameworks including among others, the Generalised Uncertainty Estimation, GLUE (Beven and Binley, 1992), the Bayesian Forecasting System, and the Pareto Optimal Set procedure. Rather diabolically, however, the selection and implementation of techniques designed to account for risk and uncertainties are themselves subject to significant uncertainty. For instance, overall results from Monte Carlo-based probabilistic assessments will be influenced by the selection of input parameters to be included in the analysis (Nofziger et al., 1994), the type and parameterisation of probability distribution functions attributed to input parameters (Brattin et al., 1996), the absence or presence of correlation between variables, the extent of the correlations considered (Smith et al., 1992), the sampling scheme used (Saltelli et al., 2000) and the seed number used in the sampling (Dubus and Janssen, 2003).

Fault Tree analysis, terms and techniques
A Fault Tree is a graphical representation of events in a hierarchical, tree-like structure (Fig. 5). It is used to determine various combinations of inputs, mathematical code, and human error failures that could result in a specified risk. From the time of its conceptualisation in the Bell Telephone Laboratories in  (Haasl, 1965), Fault Tree analysis has become an established tool used to analyse risk and the likelihood of failure of systems. The technique categorises risks as events. The 'top event' signifies the least desired event; an 'intermediate event' is the result of more primary events below; an 'undeveloped event' is one that is not developed further for lack of data or its relative insignificance; and a 'primary event' is a basic event for which failure data are available (Amendola and Bustamante, 1988;Kumamoto and Henley, 1996). Two gates of a Fault Tree are the 'OR' logic gate, whereby output occurs if any one of the input events occurs; and the 'AND' logic gate, whereby output occurs only if all the input events occur simultaneously. The gate type determines how the inputs to the gate are logically connected for the minimal cut set analysis process.
In Fault Tree analysis, system failure causality is well represented by a logic tree diagram which increases in its resolution as the diagram develops until primary events are encountered. This diagrammatic representation offers a clear representation of fault propagation through the system whilst representing a mathematical logic equation. The Fault Tree is a logical relationship between an event and its causes, and provides a logical framework for expressing combinations of event failures that can lead to the top event. Gates are used to describe the relationship between the input and output events in a Fault Tree. Fault Tree analysis has been used to support engineering and management decisions, trade-off analysis, and risk analysis (Kumamoto and Henley, 1996;Amendola and Bustamante, 1988;Haasl, 1965;Harms-Ringdahl, 1993).

Model selection criteria
A source of uncertainty in modelling is the choice of model ). An appropriate model may be difficult to choose (Garen et al., 1999), and model simulation with similar data can produce very different responses (Reed et al., 2004). The uncertainty evaluation presented in this paper was carried out in the context of a study on the effects of rewetting on extreme river discharge events in the Grote Nete catchment, and how these are likely to be affected by proposed catchment restoration measures (Rubarenzya et al., 2005). An important consideration therefore in selecting the model was the applicability for scenario analysis, which necessitated a realistic physicallybased, fully distributed representation of the study area. MIKE SHE (Graham and Butts, 2006) was adopted. It is a distributed model that incorporates the different components of the hydrological cycle, and for which each process can be represented at different levels of complexity.

MIKE SHE
MIKE SHE is a distributed model that simulates the entire land phase of the hydrological cycle (Refsgaard and Storm, 1995). It includes all of the processes in the land phase of the hydrological cycle, including precipitation, evapotranspiration, canopy interception, overland sheet flow, channel flow, unsaturated sub-surface flow and saturated groundwater flow. Each of these processes can be represented at different levels of spatial distribution and complexity, according to the goals of the modelling study, the availability of field data and the modeller's choices (Graham and Butts, 2006). The discrete grids form the computational units and the vertical axis is discretised into layers (Yang et al., 2000). Figure 1 shows the model implementation for the Grote Nete catchment.

Figure 1 Schematic representation of the Grote Nete model in MIKE SHE
MIKE SHE was used to build a distributed hydrological model of the study area. Model testing was done using a combination of goodness-of-fit statistics and a multicriteria model refinement protocol as implemented in the tool for hydrological time series analysis, WETSPRO (Rubarenzya et al., 2006a). In verifying the model, both the split-sample approach and a graphical approach using validation plots were employed. The calibration period was taken from 1986 to 1988, and the validation period from 1990 to 1995. This avoided the period in-between, when river dredging activities are believed to have interfered with the river stage measurements.

The study area
The Grote Nete catchment in Belgium is a middle-sized hydrological catchment located in the northeast of Flanders (Fig. 2). The soils are predominantly composed of sand, sandy loam in the southern and valley areas, and silt (Batelaan, 2006), and 49.6% of the area consists of sandy permeable soils. The topography is flat, ranging from 12 m in the west to 69 m in the east with an average value of 22 m (Batelaan, 2006), and has a shallow phreatic surface. Catchment slopes are in the range of 0% to 5%, with an average value of 0.3% (Batelaan, 2006). The Grote Nete catchment is composed of numerous river tributaries ( Fig. 4), and a dense network of ditches and subsurface pipe drains that feed into the main Grote Nete, Molse Nete, and Grote Laak Rivers. The confluence of the Grote Nete and Grote Laak Rivers occurs just upstream of the Varendonk limnigraphic station. In addition, the catchment has numerous small lakes, the result of sand mining in the past for glass production. The catchment area is 385 km 2 at the outlet Varendonk limnigraphic station.

Model inputs
A brief description of the Grote Nete catchment model and its main inputs is given in the following sections.

Rainfall measurements
Seven rain gauges were used as sources of rainfall input and the spatial rainfall distribution was determined by the Thiessen polygon method (Fig. 3).

Potential evapotranspiration
The potential evapotranspiration, E p , was calculated for a closed, short-cut grass surface, optimally supplied with water, and using coefficients that were calibrated for Belgian conditions. In estimating the evapotranspiration, ET o , climatic data supplied by the Royal Meteorological Institute of Belgium (RMI) from the meteorological station at Geel (51 o 09' 30' N, 4 o 59' 30'E; at elevation 21 m) were used.

Saturated zone flow
The saturated zone model consisted of a 3-dimensional Darcy equation. This permitted three-dimensional flow in the heterogeneous aquifer with shifting conditions between unconfined and confined conditions. The flow was calculated using a maximum allowable time step of 1 h. The catchment geology was described in terms of four geological layers to which hydraulic properties were assigned through grid-code files. The 3-dimensional geological model described the extent, thicknesses and elevation of the layers. For each layer, distributed estimates were determined for the horizontal and vertical hydraulic conductivities. A separate well file was created to represent the abstraction from the five abstraction wells located within the catchment, and another three just outside the boundary. Included in this file were the coordinates of each well, the vertical location of the filter, and a time series of water abstraction. The drainage component of the MIKE SHE groundwater module was included. It described drainage using drainage codes (areas considered to be drained), drain levels (distributed maps of effective drainage levels, i.e., groundwater table elevation above which drainage flow occurs), and a drainage time constant.

Vadose zone flow
The unsaturated zone model was a vertical soil profile model that interacted with both the overland flow and the groundwater model. The lower boundary condition for this zone was defined by the location of the groundwater table. The Richards equation was used to represent flow in this zone. The study area is characterised by sandy to sandy loam soils, and has a high water table. The distributed soil map was broadly classified into six major soil classes. Vertical discretisation then followed from the ground surface level down to 20 m. The minimum discretised cell height was 0.025 m at the ground surface level. To each discretised layer, soil properties were assigned, including the retention curve parameters and Averjanov pedotransfer coefficients (Rubarenzya et al., 2006b). Vertical flow and water content of the unsaturated soil was calculated using a maximum time step of 30 min. MIKE SHE automatically updated the computational time steps during the simulation to avoid numerical instability following high rainfall inputs.

Overland flow
The overland flow component was defined by the two-dimensional diffusion wave approximation of the St. Venant equations governing shallow water flow. To limit the amount of water that could flow over the ground, a parameter for water detention was introduced. The overland flow was calculated using a maximum time step of 30 min. The distributed surface roughness over the catchment was established after calibration of values from literature.

Channel flow and surface water features
The Grote Nete catchment is composed of numerous river tributaries (Fig. 4). In addition, the catchment has many small lakes, the result of sand mining in the past for glass production. Surface waters were represented as land use categories, along with the corresponding roughness and evapotranspirative parameters. The river network was represented in MIKE 11 (Havno et al., 1995), which is a hydrodynamic model that is coupled to MIKE SHE and simulates the one-dimensional river flows and water levels using the fully dynamic St. Venant equations. The maximum discretisation was 750 m distance (dx), with a fixed time step (dt) of 10 min.

Land use and vegetation
The final land-use map was based on the 1995 land-use map of Flanders. However, the latter map had several land-use classes of which there were insufficient data. This was solved by undertaking a reclassification of the 1995 land-use map of Flanders to reduce the number of classes to 8 major land-use categories. The aim of this reclassification was to simplify the land-use input and balance data availability with the detail required of spatially distributed modelling, and to merge very small land uses into similar categories. Each land use had an associated user-defined vegetation development file, which contains information on the annual growth cycle, and progression of the Leaf Area Index (LAI) and crop coefficient (Kc) values.

Figure 3
The  , 1998). For example, the attention of Bayesian statisticians is returning to the quantification of expert opinion, with due consideration of their uncertainties (Parent and Bernier, 2003). This study, like others before, analysed uncertainties chosen because they were convenient, in the opinion of the researchers, and were relevant to the dependent variable of the model. Because this study calibrated a model against river discharge as the dependent variable, the top event or ultimate uncertainty was defined as simulation uncertainty, that is, the inaccuracy of predictions in the dependent variable. The study utilised logic gates, through which combinations of uncertainties could be grouped as they contribute to the top event. All gates were of the 'or' logic type, implying that the uncertainty output from the gate could occur if any one of the input events occurs.

Fault Tree and Critical Path analysis
A Critical Path analysis was carried out for the events and established their rank or order of significance. A Critical Path is a group of events that has the highest probability of occurrence among all possible sets of events. Events of larger rank represented those more critical, that is, more likely to cause a realisation of the top event. Three measures of importance were applied, namely the F-Vesely, the Birnbaum, and the B-Proschan importance measures (Meng, 2000;Dutuit and Rauzy, 2005). The F-Vesely (Fussell-Vesely) importance measure represents an event's contribution to the system unavailability. Increasing or decreasing the availability of events with a higher importance value will have the most significant effect on system availability. The Birnbaum measure for an event represents the sensitivity of system unavailability with respect to changes in the events unavailability. The B-Proschan (Barlow-Proschan) event importance measure takes into consideration the sequence of event failures within its calculation. It is the probability that the system fails because a critical cut set containing the event fails, taking into consideration that the event fails last.
The uncertainty analysis was used to define events and linked them to form a logic diagram, the Fault Tree (Fig. 5). The top event was described precisely. The resolution of the tree increased from the top event down to the primary events. A combination of forward and backward approaches (Kumamoto and Henley, 1996) was used in developing the Fault Tree. The backward approach began at a particular event and traced back its possible causes, while the forward approach began with set of failure events and went forward to determine their possible effects. The Fault Tree served the purpose of directing the analysis to identify failure modes. The process of indicating aspects of the system responsible for system failure provided a graphic aid to define the progression of events leading to the top event, allowing for the concentration on one particular system failure at a time, and providing an insight into overall system behaviour.

Outcomes of the study Importance evaluation of uncertainty sources
The first step in the evaluation was compilation of a list of sources of uncertainty. This uncertainty evaluation was being conducted for a specific catchment in Belgium, and as alluded to by O'Hagan, was based on the judgment of the researchers (O'Hagan, 1998). Thus, while some events emerged as being universal to distributed modelling, there were events that are unique to this case study. In addition, the decision on which uncertainties to classify as independent events and which to combine with others was arrived at based on the perceived severity of the uncertainty in the context of the distributed modelling study. In the case of the Grote Nete catchment, 11 events were identified as contributing to the top event (Fig. 5). The events included: 1. Temporal variability of inputs acknowledges the uncertainty that accrues when not modelling steady-state conditions. Most model inputs vary in time, and the significance depends on both the model and the specific parameter. It is important that a model be able to represent different spatial scales. An appropriate scale would then encompass spatial and temporal aspects (Blöschl and Sivapalan, 1995). For the Grote Nete model this significance was for instance, related to processes in the vadose zone, including soil physical and hydraulic parameters and the related pedotransfer functions (Bouma, 1989). The model does not allow for variation of these inputs with time, which introduced uncertainty in the modelling outcome.

Costs and complexity of taking measurements are related
to the type of input sought, and the required resolution. This results in sampling uncertainties on parameters due to the limited availability of information (Parent and   , 2003). This also leads to 'measurement error', a term that refers to uncertainty arising from sampling in the field . Generally measurement regimes are costly, and uncertainty is introduced during the compromise between the inputs necessary for assembly of a model with acceptable results, and the available financial resources. While we must at any given time accept the data that are available (Jakeman et al., 2006), there is also a risk from using data obtained with different equipment and approaches . For the Grote Nete model, this aspect particularly limited any efforts to augment existing data sets of land use and river geometry with new field measurements.
3. Parameter heterogeneity or natural randomness integrated a combination of the parameter type, catchment size, spatial and temporal resolution, and degree of detail of available parameter values. For the Grote Nete catchment it was recognised that some parameters are more naturally variable than others, and it is known that different model developers are likely to come up with different sets of optimal parameter sets, each set describing the system acceptably well. Models with too many degrees of freedom may then be fitted to irrelevant 'noise' or inconsistent components of the noise, have near-redundant parameter combinations, or obscure significant behaviour because of the spurious variation allowed by too much freedom (Jakeman et al., 2006). Studies have revealed that not accounting for parameter heterogeneity can exert a strong influence on the predictive capability of the model. For the Grote Nete model, this uncertainty was most evident when building the sub-surface model, where large sections of the earth's strata had to be assumed to be homogeneous, responding with similar water retention and transmission properties.
4. Human error may arise from unstable or biased experimental and measurement procedures, interpretation, typing error or the simple variation between persons (Stine and Hunsaker, 2001). Other examples of this source of uncertainty include the possibility of an incorrect or unrealistic model structure selection (Parent and Bernier, 2003), digitisation of data (Burrough, 1998), and upscaling models above the scale at which they were developed (Gaunt et al., 1997). This event is most prominent in measuring parameters, setting up the model, model parameterisation, and calibration and validation. It was among the hardest events to quantify since it involves numerous persons at different stages, but it presents significant uncertainty. With regard to model building, MIKE SHE incorporates some checks for gross errors, but is unable to detect more subtle errors. While elements of this uncertainty may be found in other identified uncertainties, numerous authors explicitly identify this uncertainty, and it is possible to have the occurrence of human error even in the absence of any of the other uncertainties identified.
5. Temporal variability of parameters represents the uncertainty from the fact that input data change with time. This is different from the first uncertainty source temporal variability of inputs) in that while the former relates to physical inputs that define the catchment, like land use, this uncertainty source relates to the parameters that then mainly define how empirical relationships are solved by the model. In MIKE SHE, it is possible to enter some data as time series in order to represent this variability. Examples of these include river boundary data. However, there are other parameters, for instance horizontal and vertical conductivity, and Strickler's coefficients, which are considered static. This uncertainty is significant in the case of the Grote Nete, where, for instance, the seasonal growth of macrophytes in the rivers is believed to have an influence on recorded river stages and consequently, measured river discharges. The importance of this uncertainty is parameter-specific, and may depend on the purpose of the study.
6. Precipitation input uncertainty. Reliability of point measurements of precipitation is determined by measurement height, the presence of surrounding large features, evaporation losses, absence of heating facilities to allow for measurement of snowfall, differences in collector shapes, and inadequate calibration. Even assuming that precipitation measurements are reasonably precise, they still can only represent point measurements of a very distributed phenomenon. For instance, significant variability in rainfall data (Krajewski et al., 1998) will directly affect the water balance , and the modelling outcome has a large uncertainty if this variability is not considered (Chaubey et al., 1999). Uncertainty also lies in the approach of measuring rainfall. Tipping bucket rain gauges are, for instance, known for losing water during bucket movements; consequently rainfall intensities are underestimated and the underestimation increases with increasing rainfall intensity. Generally, this uncertainty becomes more significant for a decreasing number of pluviometers in the catchment.

Missing or unavailable data.
In applying the complex, data-intensive MIKE SHE model to the Grote Nete catchment, missing or unavailable data was another source of uncertainty which was categorised as a distinct uncertainty source. Hypotheses and assumptions regarding missing data had to be made in building the model, and the uncertainty in this uncertainty source lay in how unrealistic these assumptions were. In the case of the Grote Nete model this uncertainty was categorised as an independent uncertainty source as it represented instances where there were no data at all. Thus, extrapolation and averaging methods are applicable here. Good examples here were the crop parameters for the Kristensen-Jensen evapotranspiration method. This uncertainty generally reduces with experience of the modeller.

Numerical approximations in code.
This is one of the most basic forms of error in modelling, but is notoriously difficult to estimate . This uncertainty depends on the temporal and spatial resolution of the model setup, and the relative differences in size of adjacent spatial cells or time steps. This uncertainty also emanates from the fact that the model is based on empirical relationships whose approximate solution is then arrived at by a series of iterations. Structural error or conceptual error is the term given to a model's inability to simulate experimental observations even when an appropriate set of model inputs is used (Beck et al., 1997). They may result during conversion of a scientific concept into a set of equations or computer code (Addiscott, 2001); or through inappropriate or omitted representation of significant processes .
9. The definition of system boundaries. Boundaries are an integral part of the initial conceptual model and uncertainty lies

639
in wrongly defining boundaries. The system being modelled should be clearly defined (Jakeman et al., 2006) but this is not always feasible. In the case of the Grote Nete catchment, this uncertainty referred to the definition of initial conditions, spatial catchment boundaries, sub-surface stratifications, and saturation conditions. While it was assumed that the catchment boundary followed the topography, the geological record of the area shows that the area was initially flat, and the relief features are the result of Aeolian sands. Thus, the catchment boundaries do not necessarily follow the relief. In addition, the common approach of approximating the saturated zone by a few homogeneous layers clearly introduced a measure of uncertainty in the modelling of the sub-surface region.
10. Scale approximations. This uncertainty originates from the inability to model at real scales which results in scale dependencies and resolution problems. In theory the appropriate scale and resolution are determined by the desired nature of outputs, and interpretations to be made thereof (Birkhead et al., 2007). However, in reality, the problem of specifying an optimal mesh resolution remains unbounded, and for mesh construction, objective a priori rules do not exist.

Unknown volume of industrial and domestic effluent.
This source of uncertainty is particular to the Grote Nete catchment, where a number of industries discharge unknown quantities of effluent into the rivers. These ungauged inflows affect the interpretation that should be made from model simulation outputs and creates uncertainty especially during calibration and validation. Related to this would be the seepage of irrigation water into the river streams, which irrigation water was initially drawn from outside the catchment; and seepage and leakage of water from the canals that cross the catchment.

Discussion
A principal purpose of uncertainty evaluation is the derivation of uncertainty, which may then be managed by proposing alternative models, evaluating the uncertainty of each alternative until a satisfactory alternative is obtained. When distributed models are used as decision support tools, the uncertainty of their inaccuracy or inadequacy is perceived to be high especially if loss of life and property would result from their failure. Good modelling practice caters for both results and the accuracy of results. Simulation uncertainty (of river discharge) was established as the top event in this evaluation. A number of events are responsible for this state, all of which do occur to varying degrees in any modelling. An initial observation was that all uncertainty sources are probable when modelling with MIKE SHE. However the significance of uncertainty to the modelling process from each is site-specific.

Uncertainty sources
Uncertainty sources or Events 1, 2, 3, 4 relate to the evaluation of land-use effects in distributed modelling. Here, the model relies on a physically based description of the rainfall-runoff processes, and the effects of different land covers defining the catchment's response to land-use change. Hence, for the attainment of reliable results, soil-and land-cover properties have to be accurate and represent the heterogeneous nature of the catchment. However, the literature shows that this degree of accuracy is rarely met. In the case of the Grote Nete catchment, the uncertainty in the parameterisation of soils and land covers was expressed in some of the primary events. This uncertainty was further made more obvious by the regionalisation of point measurements, as in the case of determining area rainfall by the method of Thiessen polygons, because of the natural variation that was exhibited by the parameters. The choice of spatial resolution took into consideration improvement in insights into spatial and temporal processes that accrue from an increased spatial resolution, and how the resolution affects MIKE SHE's solution of the non-linear partial differential equations to yield the simulation results.

Gates
Seven gates were used to describe the uncertainties, and they are discussed in the following sections. In Fault Tree analysis, gates are used to represent the upward clustering of lower events in their progression towards the top event. Gate 7 (G7) (Fig. 5) represents model structure uncertainty or model inadequacy, which is the uncertainty associated with the modeller's limited understanding of the system. The behaviour of hydrological systems is very difficult to describe (Beven, 2000), and some responses are probably unknown . MIKE SHE like many other engineering models tries to describe the natural hydrological system. Unfortunately, the model does not accurately represent nature, for instance, in the solutions to the river component based on the Saint Venants equations (Kazezyilmaz-Alhan et al., 2005) subject to the known assumptions, or the approximations in solving the Richards equation for unsaturated flow. Events through this gate included the scale approximations, unknown industrial and domestic effluent being discharged into the rivers, and the definition of system boundaries. Gate 6 (G6) represents model input uncertainty. This gate is the uncertainty from missing or unavailable data. For a complex distributed model like MIKE SHE, it is always likely that input data will be limited. Good examples are the soil hydraulic properties, which are subject to large spatial variability and measurement technique. Here, pedotransfer functions are used to express the relationships between basic soil properties and parameters which are difficult to measure. Pedotransfer functions, however, increase the uncertainty during parameterisation (Tietje and Tapkenhinrichs, 1993 included precipitation input uncertainty, missing or unavailable data, and numerical approximations in the code. Gate 5 (G5) represents parameter uncertainty. This is the uncertainty arising from limited amounts of data for use in the calibration and validation of the model. The gate also encompassed the use of fitting parameters to describe some processes that could not be otherwise described. Examples included field drains, where a fitting parameter was used to describe elements of both overland and saturated zone flow. Events through this gate included human error and the temporal variability of parameters.
Gate 4 (G4) represents the spatial variability of parameters. This is the uncertainty resulting from the model's inability to represent the variation of parameters in space as occurs in nature. Parameters are also represented with distinct boundaries and yet are gradually varying in nature, leading to misclassification of locations (Tarantola et al., 2002). The costs and complexity of taking measurements, and parameter heterogeneity within the catchment were encompassed by this gate.
Gate 3 (G3) represents the epistemic uncertainties, which are those due to either inadequate data for building the model or limitations in knowledge of the processes within the system. This gate then brought together events through Gates 5, 6, and 7 as they progress towards the top event.
Gate 2 (G2) represents the inherent uncertainties, which are as a result of the stochastic or random character of the natural system. In principle, these uncertainties are unrelated to the model implementation. Of all the gates so far classified, this represents one for which it may not be possible to reduce the uncertainty even with long historical data records. Dedicated procedures such as GLUE  or the Pareto Optimal Set procedure (Yapo et al., 1998) may be used to provide a confidence interval for each optimised parameter but the uncertainty estimates provided will be dependent on subjective choices, such as the selection of objective function and the limit at which the model is considered to be not calibrated . 'Parameter lumping' may also prevent a decrease in uncertainty following calibration (Dubus and Brown, 2002). This gate brings together Gate 4, and the temporal variability of inputs. Finally, Gates 2 and 3 lead up to the top event, simulation uncertainty, through Gate 1 (G1).
A possible drawback observed during the evaluation was that since not all possible events were considered, it is possible that several potentially important events are not selected for the evaluation. In addition, the possible influence of climate change on the modelling was not explicitly included on the Tree. It is known that climate change research is both complex and uncertain (Van Wageningen and Du Plessis, 2007). However, since climate change is a form of temporal variability in model input parameters, it may be inferred to be included under Events 1 and 5.

Results from the Grote Nete catchment model
The outcome of the uncertainty assessment was then taken into account in undertaking to develop a physically-based, fully distributed model of the Grote Nete catchment. A multi-criteria modelling protocol involving both the split-sample approach and a graphical approach using validation plots was employed to test the model during the calibration and validation stages. Hourly river discharge data was used, with a calibration period from 1986-1988, and validation from 1990(Rubarenzya et al., 2006a). The resulting model met three predetermined three statistical criteria over both the calibration and the validation periods. At Varendonk limnigraphic station located at the outlet of the catchment, the mean square error (MSE) of discharge was less than 0.10 (m 3 /s) 2 ; and the Nash and Sutcliffe efficiency coefficient (EF) and the dimensionless coefficient of determination (R 2 ) were both greater than 0.7. Hydrographs of the measured and simulated river discharge showed good fits during both the calibration (Fig. 6) and the validation periods (Fig. 7). Figure 8 shows an analysis of the performance of the model in representing extreme high values. The extremely high flow values were obtained from a Peak Over Threshold analysis carried out on the discharge values. Here, analysis of the performance of individual extremely high flows was performed firstly, to rule out any possibility of rejecting the model as a consequence of a possible shift (backwards or forward) in the output hydrograph when compared to the measured hydrograph. Secondly, to assess how well the model is simulating high extremes. Studying Fig. 8 leads to the conclusion that there is good agreement between measured and simulated values. It was observed that the scatter of points about the bisector was good for both the model validation and calibration.

Conclusions
This paper outlined a methodology for uncertainty evaluation in the context of the distributed hydrological modelling of the Grote Nete catchment in Belgium. Although there have been some attempts at putting variability at the heart of modelling itself, deterministic models are likely to remain the primary means of representing the response of the hydrological system for the foreseeable future. However, model development of distributed models involves considerable uncertainty. Many of these dependencies arise naturally and their correct evaluation is crucial to the accurate evaluation of the modelling system reliability.
The goal in this evaluation was to understand the uncertainty in applying a distributed hydrological model to a specific catchment, or stated otherwise, to assess whether the model can do

641
what is reasonably expected of it by the user. The results would establish the degree of confidence to be placed in the model's representation of the catchment response. This paper presented a case of the solution to the problem of building distributed models with the quality characteristics necessary for representation of the complex hydrology of a natural catchment.
Eleven uncertainty events were identified as contributing to the top event. A total of 7 gates were used. The study found that only a few uncertainties could be well quantified, and many of these were quantified with difficulty. Literature reports some attempts at differentiating between contributions of the different sources of uncertainty to the overall uncertainty. This process allows for explicit definition of the events to be included in the evaluation, and a definition of primary and undeveloped events. The results and conclusions of this uncertainty evaluation have been used to inform an integrated study of the response of the Grote Nete catchment to river valley rewetting.