MIKE-SHE integrated groundwater and surface water model used to simulate scenario hydrology for input to DRIFT-ARID: the Mokolo River case study

A fully integrated, physically-based MIKE SHE/MIKE11 model was developed for the Mokolo River basin flow system to simulate key hydraulic and hydrologic indicator inputs to the Downstream Response to Imposed Flow Transformation for Arid Rivers (DRIFT-ARID) decision support system (DSS). The DRIFT-ARID tool is used in this study to define environmental water requirements (EWR) for non-perennial river flow systems in South Africa to facilitate ecosystembased management of water resources as required by the National Water Act (Act No. 36 of 1998). Fifty years of distributed daily climate data (1950 to 2000) were used to calibrate the model against decades of daily discharge data at various gauges, measurements of Mokolo Dam stage levels, and one-time groundwater level measurements at hundreds of wells throughout the basin. Though the calibrated model captures much of the seasonal and post-event stream discharge response characteristics, lack of sub-daily climate and stream discharge data limits the ability to calibrate the model to event-level system response (i.e. peak flows). In addition, lack of basic subsurface hydrogeologic characterisation and transient groundwater level data limits the ability to calibrate the groundwater flow model, and therefore baseflow response, to a high level. Despite these limitations, the calibrated model was used to simulate changes in hydrologic and hydraulic indicators at five study sites within the basin for five 50-year land-use change scenarios, including a present-day (with dam), natural conditions (no development/irrigation), and conversion of present-day irrigation to game farm, mine/city expansion, and a combination of the last two. Challenges and recommendations for simulating the range of non-perennial systems are presented.


INTRODUCTION
Physically-based, distributed-parameter integrated hydrologic codes, such as MIKE SHE/MIKE11, that simulate fully coupled groundwater and surface water flows, represent the best available tools to simulate hydrologic flow systems for EWR studies and water management, because they can readily reproduce results obtained using simpler methods (i.e., analytic, lumped), but also offer the most rigorous physically-based equations which solve for hydrologic and hydraulic variables at spatially and temporally variable distributions of interest. The success of implementing a fully-integrated and distributed model, as in this study, however, can only be gauged by matching expectations based on specific requirements of the EWR study with the quantity, quality, and type of available data, and the complexity and conceptual understanding of the surface/subsurface water flow dynamics.
In data-limited systems, it is important to recognise it is not so much the integrated hydrologic codes that fall short of producing desired output accuracy, but rather the inability to adequately characterise often complex surface and subsurface hydrologic parameter distributions, or processes which typically result in greater uncertainty in conceptualisation of flows within a system. It is the greater uncertainty in flow conceptualisation that directly translates into higher degrees of error/uncertainty in calibration and model predictions. In such instances, developing simpler, more 'conceptual' flow models (i.e., fewer parameters, simpler spatial distributions, etc.) is probably a much better way to first simulate hydrologic and hydraulic indicators, until more data of the right type, location and frequency of collection become available, allowing more complex and reliable distributions to be developed for physically-based integrated models, which can produce more accurate and reliable outputs. The level of this study is at a detailed level, though some inputs such as subsurface data are limited, which results in the subsurface part of the model being more conceptual than, for example, the surface hydraulic part, where stream locations and cross-sectional profiles are known reasonably well.

385
Hydrological data are usually the start and end points in EWR methods. Most methods start with a description of the present day (PD) and the surface flow regime at key points along the river's length. With the present condition of the river ecosystem described, simulated flow response to any potential water-related management intervention of interest can then be interpreted in terms of the predicted change in physical, chemical, and biological responses relative to the PD conditions. The final hydrological output of an EWR is a description of flows needed to maintain a range of possible future ecosystem conditions that would be brought about by the different management interventions (King and Pienaar, 2011;Seaman et al., 2010Seaman et al., , 2013.
The above process relies heavily on adequately reproducing the movement of water within the catchment through calibration against available data. In this respect, non-perennial systems pose challenges to hydrological modellers that are unique or more severe than those faced with perennial rivers: • Typically as river systems become more non-perennial, aridity increases and this implies that: storm durations decrease, intensities increase, and storm area decreases. Without detailed information on localised, short-duration (i.e. minutes to hour) storms, and equivalent frequency of response in both stream flow and groundwater levels, it is very challenging, if not impossible, to calibrate models in these conditions at a high level.
• Available flow information used to calibrate the integrated model typically decreases as systems become more non-perennial -this also makes calibrating a non-perennial system more challenging (more so as aridity increases).
Three general categories of data needed to produce an acceptable flow model of system for the EWR are: • External stresses data such as climate data, irrigation, diversions, • System framework data such as topography, soils, geologic layers, structures, hydraulic properties, etc.
• System response data such as stream discharge and stage, groundwater flows/level fluctuations, etc.
One of the main requirements of the DRIFT-ARID method (Seaman et al., 2013;2016a,b) was that it include an integrated or coupled groundwater and surface water hydrological model to produce a daily time series of hydrologic and hydraulic indicators to pass to the DRIFT DSS for each development scenario. After consultation with prominent hydrologists in South Africa and abroad, the research team decided to use the MSHE code (MIKE SHE/MIKE11; Graham and Butts, 2005), developed by DHI (www.dhigroup.com). A key benefit of using MSHE is that with limited data/conceptualisation, simplified models can be developed initially and then made more complex later as additional information is made available, similar to an iterative Bayesian type decision-making process. For flow systems like the South African Mokolo Basin, with limited subsurface characterisations resulting in simple subsurface flow conceptualisations, the subsurface flow model component in the coupled surface water-groundwater MSHE model could only be developed at a conceptual level, which increased uncertainty in simulated output. Despite increased uncertainty, the model represents the best available management and decision-making tool, given available data, to assess coupled surface water-groundwater response to distributed climate data and future changes in land use. The conceptual-level MSHE model also provides considerable insight into where additional data, characterisations and conceptualisation are needed to produce more accurate model results.
This article summarises: • Available data types, quantity and quality throughout the basin • Discussion of data gaps and their implications for modelling • Conclusions and recommendation for further work Because the size, complexity, and limited data of the Mokolo River Catchment prevented developing highly-detailed characterisations and conceptualisation of the flow system, efforts focused on developing key modelling datasets and iteratively determining a reasonable model based on evaluation of multiple alternative conceptual flow models of the subsurface system. The value in developing a solid and defensible conceptual flow model of the coupled surface water-groundwater flow system cannot be understated as it permits determination of key parameters governing system flows, which help focus later data acquisition programs and fieldwork aimed at eventually reducing uncertainties in model prediction.
Prucha and Graham (2012) should be consulted for the full report including the results of the MIKE SHE/MIKE11 simulation completed on the Mokolo River. The complete MSHE Mokolo River model is also available from the Centre for Environmental Management (CEM), University of the Free State (contact: cem@ufs.ac.za). This is the third in a series of three papers, which should be read in sequence, which present the structure and activities of DRIFT-ARID (Seaman et al., 2016a); a test application on the Mokolo River System (Seaman et al., 2016b) and the integrated groundwater-surface water hydrology component for input into the DRIFT-ARID method (this paper).

OBJECTIVES
The objectives of the study are outlined on Fig. 1 below.

Study area
The Mokolo River lies in the Limpopo Water Management Area (WMA1) (Fig. 2), and is a tributary of the Limpopo River which flows along the border between Botswana and South Africa. Originating in the western Waterberg Mountains, the Mokolo River includes upper reaches of the Sand River, the Mokolo Dam, and several tributaries, and confluences with the Limpopo River (African Development Bank, 2009). The total area of the Mokolo River catchment is approximately 8 437 km 2 (Prucha and Graham, 2012).
The geology of the upper catchment is dominated by sandstone of the Waterberg group, with local outcrops of conglomerate and diabase. In the lower catchment, the surface rocks belong to the more recent Karoo Supergroup of Permian to Triassic age. These include both sandstones and mudstones. Superficial deposits in the lower catchment are of quaternary age and consist of sandy alluvium (Esterhuyse, 2012). Given the predominance of sandstone in the catchment, sediment entering the channels from hillslopes is likely to be dominated by sand with low silt content (Rowntree and Van der Waal, 2012). The northern section of the study area is heavily faulted. It is composed of sediments of the Karoo Sequence and forms a graben structure, bounded in the north by the Zoetfontein fault and in the south by the Eenzaamheid fault. The Daarby fault subdivides the coalfield into the shallow open castable western part and the deeper north-eastern part of the Waterberg coalfield (a displacement of some 400 m) (Vermeulen et al.,2011).
In spite of variation in soil distribution patterns in the catchment, the largest part (> 80%) of the Mokolo River catchment topsoil is sand to loamy sand (6-15% clay), and the balance is pure sand (< 6% clay) (Van Tol and Le Roux, 2012). Rainfall in the catchment is largely from summer rain (October to March), and ranges from 400 to 700 mm per annum with most rain falling in the southern section of the catchment. The mean annual potential evapotranspiration (PET) is more than twice the amount of rainfall over most of the area. It varies across the Mokolo Catchment from about 2 200 mm/a in the south to about 2 450 mm/a in the north. Average temperature ranges from 14-20°C with the low-lying areas in the north having the highest temperatures. The Mokolo catchment falls within the savanna biome with a small section, mostly surrounding the Sandriver Mountains, in the grassland biome. Most of the catchment is covered by woodland, with some grassland as well as thicket and bushland in the upper reaches, and a large area of thicket and bushland in the lower reaches.
Commercial cultivated farmland is found along the upper and middle reaches of the river (Kemp, 2012). Some 661 farms (2 516 subdivisions) are located within the catchment, of which 272 farms are riparian to public streams. Actual use of water for irrigation is estimated to have decreased by some 35 million m 3 /a between the mid-1980s and 1998/99, probably because assurance of supply was impacted by increased water use. Higher agricultural input costs also contributed, leading to a marked change in catchment agricultural practices where farmers converted from irrigation farms to cattle ranches or game farming (DWAF, 2007).
Five sites were selected on the Mokolo River ( Fig. 2) for detailed DRIFT analyses. Sites 1, 2, and 3 are situated upstream of the Mokolo Dam and Sites 4 and 5 downstream. For complete site descriptions, see Seaman et al. (2013).

Available data types, quantity and quality
The characterisation of system hydrology and hydrogeology is critical to the development of a sound conceptual and numerical flow model. For an integrated hydrologic flow model, both the surface and subsurface flow systems must be characterised. Characterisation generally involves two key steps: • Raw data are synthesised in a database, and then analysed spatially, typically with GIS software, such as ESRI's ArcGIS software. Temporal analysis is often done with standard spreadsheet software. GIS software allows modellers to spatially correlate different types of data (e.g. mapped surface geology with subsurface borehole data).
• The raw data are interpreted to characterise the hydrology. Standard practices for characterisation in hydrogeology, as defined by ASTM D5979 (2002) or Kolm (1993), are equally applicable to fully integrated hydrologic systems.
In Fig. 3, typical datasets are listed under key integrated hydrologic model components 'External Stresses', 'Flow System Structure' and 'System Response'.
It is important to understand the differences between 'raw' and 'interpreted' data in describing data. Raw data typically refer to information derived from the field, such as borehole logs that have not yet been interpreted, for example, geologic cross-sections (Kolm, 1993), so that more detailed characterisations of the 3-dimensional hydrogeologic aquifer/aquitard   387 structure can be defined throughout the basin. Unfortunately, in the Mokolo catchment much of the data is raw, particularly related to the subsurface flow system (see Prucha and Graham, 2012). This limited development of a more accurate subsurface flow model for the basin.

Hydrologic conceptualisation
The conceptualisation is based on the characterisation of available data. The integrated hydrologic flow modelling depends heavily on the conceptual flow model (Neuman and Wieranga, 2003).

Conceptual flow model
A generalised hillslope flow model with key structural features and flow processes that explains much of the integrated groundwater-surface water flow processes was developed (Fig. 4). Typically in large arid/semi-arid basin flow systems, most rainfall either evaporates at the soil surface, or infiltrates and is then returned to the atmosphere via plant transpiration. Larger rainfall events lead to increased infiltration, and eventually basin recharge, though most is still lost through evapotranspiration. High intensity rainfall events can increase the ratio of runoff to infiltration when the surface infiltration capacity is exceeded (Horton flow), or when soil saturates from below (saturation excess), which typically occurs near streams. The configuration of subsurface geologic contacts can cause surface discharge of groundwater (i.e. springs), or can strongly influence how groundwater flow interacts with surface flows. Deeper, lower permeability layers provide more groundwater storage, and shallow layers can cause faster, more flashy responses, with rapid recession in surface discharge. Only three natural sources of water in the integrated hydrologic system contribute to flow in rivers; overland flow, groundwater baseflow, and direct precipitation. It is our experience, particularly in regional-scale models, that interflow (build-up of saturation in a shallow hillslope near a river) can simply be combined with groundwater baseflow (from the regional, connected aquifer system), because the model grid size in a regional model is typically larger than the scale of this local, near-river flow process. To correctly simulate near-river lateral unsaturated flow, a model must be capable of simulating variable saturation, and at very small scales (e.g. a grid size of 10 m or less), to capture these small contributing areas. Figure  5 shows a map of surficial geology within the Mokolo Basin. A syncline is required to account for the spatial distribution of the geologic formations based on the surficial geology. With numerous faults and dykes, groundwater flows through the system are expected to be influenced by the larger-scale structural features and bedding contacts. Only a simple 3-layer model could be developed for the subsurface saturated zone portion of the integrated model due to lack of any available subsurface characterisation (i.e., 3-dimensional variations in geologic units shown).
A local conceptual flow diagram to explain coupled groundwater-surface water flows related to the numerous linear drainage features in the Mokolo River catchment that are aligned with river segments is presented in Fig. 6. However, the integrated flow model developed in this study assumed no influence of such geologic features (faults and dykes) on flow conditions. This assumption was made because of the lack of hydraulic data for dykes and faults (including extent and depth). It is clear that large-scale groundwater flow in the deep fractured and faulted In addition, it is also likely that these features strongly influence river-groundwater flow interaction where they intersect rivers. High permeability and likely low storage along these features may strongly influence peak stream flow, recession, baseflow, and non-perennial conditions.

Integrated hydrologic model development
Simulation of the integrated hydrologic flow system of the Mokolo River requires a numerical code that accounts for spatially distributed, time-varying data such as climate, soils, vegetation, and geology across the catchment. Additionally, the code must also simulate coupled surface runoff, channel flow, unsaturated zone flow, groundwater flow, and effects of irrigation, to account for the dynamic and integrated nature of the hydrologic system.
The MIKE SHE/MIKE 11 software code was selected for use in this study because of its broad world-wide use and application in similar hydrologic systems, and because it simulates all relevant hydrologic processes (see Fig. 7) using rigorous, physicallybased, fully-coupled flow equations that allow specification of physical, rather than empirical inputs. This facilitates simulation of physically-realistic future scenarios where various land-use, or subsurface parameters (i.e., surface topography, vegetation, soil, climate etc.) can be changed using measured or expected values. The development of the MSHE model datasets and associated assumptions are presented in this section.

Model boundary and grid discretisation
The MSHE model boundary coincides with the study area boundary (see Fig. 2). Surface and groundwater divides represent good boundaries in integrated models because, despite fluctuations in groundwater levels at the boundaries, no flows are expected across them. Simulation of subsurface and overland flows requires specification of a regularly-spaced, square finite difference grid across the model domain. A 500-m grid was selected because it reasonably accounts for most spatial variability of model input and system response, but also to keep scenario computation times reasonable.

Unsaturated zone flow
The full Richards Equation method was used to simulate unsaturated zone flow in the Mokolo basin, mainly because groundwater depths are relatively deep, averaging 20 to 25 m. The general soils distribution shown on Fig. 8 was provided by Van Tol and Le Roux (2012) and used as the basis for input into the MSHE model. Only two Terrain Morphological Units (TMU) were defined in the model; one zone is adjacent to major rivers, and the other represents all uphill soils zones. Initial modelling indicated that a better stream discharge response could be obtained by allowing macropore bypass. Higher macropore bypass was assigned to alluvial soils (30%), compared to uphill soils (5%). The vertical discretisation of each soil column is the same throughout the model, and starts with a 1 cm thick layer at the ground surface to account for the nonlinear soil evaporation and transpiration. Vertical layers were smoothly increased from ground surface to a constant 0.5 m below the groundwater table.
Simulated results show recharge averages about 5.1% of mean annual rainfall (MAR), and ranges from 0.8% to 7.4% of MAR in catchments A42J and A42B, respectively, resulting in about 3.8 mm/a to 47.4 mm/a. Increased MAR in A42B (640 mm/a versus 470 mm/a for A42J), along with higher recharge rates, were required to reproduce stream gauge data and available groundwater levels in this area.

Saturated zone flow
Vertical aquifer delineation illustrated on Fig. 9 shows the vertical layering and hydraulic property values specified in subsurface saturated groundwater portion of the final calibrated integrated flow model. Groundwater flow in the saturated zone is simulated using three different vertical zones: alluvial, or unconsolidated deposit overlying weathered bedrock, overlying more competent bedrock.

Overland flow
Surface resistance controls the rate of runoff from overland areas. A single surface resistance value (Manning M) is specified for the entire Mokolo catchment. Although site-specific data were unavailable, the value was set to 10 m (1/3) /s based on dense brush in summer or a heavy stand of trees with a few downed trees (Chow 1959) in floodplains.
Another parameter affecting overland flow is a threshold value controlling the amount of surface depression storage. In the model, this was set at 2 mm depth. Once ponding exceeds this depth, overland flow can occur. Boundary and initial conditions also need to be specified in the model for overland flow. Since overland flow is a rapid process relative to subsurface flows, the initial depths of overland flow water were set at 0 mm. The Mokolo model has a grid resolution that at 500 m is too large to capture individual springs, but instead the model simulates the combined effect of these inflows to streams as groundwater baseflow, or as groundwater discharge to the ground surface and then to streams, via overland flow.

Stream flow network and hydraulics
Stream flow was simulated using the DHI MIKE11 code. Flow in 29 rivers (see Fig. 2 − main stream and tributaries) was simulated for the Mokolo Catchment. Cross-sections were placed every 2 km in rivers longer than 20 km, and every 1 km for all others. Upstream boundary conditions in all streams were set as noflow, and the only downstream boundary condition required was along the Mokolo outlet to the Limpopo River. There, the surface water elevation was set at about 1 m above the streambed to allow for continual outflow of discharge from the system. The construction of the Mokolo Dam by early 1980 effectively breaks the Mokolo River system into two parts, and two separate Mike 11 network configurations needed to be created; one prior to 1980 (no dam), and one after 1980 (with a dam).
In the post-dam configuration, the following features were added: • A dam spillway weir was added, developed based on the dam spillway elevation (911.98 m) and width; this includes the stage-storage and stage-discharge specification • Discharge from the dam based on flows in the A4H010 gauge, 1.8 km downstream of the dam (this accounted for both controlled and uncontrolled releases) • Evaporation from the dam surface area (variable in time) Flows were simulated using the fully hydrodynamic St Venant option in MIKE 11 so that backwater effects and flows in steeper slopes could be simulated. An option to use automatic time-steps was also specified. Cross-section bank and invert elevations were adjusted based on the regional-scale topography and the need to have banks lower than adjacent model cells, to allow surface inflows from overland areas. About 1 038 km of stream were simulated with the Mike 11 network.

Climate
Daily precipitation and potential evapotranspiration (PET) specified for various subcatchments based on the available 1950 to 2000 data (Schulze and Maharaj, 2007) was provided.

Vegetation
A vegetation distribution was specified in the integrated model based on available data. Seven different zones were defined throughout the basin, including subtropical alluvial vegetation along river reaches which influences ET loss. Timevarying leaf area index (LAI) values were obtained from NASA MODIS 8-day averages across the basin for the different vegetation classes.

Model simulations
To simulate present day or future flow conditions within the Mokolo River catchment requires calibrating model input parameters such that the model adequately reproduces available surface water flow and groundwater level measurements throughout the catchment. Details on the calibration approach, calibration results, and subsequent future scenario assumptions and results are presented in the following sections.

Integrated hydrologic model calibration
Before the integrated flow model can be used to simulate future conditions, it must be calibrated against hydrologic data that describes the response of the system to variations in climate.
The calibration process attempted to reproduce a range of system responses, including: • Timing/duration of no-flow along gauged streams (dry periods) • Duration and magnitude of stream flows (low and high flow periods) • Flow duration • Groundwater baseflow (end of non-rainy season) • Average and transient groundwater levels (wet/dry season) • Areas/rates of spring discharge • Gaining/losing reaches along the Mokolo and tributaries Calibration performance was assessed using standard statistical measures, such as correlation coefficient, ME (mean error), MAE (mean absolute error), and RMSE (root mean square error) of river discharge, and ME and RMSE of monitored groundwater well water levels (< 2 m for the catchment).

Available calibration data
Two key datasets were used to calibrate the model, namely, mean daily river discharge and groundwater level data. The river discharge data represent the priority calibration dataset because daily data are available for decades at key locations throughout the watershed, and because in coupled stream-aquifer watersheds different characteristics of the stream discharge response to single rainfall events or seasonal and annual variations in flow can be used to calibrate both surface water and groundwater parameters. Streamflow discharge data compared to groundwater level data are almost always rated higher for calibration targets, because they show the cumulative effects of changes to both groundwater and surface water parameter adjustments, while groundwater water-level data are more localised.
Key calibration characteristics of the 10 river discharge hydrographs include: • Peak flow rates (some gauges, for example at A4H002, do not include peak flows as the gauge is drowned above a certain level) • Duration and shape of ascending/receding hydrograph • Early recession response indicates near-stream subsurface storage (which in turn reflects subsurface structure and hydraulic properties) • Late recession response reflects more of the subsurface flow system characteristics and hydraulic properties further from the stream (e.g. uphill areas) • Baseflow rates during low precipitation periods (typically September) • Flow rates (peak) and volumes Single reported measurements from 2 000 reported groundwater well water levels, used to calibrate the groundwater portion of the MSHE model, were assumed representative of unconfined, non-pumped conditions given lack of details on measurements. Simulated groundwater levels (1950 to 2000) were compared directly to these single measurements in time, effectively assuming that this level represented long-term static local groundwater conditions. Clearly, this assumption would be violated in irrigated and mined areas, where active groundwater pumping would bias the values.

Calibration approach
The approach used to calibrate the distributed-parameter integrated hydrologic flow model of the Mokolo Catchment is outlined in Refsgaard et al. (2007). A step-wise, iterative approach to develop and calibrate the hydrologic model developed by Prucha (2002) and Kaiser-Hill (2002) was also used to guide the model calibration. Ultimately the calibration approach is dictated by available data, complexity of flow conditions (natural and anthropogenic), and required accuracy needed for the EWR. It is important to appreciate that if the integrated hydrologic/hydraulic flow model is unable to provide the level of accuracy required by the DRIFT-ARID DSS, more data must be collected to reduce model calibration error and predictive uncertainty.

Calibration parameters
The most important model parameters adjusted during the modelling in the context of the hillslope conceptual flow model are illustrated in Fig. 10. In a given river drainage basin, this is a useful way to understand how adjustments to specific parameters affect the integrated hydrologic response, or unsaturated zone infiltration, recharge, groundwater level fluctuations, groundwater flows in the respective 3-layer system, baseflow into or out of the river, and river discharge.

Calibration results
The model was calibrated mostly to surface water flow data, because the quality of available groundwater level data was uncertain. It was unclear whether available groundwater level data reflected static or pumped conditions, which is important because reported levels might show significant bias to the low side if they were taken while or shortly after wells were pumped. The average mean error (ME) across the entire Mokolo Basin indicates simulated groundwater levels were higher than observed (one-time measurements) by about 5.5 m, well within an accepted range of calibration given the range of elevation change across the entire model. However, the notable uncertainty in observed groundwater levels, combined with limited subsurface characterisation and conceptualisation of a complex subsurface flow system, meant that the subsurface portion of the integrated model could only really be calibrated in a semi-quantitative or even qualitative sense.
Calibration of the surface flow system shows that the model was largely able to capture the general characteristics of discharge time series at available gauges (see Figs 11 and 12), including peak flows, relatively rapid ascension curves and relatively slow recession curves due to the slow, delayed effect of continued discharge of subsurface storage into the river following rainy seasons. The model however failed to capture low-flow seasonal baseflow and over-simulated river discharge, which is likely due mostly to unaccounted-for river diversions during periods of low river flow. The inability to reproduce observed low-flow conditions well is also probably due to uncertainty associated with conceptualisation of the subsurface flow system (i.e., variation of streambed thickness, material and hydraulic properties in all catchment streams, effects of faulting/dykes etc). The model was also unable to simulate levels in the reservoir above the Mokolo River Dam, due to unavailable operations and incomplete diversion records.

Future scenario simulations -approach and set-up
Despite the challenges in calibrating the MSHE model, it was considered acceptable for assessing changes in future scenario hydrologic and hydraulic indicators described below. To help improve the effect of unaccounted for surface water diversions (or from shallow riverbed wells), especially during low-flow periods, shallow riverbed groundwater was removed using a depth-dependent 'drain' function that reduced baseflow. This improved modelled low-flow discharge somewhat, but probably didn't account for direct surface water diversions and other uncertainties.

Hydrologic and hydraulic indicators
A full list of indicators chosen is available in Seaman et al. (2013;2016a,b). Most of the values for each indicator could be derived from a small set of model outputs for each DRIFT-ARID study site: • Daily river discharge • Daily river stage • Daily groundwater depth • Daily groundwater flow beneath the river • Daily net groundwater baseflow to the river

Future scenarios
The five chosen scenarios are summarised in Table 1. With the exception of Scenario 1 (separated into two models; pre-dam and post-dam), each of the scenarios simulates a continuous future 50-year period, using climate data from 1950 to 2000. For simplicity, the future model runs are assumed to start at climate year 1/1/1950. This time period was chosen because it corresponds with available climate data. It is also important to note that the future simulations were conducted using the 1950-2000 dates and times, rather than projecting future dates, which allowed a direct comparison between the future and PD scenarios. The PD scenario is simply an extension of the calibrated model, except that it assumes the Mokolo Dam is present during the entire 1950-2000 period. The following assumptions (Prucha and Graham, 2012) were made for future scenarios.

Mokolo Dam
The Mokolo Dam is simulated for all 50 years in all scenarios except for the Natural scenario (Scenario 2). For the Natural scenario, a natural stream profile was specified through the existing reservoir and dam, based on the upstream and downstream profile. All flow upstream of the Mokolo Dam was routed downstream to the outlet of the model (through Gauge A4H014 near the Limpopo River). The post-dam simulation period was 20 years . The simulation period for future scenarios is 50 years . Therefore, the conceptualisation of the Mokolo Dam was modified in the future scenarios. The flow released through the Mokolo Dam gate was set based on the monthly averaged flows at Station A4H010. The rate of water extraction from the Mokolo Dam was assumed constant at 2.5 m 3 /s. This value is lower than the average extraction in the calibrated post-dam model  which is 2.8 m 3 /s. However, a constant extraction above 2.5 m 3 /s in the scenarios could cause the upstream water levels to decrease unrealistically during dry years.

Farm dams
For the future scenarios, 25 local farm dams were simulated in the Present Day (Scenario 1), Game Farm (Scenario 3), External Water (Scenario 4), and Combined (Scenario 5) scenarios.
These were placed along streams, to account for additional storage and evaporation losses to the regional flows.

Observation weirs
Weirs were simulated in Mike 11 at all stream gauge locations to simulate discharge at these locations.

Cross-sections
To simulate stream stage at the five study sites, it was necessary to include cross-sections measured in the field. Ideally, sections should have been measured upstream, downstream, and through each study site to produce more accurate stage levels. Lacking surveyed cross-sections upstream and downstream of the five study sites, measured sections were repeated upstream and downstream of each site, changing only the thalweg elevations based on the stream profile already in Mike 11.

Inter-basin water transfer from Crocodile River
In the External Water (4) and Combined (5) scenarios, the effect of the inter-basin transfer from the Crocodile River into the Mokolo Basin was simulated. The transfer was allowed to support expansion of the Exxaro mine, Eskom Powerplants (Medupi and Matimba), and Lephalale town water supply. Water was applied in these scenarios using the irrigation module. The only information provided for this scenario was an estimate of the future amount of water required by the mine; the power plant and the town expansion. Application rates or how the additional water might be used in the expansions were not provided and therefore assumed in the scenario.

Irrigation
Irrigation was turned off over the entire model extent for the Natural scenario, and upstream of the Mokolo Dam for the GameFarm and Combined scenarios. Irrigated vegetation was changed to native vegetation in the GameFarm scenario. Irrigation remained active downstream of the Mokolo Dam for the GameFarm and Combined scenarios, and for the entire model in the PD and Expansion scenarios.  (Table 2).

Future scenario simulations -summary of results
Scenario simulation results show the following key points: • Changes in the vegetation associated with conversion of irrigated land to game farming significantly increased the mean flow simulated at the sites upstream of the Mokolo Dam (simulated values for these scenarios were 200 to 800% more than PD). Increases in maximum flow were even higher (simulated values up to 70 times PD maximum flow).
• Changing irrigated areas to game farming (i.e., to 'native vegetation') resulted in larger changes compared to those for Natural. The changes were larger because modified native vegetation (Acocks veldtypes using ACRU method to determine LAI) was used in the GameFarm and Combined scenarios, whereas, in the Natural scenario the existing (Mucina and Rutherford, 2006) vegetation LAI was based on NASA MODIS data. The modified native vegetation (GameFarm and Combined scenarios) uses much less water than the natural vegetation in the Natural scenario. Though better LAI data is needed to produce more realistic differences between these scenarios, results demonstrate how sensitive water balance outputs can be to changing these vegetation parameters in a fully integrated model.
• The External Water scenario had virtually no effect on the selected indicators, attributed mostly to the specific way inputs were modified in the scenario, relative to the indicator locations, and difficulties in knowing how the External Water scenario should be implemented.
• The flow conditions downstream of the Mokolo Dam are strongly controlled by releases from the dam. Given the lack of data associated with operating rules and strategies, the simulated changes downstream of the dam are only indicative of potential changes. Different strategies or rules would significantly impact the results.
The following daily time-series of model output for all five study sites and the five MSHE scenarios were provided for input into the DRIFT-ARID DSS (For detailed data see Prucha and Graham, 2012 and an example in Seaman et al., 2016a,b): • River discharge • River stage • Depth to groundwater beneath river • Baseflow to river • Groundwater flow beneath river

DISCUSSION AND CONCLUSIONS
A fully integrated, distributed parameter MSHE model was developed for the Mokolo Basin and used to simulate several land-use change scenarios using available information. Though the model successfully produced reasonable hydrologic and hydraulic indicator results for use in subsequent DRIFT-ARID analyses, results revealed important challenges at various stages of model development and application that must be considered in future applications in South Africa. Much of the difficulty of the MSHE model reproducing observed non-perennial conditions and peak flows at most surface water gauges throughout the catchment is attributed to known data gaps, and limited characterisation and conceptualisation of the subsurface flow system.

Characterisation and conceptualisation
Key data gaps identified in the study that influence results include: • Surface water diversions: The locations, rates and nature (i.e., from shallow riverbed wells, or direct surface diversions) were not available for this study.
• Irrigation: Although some data are available on annual irrigation rates from surface water and groundwater sources, it was not possible to correlate irrigation rates and sources to actual irrigation areas.
• Subsurface aquifer data: Basic subsurface aquifer data were unavailable, including aquifer depths, properties, and transient water level observations. This significantly hinders the calibration of groundwater flow and groundwater surface water interaction. In particular, the stream-aquifer interactions are important in the EWR determination.
• Soils: Basic soils data are available, but need to be correlated to the subsurface aquifer data when it becomes available as it directly affects the spatial and temporal distribution of recharge throughout the model.
• Topography: The available topography data are relatively coarse, which is suitable for a regional integrated model. However, the available topographic resolution is insufficient for the creation of appropriate cross-sections to accurately describe the surface water flows and levels. The limitations of the input data translate directly into uncertainty associated with the conceptualisation, calibration, and scenario results.

Calibration
The calibrated model largely reproduced the long-term, regional-scale flow behaviour observed in the Mokolo catchment. However, the system is sensitive to irrigation and nearstream vegetation, soils, and sub-surface hydraulic properties.
Particularly in the groundwater model, the lack of observations and field data means that the simulated groundwater response is uncertain and of little help to calibration of the integrated model. The network of dykes and faults probably compartmentalises the regional groundwater flow system, which was not simulated in the current model due to lack of hydraulic characterisation of how these features (i.e. depths, lateral extents, permeabilities) actually influence subsurface flow from upland areas and into the surface water drainage network. This may partly explain the difficulty in simulating non-perennial flows, since groundwater baseflow is likely a local process. Peak flow responses were also difficult to reproduce in the model, which is attributed to the lack of sub-daily precipitation data; detailed river bed topography (cross-sections); and in-stream weir and farm dam information.

Future scenarios
The model generally reproduced the expected directions of changes in flows associated with the five scenarios. However, the absolute magnitudes were uncertain, given the lack of available data, and necessary relatively simplistic subsurface characterisations and flow conceptualisation.
Differences in the initialisation of each of the scenarios make exact comparisons difficult. For example, in one of the scenarios, different numerical convergence parameters had to be used to make the simulation stable, which affected the results. Also, scenario results suggest that the removal of the Mokolo Dam impacts the catchment flow for more than 50 years afterwards. This is a valuable tool that can be used for ecological Reserve determinations at the 'comprehensive level' as required by the National Water Act.

KEY CHALLENGES AND RECOMMENDATIONS
• Several noteworthy challenges were encountered in the project. These, along with recommendations for addressing them, are presented below: The data sources used in the models need to be clearly documented (metadata, date stamps, origins). A matrix should be developed for data input to keep track of large datasets.
• Rivers with a large dataset can be difficult to model as the combination of dataset errors can produce errors that are compounded, or are superimposed at critical locations of interest, making it difficult to interpret results and assess uncertainty in predictions. Basin watersheds with less data could possibly be easier to model but calibration is difficult if not impossible if system response data are lacking and the subsurface flow system is complex like the Mokolo Basin, or historical or current use is not well documented. In systems where climate data is unavailable, modellers may instead rely on available sub-daily climate data (http://disc.sci.gsfc. nasa.gov/hydrology/data-holdings).
• Data is likely sparse for most catchments in South Africa and internationally, especially in non-perennial systems, resulting in varying degrees of uncertainty in results. This favours the development of multiple conceptual models that match the observations reasonably well. The different conceptualisations can then be ranked in terms of their ability to reproduce observed conditions. The range of results from models that are similarly calibrated is a measure of the uncertainty in the results. With new data, unrealistic conceptual alternatives can be omitted. Ultimately, it may be better to start with very simple conceptual flow models of the system (i.e., analytical or lumped conceptual) and increase the complexity only when new data become available. Even in these instances, the integrated MSHE code can still be used to build even the simplest model, though typically unique characteristics of key indicators (i.e., low flow periods in non-perennial systems) can only be captured at an 'adequate' level when more physically realistic characteristics of the basin system are included in the model.
• If higher levels of accuracy in simulating basin hydrologic and hydraulic indicators are required, then the right types, quantity, quality, frequency, and locations of data must be collected to meet the needs. This should be assessed at the beginning of any modelling project to set the expectations of the integrated modelling output, and effects of associated uncertainty translated into subsequent DRIFT-ARID DSS analyses. Ultimately the models themselves can be used to focus field data collection and monitoring.

397
• Surface water and groundwater divides do not have to match one another. Surface water hydrologic divides can easily be based on available topographic data, but groundwater hydraulic divides could be dictated almost entirely based on complex subsurface hydrostratigraphy. It is possible to have groundwater flow across the external model boundary, based on surface water divides. Within the Mokolo Basin, the complex but poorly defined 3-dimensional geology and a lack of water level data across surface water divides and within specific hydrostratigraphic units (i.e, Mokolo Basin watershed) limit the ability to just use water level data to confirm correlation with surface water divides. Groundwater data should be collected outside the basin of interest to help determine appropriate model boundaries.
• The lack of detailed cross-sections, irrigation, surface diversions, and hydrogeologic characterisation of near-stream alluvial subsurface conditions limits the ability of the model to reproduce periods of zero flow.
• In arid/semi-arid environments, it is usually necessary to use 'event-level' rainfall because rainfall events are typically short (<1 h), intense, and localised. If soils are generally permeable (e.g. sandy), then it is possible to use daily data to simulate integrated flows, as most rainfall will infiltrate. However, if the rainfall rate exceeds the capacity of the soil to infiltrate water, daily rainfall data can promote infiltration at the expense of the surface runoff generated during short, high-intensity events that are averaged out over a day. Ultimately, use of daily climate data must be evaluated during model calibration to assess whether it adequately simulates surface runoff and infiltration/recharge to groundwater. Ideally, it is always better to use sub-daily rainfall and aggregate up if needed.
• The calibration strategy should carefully consider the needs of the DSS. That is, the strategy needs to include the hydrological and hydraulic indicators used in the DSS, along with the available data. In particular, the strategy needs to consider the: -Available calibration data prior to the start of modelling, including spatial distribution and length of record. If appropriate calibration data are unavailable (e.g. depth to water table beneath the river) then the early identification of the missing data could motivate for its collection.
-Appropriate calibration target locations, types, and values. Instead of the traditional discharge and water level targets, the indicators themselves could become critical calibration targets. For example, if non-perennial conditions are a critical indicator, then the annual period of non-perennial flow could be a calibration target.
-Priority of the calibration targets. The indicators chosen by the team need to be prioritised, and the modeller can then make sure the model produces accurate information for the most important indicators. It is generally not feasible to calibrate a model that reproduces disparate responses equally well. For example, it is challenging to create a model that is very well calibrated to river peak flows, river low flows, and groundwater levels.
-Appropriate accuracy of the calibration. The accuracy of the calibration is fundamental to the modelling effort required. It may be that a high level of calibration accuracy is not required to produce the same ecological response results. If the DSS response is not very sensitive to the exact model results, then considerable time, effort, and money could be saved in the modelling. Similarly, if the ecological results were sensitive to the difference between two responses, then only the differences need to be accurate (which is generally easier to simulate accurately).
• Scenarios chosen must be carefully designed. Scenarios that simulate the required response are not always easy to design. There is a need to pre-evaluate scenario responses and adjust the final scenarios accordingly. The pre-evaluation of the scenarios can be done on a shorter simulation, or a simpler model setup.
• Modelling uncertainty should be propagated to the DSS. In any modelling project, model predictions are always inherently uncertain. Some of this uncertainty can be quantified through formal analysis (sensitivity analysis, multiple conceptual models etc.). The uncertainty in the hydrological responses should be communicated to the DSS team as a range of expected values. The range of expected values could be incorporated into the DSS to determine if the DSS outcomes are sensitive to the range of hydrologic uncertainty. If so, then this could motivate for additional data collection in the catchment to reduce the model range. This could also motivate for changes to DRIFT-ARID or the DSS to make it more robust, with respect to the expected uncertainty of the hydrologic modelling.
• Focus on changes in hydrology rather than on a specific hydrologic state (actual real-time results, as this is only possible at high confidence when considerable amounts of data are available). MSHE modelling predicts a change in a system rather than a specific state in time. It is very difficult to defensibly simulate real system flow data when input data are inaccurate or unverified.
• A future project should focus on coming up with integrated groundwater-surface water methodology for these systems -assuming data are always limited -instead of attempting to calibrate at the highest levels. At the basin scale, it may be more productive to (a) assume limited data, (b) come up with realistic indicators which do not require difficult-toachieve accuracies, and (c) demonstrate the methodology.