Prediction of groundwater levels from lake levels and climate data using ann approach

There are many environmental concerns relating to the quality and quantity of surface and groundwater. It is very important to estimate the quantity of water by using readily available climate data for managing water resources of the natural environment. As a case study an artificial neural network (ANN) methodology is developed for estimating the groundwater levels (upper Floridan aquifer levels) as a function of monthly averaged precipitation, evaporation, and measured levels of Magnolia and Brooklyn Lakes in north-central Florida. Groundwater and surface water are highly interactive in the region due to the characteristics of the geological structure, which consists of a sandy surficial aquifer, and a highly transmissive limestoneconfined aquifer known as the Floridan aquifer system (FAS), which are separated by a leaky clayey confining unit. In a lake groundwater system that is typical of many karst lakes in Florida, a large part of the groundwater outflow occurs by means of vertical leakage through the underlying confining unit to a deeper highly transmissive upper Floridan aquifer. This provides a direct hydraulic connection between the lakes and the aquifer, which creates fast and dynamic surface water/groundwater interaction. Relationships among lake levels, groundwater levels, rainfall, and evapotranspiration were determined using ANN-based models and multiple-linear regression (MLR) and multiple-nonlinear regression (MNLR) models. All the models were fitted to the monthly data series and their performances were compared. ANN-based models performed better than MLR and MNLR models in predicting groundwater levels.


Introduction
Assessment of the quality and quantity of both surface and groundwater is important in hydro-environmental management to sustain the natural systems and safe liveable environment on and under the earth's surface.Groundwater and surface water are fundamentally interconnected.This interconnection should be well understood to effectively and safely manage the precious groundwater and surface-water resources while benefiting from them.Determining groundwater surface-water interactions is therefore crucial in water resources planning and management.The main factors affecting groundwater/surface water interaction are the climate inputs (rainfall, evaporation demand), the surface characteristics of the basin (soil, vegetation and topography) and the underlying geological structure, including the depth of the groundwater table below the surface.
Although parametric statistical rules and deterministic models have been the traditional approaches to forecasting water resources variables, many recent efforts have shown that when explicit information of hydrological sub-processes are not needed, such as infiltration, rainfall and runoff, an artificial neural network (ANN) model can be more efficient and effective (Maier and Dandy, 2000).Therefore, the ultimate objective of this research was to develop an ANN model for predicting groundwater levels from readily available observations without needing any information about the hydrological sub-processes.
An ANN model was developed in which lake levels, rainfall, and evapotranspiration data available for north central Florida were used as inputs, and the groundwater level around the lakes was used as output.The sensitivity of the prediction accuracy to the content and length of training data was investigated.The multiple-linear regression (MLR), multiple-nonlinear regression (MNLR) and ANN models were fitted to the monthly data series and their performances were compared.ANN was found to model groundwater levels using limited data better than a statistical regression model for different lengths of training data.

Description of the study area
The study area is located in the Upper Etonia Creek Basin (UECB) of north central Florida which has an area of 446 km 2 and is noted for numerous lakes and karst features (Watson et al., 2001) and is part of the lower St. Johns River Basin.It lies between 29 o 37' and 29 o 53' north latitude and 81 o 51' and 82 o 04' west longitude (Sousa, 1997).Figure 1 shows the location of the lakes and in downstream order they are: Lowry Lake, Magnolia Lake, Lake Brooklyn, and Lake Geneva (Merritt, 2001).

Hydrogeology
The surficial deposits in the study area consist of unconsolidated to semi-consolidated sand and clayey sand marl of the Holocene, Pleistocene, and Pliocene ages (Clark et al., 1964).The deposits are underlain by the Hawthorn Group, a marine deposit of the Miocene age that consists of clay, quartz sand, carbonate, and phosphate.The Hawthorn Group is underlain by Ocala limestone of the Eocene age.

200
The groundwater system in the area generally consists of three hydrogeological units.The surficial aquifer is the uppermost water-bearing unit.It occurs in the unconsolidated and semi-consolidated deposits, and it is hydraulically connected to streams and lakes throughout the basin.The underlying Hawthorn Group acts as a leaky confining unit overlying the limestone formation (Clark et al., 1964).The upper Floridan aquifer occurs in the Ocala limestone, which is part of the Floridan aquifer system (FAS), a major aquifer system that underlies Florida and is part of the adjacent states (Miller, 1986).Many of the lakes in the basin coincide with karst features that exist in the underlying limestone formation.Lake and groundwater levels in the surficial aquifer generally are higher than hydraulic heads in the upper Floridan aquifer and UECB is a major recharge area for the underlying upper Floridan aquifer (Watson et al., 2001).

Climate
The climate in the study area is classified as humid subtropical.The basin lies about 80 km north of the division between the tropical climate of the lower latitudes and the subtropical climate of the south-eastern United States.The average annual temperature is approximately 22 o C. The area receives more than half of its annual rainfall between June and September.Precipitation in the winter and early spring is typically widespread associated with frontal activity (Sousa, 1997).Most of the rainfall in the summer is in the form of local showers and thunderstorms.A notable feature is that the average rainfall for June is about double the average rainfall for May (Clark et al., 1964).

Groundwater, surface-water and climate interaction
The sources of water for human activities are surface waters, which include all the lakes, streams and rivers that eventually flow into oceans.These waters are depleted by evapotranspiration and replenished by precipitation as a part of the hydrological cycle.In recent years, because of droughts, the lake levels dropped, adversely affecting the fish and fauna of the lakes.In the region drinking water supply is obtained from groundwater, and therefore there was an increase in public awareness regarding the exact relationship between groundwater pumping, lake levels, precipitation and evapotranspiration, and thus to take the necessary precautions to protect this vulnerable and valuable environment.
Groundwater is the major source of drinking and irrigation water in Florida.It also interacts closely with streams, sometimes discharging water into streams or lakes and sometimes receiving water from them.In fact, groundwater can be responsible for maintaining the hydrological balance of streams, springs, lakes, wetlands, and marshes.
If the interaction between groundwater and surface water is not well considered, some quantity and quality problems may occur in both surface and groundwater resources.An increasing quantity of groundwater is being withdrawn to meet the demands of a growing population, which may cause some typical threats such as overdraft, drawdown and subsidence.Overdraft occurs when groundwater is removed faster than recharge can replace it.Drawdown lowers the lake levels and dries up the wetland areas and even some streams, which are fed by groundwater.Subsidence is also one of the dramatic results of over-pumping.A basic threat to the quality of the groundwater is contamination, which may be caused by over-pumping, which results in saltwater or brackish water intrusion, or by not protecting the natural recharge areas of groundwater basins.Therefore groundwater recharge areas and surface-water bodies interacting with groundwater should be well protected against any kind of contamination.
The fluctuation of the lake levels in the study area is a function of the balance between the inflow and outflow components of the lakes.In a lake that interacts with both the surface-water and groundwater systems, the inflow components are precipitation, surface-water inflow, groundwater inflow, and overland flow, and the outflow components are lake evaporation, groundwater outflow, and surface-water outflow.In a lake/groundwater system that is typical of karst lakes in Florida, a large part of the groundwater outflow from lakes occurs by means of vertical leakage from the lake through an underlying semi-permeable confining unit to a deeper, highly transmissive limestone Floridan aquifer system (Watson et al., 2001 andMotz, 1998).
The relationship between a lake and groundwater is shown below in the water budget equation and in Fig. 2 for a lake such as Magnolia Lake in this study.The change in storage over a period of time is balanced by the sum of the inflows and outflows that occur for a given time period: (1) where: ∆S is change in storage P is precipitation I s is surface-water inflow R is overland flow into lake I g is groundwater inflow from the surficial aquifer

201
E is lake evaporation O s is surface-water outflow O g is groundwater outflow to the surficial aquifer L is vertical leakage through the confining unit to the underlying aquifer.
Groundwater levels around the lake are functions of I g and O g , and L in Eq. (1).To calculate the groundwater levels, first of all, the water budget equation (Eq.( 1)) needs to be solved, then the I g , O g and L should be substituted in the partial differential form of the groundwater flow equation (Dogan and Motz, 2005): (2) where: h is the pressure head of the aquifer K is the hydraulic conductivity of aquifer in the x-, y-, and z-directions S s is the specific storage coefficient.
The combined solution of Eqs. ( 1) and ( 2) is complex and requires the use of numerical models depicting the groundwater/ surface-water interaction.An ANN is an alternative, easy and fast way to model the parameters without using any complex mathematical model.Measurements of lake levels, groundwater levels, evapotranspiration and rainfall can be obtained easily and cost effectively when compared to measurements of the soil characteristics, initial soil moisture, infiltration, and other groundwater characteristics, which are required for numerical models.Therefore, a model that uses only available real-time data, i.e. an ANN model, would be more easily applied in the operational forecast system.

Artificial neural network (ANN)
In general, the architecture of the multilayer feed-forward neural network can have many layers where a layer represents a set of parallel processing units (or nodes).The three-layer ANN (Fig. 3) used in this study contains only one intermediate (hidden) layer.A multilayer ANN can have more than one hidden layer; however, many experimental results have shown that a single hidden layer may be enough for most forecasting problems.It is the hidden layer nodes that allow the network to detect and capture the relevant patterns in the data, and to perform complex non-linear mapping between the input and the output variables.The sole role of the input layer of nodes is to relay the external inputs to the neurons of the hidden layer.Hence the number of input nodes corresponds to the number of input variables.The outputs of the hidden layer are passed to the last (or output) layer, which provides the final output of the network.A network with very few hidden nodes will have difficulty learning the data, while a too complex network tends to over-fit the training samples and thus has a poor generalisation capability.Finding a parsimonious model for accurate prediction is particularly critical since there is no formal method for determining the appropriate number of hidden nodes prior to training.A trialand-error method commonly used for network design (Tokar and Johnson, 1999) is used in this study.

Back-propagation training
In the prediction context, multilayer feed-forward neural network training consists of providing input-output examples to the network, and minimising the objective function (i.e.error function) using either a first-order or a second-order optimisation method.Training can be formulated as one of minimising function of the weight, the sum of the non-linear least squares between the observed and the predicted outputs defined by: (3) where: n is the number of patterns (observations) Y o represents the observed response (target output) Y p the model response (predicted output).
In the back propagation training, minimisation of the error function E in Eq. ( 3) is attempted using the steepest descent method and computing the gradient of the error function by applying the chain rule on the hidden layers of the feed forward neural network.Considering a typical multilayer feed forward neural network whose hidden layer contains M neurons, the network

Figure 2
Water-budged components for a typical lake interacting with groundwater (4) (5) where: net PJ is the weighted inputs into the j th hidden unit n is the total number of input nodes W ji is the weight from input unit i to the hidden unit j x pi is a value of the i th input for pattern P, W jo is the threshold (or bias) for neuron j g (net PJ ) is the j th neuron's activation function assuming that g is a logistic function Note that the input units do not perform operation on the information but simply pass it onto the hidden nodes.The output unit receives a net input of: where: M is the number of hidden units W kj represents the weight connecting the hidden node j to the output k W ko is the threshold value for neuron k y pk is the k th predicted output Recalling that the ultimate goal of the network training is to find the set of weights W ji connecting input units i to the hidden units j, and W kj connecting the hidden units j to output k, that minimise the objective function (Eq.( 3)).Since Eq.( 3) is not an explicit function of the weights in the hidden layer, the first partial derivatives of E in Eq.( 3) are evaluated with respect to the weights using the chain rule, and the weights are moved in the steepest-descent direction.This can be formulated mathematically as: (7) where: η is the learning rate, which scales the step size.The usual approach in back propagation training comprises choosing η according to the relation 0< η <1 (Zealand et al., 1999)

Selection of training and testing set
ANNs are data-intensive and learn the underlying physics of the system of interest from the training samples which are basically the cause-effect samples.Therefore, the number of training samples significantly influences a network's predictive performance.Increasing the number of training samples provides more information about the shape of the solution surface and thus increases the potential level of accuracy that can be achieved by the network.Having too few data samples will lead to poor generalisation by the network.An optimal data set for training would be the one that represents the modeling domain and has the minimum number of repetitive samples in training (ASCE Task Committee, 2000).
A training and a test sample are typically required for building on ANN forecaster.The training sample is used for ANN model development and the test sample is adopted for evaluat-ing the forecasting ability of the model.Sometimes a third one called the validation sample is also utilised to avoid the overfitting problem.It is common to use one test set for both validation and testing purposes particularly with small data sets.
There is no general rule to the problem of division of the data into training and data sets, several factors such as the problem of structure, the data type and the size of the available data should also be considered in making the decision.It is critical to have both the training and test sets representative of the population or underlying mechanism.This has particular importance for time series forecasting problems.Inappropriate separation of the training and test sets will affect the selection of optimal ANN structure and the evaluation of the forecasting performance (Zhang et al., 1998).
The literature offers little guidance in selecting the size of training and the test sample.Most authors select the ratio of training data vs. testing data depending on their particular problems.Garr et al. (1994) employed a bootstrap re-sampling design method to partition the whole sample into 10 independent sub-samples.The estimation of the model is implemented using 9 sub-samples and then the model is tested with the remaining sub-sample.
The accuracy of a particular forecasting problem may be also affected by the sample size used in the training and/or test set.Nam and Schaefer (1995) tested the effect of different training sampling sizes and found that as the training sample size increases, the ANN forecaster performs better.

Limitations of ann
ANN modelling should be used with care and an understanding of its strength and limitations.ANNs are capable of mapping relations within the range of values comprising the training space used in the training data set, a problem of interpolation.The ability of an ANN to extrapolate is limited when the input values in the prediction phase are far from the domain of the training data set.In this sense an ANN is not very capable when it comes to extrapolation.
An ANN model has a major drawback compared to physically based models, in that a new input variable that was not used in the training phase cannot be introduced to the model in the prediction phase, i.e. the number of input variables should be the same during the training and prediction phases.If the scope of the problem changes, such as the addition of a new ground water extraction well or a change in the land use, training must be repeated with this new information.The weight space in ANN cannot remain static after completion of the training.
If the relationship between the input and output variables is very simple or nearly linear there would have been no need for using ANN.In such cases, although an ANN may perform well, it is quite possible that it may have a mathematically indeterminate neural network structure (Sha, 2007).
The lack of physical concepts and relationships in ANNs can sometimes be considered a limitation, while at other times an advantage, frequently the source of skeptical attitudes towards this methodology.This aspect can be an advantage when the physical relationships between variables cannot be modeled mathematically but easily modeled by ANN, yet the physics behind the relationships cannot be explained by ANN.

ann application for prediction of groundwater levels in north Central Florida
The variables monthly variation in Magnolia lake level (MLL), in Brooklyn lake level (BLL), rainfall (RF), and evapotranspi- 203 ration (ET) were selected to describe the physical phenomena of the groundwater/surface-water interaction process, in order to forecast groundwater levels (GWL) using an ANN.
The first critical decision is to determine the appropriate network architecture, that is, the number of layers and the number of nodes in each layer.In this study, the input layer has 4 neurons representing MLL, BLL, RF, and ET; the output layer has 1 neuron representing GWL.
The hidden layer and nodes play very important roles for many applications of ANN.It is hidden nodes in the hidden layer that allow neural networks to detect the feature, to capture the pattern in the data, and to perform complicated non-linear mapping between input and output variables.There are few guidelines on how many hidden nodes are needed to approximate any given function, but it is widely recognised that a single hidden layer is often sufficient for an ANN to approximate any complex non-linear function with any desired accuracy.Here, the 3-layer ANN with 1 hidden layer and the commonly used trial-anderror method to select the number of hidden nodes were used.The trial-and-error procedure started with 2 neurons initially and the number of hidden neurons was increased up to 10 with step size of one in each trial.For each set of hidden neurons, the network was trained in a batch mode to minimise the mean square error at the output layer.In this study, the tangent sigmoid, logarithmic sigmoid and pure linear transfer functions were tried as activation functions for hidden and output layer neurons to determine the best network model.
Preprocessing of the data is usually required before presenting the data samples to the network model when the neurons have a transfer function with bounded range.The reasons for scaling of the data samples can be described as to initially equalise the importance of variables and to improve interpretability of network weights (Goh, 1995).In this study, the data were scaled by using the following equation;  1.
In this study, the Levenberg-Marquardt optimisation technique was employed which is more powerful than the conventional gradient descent techniques (Hagan and Menhaj, 1994;Cigizoglu and Kisi, 2005;Alp and Cigizoglu, 2007).The information related to the theory and applications of ANNs may be found in Rumelhart et al. (1986), Sudheer et al. (2003), Cigizoglu and Kisi (2005) and Fang and Wu (2007).

Results and discussion
The observed values of monthly variations in GWL depend on MLL, BLL, RF, and ET measurements.Hence, the statistical model of multiple linear regression (MLR) and multiple non-linear regression (MNLR) were employed to estimate the groundwater levels from the corresponding above measurements and to compare them with the ANN results.If GWL is a dependent variable, and MLL, BLL, RF, and ET are independent variables, then the MLR and MNLR models, respectively, are given by: where: a, b i are constants ε, the 'noise' variable, is a normally distributed random variable with a mean equalling zero The best-fit network for each model was selected based on various statistical goodness of fit indices.The goodness of fit statistics that were used in the selection of the best fit networks are mean absolute error (MAE), root mean square error (RMSE) and coefficient of determination (R 2 ) expressed as: where: Subscripts m and s represent the measured and simulated outputs, respectively p is total number of events considered x is mean value of the measured data ANN predictions are precise, as R 2 value approaches 1 while RMSE and MAE approach zero.The values of performance measures for the five  From Tables 2 and 3, the ANN-1 model has the best performance measures in the training and testing phases.R 2 values range from 0.918 to 0.905, meaning that good predictions with relatively small errors have been achieved in both training and testing phases.
In ANN-1 model, the minimum RMSE and MAE values are obtained (0.250 and 0.184) in the testing phase.The worst predictions occur in ANN-4 and ANN-5 models in the testing phase because of the relatively smaller number of training data.

205
Tables 2 and 3 also compare the performance of the ANN models with the MLR and MNLR models.In general, the ANN models performed better than the MLR and MNLR models.Among the MLR and MNLR models, the trained models for the 300 months resulted in a good degree of correlation between the measured and estimated monthly groundwater levels for the testing phase.The RMSE obtained from the MNLR-1 model was equal to 0.420, which is smaller than the MLR-1 model value (0.423).
Comparison of observed and calculated results of MLR-1, MNLR-1 and ANN-1 models for monthly groundwater level around the lakes in the testing and training phases are shown in Figs. 4 and 5, respectively.The ANN-1 model matches the observed monthly groundwater levels more closely than the MLR-1 and MNLR-1 models, while the 2 regression models tend to underestimate high values, and overestimate low values in both the training and testing phases.Figure 5 also shows that the ANN-1 model, in the testing phase, overestimates the very low groundwater level changes and estimates accurately the medium groundwater levels.In spite of this, the ANN-1 model tends to have no larger deviations from the observed levels than those of the MLR-1 and MNLR-1 models.
Time series and scatter plots of the observed vs. predicted results of MLR-5, MNLR-5 and ANN-5 models, which have 100 training data and 263 testing data, are shown in Figs. 6 and 7 for the training and testing phases, respectively.As illustrated in Fig. 6, the MLR-5 and MNLR-5 models tend to more deviations than those of the ANN-5 model during the training phase and underestimate for high groundwater levels.Although the ANN-5 model has larger deviations than those of the MLR-5 and MNLR-5 models during the testing phase (Fig. 7), the results are subject to less systematic error.The MLR-5 and MNLR-5 models tend to fit the lower groundwater levels, but overestimate the medium and high flows in the testing phase.In contrast to the training phase, ANN-5 model performance in the testing phase is poor (Fig. 7).The decreasing accuracy of the ANN-5 model compared with that of the ANN-1 model can be attributed to the larger amount of data used for training the ANN-1 model.

Conclusions
ANNs are relatively new computational tools that have found extensive utilisation in solving many complex realworld problems and are very attractive due to their remarkable learning and generalisation capabilities even for highly non-linear problems.Monthly variations in the Magnolia and Brooklyn Lakes levels, rainfall and evapotranspiration were selected to describe the physical phenomena of groundwater/surface water interaction in order to forecast monthly groundwater levels.The sensitivity of the prediction accuracy to the length of training and testing data was investigated.In order to ensure that the network has properly mapped input training data to the target output, it is essential that the set of patterns presented to the network is appropriately selected to cover a good sample of the training domain.A well-trained network is one which is able to respond to any unseen pattern within an appropriate domain.There are no acceptable rules to determine the optimum size of the training data set.The results show that the networks are not very sensitive to the number of training data, but very sensitive to the number of testing data.Attempts at reducing training size resulted in poor generalisation capabilities in the testing phase.
MLR, MNLR and ANN models were fitted to the monthly data series and their respective performances were compared.The use of 5 different ANN models with different lengths of training and testing data resulted in a better performance to model groundwater levels in either the training phase or testing phase in comparison to the MLR and MNLR models, except for the last ANN model.This result can be explained by having a lower number of minimum and maximum GWL data in the training phase than those in the testing phase, since, in the training phase, the ANN could not learn the pattern of maximum and The validations of the developed networks show that with respect to predicting groundwater levels, the ANN model performs better than the conventional statistical MLR and MNLR models.The comparison of the results shows that MAE and RMSE statistics for ANN-1 model (0.184 and 0.250) are smaller than those obtained by MLR (0.354 and 0.420) and MNLR (0.358 and 0.423) models, respectively.It may be concluded that if the length of training data is sufficient, then the ANN performance is the best to model the groundwater levels.The ANN model can therefore be used as a very good tool to predict groundwater levels using easily measurable climate data and lake levels.
Although the ANN is a very good tool in hydrologic applications, it has some limitations compared to a physically based hydrologic model.It can never replace a physically based model when considering future simulation scenarios involving the addition of new parameters in the model such as groundwater withdrawals, changes in land use and irrigation patterns, etc.The ANN model in this study is trained without using groundwater abstraction information.Assuming that patterns of groundwater abstraction in the training period remain the same during the future, this model can estimate GWL reasonably well as a function of MLL, BLL, RF, and ET only.On the other hand the ANN cannot estimate GWL for different future scenarios of groundwater abstraction scenarios.In that case, the training should be repeated including groundwater abstraction data in the training data set, which were not available for this study.Unfortunately this is one of the major drawbacks of an ANN model.Nonetheless, ANN models are still very attractive tools to estimate GWL using limited available data, when it is not possible to apply a physically based hydrologic model due to the unavailability of complete hydrological and hydrogeological parameters.

Figure 1
Figure 1Location of the study area Figure 3Typical ANN configuration with one hidden layer Available on website http://www.wrc.org.zaISSN 0378-4738 = Water SA Vol.34 No. 2 April 2008 ISSN 1816-7950 = Water SA (on-line) and x max denote the minimum and maximum values of the overall experimental data Different values can be assigned for the scaling factors a and b.There are no fixed rules as to which standardisation approach should be used in particular circumstances (Dawson and Wilby, 1998).The factors a and b were taken as 0.6 and 0.2 herein, respectively.Data from 364 months (May 1968 to August 1998) were used for ANN training and testing.Monthly MLL and BLL were obtained from lake level measurements of Magnolia and Brooklyn lakes respectively.RF and ET values were obtained from the monthly records of Gainesville, FL, 32 km southwest of study area.Monthly GWL values were obtained from head values measured in the 'Keystone well' (SJRWMD identifier C-0120) by the Saint Johns River Water Management District (SJRWMD).Well C-0120 is located on the eastern shore of Lake Brooklyn.Since the selection of training and testing data has a potentially large impact on the model accuracy, different lengths of training and testing data were assessed in this study.The total data set of 364 monthly values was divided into 5 training and testing sets.The first consisted of 300 training values (64 testing values), while the others consisted of 250, 200, 150 and 100 training values, respectively.The range of the the studied variables is presented in Table

Figure 5
Figure 5Observed and predicted monthly groundwater levels for MLR-1, MNLR-1 and ANN-1 models in the testing phase.