Groundwater vulnerability mapping of Witbank coalfield in South Africa using deep learning artificial neural networks

This study highlights the usage of deep learning artificial neural networks in the assessment of groundwater vulnerability of a coalfield. The network uses the DRIST model with parameters (depth to water level, recharge, impact of the vadose zone, soils and topographic slope) as training inputs and borehole sulphate concentration as training output. This technique was applied to Witbank coalfield, where acid mine drainage emanating from coal mining operations is a huge concern for surrounding environment and groundwater resources. The generated groundwater vulnerability model was validated with another sulphate dataset not used during model training. The deep neural network model with dropout and decaying learning rate regularisers correlated very well with sulphate data from another source as compared to the index and overlay DRIST model. The approach, differentiated areas in terms of vulnerability to acid mine drainage, which can aid policy, and decision makers to make scientifically informed decisions on land use planning. The approach developed in this research can be applied to other coalfields in order to evaluate its robustness to different hydrogeological and geological conditions.


Introduction
South Africa has nineteen coal provinces, of which the current mining activities are largely focused on coalfields in Mpumalanga Province of South Africa (Banks et al., 2011).The coal mining industry has been a fundamental catalyst in the development of the South African economy for over a century (Bell et al., 2001).However, coal mining operations in particular Witbank Coalfield has seriously affected the surrounding environment by massive deterioration of groundwater quality by Acid Mine Drainage (AMD) (Bell et al., 2001).AMD is formed when sulphide-bearing material found in coal discard or ore bodies reacts with water in the presence of oxygen.The generated product is characterised by high concentration of toxic heavy metals and sulphide (Sakala et al., 2018).In the study area, a regional knowledge-driven fuzzy expert system was used by (Sakala et al., 2018), which correlated slightly with concentration of sulphate in boreholes.Due to a large amount of boreholes with sulphide concentration couple with the need to improve this correlation, a data-driven deep neural network approach was used.The production of more accuracy groundwater vulnerability model of the coalfield would go a long way in demarcating AMD sensitive areas for appropriate land use planning at the same time testing the applicability of deep learning on large aerial groundwater vulnerability assessments.
Assessment of groundwater vulnerability is divided in two types: intrinsic and specific vulnerability (NRC, 1993).The intrinsic type defines only the ease with which pollutants migrates from surface to groundwater.On the other hand, the specific type takes into consideration the pollutant properties and its interaction with the subsurface (Vrba and Zoporozec, 1994;Huan et al., 2012).According to (Huan et al., 2012), in the modern days, the intrinsic vulnerability is considered meaningless as compared to specific type as factors affecting intrinsic vulnerability such as depth to water table, soil and recharge are changing as effects due to humans are increasing.This research hence, focuses on the specific vulnerability assessment in Geographic Information System (GIS) environment taking advantage of the good learning and generalisation capabilities of deep artificial neural networks to establishing the complex relationship between the groundwater vulnerability inputs and AMD indicators.

Study Area
The study area (~ 8 000km 2 ) is located between 25 o 30" and 27 o 45" south latitude, 28 o 30" and 30 o 30" east longitude (Figure 1) and extends from Delmas to Wonderfontein in the west-east direction and between Kromdraai and Hendrina in the north-south direction.The study area is marked to the north by the edge of the Karoo rocks and to the south by the Smithfield Ridge, a palaeohigh.Several abandoned and current mining areas are scattered throughout the coalfield.The Olifants River and its tributaries form the main drainage system within the study area.
Figure 1.Geological setting and location of the study area

Geology and Hydrogeology
Geologically, the study area fall on the northern tip of the Main Karoo Basin which is a sedimentary basin which the first-order depositional sequence is generally placed in the Late Carboniferous, around 300 Ma.The study area consists of six major lithologies (Figure 1).The geological descriptions of the rocks are displayed in Table 1.The geology was extracted from the 1:250 000 scale geology.(du Toit and Sonnekus, 2014) subdivided lithologies into four aquifer types, viz.Intergranular -Fractured aquifers consisting of the Rooiberg Group, Loskop Formation, Bushveld Complex, Ecca Group, Karoo dolerite dykes and sills and the Pretoria Group quartzite.
The intergranular aquifers correspond to the quaternary alluvial following major rivers.The dolomite of Chuniespoort Group of the Transvaal Supergroup forms karst aquifers found around Delmas Town.The fractured rock aquifers consists of the Black Reef Formation, Dwyka Group, Magaliesberg Formation, Wilger River Formation and the sedimentary rocks of the Pretoria Group with the bulk belonging to the intergranular -fractured aquifer type (Xws and Götz, 2014).The aquifers within the Witbank Coalfield are generally shallow (with distance less than 12 m to the water table) making them highly vulnerable to pollution and warranting a groundwater assessment to help manage and protect the water resources (Vrba and Zoporozec, 1994).
Table 1.Geological description of rocks in the study area (Sakala et al., 2018)

DRIST Method
In this study, assessment of groundwater vulnerability is based on the overlay and index DRIST method (Chenini et al., 2015), an improvement of the popular DRASTIC method by (Aller et al., 1987).The method is based on five parameters of the DRASTIC method namely: depth to water level, net recharge, impact of the vadose zone, soil media and topography.The aquifer media and hydraulic conductivity present in the DRASTIC method are not used in the DRIST method (Chenini et al., 2015).The vulnerability values are calculated similarly to the DRASTIC method.

Input -Depth to water level layer (D)
This refers to the distance between the ground level and the water table, which determines the passage through which water and pollutant have to travel to get to the aquifer (Aller et al., 1987).
The longer the distance, the higher the possibility of nature attenuating the pollutants as compared to shorter distances.The layer was calculated from monitoring borehole data obtained from the South African Department of Water and Sanitation.The water level data within the study area were interpolated using the kriging technique to generate the 30m resolution gridded raster layer which was reclassified into three classes as proposed by (Chenini et al., 2015) (Figure 2a).The depth to water level varies between 4 to 21m and increases in the western direction.

Input -Rainfall (R)
The rate of pyrite oxidation in the mining waste, tailings or subsurface is mainly controlled by the availability of oxygen and water at the mineral grain surface (Anawar, 2013).Mining operations expose the pyritic material to the atmosphere, making them prone to water and oxygen.Rainfall is often a major factor which supplies water for AMD reactions and which acts as a ligand for the transportation of products of the AMD reactions as surface runoff or as infiltration.From a hydrogeologist point of view, the greater the amount of rainfall, the greater the amount of groundwater recharge, thus increase the potential for groundwater aquifer pollution.In the study area, the average long-term rainfall data (Figure 2b) which was obtained from historical data published by the South African Weather Services (SAWS, 2016).

Input -Impact of the vadose zone (I)
A laboratory column-leach experiment was conducted as part of this study to differentiate study area rocks in terms of their reactivity and removal of pollutants (toxics) from an AMD solution under unsaturated conditions.The results show that dolomite had the highest reactivity followed by mudrock, shales, dolerite, diabase, then felsite, rhyolite, and quartzite and lastly alluvial, sandstone, diamictite being the least reactive.The results are in agreement with findings by (Lapakko, 1994;Yager et al., 2006).The vadose zone impact layer (Figure 2c) shows classification of rocks in terms of their ability to lessen or remove AMD pollutants load.Areas with lithologies having lesser ability to remove pollutants could be more vulnerability to AMD as less or no reactivity can mean that the AMD will pass through the vadose zone chemically unaltered and reach the groundwater.
Generally, the largest area is marked by rocks with lower ability of reactivity which can translates to lack of attenuation of AMD pollution.

Input -Soil layer (S)
Soil data at a scale of 1:250 000 from Agriculture Research Council -Institute for Soil, Climate and Water (ARC-ISCW) was reclassified according to clay content.Clay materials are known to create a barrier zone restricting water and pollutant migration, thus the more the clay content, the better the barrier effect hence lessening the chances of groundwater pollution.Figure 2d shows the clay content layer of the study area, where the northern part is marked by low clay content.The clay content ranges from 9 to 51%.

Input -Topographic slope layer (T)
SRTM data of 30m resolution was used to generate the percentage slope layer (Figure 3e).
Surface flatness increases the resident time for water and pollutants to react and infiltrate.
Generally, the slope varies between 0 to 18% with the biggest portion of the study area being flat to gently sloping as marked by grey colour (Figure 2e).

Output -DRIST model
The five layers (Figure 3) were combined in a GIS environment using equation ( 2) to generate the DRIST groundwater vulnerability model.The equation is given by (Chenini et al., 2015) as: Where r is the rating of the parameter and w is the importance weight of the parameter shown on Figure 3.
The results were divided into four zones (very low, low, moderate and high) using the natural breaks method (Ioannou et al., 2010).The output result is a map (Figure 3) showing the DRIST model, where grey areas are less vulnerability and the red coloured areas as most vulnerable.The eastern areas are marked by moderate to high groundwater vulnerability whereas the central and western areas are very low to moderate.

Deep Artificial Neural Network
Artificial neural networks (ANNs) belong to the data-driven branch of Artificial Intelligence (AI) which is inspired by the biological neural system in terms of which the computer is trained to do the functions which, at the moment, are best handled by humans, such as learning (Shigidi and Garcia, 2005).Artificial neural networks (ANN) form a class of non-linear parallel distributed information processing and adaptive systems originally based on studies of the brains of living species (McCulloch and Pitts, 1943).An ANN consists of a layer of neurons that accept various inputs (Input layer) then fed them into further hidden layers with neurons and ultimately to the output layer, which produces an output response (Figure 4).The aim of the technique is to train the network such that its response to a given set of inputs is as close as possible to a desired output

Results and Discussion
The processes of formulating the five DRIST parameters (depth to water level, rainfall, impact of the vadose, soils and topography) and combining them using the traditional DRIST method was done in the previous section.In this section, results of building an ANN and resultant results will be discussed and model compared with the DRIST model.

ANN Training and validation
To build an ANN, the network has to be fed with labelled data (set of inputs and an output label).
In this study, 350 boreholes with sulphate concentration values were extracted from the South African national groundwater archive database for building the ANN model.Boreholes with high sulphate values (>200mg/l) were used as polluted sites (output label -1) and those with values below 200mg/l as non-polluted training sites (output label -0).DRIST input values at the borehole locality for the five parameters together with associated borehole output label were fed into the ANN.The input-output pairs were partitioned into training, validation and test sets using a stratified 10-fold cross-validation approach (Kohavi, 1995).
In this paper, the most extensively used Rectified Linear Unit (ReLU) activation function (Epelbaum, 2017) was used as it does not activate all the neurons at the same time making the network sparse which in turn improves efficiency as compared with other functions such as the sigmoid, tanh and others (Gupta, 2017).Using the ReLU as an activation function with a gradient descent optimiser (Lv et al., 2017), several experiments with various parameter settings (learning rate, number of training iterations, optimisation function and the architecture (number of neurons and hidden layers)) were done to choice the optimum parameters with the highest training and validation accuracy.Overfittng is a problem in building an ANN model, in this paper, the principle of early stopping and regularisers were used to minimise overfitting.Early stopping involves stopping the training process once performance of the validation dataset stops increasing (i.e. the cost begins increasing steadily instead of decreasing).In this study, the training process was stopped earlier at 430 epochs (Figure 5a). Figure 5b presents the number of hidden layers with the performance measured using mean-squared error (MSE) at 430 iterations.The results obtained after this evaluation shows that the best number of the hidden layers which corresponds to the smallest MSE, and good ANN outputs is a three hidden layer system.From Figure 5c  The ANN trained without regularisation at a constant learning rate of 0.5 was able to reach 83% accuracy in training and 81% in validation, but the training results show erratic changes in the accuracy (Figure 6a).When, the dropout technique (Gupta, 2017) which involves stochastically dropping some of the hidden neurons was used together with an exponential decay learning rate (Brownlee, 2016) the accuracy improves drastically to 92.3% accuracy for training and 95.7% for validation and the erratic variations disappear (Figure 6b).Thus, adding regularisers greatly improved the training process and the network can now generalise better, and considerably reducing the effects of overfitting.ANN model with the fuzzy model (Sakala et al., 2018), the correlation results of the ANN are higher.Thus, the deep ANN, which was purposefully built from complex relationship the network learnt from the input-output pairs, improves the correlation.

Conclusions
The purposefully built deep ANN using the ReLU, gradient descent optimiser with dropout and an exponential decay-learning rate was able to generate a groundwater vulnerability model, which correlates very well with physically measured field data.The results significantly improves the quality of the output model when compared with the use of the index and overlay DRIST method and knowledge-driven model from literature.Results can be used help policy and decision makers to make scientifically informed decisions on land use planning.Based on findings of this study, management and protection of groundwater resources recommendations can be made: • Land use activities that generate sulphates like coal mining should be avoided on highly vulnerable zones or done in a strict manner that minimises pollutants from entering the subsurface.
• Rehabilitation exercise over abundant coal mining areas, which are still generating AMD within or near the highly vulnerable zones.
The datasets used in this study are readily available from various governmental agencies making the approach cost-effective in evaluating larger areas.The AMD groundwater vulnerability approach developed in this research need to be applied to other pollutants with similar or different hydrogeological settings in order to determine the robustness of the methodology.

Figure 2 .
Figure 2. DRIST inputs (a) Depth to water level (b) Recharge (c) Impact of the vadose zone (d) Soils and (e) Topographic slope

(
McCulloch and Pitts, 1943).A number of algorithms are available for training a neural network of which the back propagation with gradient decent which is the most popular training algorithm(Manoj and Nagarajan, 2003) and was used in the present study.The five DRIST parameters were used as training datasets.ANN training, validation and classification was done using the Tensorflow® library in Python® programming language.

Figure 3 .
Figure 3. DRIST model showing the four groundwater vulnerability classes within Witbank coalfield , the MSE shows a minimum value when the hidden neurons number is 35, indicating that an ANN with 35 hidden neurons provides a better optimum number of neurons, where increasing the number has not effect on the training process.Therefore, an ANN with architecture 5-35-35-35-1 (five inputs, 35 hidden neurons in each of the three hidden layers and one output) was used.

Figure 5 .
Figure 5. Parameters setting (a) number of epochs (b) number of hidden layers (c) number of neurons

Figure 6 .
Figure 6.Training process (a) without regularisers (b) with dropout and learning rate

Figure 8 .
Figure 8. Correlation scatter plot for testing dataset for (a) DRIST (b) ANN models