Class frequency distribution for a surface raw water quality index in the Vaal Basin

A harmonised in-stream water quality guideline was constructed to develop a water quality index for the Upper and Middle Vaal Water Management Areas, in the Vaal basin of South Africa. The study area consisted of 12 water quality monitoring points; V1, S1, B1, S4, K9, T1, R2, L1, V7, V9, V12, and V17. These points are part of a Water Board’s extensive catchment monitoring network but were re-labelled for this paper. The harmonised guideline was made up of 5 classes for NH4 + , Cl -, EC, DO, pH, F-, NO3 -, PO4 3and SO4 2against in-stream water quality objectives for ideal catchment background limits. Ideal catchment background values for Vaal Dam sub-catchment represented Class 1 (best quality water), while those for Vaal Barrage, Blesbok/Suikerbosrand Rivers and Klip River represented Classes 2, 3 and 4, respectively. Values above those of Klip River ideal catchment background represented Class 5. For each monitoring point, secondary raw data for the 9 parameters were cubic-interpolated to 2 526 days from 1 January 2003 to 30 November 2009 (7 years). The IF-THEN-ELSE function then sub-classified the data from 1 to 5 while the daily index was calculated as a median of that day’s sub-classes. Histograms were constructed in order to distribute the indices among the 5 classes of the harmonised guideline. Points V1 and S1 were ranked as best quality water (Class 1), with percentage class frequencies of 91% and 60%, respectively. L1 ranked Class 3 (34%) while V7 (54%), V9 (53%), V12 (66%) and V17 (46%) ranked poorly as Class 4. B1 (76%), S4 (53%), K9 (41%), T1 (53%) and R2 (61%) ranked as worst quality (Class 5). The harmonised in-stream water quality guideline resulted in class frequency distributions. The surface raw water quality index system managed to compare quality variation among the 12 points which were located in different sub-catchments of the study area. These results provided a basis to trade pollution among upstream-downstream users, over a timeframe of 7 years. Models could consequently be developed to reflect, for example, quality-sensitive differential tariffs, among other index uses. The indices could also be incorporated into potable water treatment cost models in order for the costs to reflect raw water quality variability.


INTRODUCTION
Understanding complex systems involves constructing models, comparing their predictions with observations and improving them by using feedback mechanisms from continuous assessments (Even et al., 2007).For water quality management purposes, assessments are done based on the prevailing guidelines.This approach assumes that proper identification of contamination sources for individual parameters that are assessed can be done to provide a basis for environmental and legal compliance.However, the approach does not readily offer a holistic view of the spatial and temporal trends in water quality expressed in a single value, especially for catchments that are perturbed by various pollutant types.More importantly, options for restoring heavily degraded catchments are limited, hence assessment tools that are supported by robust water quality data should be continuously developed (Bohensky, 2008).
Limitations, though, exist where compliance with water quality objectives proves to be prohibitively expensive or technically impossible (Mey and van Niekerk, 2009).Further, even in catchments where data are aligned with specific sampling objectives, data for a required parameter, for example, might be unavailable, rendering the dataset inadequate for use with a specific water quality index (WQI).Yet indices are expected to provide simplified interpretation of results since in their various forms they summarise, in one value or concept, a series of parameters (Abrahão et al., 2007;Couillard and Lefebvre, 1985;DWAF, 1996).This is desirable, especially in cases where decisions require interpretation of the severity or extent of pollution impacts.
An index can be limited if it requires data of longer duration than is available (most models are done with retrospective data).In addition, a model can predefine its input parameters or even the number of input parameters, both of which might not be available.Further, if a model requires a subjective constant that relates to a particular water body at a specific time of sampling, but which was not captured then, it renders the historical dataset inadequate for use with that model (Abrahão et al., 2007;Pesce and Wunderlin, 2000).Thus many indices have been developed since the first index was suggested by Horton (1965), to try to satisfy various conditions within ecological system boundaries, which constantly shift in time and space.As at 1985 more than 100 scientists had already developed indices for specific water-related settings (Couillard and Lefebvre, 1985).Some of the indices were based on statistical or planning approaches while others represented trophic states of

338
specific ecosystems.Some researchers have gone a step further and adapted indices that were developed for specific environments, for example the salinity index which is mainly used in agriculture and soil applications (Katerji et al., 2000;Slavich et al., 1999), for water quality trending, in order to satisfy specific objectives (Bohensky, 2008;DWAF, 2007;Mey and van Niekerk, 2009).
Apart from WQI limitations, another challenge regards choosing those parameters which are most significant to describe aspects of spatial and temporal quality variations.According to Abrahão et al. (2007), these parameters should provide an indication of the evolutionary tendency of quality as it evolved over time, in addition to allowing for comparison between different watercourses or different locations along the same watercourse.Some indices use fixed numbers and specific input variables because they were objectively designed for comparison using some specific expert opinion.Examples are indices by Horton (1965) which uses 10 parameters, Dunnette (1979) which uses 6, and Brown et al. (1970) which uses 9, among many others.In the end, an index should still provide a simple way of representing information by using a simple quality numerical value (Couillard and Lefebvre, 1985).Where costs and other challenges may limit water quality evaluation, selection of a streamlined list of the most appropriate quality parameters is, however, fundamental (Abrahão et al., 2007).
A WQI that is based on parameters which represent the broader pollution sources is a vital tool for assessing a water body's spatial and temporal quality trends within a boundary system which spans different sub-catchments.In South Africa, Wepener et al. (2006) documented an extensive literature review of water quality indices.The research highlighted 2 pre-requisites for useful water quality indices; that they should be readily derived from available monitoring data and that they should impart an understanding of the significance of the data represented.The data should preferably be of long duration to minimise short-term ecosystem noise and should produce new knowledge from old data (Hawkins et al., 2013).It is therefore expected that results from this paper will, for example, if applied in raw water pricing structures, provide equity on tariffs among surface raw water users, in addition to incorporating a water quality variability factor when modelling potable water treatment costs.The actual application of the models is beyond the scope of this paper.
The focus of this study was to model and compare pollution trends in the Upper and Middle Vaal Water Management Areas (WMAs) of South Africa (Fig. 1).
The upper parts of both WMAs are covered by the hydrological C2 secondary catchment (Fig. 2), the significance of which is explained later.
Historical (retrospective) data from 1 January 2003 to 30 November 2009 and for parameters NH 4 + , Cl -, EC, DO, pH, F -, NO 3 -, PO 4 3-and SO 4 2-, were used.The parameters were selected after a series of data reduction and manipulation operations to reflect, among other attributes, the major sources of pollution in the study area (Dzwairo, 2011), in addition to the availability of a consistent dataset.
The study area consisted of 12 surface raw water quality monitoring points with the aim of fulfilling 3 objectives.The first objective was to construct a harmonised in-stream water quality guideline (HIWQG) by combining guideline values for the 2 water management areas (Upper and Middle Vaal) to create 5 classes.At the time of writing this paper, in-stream water quality objectives (IWQOs), which various stakeholders use as guidelines for pollution trending, had different values for specific sub-catchments located within the two WMAs (Rand Water, 2012).This scenario made it impossible to objectively compare water quality variability from the same baseline.The second objective was to model sub-classes for the 9 parameters.The sub-class values served as inputs for the WQI, based on the constructed HIWQG.Maximum sub-class contribution towards each corresponding index value was also assessed for all monitoring points.The third objective was to determine class frequency distribution based on daily indices for each of the monitoring points.

STUDY AREA
Study site monitoring points were chosen to represent upstream-downstream relationships on the Vaal River (which flows from east to west) as well as possible pollution entry points into the Vaal River.The points and spatial relationships are shown in Fig. 3.
The Vaal basin, which is made up of Upper, Middle and Lower basins, is the economic hub of South Africa because it contributes about 60% of the country's economic activity and also supports approximately 12 million people (Dzwairo, 2011).The Upper Vaal alone accounts for about 20% of the country's gross domestic product (GDP).This WMA is where 11 of the A zoomed-in map in Fig. 4 shows the river network and monitoring points within the C2 secondary catchment as well as V1 on the Vaal Dam.
The land use map in Fig. 5 indicates that waste effluent from mining activities in the north-western part of the Upper Vaal, and around V17 in the Middle Vaal, drains into the Vaal River via a network of tributaries.
This sinking of pollution exerts tremendous pressure on the water resource and its various treatment processes aiming to meet receiving water quality objectives.The majority of the pollution sources are mines, mine dumps and sewage treatment plants (DWAF, 2004b;Dzwairo et al., 2010;Gouws and Coetzee, 1997;Jack et al., 2006;Steÿn et al., 1976), where specific sections are highly impacted and surrounding land is degraded (DWAF, 2004a;DWAF, 2004b;DWAF, 2004c;Van Steenderen et al., 1987).These sources have been creating pollution impacts for hundreds of years, downstream of the Vaal Dam and into the Middle Vaal.The mining impacts are shown to occur downstream of Vaal Dam.
Three decades ago, scientists already warned that securing sufficient supplies of good quality water in the Vaal basin would become increasingly difficult, mainly due to pollution of its upper reaches (Grobler et al., 1983).Yet, even with successful development of models such as WQ2000 for that basin (Herold et al., 2006), mitigation measures are yet to influence positive change.Pollution trends continue as shown by a more recent and comprehensive study by DWAF (2007), which used salinity values mapped against parameter acceptable management target values for receiving water quality objectives (Fig. 6).
The process of deriving quality objectives and guidelines for surface raw water has a long history in South Africa, and one of the main challenges continues to be the non-suitability of guidelines for specific use.Several different approaches have been proposed and evaluated for specific quality values (City of Tshwane 2055, 2012;DWAF, 1996;Roux et al., 1996;Slaughter, 2005).However, researchers still experience limitations (data, technological, etc.) and thus keep developing new approaches to suit particular requirements.This paper provides additional simplified assessments of the basin's pollution trends and possible sources, based on past trends.The tools can model future scenarios and parameter-targeted mitigation measures.Class frequency distribution and basin-specific WQI are such tools which will be determined.

Constructing the harmonised in-stream water quality guideline
Parameter selection, data reduction and pre-processing were performed on datasets that ranged from 1 January 2003 to 30 November 2009.The procedures were adapted from those used in earlier studies by the same researchers (Dzwairo, 2011;Dzwairo et al., 2011a).It is the norm to sample at various Since the index was supposed to be developed for daily intervals it meant that the input data had to be value and date-filled to 2 526 days.Therefore raw data representing the 12 points and 9 parameters were cubic-interpolated on Matlab2012b in order to create the missing dates and corresponding data for all dates between 1 January 2003 and 30 November 2009.Although there are several interpolation techniques, cubic interpolation was chosen for the time-series dataset because the method is shape-preserving.The final dataset comprised 9 parameters per day for 2 526 days, for each of the 12 monitoring points.
The HIWQG was constructed for NH 4 + , Cl -, EC, DO, pH, F -, NO 3 -, PO 4 3-and SO 4 2-against ideal catchment background limit values for Vaal Dam (Class 1), Vaal Barrage (Class 2), Blesbokspruit/Suikerbosrand river system (Class 3) and Klip River (Class 4) (see Table 1).Class 5 represented values which ranged above those of Klip River limit values (see Table 1).The rationale for using Vaal Dam limit values is that this sub-catchment is considered to have good quality water, especially for treating to potable standard.Thus mitigation or remediation measures for the basin as a whole could seek to return impacted environments to Vaal Dam quality equivalent, where applicable and possible.

Models for parameter-specific sub-classification of daily water quality
IF-THEN-ELSE statements with tree-depth of up to Level 4 were constructed in Microsoft Excel in order to sub-classify each of the 9 parameters using the HIWQG class limits provided in Table 2.The rule-sets for EC sub-classification are given in Eqs ( 1) to (4).Similar rule-sets were constructed for NH 4 + , Cl -, DO, pH, F -, NO 3 -, PO 4 3-and SO 4 2-. ( (2) The indices for each of the 2 526 days were calculated as medians of the 9 parameter sub-classes for each monitoring point.

RESULTS AND DISCUSSION
Using the harmonised in-stream water quality guideline Microsoft Excel frequency distribution curves were drawn and temporal and spatial patterns of water quality were calculated from ranges, means, medians, and 1 st and 3 rd quartiles (5-number summary) of the indices.).Table 2 gives the sub-classes for Monitoring Point V1, from 1 January 2003 to 7 January 2003.The same treatment was applied to datasets for the other 11 monitoring points and for data up to 30 November 2009.The median of the 9 sub-classes represented the WQI for that day for each monitoring point (refer to Table 2).The daily water quality indices (medians) were incorporated into Table 3 for all 2 526 days, to calculate the median which represented the 7-year water quality indices.Medians were also calculated for each sub-class for the 2 526 days at each monitoring point (see Table 4).
The 7-year indices from Table 3 were compared with the 7-year median sub-classes in Table 4 in order to determine the parameter (using its sub-class), if any, which contributed maximally towards the 7-year index for each point.The results are given in Table 5.
Values are omitted where there was no maximum contribution.CSO 4 2-contributed maximally to the WQI at all

341
monitoring points except S1, V1, T1 and R2.This theoretically means that if all other factors were held constant and noninteractive, mitigation and rehabilitation measures would target a monitoring point's upstream activities where the SO 4 2-was emanating from.Where sub-classes equalled or were higher than the WQI, this indicated a positive influence on the WQI.The corresponding parameter could be viewed as a target for future impact mitigation measures.CDO contributed maximally at S1, K9 and L1; CCl -at R2; B1, S4 and T1; CPO 4 3-at all points except V1, S1 and S4; and CNH 4 + at R2. NO 3 -pollution attenuation measures could target V1, S1, K9, V7, V9 and R2.EC did not have maximal influence on indices of V1, S1, L1 and R2 while the rest of the points indicated a significant influence.CF -was Class 5 for all points but provided the maximum contribution to water quality indices for only B1, S4, T1 and R2.
For Vaal Dam at V1, the concern would be nitrate, which, if not strictly regulated, could contribute to nutrient loading in the dam.Vaal Dam is a strategic water resource in Gauteng as it supports wide sectoral needs, both domestic and industrial.Dam water also assists with flushing pollution which drains into the river via tributaries at confluences located downstream of the dam wall.B1 effluent is characteristic of acid mine drainage (high sulphate and electrical conductivity), thus mitigation measures could target such pollution sources.R2 exhibits characteristics typical of pollution emanating from sewage (phosphate, nitrate and ammonium contribution).Sulphate pollution patterns exhibited in Table 5 point to the conservative nature of that parameter across the two sub-catchments.The pattern of indices and direction of flow suggest that sulphate enters the Vaal system via tributaries at B1 and K9 and does not attenuate significantly downstream to V17 in Middle Vaal.
While salinity (DWAF, 1996) and conductivity (Dzwairo et al., 2011b) provide an overall indication of the pollution status of sites along the Vaal River and its tributaries, the water quality indicators and rule-sets described in this paper offer more detail and allow for the identification of problem parameters.
Using data manipulations like those in Fig. 7, further analysis could assist with establishing possible future states of an impacted ecosystem.Among the monitoring points, V1 and S1 were the least impacted.However, individual sub-classification indicated that some of the parameters contributed 1% towards Class 5 frequencies at S1.The water passing through points B1, S4, T1 and R2 was highly impacted although overall indices were influenced by different combinations of variables (refer to median sub-classes in TABLE 4).Pollution through R2 (it is a Class 5 tributary of the Vaal River which flows into the Vaal Barrage) highly impacted the Vaal River at V12, which is at the barrage wall.Water flowing through S4 (Class 5) and K9 (Class 4-5) impacted the Vaal River at V7 (Class 4).Both points are located on Vaal River tributaries.This shows that the Vaal River water was being polluted by rivers that were being monitored at S4 and at K9.The pollution effects changed the Vaal River's index from Class 1 at V1 to Class 4 at V7.The initial pollution shock at V7 was felt all the way downstream at V9 (Class 4), V12 (Class 4) and V17 (Class 4).Polluted water passing through T1, L1 and R2 (tributaries of Vaal River) maintained the Class 4 perturbation in the main Vaal River channel.
The superimposed graph in Fig. 7 (note that Class 4 and Class 5 for K9 were both included) serves to provide a platform for optimising rehabilitation efforts in the Upper Vaal sub-catchment to the section just before the barrage wall.
Restoring the Upper Vaal sub-catchment to a pristine state means matching Vaal Dam conditions.These would entail raising the line graph to match 91% and shortening the bars to Class 1 across all Upper Vaal monitoring points given in Fig. 7.This would be the desired future state of the Upper Vaal sub-catchment.See Fig. 4 for geo-referenced positions of monitoring points.
The tools developed here, together with the results, were assessed for validity against results that were recorded by DWAF (2007), using the Salinity index (see Fig. 6).The general pollution trend evident in DWAF ( 2007) indicates good quality water for Vaal Dam, which is in agreement with the results reported in this paper.Figure 6 also indicates that barrage pollution values were higher than those for Vaal Dam, which is also in agreement with the results of the current study.Data available from Rand Water (2012; identifiers in brackets are those used for these data) show that, for example, water passing through points K9(K19), B1(B10) and S4(S2) is highly impacted while that passing through S1(S1) and V1(VD1i) is good quality water.This comparison indicates that the index system developed in this paper is valid for the conditions prevailing in the study area.

CONCLUSIONS
Four tools were used in this paper to consolidate pollution trends and to make a case for strengthening practices that target heavily impacting sources of pollution in the basin, most of which lie in the East Rand.These are: (i) the 9-parameter harmonised in-stream water quality guideline which provided a baseline for comparing pollution loads across boundaries, (ii) parameter sub-classification which characterised pollution types at specific points in order to tailor mitigation strategies, (iii) the water quality index which facilitated pollution assessment at any time-step using the median value of the chosen time-step, and (iv) class frequency distribution which assessed parameter contribution towards the overall water quality index.
Rehabilitation of upstream impacted environments would theoretically lower indices downstream, as indicated by the 343 desired future state model (Fig. 7).This approach does not ignore practical pollutant interactions within an aqueous/sediment medium, which could result in combined phenotypic behaviour.Combined interactions are, however, beyond the scope of this paper.The methodology offers a simple approach to assess severity of pollution from sub-classes and the overall indices, across sub-catchments, without necessarily using expensive and sometimes complicated commercial models.Sub-classification and class frequency distribution indicate possible sources of pollution, from among those existent in the impacted spatial boundary.The classification system, however, is a preliminary first step; a catchment manager, for example, would need to examine the raw data before making further decisions.
The developed classification system could be used to predict future median classes, based on the specific parameters used.Given the classification frequencies of the monitoring points, a water regulator could charge more to a user abstracting Class 1 water, which represents best quality, while a user abstracting Class 5 water could be charged much less.In addition, a tariff system based on these indices would equitably cross-subsidise upstream-downstream raw water use.

ACKNOWLEDGEMENT
The financial assistance of the Department of Science Technology (DST) is hereby acknowledged.Opinions expressed and conclusions presented are those of the authors and are not necessarily to be attributed to the DST.The authors would also like to sincerely thank Durban University of Technology for hosting and co-funding the Post-Doctoral Fellowship.The Department of Water Affairs and the Water Boards, Rand Water (co-funding), Midvaal Water and Sedibeng Water, are acknowledged for providing datasets and other relevant documentation, without which this research could not have been conducted.

Figure 1
Figure 1 Location of Upper Vaal WMA and Middle Vaal WMA in South Africa
According to Table 1, the lower and upper limits of the proposed water quality classes set the ranges of acceptable values in each class.The values were extracted from Rand Water (2012) and adapted for this paper.The guideline represents a uniform baseline against which to compare the indices for monitoring points located in different sub-catchments.