Performance of regional flood frequency analysis methods in KwaZulu-Natal , South Africa

Estimates of design floods are required for the design of hydraulic structures and to quantify the risk of failure of the structures. Many international studies have shown that design floods estimated using a regionalised method result in more reliable estimates of design floods than values computed from a single site or from other methods. A number of regional flood frequency analysis (RFFA) methods have been developed, which cover all or parts of South Africa. These include methods developed by Van Bladeren (1993), Mkhandi et al. (2000), Görgens (2007) and Haile (2011). The performance of these methods has been assessed at selected flow-gauging sites in the province of KwaZulu-Natal (KZN), South Africa. It is recommended that the limitations of available flow records to estimate extreme flow events need to be urgently addressed. From the results for KZN the JPV method, with a regionalised GEV distribution with the veld zone regionalisation, generally gave the best performance when compared to design floods estimated from the annual maximum series extracted from the observed data. It is recommended that the performance of the various RFFA methods needs to be assessed at a national scale and that a more detailed regionalisation be used in the development of an updated RFFA method for South Africa.


INTRODUCTION
The design of hydraulic structures (e.g.dams, flood attenuation structures, culverts) requires the estimation of a design flood which is the magnitude of the flood associated with a given probability of exceedance or return period in years.Practitioners in South Africa generally estimate design floods by performing a frequency analysis of gauged flow data at a given location, if flow data are available at the site of interest, or by using an event-based rainfall-runoff model, for example, using the rational method, unit hydrograph method, or Soil Conservation Services method adapted for conditions in South Africa (SCS-SA).However, limitations of event-based methods include the assumption that the exceedance probability of the flood event is the same as the exceedance probability of the rainfall event, i.e., the 100-year return period flood event is assumed to result from a 100-year return period rainfall event, and the antecedent soil moisture condition in the catchment prior to extreme rainfall events is not taken into account.
When observed flow data are available, design floods can be estimated by performing a frequency analysis of the data, which generally involves fitting probability distributions to the annual maximum series (AMS) extracted from the data.The selected probability distribution is assumed to represent the population of all extreme events from the site.Hence, the longer the period of record, the better the assumption that the selected distribution represents the distribution of the population of all extreme events at the site.
A limitation of using a single-site approach to f lood frequency analysis is that relatively few gauging stations in South Africa have long record lengths (e.g.> 50 years) and this limits the confidence in design f loods estimated using data from a single site, particularly when using shorter record lengths and when estimating design values for longer return periods (e.g. 100 years).In addition, design f loods generally need to be estimated at sites where observed f lood data are not available and thus rainfall-based methods or regionalised methods need to be used to estimate design f loods at ungauged sites.
Given the relatively short flow-record lengths generally available it is necessary to use data from similar and nearby locations to improve the reliability of design flood estimates (Stedinger et al. 1993).This approach is known as regional flood frequency analysis (RFFA) and utilises data from several sites to estimate the frequency distribution of floods at each site.As summarised by Smithers (2012), many studies have shown that RFFA will result in more accurate and consistent estimates than at-site analyses (e.g.Cordery and Pilgrim, 2000;Hosking and Wallis, 1997;Smithers and Schulze, 2000a;Smithers and Schulze, 2000b).
RFFA usually assumes that relatively homogenous flood regions can be identified where the frequency distributions of floods at different sites are similar after site-specific scaling.Generally, growth curves (ratio of design flood/index flood vs. return period), or regionalised scaled distribution parameters, are developed for each region.Regionalised relationships are then developed to estimate the index flood/scaling value (e.g.mean annual flood) at ungauged sites in a region.A critical aspect of RFFA is the identification of relatively homogenous flood response regions.
A number of RFFA studies have included parts of or the entire area of South Africa in their analyses.These include Van Bladeren (1993), Mkhandi et al. (2000), Görgens (2007) and Haile (2011).None of these RFFA methods are currently widely used in practice to estimate design floods.The objective of this paper is to give a brief background to the methods and to compare the performance of these RFFA methods in the province of KwaZulu-Natal (KZN) in South Africa.

REGIONAl FlOOD FREqUENCy ANAlySIS METhODS FOR KwAZUlU-NATAl
The following sections provide a brief background to the RFFA methods used in this study.

Mkhandi Method
Mkhandi et al. ( 2000) performed a regional frequency analysis of annual maximum flood data using data from 407 screened stations in southern Africa.As shown in Fig. 1, 13 flood regions were identified in South Africa.The Pearson Type 3 (P3) distribution fitted by probability weighted moments (PWM) was found to be the best distribution in all regions in South Africa, with the exception of SAF13 where the Log-Pearson Type 3 (LP3) distribution fitted by the Method of Moments (MM) was used.
The Mean Annual Flood (MAF) was used as an index flood to scale the data.Equation 1 was used to estimate the MAF at ungauged sites (Mkhandi et al., 2000).EXPONENT  (1) where: MAF = mean annual flood (m 3 •s -1 ), CONSTANT = regionalised parameter, AREA = catchment area (km 2 ), and EXPONENT = regionalised parameter.

Van Bladeren Method
Van Bladeren (1993) derived growth curves using both continuously recorded data and discharges derived from historical flood information from the Natal and Transkei regions.Design floods were estimated using the General Extreme Value (GEV) distribution fitted by PWM.The MAF was used as the index flood in the derivation of the growth curves and regionalised regressions were derived to estimate the MAF as a function of catchment area.Regionalisation was initially based on the Regional Maximum Flood (RMF) regions identified by Kovács (1988) and was further refined within the RMF regions based on the skewness of the data.The method developed by Van Bladeren ( 1993) is applicable for RMF Regions 5.0 to 5.6 in Natal and Transkei.

haile Method
A RFFA in southern Africa was undertaken by Haile ( 2011) who used data from 459 gauging stations in 5 countries (Namibia, Malawi, Zambia, Zimbabwe and South Africa).After screening of the data, only 122 stations were included in further analyses (Fig. 2) with 92 stations from South Africa, of which 8 stations were used for independent testing of the method.Nine homogenous regions were identified, with 5 of these regions in South Africa, as shown in Fig. 3.The generalised Pareto (GPA), Pearson Type 3 (P3), three-parameter log-normal (LN3) and the GEV distributions were found to be suitable to model the distributions of the AMS of floods in southern African catchments.
The median of the AMS (MEF) was used as the index to scale the values.From independent assessment of design floods estimated in the 9 regions using the regionalised flood frequency relationships, it was concluded that the regional approach was satisfactory (Haile, 2011).Both linear and exponential relationships were developed to estimate the MEF as a function of catchment area.In some regions where the negative constant in the linear relationship resulted in a negative MEF, exponential relationships were developed in this study to estimate the MEF from catchment area using information from Haile (2011).

JPV Method
As part of the development of the Joint Peak-Volume (JPV) methodology, Görgens (2007) developed a regionalised index flood approach to design flood estimation for South Africa.

Figure 1
Flood regions and stations used in the analysis (Mkhandi, Kachroo and Gunasekara 2000)

Pooling of statistical parameters
Pooled values for the coefficient of skewness (g) and coefficient of variation (CV) were weighted according to the record length and the inverse of a similarity distance (Dist i,j ), computed using Eq. 2 (Görgens, 2007).(3)

DATA USED IN ThE STUDy
Stream flow-gauges located in KZN were selected for inclusion in the analysis based on the attributes of the gauges.The criteria used were length of record, with gauges included for record lengths > 20 years, start and end date of flow record, the percentage of values in the AMS where the recorded stage exceeded the limits of the rating curve for the flow-gauging station, the number of values in the AMS which were flagged as having missing data during the year, and the period of the year when the missing data occurred.
After the initial selection of gauges, additional gauges with 15-20 years of record were investigated for inclusion in the analysis in areas which did not have any gauges included in the initial selection.The location of all of the gauges in KZN is displayed in Fig. 7.The distribution of the length of record of the selected gauges used in the analysis is shown in Fig. 8.The record lengths ranged from 13 to 83 years with a median value of 40 years.
The reliability of design values estimated from the observed data is dependent on the quality of the observed data and length of available record.As shown in Fig. 9, 67.4% of the gauges in KZN had a single unique maximum value in the AMS, whereas 15.7% of the gauges had the same maximum value in more than 20% of the years, which is an indication that the rating table used to convert the observed river stage into discharge had been exceeded.For the selected gauges used in this study, a single unique maximum value in the AMS was found at 87.8% of the gauges, and 12.2% of the sites had up to 15% of the years with the same maximum value.In these cases, the exceeded values were treated as missing data.

RESUlTS
This section contains the results from the application of the JPV, Haille, Van Bladeren and Mkhandi RFFA methods at the 41 selected flow-gauging sites and a comparison of the estimated design floods to the design floods computed from the observed flow data at the sites.Alexander (1990Alexander ( , 2001) ) recommended the use of the LP3 probability distribution for design flood estimation in South Africa, while Görgens (2007) used both the LP3 and GEV distributions and, according to Van der Spuy and Rademeyer (2010), both distributions are applicable in South Africa.Hence both the LP3 and GEV distributions, fitted by L-moments (Hosking, 1990;Hosking and Wallis, 1990), were used to estimate the design floods based on the statistics of the AMS at each selected gauge.
For each RFFA method, site and distribution, a mean absolute relative error (MARE) was computed as shown in Eq. 4: The results from an analysis of the performance of the RFFA methods are shown in Table 1, which includes both the MARE M,D values for each method and the average slope between the 2 to 100 year return period floods computed at each site using the selected method (Estimated) and estimated from the observed data at the site (Observed).While the Haile method resulted in the smallest MARE M,D value, the average slope of the estimated vs. observed floods is considerably less than 1, indicating that the Haile method generally underestimates the floods computed from the observed data.This general underestimation by the Haile method is confirmed by the performance of the RFFA methods in estimating the 50-year return period floods shown in Fig. 10.The results in Fig. 10 also confirm the poor performance of the JPV method when the regionalised LP3 distribution is used.Based on the results in Table 1 and the typical results shown in Fig. 10, the JPV method using the regionalised GEV distribution and veld-zone regionalisation performed the best of the RFFA methods considered in this study.

DISCUSSION, CONClUSIONS AND RECOMMENDATIONS
The availability of gauged flows for large events with extreme discharges remains a challenge in South Africa.For many flow-gauging stations investigated in this study, exceedance of the rating table by observed river stage was evident by values in the AMS which are constant and equal to the maximum rated discharge for the flow-gauging structure.Stations which have more than 20% of the values in the AMS which exceed the maximum rated discharge were excluded from the study and, for the retained stations, the years with values which exceed the maximum rated discharge were assumed to be missing.This analysis did not account for different rating tables covering different periods of record, which would result in the rating tables being exceeded more frequently than indicated in this analysis.
The frequency with which recorded flow stages exceed the maximum rated level needs to be quantified and the impact of not including these extreme events in the estimation of design floods in South Africa must be quantified.Methods to extend the rating tables, and thus provide an estimate of discharge for all observed stage levels, need to be urgently developed.It is expected that the estimation of discharge for all of the observed stage levels will impact both on the volume of runoff measured and the design floods estimated from the observed peak discharge data.
The results presented for the 50-year return period illustrate the differences in the design floods estimated from the observed data when using the GEV and LP3 distributions, particularly for longer return periods.Despite the quality screening of the stations included in the study, the design floods estimated at a few stations seem to not be consistent with other stations in the region.Hence, the need for consistent screening and checking of the flow data is required in order to identify reliable data records that can be used for design flood estimation.
A number of RFFA studies have been developed which include all of South Africa (Haile, Mkhandi and JPV methods), and regions of South Africa (Van Bladeren).Despite the advantages of a regional approach to design flood estimation, RFFA  methods are not widely used in South Africa.Of the RFFA methods assessed in KZN, the Haile method gave the best performance in terms of the MARE, but consistently underestimated the design floods computed from the observed data when using either the GEV or LP3 distribution.The poor performance of the JPV method with the regionalised LP3 distribution needs to be investigated.However, the JPV method with the regionalised GEV distribution generally performed well, with the veld zone regionalisation giving better results than the RMF K-region regionalisation.The results from this study are applicable only to KZN and the performance of the various RFFA methods at a national scale needs to be investigated.
Only the studies reported by Mkhandi et al. (2000), Haile (2011) and Görgens (2007) encompass the whole of South Africa.Ideally, regionalisation in a RFFA should be performed using site characteristics as this enables independent testing of the regions for homogeneity using the at-site data, and the independent allocation of a site to a region based on the site characteristics.Discontinuities at regional boundaries need to be investigated and the alternative approach of transferring hydrological information from gauged to ungauged sites within a region should be evaluated.
Both Mkhandi et al. (2000) and Haile (2011) utilised the statistics of the at-site data with various homogeneity tests to identify homogenous flood regions in their study areas.Görgens (2007) did not update flood regions in South Africa and used both the RMF K-regions (Kovács, 1988) and the veld type zones (HRU, 1972) in his regionalisation, and it is recommended that a more detailed regionalisation should be used in the development of an updated RFFA method for South Africa.

Figure 1
Figure 1Flood regions and stations used in the analysis(Mkhandi, Kachroo and Gunasekara 2000) 2

Figure 10
Figure 10Performance of RFFA methods for the 50-year return period event