Comparing available rainfall gridded datasets for West Africa and the impact on rainfall-runoff modelling results, the case of Burkina-Faso

Monthly rainfall data in Burkina-Faso, West Africa, over a period of 77 years are extracted from three different gridded data sets, available either on the web : CRU (Climatic Research Unit, Norwich, UK), SIEREM (HydroSciences Montpellier, France), or from the National Meteorological Center of Burkina-Faso. With a view to modelling the runoff-rainfall relationship at the monthly time step, these data are used at the 0.5°*0.5° scale. Despite mean, minimum, standard deviation and inter-annual variability being very similar for the period 1922 to 1998, the three gridded data sets used show an important spatial variability of values with time, and some differences are observed which lead to significantly different runoff-rainfall simulations. Comparison of the rainfall grids has shown that differences between the precipitation grids are more pronounced during years when the rainfall is lower; this also applies to areas where the rainfall is lower. The three different rainfall grids produce differences in mean rainfall of 4 to 11%, depending on the grids that are compared. While these results are obviously specific to the station networks and interpolation method used, they provide an indication of the differences that can arise.
It is recommended that as many stations as possible are used to better assess areal rainfall. These biases have a strong influence on the results of the runoff-rainfall modelling (using the GR2M conceptual model) : the Nash criteria show differences of about 20% and calculated flow of 30% to 40%. This study illustrates the levels of uncertainty when using available rainfall gridded data sets, for rainfall-runoff studies in West African developing countries, which is important in the context of predicting water resources for the future from the GCM outputs for the 21st century.


Introduction
Accurate simulation/estimation of river flows is important for numerous applications in water resource planning and management. If we exclude the problem of gathering runoff data from African countries, where the observations within the national networks were often interrupted during the 1980s (Mahe and Olivry, 1999), in most catchments there are two major sources of uncertainty that limit accuracy: hydrological modelling (structure and parameterisation) and climate data (especially precipitation) used to drive hydrological models. The latter is a particular problem in most developing countries where meteorological networks are relatively sparse in time and space. For large-scale hydrological modelling, a common approach is to use gridded climate data as input to a gridded rainfall-runoff model, where the hydrological grids are linked to a drainage network. For example, in West Africa Ouedraogo et al. (2001) suggest a semi-distributed conceptual modelling of rainfall-runoff at the monthly time step to study climatic variation and its impacts on hydrological variability (Paturel et al., 2003a;2003b). Such an approach allows the prediction of the depth of runoff from latitude/longitude grids of rainfall, potential evapotranspiration and soil-water capacity. Many publications on this topic have appeared about Southern Africa, among which Hughes (1995) or Lee and Oh (2006), which use different kinds of rainfall-runoff models for water resource assessment purposes. In West Africa rainfall-runoff models are most sensitive to rainfall, and it is therefore important to understand biases of rainfall data before their use in a water resource model (Paturel et al., 2003b), and the effect of these biases on predicted runoff and river flow (Lee and Oh, 2006;Rubarenzya et al., 2007). But accessibility to data is a serious problem limiting research performance.
This paper is not about the relationships between rainfall distribution and runoff modelling (others like St-Hilaire et al., 2003;Dong et al., 2005;and Lee and Oh, 2006 have done this) or about how well or not the existing data sets allow reproduction of the 'real'-or most probable -rainfall fields. The purpose of this paper is to evaluate data from three gridded rainfall data sets of monthly rainfall available for Burkina-Faso over the period 1922 to 1998, and to analyse the consequences of each choosing one of them on the simulated river flows in 5 basins across Burkina-Faso. The aim is to process the data as though we were in the situation of a nonspecialist rainfall data-set user, who will use only one source of data, ignoring the consequences.

Data and rainfall grids Access to rainfall data in West Africa
Raw data: Raw data are rarely available for free through the National Meteorological Services, and costs are sometimes very high, although the WMO 40 th resolution asks for the free exchange of data for research and educational purposes (WMO, 2001). In their own countries, African researchers often have great difficulties to obtain access to rainfall data, and especially recent data (e.g. after 1990).

530
Several international institutions and laboratories have gathered raw data for internal research purposes, but they have not made these data available on the web, to respect the limitations placed on distribution by the National Meteorological Services, except the GHCN (Global Historical Climatology Network -NOAA) which provides web access to monthly climate data for synoptic station data supplied by the National Meteorological Services. The IRD (ex-ORSTOM) gathered raw data up to 1980 for a number of African countries, in terms of a contract with CIEH (Comite Interafricain d'Etudes Hydrauliques) and ASECNA (Agence pour la SECurite de la Navigation Aerienne en Afrique et a Madagascar), but is not allowed to release them. This IRD database served as a basis for the FRIEND-AOC dataset (Flow Regimes from International and Experimental Network Data -UNESCO/PHI -Afrique de l'Ouest et Centrale). The FRIEND-AOC dataset has been set up along with the FRIEND-AOC UNESCO program since 1994. This is a research network involving about 200 hydrologists and climatologists from West and Central Africa who can access the database via their research theme leaders (FRIEND-AOC, 2007). However, there are only a few data after 1990. This database is managed and updated by the University Cheikh Anta Diop of Dakar (Senegal).
Another regional database for West Africa is located at the AGRHYMET Centre of Niamey (regional centre for Agro-Hydro-Meteorology), Niger. This database has been compiled for the CILSS countries (Comite inter-africain de lutte contre la secheresse au Sahel), with the contribution of each CILSS country, but it is basically not accessible outside AGRHYMET or CILSS purposes, even to African researchers.
Gridded data sets: Research Laboratories like CRU (Climatic Research Unit, UEA, Norwich UK) or HSM, or international institutions like NOAA, through local and restrictive partnership research contracts, release 'elaborated' data sets, like gridded rainfall, most often at the 0.5°*0.5° scale. Gridded data sets, most of them covering the world at the monthly time step, are available for research purposes, free of charge via the internet, like NCDC- GHCN (Peterson and Easterling, 1994) or CRU (New et al., 1999(New et al., , 2000 or SIEREM (SIEREM, 2007;Boyer et al., 2004Boyer et al., , 2006Dieulin et al., 2006) data sets. They are often the only way for researchers to obtain rainfall data for remote areas of the world.
Satellite-derived rainfall data sets: The techniques for deriving rainfall amounts from satellite data are improving (Rouault et al., 2001); however, the quality and the reliability of these data have yet to improve with respect to compatibility with data derived from ground measurements (Hughes, 2006;Grimes and Diop, 2003).

The data sources
The three databases used to describe the monthly rainfall between the years 1922 and 1998 in Burkina-Faso are (  (Table 1). This dataset has a markedly unequal distribution over the Burkinabe territory, with a much lower density in the eastern half of the country. For the rest of the study, we distinguished between the western and eastern parts of the country, divided along longitude 2°W, because of the great difference of density of the CRU network on either side of this divide. The SIEREM and the METEO datasets contain the same stations as CRU, but they also include a number of stations with shorter time series, apart from the national protected areas of the extreme eastern and south-eastern regions that are not very well covered. This provides a more even spatial distribution over the Burkinabe territory. The differences between SIEREM and METEO are primarily due to a few station-years between 1970 and 1990, where SIEREM has improved coverage from various sources, and to greater quality control leading to the removal of several highly unreliable data series.

531
We then analysed the differences between the grids all over the entire territory and through the period of record.

Results of the comparative analysis
The main sources of error in the construction of gridded climate data from station records are partly due to the quality of the data (instrument and information management errors in the databases), the interpolation method used and the network density and distribution (New et al., 2000;Briggs and Cogley, 1996;Wahba, 1990). According to the number of stations and the geometry of the station network, the resolution of 0.5° is a good choice for the majority of countries in West Africa to correctly represent the monthly rainfall fields. In the sections that follow we quantify the differences between the 3 gridded datasets, and make an assessment of the relative contribution of station density and interpolation method to these differences.

Annual rainfall
Annual rainfalls calculated from each data set CRU, SIEREM and METEO are very similar (Fig. 3). Mean, standard deviation and minimum values are very close (Table 1), and only maximum values show some differences between SIEREM and the two other grids. It is therefore surprising that the study found that there were a number of other differences arising from the use of these three grids.

Time-evolution of differences
We first look at the time-evolution of differences between grids over the study area as a whole. Figure 4 shows the area-averaged differences between grids for annual rainfall; for each of three possible grid-pairings, grid-point differences (Eq. (1)) for annual rainfall were calculated, and the mean of all 86 grid-point values was then determined; the three area-averaged annual values were then combined into a single mean annual difference. The mean differences show marked inter-annual variability, as well as an increase after the mid-1970s, coincident with the Sahelian drought. In general, drier years are associated with larger differences between grids. There are only slight differences between pair of grids SIEREM and METEO while the differences of SIEREM and METEO values with CRU are much greater (Fig. 5).
The temporal evolution of the differences between the grids CRU and SIEREM in the western part of Burkina-Faso clearly shows the influence of the number of stations (Fig. 6). Similar results arise from other combinations of grids. As might be expected, differences remain small when the network is similar

Analysis
We analysed the spatial and temporal distribution of annual rainfall amounts for each of the 86 grid points located in the Burkinabe territory, during the period 1922 to 1998. We calculated, mesh by mesh and year by year, the deviations between grids according to the following simple but robust formula: where: val_grl = value grid 1 val_gr2 = value grid 2 ABS = absolute value  Figure 7 shows the spatial distribution of the mean differences calculated from monthly rainfall. The CRU grid shows some large differences to SIEREM, up to 20% at grid points in the border areas. The large differences in the borders are a reflection of the different network. CRU also show larger differences in the eastern parts of the territory, where the network of stations used by the CRU is less dense than SIEREM or METEO. The SIEREM and METEO grids show only minor differences to each other (less than 4%) except for 5 grid points in the south-eastern part of the country where the network is less dense. These observations are confirmed by grid-point Student's t-tests for differences in the mean, showing no differences between SIEREM and METEO data but several grid points with differences between CRU and SIEREM databases (Fig. 8).  Fig. 2) leads to a reduction of 30% of the difference between rainfall values of CRU and other data sets (results not shown). This is not the case in the eastern part of the basin, where the recent CRU dataset has not incorporated more stations. These differences are higher than those cited by Ali et al. (2005), who refer to studies in southern Annual variability of the mean differences between the three rainfall grids over Burkina-Faso (line) and their relation to area-averaged rainfall totals (bars)

Figure 6
Annual variability of the mean differences between rainfall grids CRU and SIEREM: difference in the number of stations (bars) and in the rainfall amount in the western Burkina-Faso (line) (%)

533
Niger which assess the effect of station network density on precipitation estimates. They report a bias of 10 to 15% when using a degraded network of 5 stations over 10 000 km², compared to a network of 10 stations over the same area.
It is most probable that the rainfall data of SIEREM and METEO grids are more representative of the ground truth than the data of the CRU grid. This means that in the study area at least, it is better to use as many stations as possible to compute rainfall grids. The runoff-rainfall modelling we use hereafter in this study has two objectives. Firstly, it assists in assessing the significance of increasing the number of stations, assuming that the rainfall data which are closer to the ground truth will lead to better runoff simulations, described by higher Nash criteria (all other variables being identical). Secondly, it is used to assess the impact of using different gridded-rainfall data sets on runoff simulation accuracy, in the context of water resource availability estimation.

Impacts on the river flow modelling
We have shown that different existing precipitation grids, nominally describing the same surface climate element, vary significantly across the study area, depending on the station network used to generate the gridded data. This is a confirmation of what has been found in other cases, but it is important to emphasise this point before evaluating the effect of such differences on river flows simulated with a hydrological model.

Data and method
Previous work has evaluated two different hydrological models (GR2M described by Makhlouf, 1994 and WBM described by Vorosmarty et al., 1989 andConway, 1997) over 50 river basins in West Africa, using several different sets of data for potential evapotranspiration (PE) and soil-water holding capacity (WHC) (Paturel et al., 2003a;Dezetter et al., 2008). For West African river regimes the results indicated that the best combination of model, PE, WHC and calibration/optimisation was: • GR2M hydrological model (better than the WBM) • Standard Penman PE formulae, better then reference crop PE, calculated after Shuttleworth (1993) • WHC, calculated from the FAO digital soil map of the world, (FAO, 1981;Girard et al. 2002), and considering the maximum value of the range given by the FAO (better than the Dunne and Willmott (1996) WHC dataset) • Calibration using the series of Rosenbrock methods (Guilbot, 1971;Servat, 1986) and Nelder and Mead (Servat, 1993).
We chose 5 catchments in Burkina-Faso ( Fig. 9 and Table 3). These catchments are between 7 500 to 20 000 km 2 in size, and distributed over Burkina-Faso such that they encompass the North-South gradient of rainfall, which varies from 400 to 1 100 mm. The mean annual runoff coefficient varies between 2 and 8% and runoff is not perennial, except for the Nwokuy River, in the Mouhoun basin (71% of runoff in July to October period, Table 3). We simulate runoff in each catchment using each of the three rainfall grids: CRU, METEO, SIEREM. A split-sample procedure is followed, running a calibration on two-thirds of the available time series of flows (18 to 30 years depending on the specific catchment) and a validation on the last third part of the time series. Our comparison covers data from the calibration period and the performance of the model is evaluated using the Nash criteria (Nash and Sutcliff, 1970;Perrin, 2000), which compares simulated flow (S) against the mean of observed (O) discharges, and is effectively a standardised MSE, with a maximum possible value of 100%: (2)

534
Results for the validation period are similar to those of the calibration period, but with lower values most of time, and are not shown here. In addition, for each combination of grids, we compared the differences between the monthly flows, expressed in both percentage (Eq. (1)) and absolute values. As before, we distinguish between the eastern and western parts of the country, on each side of the 2°W meridian.

Results
Simulations using the METEO and SIEREM grid produce the best results for the Nash criteria for 3 catchments (Folonzo, Wayen and Nobere), and the mean of the Nash criteria is 57% over the 5 catchments (Fig. 10); the Nash criteria are higher for simulations using SIEREM or METEO than the CRU precipitation grid, between 50% and 80%. For the two other catchments (Nwokuy and Dolbel), the Nash criteria are similar with the 3 grids, around 40%. Table 4 shows, for each pair of grids, the percentage difference between the Nash criteria (calculated according to Eq. (1)). The relative differences in the criteria for the METEO and SIEREM grids are very low (<1%), but are highest between both the METEO/SIEREM and CRU, and are twice as high for the catchments of the eastern half of the country (9.2 to 9.9%) than for catchments in the western half (5.9 to 6%). The differences in the Nash criteria are comparable to the mean differences in the rainfall data, which are about 4% in the west and from 10.7 to 11.2% in the east (Table 2). When the actual flows simulated in the 5 catchments are compared (Fig. 11), the largest differences are observed between simulations using the CRU and SIEREM/METEO grids: up to 90% between CRU and SIEREM/METEO. The smallest differences are between the SIEREM and METEO grids, as would be expected from the similarity of the precipitation data in these two grids. As seen in Table 5, the highest differences are for the catchments located in the east (about 70%) where the measurement network is less dense. The differences between the rainfall grids are amplified in the hydrological response, with differences four times higher in the East than in the west in absolute values (m 3 ·s -1 ). Figure 12 shows the difference in mean monthly flow using CRU and SIEREM. Relative differences are largest during periods of low flow (up to 80%), but have relatively low absolute values (less than 1 m 3 ·s -1 ). Conversely, during high-flow months (from July to October), which represent between 70 and 97% of the annual discharge, the relative differences in flow are smaller (less than 40%), but with a greater impact on absolute values up to 5 m 3 ·s -1 . The eastern catchments show the greatest percentage differences, particularly in the dry periods. This is due to the combined effect of the rainfall grids varying too much in the east, and to lower rainfall amounts in the eastern catchments, compared to the catchments in the west. For the Wayen catchment, which is in the centre of the country, with half of the catchment in each of the western and eastern parts of the country, the results for the simulated runoff are between those observed in the eastern and western parts (Table 5).

Synthesis and conclusions
Our analysis has enabled us to investigate the sensitivity of river flows simulated by gridded hydrological modelling, to the quality of available gridded monthly rainfall data sets.

Figure 10
Nash criteria (%), during calibration (left) and validation (right), for simulated-observed flow in each catchment using the three rainfall grids (ranged from West (left) to East (right))

Figure 12 Difference in % and m 3 ·s -1 between calculated runoff from CRU and SIEREM grids, for western basins (Folonzo and Nwokuy) and eastern basins (Dolbel and
Nobere)

535
Despite mean, minimum, standard deviation and inter-annual variability being very similar, the three gridded data sets used show an important spatial variability of values with time. The three different rainfall grids produce differences in mean rainfall of 4 to 11%, depending on the grids that are compared. While these results are obviously specific to the station networks and interpolation method used, they provide an indication of the differences that can arise.
Comparison of the rainfall grids has shown that differences between the precipitation grids are more pronounced during years when the rainfall is lower; this also applies to areas where the rainfall is lower. These differences have several causes including the quality of the data (instrumental errors, recording error, point or systematic errors) and the interpolation method which are not discussed here. This study has focused on the differences related to the variability in the density of the measurement network and its impact on the calculation of the hydrological balance. Our comparison has provided insight into the relative influence of these biases on the quality of the grids. The improvement of the data quality (SIEREM grid compared to METEO grid) has not had a significant influence on the quality of the grid, but the improved coverage in the east in both SIEREM and METEO has produced quite large differences from CRU, about 4% in the west where the networks are more similar, and about 11% in the east where the CRU network is much sparser.
The maximum differences observed between the CRU and SIEREM/METEO grids, of about 11%, represent about 60mm of annual precipitation in northern Burkina-Faso and about 120mm in the south. These differences in precipitation input are amplified in the hydrological responses simulated by the GR2M model.
The impact of differences in the rainfall grids on simulated river flow is considerable: a mean difference of 10% between grids induces variations of the Nash criteria of about 10% and calculated variations of flows of about 30% during the flood period and about 60% for the rest of the year. Comparisons between the west and east of the country show that precipitation network density is very important for accurate flow prediction. In the eastern part of Burkina-Faso, where the number of stations are reduced, the differences between Nash criteria are about twice those in the west, and differences between simulated flows are 5 times higher in the east.
Ideally, the better rainfall gridded data sets should include as many stations as possible, while a quality review (SIEREM) of the raw data from National Services (METEO), seems to have little effect on the results of runoff-rainfall modelling, despite the fact that there is a significant difference between grid-point rainfall values of SIEREM and METEO.
These results show the major impact of the quality of the rainfall dataset used for further purposes (runoff-rainfall modelling in this study) in countries where data are scarce, not fully quality controlled and difficult to access. It is of particular interest to be aware of these differences when considering the prediction of future runoff from GCM outputs. The hydrological models are calibrated and validated on existing gridded rainfall datasets. The model parameters that are obtained are used to run the model with future rainfall and PE data from GCM outputs (Ardoin- . This increases the uncertainty on the runoff prediction for future decades. It would be a great help, in particular for the scientific community, if one reference rainfall gridded data set could be updated for West and Central Africa, and be made freely available via an internet website. The Agrhymet Center (CILSS) and the FRIEND-AOC network of hydrologists (UNESCO) could collaborate for this purpose, for example on the basis of the FRIEND-AOC rainfall database which is coordinated by UCAD in Senegal. For now, the SIEREM gridded database has been put on the web for the purpose of African rainfall studies and will be maintained and updated as well as possible, with the help of all willing national services, guaranteed by the fact that only gridded data are available through this database. Quick communication of monthly data for synoptic rain gauges would allow rapid updating of rainfall indices as presented by L' Hote et al. (2003) over the Sahel.