The characterisation of rainfall in the arid and semi-arid regions of Ethiopia

In order to plan effective agricultural and water resource projects, it is necessary to understand the spatial and temporal variability of rainfall. Although it is one of the most drought-hit countries in the world, almost no study has ever been conducted in characterising the rainfall pattern of the arid and semi-arid regions of Ethiopia. In this study, rainfall data of the past 50 years was used to study the basic statistical characteristics of the rainfall of this region. Annual and monthly rainfall was fitted to the theoretical probability distributions and the best distributions describing the data at respective stations were determined. Probability of wet days and dry periods of different durations was determined. It has been found that both annual and monthly rainfall at different stations was described by different probability distributions. There is high variation of rainfall pattern among the stations. Heavier rainfall events are infrequent but they make up a significant percentage of the total rainfall. In arid and semi-arid regions where both the amount and frequency of rainfall occurrence is low, it is essential to take into account the unique rainfall characteristics in such regions.


Introduction
Rainfall is the most important environmental factor limiting agricultural activities in the arid and semi-arid regions of the tropics. Although irrigation is believed to be an important strategy in alleviating the current food crisis, rain-fed agriculture is still the dominant practice in most developing countries. Soil moisture management in semi-arid and arid areas of the tropics is faced with limited and unreliable rainfall and high variability in rainfall pattern (Kipkorir, 2002).
It is very hard for hydrologists to measure, collect and store hydrological data such as rainfall and runoff. In most cases, the available data are limited and may also contain some gaps in the series. The gaps in the data can be filled or the series extended to a longer period using mathematical equations. It is generally assumed that a hydrological variable has a certain distribution type. Some of the most common and important probability distributions used in hydrology are the normal, lognormal, gamma, Weibul and Gumbel (Aksoy, 1999). The normal distribution generally fits to the annual rainfall and flows of rivers. The lognormal distribution is also used for the same purpose. In hydrology, the gamma distribution has the advantage of having only positive values. The Weibul and Gumbel distributions are used for extreme values of hydrological variables.
Generally only few studies of rainfall characteristics of arid and semi-arid regions of the tropics have ever been conducted. A review of research on tropical rainfall reveals that most detailed studies have been concerned with the more humid areas, a reflection of the distribution of both population and rainfall stations (Jackson, 1977;Oguntoyinbo and Akintola, 1983;Rowntree, 1988). The few published studies available from semi-arid areas tend to be from outside of the tropics (Sharon and Kutel, 1986) and the results are not necessarily representative of tropical areas. The Ethiopian arid and semi-arid region is no exception with almost no study to characterise the climatic pattern of this area. A recent study by Segele et al. (2005) tried to analyse the onset of Kiremt (rainy season), the rainy season of Ethiopia over the highlands of Ethiopia.
This study tries to characterise daily, monthly and annual rainfall distributions of the arid and semi-arid region of Ethiopia. The resulting information is essential for several research programmes, rehabilitation projects, irrigation scheduling, and hydrological studies in the area.

Data and method of data analysis
The study area and data The study area encompasses the arid and semi-arid region of Ethiopia found in the southern, southern-eastern, eastern, and north-eastern parts of the country (Fig. 1). The selection of the stations was restricted to eight stations due to unavailability of stations with complete data. Daily data of rainfall, temperature, humidity, wind speed, and sunshine hours was obtained from the National Meteorological Services Agency (NAMSA) of Ethiopia. As presented in Table 1, the length of data record for all the stations was greater than the 30 years of climatic data needed to do accurate climatic analyses in the tropics (Stewart, 1988;Aldabadh et al., 1982).

Method of data analysis
Monthly reference evapotranspiration was calculated using the FAO Penman-Monteith equation (Allen et al., 1998) given as: ( 1) where: ET o is the reference evapotranspiration (mm d -1 ) R n is the net radiation at the crop surface (MJ·m -2 ·d -1 ) G is the soil heat flux density (MJ·m -2 · d -1 ) T is the air temperature ( o C) u 2 is the wind speed at 2 m height (m·s -1 ) e s is the saturation vapour pressure (kPa) e a is the actual vapour pressure (k Pa) (e s -e a ) is the saturation vapour pressure deficit (k Pa) ∆ is the slope of vapour pressure curve (k Pa o ·C -1 ) γ is the psychometric constant (k Pa o C -1 ) The agro-climatic zonation of the meteorological stations was determined using UNESCO aridity index (AI) given in Rodier (1985) as: (2) where: P is the mean annual rainfall ET o is the mean annual reference evapotranspiration.
According to this classification, P/ET o <0.03 is hyper-arid zone, 0.03<P/ET o <0.20 is an arid zone, and 0.20<P/ET o <0.50 is a semi-arid zone. Mean annual rainfall P was calculated from the rainfall data for each station. Monthly average data of temperature, humidity, wind speed and sunshine hours was used in Eq.

Probability distributions of annual and monthly rainfall
For predictive purposes, it is often desirable to understand the shape of the underlying distribution of the population. To determine this underlying distribution, it is common to fit the observed distribution to a theoretical distribution by comparing the frequencies observed in the data to the expected frequencies of the theoretical distribution since certain types of variables follow specific distributions. Two kinds of tests were used to identify which theoretical probability distribution function best fits the rainfall data: Chi-square goodness-of-fit and Kolmogorov-Smirnov test. The chi-square goodness-of-fit test compares the observed frequencies with the expected frequencies from the hypothesised distribution. To apply the chi-square goodness-of-fit-test, the data are grouped into suitable classes, and then the chi-square statistic is calculated as the sum of squares over the classes of the difference between the observed and corresponding expected frequencies in the class. This test can be applied to discrete as well as continuous distributions; however, a fairly large sample is required to generate a reasonable frequency distribution. As a result, this being a large-sample test, one needs a sufficiently large sample.
The Kolmogorov-Smirnov test compares the observed distribution function to the hypothesised distribution function. The test statistic is based on the maximum absolute difference between these two distribution functions.
In this study, five commonly used probability distributions were fitted to annual and monthly rainfall data of eight stations in arid and semi-arid parts of Ethiopia. The five distributions are briefly described in the following section.

Normal distribution
The most important distribution of continuous variable is the normal distribution also called Gaussian distribution commonly applied for symmetrically distributed data. The probability density function n (x; μ, σ) of a random continuous variable x reads:

Lognormal distribution
Large numbers of hydrological continuous random variables tend to be asymmetrically distributed. It is computationally advantageous to transform the distribution to a normal distribution. In many cases the transformation can be achieved reasonably well by considering the logs of the events. In case natural logarithms of a variable x are normally distributed, the variable x is said to follow logarithmic normal probability distribution. The probability density function of such a variable y = ln x: where: μ y is the mean of ln x σ y is the standard deviation of ln x.

Gamma distribution
A random variable x is said to have a gamma distribution if the probability density function is given by: where: α is the scale parameter β is the shape parameter of the distribution.
The normalising factor Г (α) is defined such that the total area under the density function is unity as:

Weibul distribution
The probability density function of the Weibul distribution is given by: where: α is the scale parameter β is the shape parameter.

Gumbel distribution
The probability density function of Gumbel distribution is given by: where: α is the scale parameter β is the location parameter of the distribution.
In order to determine the probability of a wet day P wet , the number of days (n i ) that were wet were counted out of the total number of days (N s ) for the station as: Since the climatic records for the stations are all greater than the minimum recommended length of data record (i.e., 30 years), then the number of days in 30 years is assumed to be a reasonably closer sample estimate to the population's probability of wet days. A day was considered to be wet when there was more than 1 mm of rainfall and dry when rainfall was 1 mm or less. The probability of wet day vs. time was plotted to identify the time when the station is likely to be dry or wet. The probability of a dry spell of a given duration was determined on a monthly basis. To obtain this probability, the number of dry days of a given duration was counted and divided by the total number of days of such duration in the data series for a given month. For example, to determine the probability of a dry spell of 2 d, the number of two consecutive dry days was counted and divided by the number of total two consecutive days in the recorded historical rainfall data for a given month. Similarly, to determine the probability of three consecutive dry days, the number of three consecutive dry days was counted and divided by the number of total three consecutive days in the recorded historical rainfall data.
The distribution of daily rainfall totals by amount and frequency was obtained using a frequency analysis of historic daily rainfall data. This was achieved by counting the number of times a daily rainfall of specified amount occurred during the recorded period for the station.

Agro-climatic zonation
The aridity index (AI value) calculated using Eq. (2) is presented in Table 2. Included in the table is also the corresponding agroclimatic classification of the stations based on the UNESCO classification criteria. Assaita and Gode are relatively arid as a result of low rainfall and high evapotranspiration in these areas. Some stations such as Dire Dawa, despite high evapotranspiration due to the relatively high rainfall, are classified as semi-arid.

Probability distributions of annual and monthly rainfall
Annual rainfall recorded at eight rain-gauge stations in arid and semi-arid regions of Ethiopia was fitted to five probability distribution functions. The respective parameters of the distribution functions were determined and presented in Table 3. The values of the two goodness-of-fit tests chi-square (χ 2 ) and Kolmogorov-Smirnov (KS) are also presented in the table. The annual rainfall data of three of the stations (Gode, Assaita, Zeway) were not sufficient to calculate the chi-square goodness-of-fit test as this method requires a large size of data to be properly applied.
Based on the value of the chi-square goodness-of-fit value, the annual rainfall of the five stations is best described by the respective theoretical probability distributions indicated in parenthesis as follows: Negele Borena (Gumbel), Dire Dawa (Weibul), Mekele (normal), Jijiga (Gumbel), and Assebe Teferi (Weibul) ( Table 3). Theoretical probability distributions superimposed on respective frequency histograms of annual rainfall are also presented in Fig. 2 for these stations. Goodness-of-fit for Gode, Assaita, and Zeway was evaluated based on Kolmogorov-Smirnov test static value for which Gumbel, Gumbel and lognormal distributions respectively best describe the annual rainfall at these stations. Out of the eight stations considered, the number of stations with annual rainfall following the given distribution is normal (1), lognormal (1), Weibul (2), and Gumbel (4).
Monthly rainfall data were fitted to the theoretical distributions and the parameters of the respective distribution and the goodness-of-fit values are given in Table 4. Rainfall data of the relatively dry months could not be analysed due to limited non-zero data to apply the respective frequency distributions. A monthly rainfall frequency histogram superimposed to the fitted theoretical probability distributions for the wet months of Negele Borena, Dire Dawa, Mekele, and Jijiga is presented in Figs. 3, 4, 5, and 6. Out of the 12 months for which probability distributions were plotted, the number of stations with respective probability distribution is as follows: normal (2), lognormal (5), gamma (3), Weibul (1), and Gumbel (1). While no single distribution provides a good fit to monthly or annual rainfall data, it can be seen that most of the annual rainfall and monthly rainfall data fit the Gumbel and lognormal distribution respectively. The gamma distribution was also found to be the probability distribution of monthly rainfall in arid regions (Sen and Eljadid, 1999).        Studies in various parts of the world indicate that there is a general consensus that while annual rainfall in wet areas and wet months' rainfall can be fitted to normal distribution, rainfall in arid and semi-arid areas is skewed. Manning (1956) assumed that the distribution of annual rainfall in Uganda was statistically normal. Jackson (1977) has stressed that annual rainfall distributions are markedly 'skew' in semi-arid areas and the assumptions of a  be noted that higher number of rainy days for arid and semi-arid areas do not necessarily imply higher daily rainfall since in arid areas smaller numbers of rainy days are more frequent. A daily point rainfall frequency analysis was carried out and the result presented in Fig. 7. The plotted points on the graph show the frequency of occurrence of daily rainfall of 1 mm or more for each calendar day in the record period. The probability plot clearly shows the rainfall pattern in a year. It can be seen that over a long-time period, there is a well-defined daily rainfall probability pattern within the season. The shape of the curve varies from station to station. From the figure, the probability of daily rainfall occurrence on a specific day in a year may be inferred. The maximum probability for a wet day is 0.75 (on 9 August) for Mekele, 0.50 (on 10 August) for Dire Dawa, 0.52 (on 14 October) for Negele Borena, 0.48 (on 4 September) for Jijiga and 0.55 (on 30 August) for Zeway. Since the rainfall pattern in Mekele is highly unimodal, 80% of Mekele's annual rainfall occurs from June to September. The rainfall intensity is high during this period and runoff and erosion would be very high unless different soil and water conservation structures are implemented. Mekele area is almost dry for the rest of the season (October to May). The rainfall pattern in Dire Dawa area is bimodal with two rainfall peaks: one occurring in the period from March to April and the other from July to September. The first peak occurs in the last week of March while the second peak occurs in the beginning of August. About 45% of the annual rainfall occurs in the second peak period. The highest probability is 0.38 during the first peak period and 0.50 during the second peak. Negele Borena also exhibits a highly bimodal rainfall pattern. The two peak periods are such that the first peak is from mid-March to mid-May and the second peak is from beginning of October to mid-November. Unlike other stations, July and August are dry periods. Jijiga area exhibited only slight bimodality in April and May and September with the annual highest number of rainy days occurring in September. Generally, rainfall is distributed from March to September. At Zeway, the number of rainy days starts increasing in March and peaks in August and decreases thereafter. At a probability level of 0.20 of   Probability of a wet day during a year at different stations normal frequency distribution for such areas are inappropriate. Brooks and Carruthers (1953) stated that annual rainfall is slightly skew and that monthly rainfall is positively skew. For annual rainfall series which exhibit slight skewness, Brooks and Carruthers (1953) suggest the use of lognormal transformations. These comments apply equally well to tropical rainfall where annual totals exceed certain amounts. For example, Gregory (1969) suggests that normality is a reasonable assumption where the annual is more than 750 mm. Kenworthy and Glover (1958) suggest that in Kenya normality can be assumed only for wet season rainfall. Gommes and Houssaiu (1982) state that rainfall distribution is markedly skew in most Tanzanian stations. Mooley and Rao (1971) have shown that annual and monthly rainfall over different parts of India can be described by a gamma distribution.

Exceedance probability and return period of annual rainfall
Annual 20%, 50%, and 80% exceedance rainfall was calculated from the respective rainfall distribution of each station. The 20% exceedance rainfall is expected on average to be exceeded in 1 out of 5 years, 50% in 1 out of 2 years, and 80% in 4 out of 5 years. The annual rainfall which is expected at different exceedance probability levels and corresponding return-periods is presented in Table 5. A return period implies the frequency with which one would expect on average, a given total annual rainfall to occur. It can be calculated as: where: T is the return period (years) P is exceedance probability (i.e. the probability that a given annual rainfall is equalled or exceeded).
This statistical information helps in planning irrigation projects under different scenario. In some areas such as Gode and Assaita, even if one is optimistic and assumes 20% exceedance annual rainfall, it is not possible to grow crops without supplemental irrigation.

Probability of a wet day
The frequency of rain-days is an important determinant of annual rainfall (Hofmeyr and Gouws, 1964). Knowledge of wet day probability is important in the soil and water conservation planning and to predict the incidence of crop diseases. It should, however, and estimates of the percentage of annual rainfall falling in daily rainfall within each class. It can be observed that the lightest rainfall events are more frequent. The distribution of daily rainfall depths is highly skewed, a comparatively small proportion of the rain-days supplying a high proportion of the rainfall. In South Africa, Harrison (1983) observed the following: only 13% of all rain-days in the Eastern Orange Free State are responsible for 50% of the rainfall and only 27% contributed 75% of the total rainfall, whereas the lowest 50% of all rain-days produce as little as 7% of the rainfall. Similar observations were made elsewhere in the world including Argentina (Olascoaga, 1950), Florida (Riehl, 1949), Philippines (Riehl, 1950, and the Sudan (Hammer, 1968). In this study the following observations were made. In Mekele area about 98% of the daily rainfall events have values of less than 20 mm but accounting for only 60% of the total rainfall. From Fig. 9 it can be seen that 3% of the daily rainfall events and 53% of the total rainfall amounts equal or exceed the 15 mm required for rain-water harvesting (Roberts, 1985). Although the heavier rainfall events are relatively infrequent, they make up a significant percentage of the total rainfall. At Dire Dawa, about 98% of the storms produce less than 20 mm but accounting for only 53% of the total rainfall. Only 1% of the storms produce 40 mm of rainfall or more, yet they account for 18% of the annual rainfall total. In Negele Borena area, about 97% of the storms produce less than 20 mm but accounting for only 45% of the total rainfall. Five percent of the storms and 67% of the total rainfall amounts equal or exceed the 15 mm. Only 1% of the storms produce 40 mm of rainfall or more, yet they account for 24% of the annual rainfall total. At Jijiga, about 97% of the storms produce less than 20 mm but accounting for only 55% of the total rainfall. From the figure it can be seen that 4% of the storms and 57% of the total rainfall amounts equal or exceed the 15 mm. One percent of the storms produce 40 mm of rainfall or more, yet they account for 17% of the annual rainfall total.

Conclusions
The annual and monthly rainfall distribution in most of the arid and semi-arid parts of Ethiopia is skewed and cannot be described by normal distribution. Other distributions such as Gumbel and lognormal fit the data better. Although heavier rainfall events are infrequent, they make up a significant percentage of the total rainfall.
The maximum probability of a wet day is 0.75 (on 9 August) for Mekele, 0.50 (on 10 August) for Dire Dawa, 0.52 (on 14 October) for Negele Borena, 0.48 (on 4 September) for Jijiga and 0.55 (on 30 August) for Zeway. The probability plot of number of wet days follows similar pattern as the rainfall amount distribution in a year.
It is becoming increasingly important to understand the nature of the variability of rainfall so as to be able to optimally utilise the low rainfall areas a day being wet, only a period from June to mid-September can be identified which is the main growing season.

Probability of dry periods of different durations
Knowledge of the occurrence of dry periods of different durations is important in agricultural planning (irrigation scheduling) and hydrological studies. The probability of occurrence of continuous dry periods ranging from 2 d to one month duration was determined and is presented in Fig. 8. The probability of occurrence of dry periods of different lengths was different for different stations. At Mekele the probability of occurrence of a dry period of even 2 d is very low in July and August. However, the probability of occurrence of a dry period of one week is about 90% from October to February. Therefore, if not supplemented by irrigation, crop production is very risky during this latter period. In Dire Dawa area the probability of occurrence of dry periods of one week is less than 50% for the months from March to September. During July, August and September, the probability of a dry period of two consecutive days is less than 40%. At Zeway, the probability of dry period of one week is less than 60% for the months from February to October. The probability of occurrence of a dry period of 3d in July and August is the same as that of the probability of a dry period of one week in June. In Negele Borena area, the probability of a dry period of one month is less than 60% throughout the year while the probability of a dry period of 2d to one week is more than 60% for the months of June, July and August. In Jijiga area, the probability of dry period of even 2 d is less than 60% throughout the year. The probability of a dry period of one week is less than 20% in July, August and September and the probability of a dry period of two weeks is less than 20% for the months from April to September.

Cumulative frequency and cumulative depth of daily rainfall
The annual distribution of daily rainfall is summarised in Fig. 9 which shows both the frequency distribution of rainfalls producing various rainfall amounts Daily rainfall (mm) Frequency and D epth (% ) Frequency Depth

Figure 9
Cumulative frequency and amount and depth of daily rainfall