Estimating Soil Bulk Density and Total Nitrogen from Catchment Attributes in Northern Ethiopia

Even though data on soil bulk density (BD) and total nitrogen (TN) are essential for planning modern farming techniques, their data availability is limited for many applications in the developing word. This study is designed to estimate BD and TN from soil properties, land-use systems, soil types and landforms in the Mai-Negus catchment, northern Ethiopia using stepwise multiple regression (SMR) and generalized linear model (GLM) analyses. Different soil properties and other catchment attributes were collected following a standard procedure. The SMR analysis showed overall model coefficients of determination (R 2 ) of 0.91 and 0.89 with significant F-statistics for the relationships of BD and TN, respectively, with soil properties, land-use and landforms. In addition, the GLM analysis resulted in an overall R 2 of 0.92 with significant F-statistic for BD, and an R 2 of 0.94 with significant F-statistic for TN. The model coefficients of both analyses for the dependent variables showed higher for organic carbon (OC) as compared to the other variables even though higher values were found from GLM. This study thus confirmed that practices which improve OC can strongly influence the variation of both dependent variables. This study suggested that BD and TN should be estimated based on the relationships explained by the different techniques (analysis) in similar conditions in order to improve data availability; however, the GLM is preferable as it considers the effect of the interaction terms. The of soil varied with size of the sampling point in the landforms, land-use and A sampling plot was its homogeneity in hydrological condition on the basis of researcher’s field observations. each soil sampling points composite samples 8 at 0-20 cm soil were collected since this depth is where most are expected to occur due to natural and anthropogenic activities. The soil samples were higher in the sampling points that possessed larger plots. Disturbed soil samples each sampling were air-dried, pooled, homogenized and sieved to pass 2-mm


INTRODUCTION
The need to achieve sustainable use of soil resource has been an increasing concern to decision and policy makers. This is mainly the concern of many developing countries like Ethiopia, because soil degradation such as soil nutrient depletion and physical degradation have alarmingly increased and become serious threats to agricultural productivity (Fassil Kebede and Yamoah, 2009). Previous studies have showed that low soil nutrients are among the most crop production limiting factors in Ethiopia (e.g., Kamara and Haque, 1988;Chikowo et al., 2010). The positive feedback dynamics between growing population, landcover and climate change have led to a rapid loss in the capacity of soils to deliver essential nutrients such as nitrogen (Gebreyesus Brhane Tesfahunegn et al., 2011). Other studies also reported that most crop and soil management practices and topography of tropical ecosystems (e.g., Ethiopia) could cause significant modifications in soil properties such as nitrogen and biological and chemical properties are more rapid than soil physical properties (Schipper and Sparling, 2000;Birang et al., 2003;Agoumé and Birang, 2009).
Strong relationships between plant communities and soils, and landscapes are documented by various studies (e.g., Wondzell et al., 1996). Such studies suggest that the patterns of plant communities vary as a result of the influence of soil properties on soil water and nutrient availability across landscapes. However, site specific understanding on the relationships of soil nutrient (e.g., total nitrogen, TN) and soil physical parameter (e.g., bulk density, BD) with catchment attributes (soil properties, landform, land-use systems, and soil types) demands research attention in fragile ecosystems in order to highlight the main influencing parameters. To do so, there are many cases in which it is desirable to develop empirical relationships among some soil physical and chemical properties and with other catchment attributes (Rashidi and Seilsepour, 2009;Seybold et al., 2009). This is based on the fact that BD and TN are often determined using laborious and time consuming methods but these soil parameters can be estimated on the basis of easily available parameters (Périé and Ouimet, 2008). In addition, dense sampling is required to adequately characterize the spatial variability of soil BD and TN, which may not be economically affordable (Rashidi and Seilsepour, 2009). It may be thus more suitable and economically feasible if a method which uses easily available soil data and other catchment attributes that influence soil conditions in a catchment is developed to serve as a proxy for estimating soil parameters such as BD and TN (Heuscher et al., 2005;Rashidi and Seilsepour, 2009;Périé and Ouimet, 2008).
Literature showed that regression models have been developed to predict soil parameters such as BD and TN from soil physical and chemical datasets (Akgül and Özdemir, 1996;Heuscher et al., 2005). A linear regression model for predicting soil TN from soil organic carbon (OC) by Rashidi and Seilsepour (2009) and logarithmic regression for BD estimation by Prévost (2004) and Périé and Ouimet (2008) were obtained. According to Heuscher et al. (2005), stepwise multiple regressions indicated that OC was the strongest contributor to BD prediction. However, such estimations of TN and BD were as a function of soil OC only or using some soil properties, indicating that there is a limitation in the past studies as mainly focused on specific or limited soil datasets. Consequently, there is lack of published results that demonstrated the relationship of BD and TN with other soil properties and catchment attributes such as soil types, land-use types, and landforms in the existing literature (Wagner et al., 1994;Akgül and Özdemir, 1996;Heuscher et al., 2005;Rashidi and Seilsepour, 2009 Rashidi and Seilsepour (2009) have reported that nitrogen (N) is among the soil macronutrients that often limit plant growth. Despite of this fact, presently, there is no generally adopted and completely reliable and easy method for prediction of nitrogen in soil.
Knowledge of soil BD is also an essential indicator for soil management, as BD shows soil compaction and structural degradation. In addition, soil BD is often used in models, characterizing field conditions, estimate soil porosity, and convert volumetric measurements (Reinsch and Grossman, 1995;Arshad et al., 1996). Other studies showed that soil BD is a basic soil property which influences other soil physical and chemical properties and catchment characteristics (e.g., Arshad et al., 1996;Périé and Ouimet (2008)). Site specific knowledge of soil BD and TN are essential for planning modern farming techniques and thus for sustainable management of soil resources (Akgül and Özdemir, 1996).
Despite the above facts, soil BD and TN are dynamic soil property which vary with the structural condition of the soil as this can be altered by cultivation, trampling by animals, land-use types, erosion-deposition processes, and weather condition such as raindrop impact (Reinsch and Grossman, 1995;Akgül and Özdemir, 1996;Arshad et al., 1996). This makes measuring and/or estimating BD and TN difficult, time consuming and expensive at catchment-scale. As a result of this, many studies use single BD value in most cases while estimating for other locations. Similarly, direct laboratory measurement of soil TN is generally impractical to get data for most applications. It is therefore essential to explore methods that are required for predicting soil BD and TN from more easily and routinely measured soil properties and other catchment attributes.
Though few studies illustrated the development of BD and TN prediction equations using a range of soil data (e.g., Arshad et al., 1996), little is documented on the relationships of soil BD and TN with soil properties and other catchment attributes such as land-use systems, soil types, and landforms (small units of the landscape that possess similar slope, flow and deposition). However, acquiring scientific information on their relationships of a combined and separate effect on BD and TN is required so as to guide for decision makers in the choice of appropriate cropping systems and suitable land-use and soil management practices in a catchment (Aruleba and Ajayi, 2011).
The objective of this study is to examine the relationships of soil BD and TN with selected soil properties, land-use systems, soil types, and landforms in Mai-Negus catchment, northern Ethiopia, using stepwise multiple regression (SMR) and generalized linear model (GLM).
The target of this study is to identify the main and interaction terms of soil parameters and other catchment attributes, and assesses the relevance of the data groups (soil properties, soil type, land-use systems and landform) in predicting BD and TN using the two statistical models. Such prediction models improve the availability of soil BD and TN data for researchers and development workers and their interpretations can be applied for decisionmaking processes such as soil-plant management planning in similar conditions.

Study Site
This study was conducted in the Mai-Negus catchment of Tigray region (

Generating Catchment Landforms, Elevation, Soil Types, and Land-Use Types
In order to develop the different catchment attributes, field reconnaissance surveys and informal group discussions were executed with a team consisting of the author, two development agents and six farmers who are knowledgeable about the study catchment. Data were collected from June to December 2009 using the research framework which is given in figure 2. The detail data collection and generating procedures of each catchment attribute is given below. The landforms in the catchment ( Fig 3A) were classified using data from field and topographic map of the area. Considering elevation, slope, and geomorphologic character (surface and subsurface flows, alluvial and colluvial deposition), the catchment was classified into six main landforms in ArcGIS ( Fig 3A).  About 55% of the land area in the catchment was classified as arable land, 21% for grazing, and 14% as exclosure. Dense bush and woodland with mixed forest accounted for about 2% of the catchment. The rest of the land was miscellaneous such as settlement, marginal area and reservoir (8%). A farmland after harvest is used as a grazing land to use grasses in the field and around the boarders for short time (not more than an hour). Grasses in an exclosure (landscape under rehabilitation) was used for livestock by cut and carrying system. Part of the areas covered by bush and wood land were also opened for grazing but these are dominated by unpalatable species and in many cases due to their shading effects grasses are rarely grown. The data on cropping systems and soil management practices were collected through informal group discussions with local farmers and development agents in the study catchment.

Soil Sampling and Analysis
During soil sample collection, the locations of the soil sampling points considered the spatial distribution of the different landforms, land-use and land-cover and soil types in the catchment as presented in table 1. A total of 117 soil samples were collected and analysed from the sampling points (plots) with an area ranged 150 to 300 m 2 in the catchment. The number of soil samples varied with size of the sampling point in the landforms, land-use and soil types. A sampling plot was located considering its homogeneity in hydrological condition on the basis of researcher's field observations. From each soil sampling points composite samples of 5 to 8 at 0-20 cm soil depth were collected since this depth is where most changes are expected to occur due to natural and anthropogenic activities. The soil samples were higher in the sampling points that possessed larger plots. Disturbed soil samples from each sampling point were air-dried, pooled, homogenized and sieved to pass through 2-mm sieve. Soil samples were determined for soil texture using the Bouyoucos hydrometer method (Gee and Bauder, 1986), soil bulk density (BD) by the core method (Blake and Hartage, 1986), electrical conductivity (EC) by an EC meter in a 1:2.5 soil to water suspension (Rhoades, 1982a), soil pH by suspending the soil solution in a 1:2.5 soil to water ratio using a pH-meter and a combined glass electrode (Thomas, 1996). Soil organic carbon (OC) was determined by the Walkley-Black method (Bremmer and Mulvaney, 1982), available phosphorus (Pav) by Olsen (Olsen and Sommers, 1982), total phosphorus (TP) extracted by HClO 4 digestion determined calorimetrically (Jackson, 1964) and total nitrogen (TN) by the Kjeldhal Digestion method (Anderson and Ingram, 1993). Cation exchange capacity (CEC) was determined by ammonium acetate extraction buffered at pH 7 (Rhoades, 1982b).
Exchangeable bases (Ca, Mg, K, Na) were analyzed after extraction using 1M ammonium acetate at pH 7.0. Readings for Ca and Mg in the extracts were determined using an atomic absorption spectrophotometer, while Na and K were determined by flame photometry (Black et al., 1965). Exchangeable sodium percentage (ESP) was calculated by dividing exchangeable Na + by CEC. Base saturation percentage (BSP) was calculated by dividing the ISSN: 2220-184X sum of base-forming cations by CEC (Coyne and Thompson, 2006). Iron and Zinc were determined by the method described in Baruah and Barthakur (1999) using 0.005M diethylene triamine pentaacetic acid extraction.

Data Analysis
Soil properties, land-use systems, soil types, and landforms were used as predictors to assess their relationships with soil BD and TN using SPSS 18.0 software (SPSS, 2011). To do so, all soil types, land-use systems and landform types (Table 1) were coded as categorical (dichotomous) variable using 1= presence and 0 = absence, for indicating their presence and absence in each sampling point. For example, in the sampling point one the soil type was Vertisols, land-use system of teff-pulse rotation and a rolling hills landform, in which these variables were coded as 1 whereas the other soil types, land-use and landforms were absented in sampling point one and consequently, they were coded as 0. The same procedure holds true for the other sampling points in the catchment. The continuous value of the soil properties determined from each sampling point was entered into the corresponding independent soil variables. After this, the stepwise multiple regression (SMR) was used to test the strength of the relationships of BD and TN with all the independent variables. The generalized linear model (GLM) was also applied to examine for the relationships of soil BD and TN with the main and interaction terms of the different parameters. These two methods were selected because of their unique qualities such as the inclusion of input variables with good explanatory power in SMR and the ability of GLM to utilize interaction terms (Agyare, 2004). One common advantage of these techniques is the ability to use both categorical and continuous parameters as input variables. The stepwise multiple regression is a sequential approach to variable selection, and was used because it allows the inclusion of input variables that better explain the response, leaving out parameters that are statistically non-significant or of low explanatory power due to the inclusion of other parameters (Hair et al., 1998). The GLM differs from the well-known multiple regressions in many respects. For instance, the distribution of the independent or response variable does not have necessary to be continuous for GLM. Meaning, GLM model allows categorical or nominal variables as input variables by recoding them into a number of dichotomous variables (Agyare, 2004). The parameters and their data group used for the SMR and GLM analysis are presented in table 1.
When soil BD and TN were used as the dependent variable in the analysis, the importance of each independent parameter was evaluated based on the size of the model coefficient, significance level, and coefficient of determination (R 2 ) of each statistical model. Besides to this, the effect size measure (Eta) in the case of GLM was used as a measure of model performance i.e., measures the association between the main or interaction term and the dependent variable. Only data elements that contributed significantly (P ≤ 0.05) to predicting soil BD or TN were presented in this study. The R 2 change gives the percentage of variance in the dependent variables (BD and TN) explained by the independent variables. For each statistical model, the R 2 and F-statistic is used as a measure of model performance. The stepwise regression analysis was done using the data lists with variable selection method, i.e., entering and removal of parameters at P ≤ 0.05 and P ≥ 0.1, respectively, using SPSS 18.0 (SPSS, 2011). For GLM analysis, the number of interactions is limited to 2-way due to the possible increase of multi-collinearity.

Estimation of Soil Bulk Density Using SMR Analysis
The result of the stepwise multiple regression (SMR) analysis for soil bulk density (BD) as the dependent variable is shown in table 2. An overall model coefficient of determination (R 2 ) of 0.91 with a significantly high F-statistic of 176.78 was obtained for the relationship of soil BD with the soil properties, land-use system and landforms in the catchment. This indicates that about 91% of the variance in BD can be explained by the independent variables. However, there was no significant (P > 0.05) relationship between BD and soil types of nominal data group, and some soil properties (e.g., Ca, Mg, Na) (data not shown).
The non-significant variables were not included in the SMR analysis result due to their low relative contributions. The most important parameters that influence the estimation of BD were identified as OC, followed by sand content, the reservoir landform, plantation protected land and forest landuse systems. This is because these variables showed higher SMR coefficients than the others.
In line to this, previous reports indicated that sand content and OC were among the most important soil property that affected BD in which BD decreases with increasing OC and decreasing sand content (e.g., Bauer and Black, 1992;Wagner et al., 1994). Such findings thus suggested that soil BD can be estimated using soil texture parameters along with OC values. However, Heuscher et al. (2005) reported using stepwise multiple regression that OC was the strongest contributor to BD prediction as compared to other soil properties.
The soil parameters such as CEC and Pav showed lower relationships with soil BD as compared to the others, even though their relationships were influenced significantly ( Table   2). The implication is that while estimating BD, the parameters that had larger regression coefficients are preferable to those with lower values. The variables such as OC, silt, CEC, Pav, reservoir, forest land, plantation protected land-use system that had negative regression coefficients indicated an inverse relationship with BD. On the other hand, the variables such as sand content, marginal land-use system, central-ridge and mountainous landforms showed positive coefficients for their relationships with BD. In line to this, other studies reported that positive coefficient indicates that the dependent variable increases as the corresponding independent variables increase and vice versa for negative coefficient (e.g., Rawls, 1983;Federer et al., 1993;Manrique et al., 1993;Neupane et al., 2002).
The highest variance (R 2 ) while estimation soil BD was explained by organic carbon only for 71%, and the lowest was by Pav (0.4%) followed by CEC (0.6%). According to Hamilton (1990), strong and weak values of R 2 are defined between ± (0.64-1.0) and ± (0.04-0.25), respectively. Generally, considering the different data source groups (soil properties, landform and land-use systems) as independent variables, the R 2 obtained using the SMR analysis to estimate soil BD is illustrated in figure 4. The R 2 for soil properties as the data source group in estimating BD was explained by the variance of 80%. This was followed by the data source group of the landform for 6% and land-use systems for 5%, in explaining the variability of soil BD. Those values indicated that BD variability in the study catchment can be explained mainly by the soil properties data group. In light of the above model results, other studies prove that stepwise multiple regression (SMR) is more efficient than the full model regression to determine predictive equation for yield and yield components (e.g. Naser and Leilah, 1993;Mohamed, 1999).

Estimation of Soil Total Nitrogen Using SMR Analysis
The result of stepwise multiple regression (SMR) analysis between the dependent variable (TN) and independent variables (catchment attributes) is presented in table 3. The parameters in this table were significant predictors of TN for an overall R 2 of 0.89 with a significant F-ISSN: 2220-184X statistic of 162.57. However, the coefficient and R 2 of each variables showed that soil OC followed by the plantation protected land-use system and reservoir landform have significant contribution for the regression model when compared with the others (Table 3). The better estimation of TN by OC could be associated with the organic source of nitrogen in the study catchment and this is consistent with the finding reported using linear regression in Rashidi and Seilsepour (2009);and Prévost (2004). The SMR model coefficient and R 2 of the forest land-use system was also significantly predicted TN better than the other variables. The lowest model coefficient (-0.009) was found for the relationships between TN with BD and the mountainous landform. The total variance explained for TN by the different data source groups is shown in figure 5.
This figure indicates that soil properties followed by land-use systems and landform can better predict TN compared to the other landscape attributes. The soil properties (OC, silt, clay, BD, Pav, CEC) could predict 74% of the variation in soil TN, even though OC only accounted for the largest part (about 69%). Such R 2 results are rated as strong relationship according to the rates described by Hamilton (1990

Estimation Soil Bulk Density Using General Linear Model (GLM)
The GLM analysis result for the relationship of soil BD with the independent variables is presented in table 4. The result shows the coefficient, standard error, significance level and measure of size effect (Eta) for the constant or intercept, the main and interaction effects of the independent variables. The Eta gives the measure of the association between the main or interaction term and the dependent variable, in this case soil BD. The GLM analysis resulted in an overall R 2 of 0.93 with F-statistic of 187.45. The main term effect among the parameters in the GLM analysis showed the highest model coefficient and Eta square for OC followed by the reservoir landform. However, the interaction term effect of OC with the other parameters on the coefficients and Eta square was higher when compared to the main terms ISSN: 2220-184X effect on BD prediction as this is shown by the GLM analysis (Table 4). In contrast to this study, Agyare (2004) has reported that OC was found to be least important in estimating hydraulic conductivity, even though this directly influences soil pore size and distribution.
The least important main term effect in the prediction of soil BD using the GLM analysis was explained by the soil type.
The dominant parameters that influenced soil BD prediction were associated with the interaction terms, explaining about 68% of the variance compared to the other data groups (Fig 6). The remaining variation in BD was mainly explained by the soil properties (15.2%), landforms (8.2%) followed land-use systems (3.2%) and soil type (0.60%) data groups. This result indicates that the interaction terms in the GLM analysis are better in estimating soil BD in the study catchment conditions. Figure 6. Comparison of measure of size effect (Eta) for soil BD estimation in-terms of different data groups using GLM analysis for Mai-Negus catchment, northern Ethiopia.

Estimation of Soil Total Nitrogen Based on GLM
The coefficient, standard error, significance level and measure of size effect (Eta) for the constant or intercept, the main and interaction effects of the independent variables on the dependent variable of TN using the GLM analysis is presented in critically influenced by the OC, land-use system and landform in the catchment. This result is consistent with the report in Aruleba and Ajayi (2011); and Rashidi and Seilsepour (2009). The remaining main and interaction terms also significantly explained the estimation of TN in the study catchment (Table 5). The interaction terms accounted for overall Eta square of 0.74, which means that this is explained for 74% of the variation in soil TN. This is followed by the soil properties, land-use and landform in descending order as it is shown in figure 7. Figure 7. Comparison of measure of size effect (Eta) for soil TN estimation in-terms of the different data groups using GLM analysis in the Mai-Negus catchment, northern Ethiopia.

Synthesis of SMR and GLM Analysis Results
Currently, literature shows that there is no comprehensive model for estimating soil parameters which are not easily available, using basic soil properties obtained in soil survey and catchment attributes collected directly from a field (Seybold et al., 2009). To contribute towards this information gap, this study examined the variables that best explain for variation in soil BD and TN using different analysis methods. In this study, the overall R 2 found from SMR and GLM analysis employed to predict soil BD and TN based on key landscape attributes is greater than 0.60. This is in agreement with the report by Seybold et al. (2009)  In addition, the soil type data group had no significant relationship with BD and TN in the SMR analysis whereas the reverse is observed in the GLM analysis. The overall R 2 due to the independent parameters on TN (0.94) for the GLM analysis is greater than that of soil BD improvement in R 2 may be associated with the inclusion of parameters more suitable for the dependent variable of soil BD than TN in the SMR analysis.
The Eta, which is a measure of the association between the two dependent and the different independent variables in reflecting the interaction terms is considerably reduced from 74% for soil TN to 68% for soil BD variance (Figs 6 &7). This holds true for the Eta of the soil properties data group with variance of 16% for TN and 15% for BD. Of the independent variables, OC contributed better in predicting the variability of both BD and TN. The R 2 for the relationship of BD and OC using SMR analysis was 0.71, whereas it was 0.69 with TN.
However, the variance contributed by OC as main term effect for the variability of BD and TN in the GLM analysis were 0.114 and 0.118, respectively, which are lower than that of the SMR. This shows that linear regression model may be better suited for predicting soil TN from soil OC as suggested by Rashidi and Seilsepour (2009). Despite this fact, OC contributed higher in both methods as compared to the other independent variables used. The model coefficients found from both analysis methods for both dependent variables were higher for OC as compared to the other variables. This study indicates that a unit increase in organic matter can be caused a relatively larger decrease in soil bulk density but larger increase in TN, which is consistent with the findings reported in Federer et al. (1993); and Akgül and Özdemir (1996). Thus, management and land-use practices that improve OC should be considered in determining representative fields as OC is useful in estimating soil BD and TN for soil survey when there are no measured data available.

CONCLUSION
In this study, the potentials of stepwise multiple regressions (SMR) and generalized linear models (GLM) were used to predict the variation of soil bulk density (BD) and total nitrogen (TN) based on selected environmental data group (soil properties, land-use, landform, soil type) at a catchment-scale in northern Ethiopia. The study also attempted to identify the most useful environmental variables that predict the BD and TN. The results of SMR and GLM analysis showed strong relationships (R 2 > 0.89) between the two soil properties (soil BD and TN) and catchment and soil attributes. Both methods indicated that soil properties are best suited to predict BD and TN. Among the different soil properties, organic carbon (OC) accounted for the largest share of the variation in BD and TN. The second landscape data group that best predicted BD was land-use type followed by landform for BD. However, the Eta as a measure of association in the GLM analysis showed higher model coefficient and Eta square for the interaction terms of OC  plantation protected land system, OC  forest land ISSN: 2220-184X system, OC  reservoir with BD and TN. However, soil TN (68%) is better explained by the interaction terms as compared to BD (74%), even though both are significantly related to the independent variables. Generally, a higher overall R 2 of GLM for both dependent variables (BD and TN) as compared to that of SMR indicated that inclusion of the interaction terms in the GLM analysis improved the variance values. However, the R 2 of OC in the SMR analysis for BD estimation was 0.71 whereas for TN it was 0.69. These values are higher than the Eta square of OC in the GLM analysis, indicating that SRM is preferable for estimating BD and TN variations using the main term This study suggests that soil BD and TN should be estimated based on the soil properties such as organic carbon and other catchment attributes using appropriate statistical techniques such as the GLM which is more suitable to consider the effects of interaction terms in the study catchment conditions in order to get data for most applications.

ACKNOWLEDGEMENTS
The author gratefully acknowledge the financial support by DAAD/ GTZ (Germany) through the Center for Development Research (ZEF), University of Bonn (Germany), and the support of Aksum University (Ethiopia) during the field work. The author deeply appreciates the assistance offered by the local farmers and extension agents during the field study. The author is also grateful to the anonymous reviewers for their comments, suggestions and corrections of this manuscript.