Factors affecting child malnutrition in Ethiopia

Background One of the public health problems in developing countries is child malnutrition. An important factor for children's well-being is good nutrition. Therefore, the malnutrition status of children under the age of five is an important outcome measure for children's health. This study uses the proportional odds model to identify risk factors associated with child malnutrition in Ethiopia using the 2016 Ethiopian Demographic and Health Survey data. Methods This study uses the 2016 Ethiopian Demographic and Health Survey results. Based on weight-for-height anthropometric index (Z-score) child nutrition status is categorized into four levels namely- underweight, normal, overweight and obese. Since this leads to an ordinal variable for nutrition status, an ordinal logistic regression (OLR)proportional odds model (POM) is an obvious choice for analysis. Results The findings and comparison of results using the cumulative logit model with and without complex survey design are presented. The study results revealed that to produce the appropriate estimates and standard errors for data that were obtained from complex survey design, model fitting based on taking the survey sampling design into account is better. It has also been found that for children under the age of five, weight of a child at birth, mother's age, mother's Body Mass Index (BMI), marital status of mother and region (Affar, Dire Dawa, Gambela, Harari and Somali) were influential variables significantly associated with underfive children's nutritional status in Ethiopia. Conclusion This child's age of a child, sex, weight of child at birth, mother's BMI and region of residence were significant determinants of malnutrition of children under five years in Ethiopia. The effect of these determinants can be used to develop strategies for reducing child malnutrition in Ethiopia. Moreover, these findings show that OLR proportional odds model is appropriate assessing thedeterminants of malnutrition for ordinal nutritional status of underfive children in Ethiopia.

est child malnutrition is found in the sub-Saharan Africa countries. Ethiopia is among those countries with the highest rate of stunting in sub-Saharan Africa. The proportion of underweight children is highest in the age range of 2 to 3 years (34%) and lowest among those under six months of age (10%). In general, 29% of children the under age of five are underweight, and 9% are severely underweight in Ethiopia. An estimated 159 million children underfive years of age, or 23.8%, were stunting in 2016, 15.8% decrease from an estimated 255 million in 1990 worldwide 54 . Even though the occurrence of stunting and underweight among children underfive years of age worldwide has decreased since 1990, overall improvement is unsatisfactory and millions of children remain at risk 2,3 . Malnutrition is the cause of substantial health problemsin children that need due consideration. For that reason, reducing malnutrition of children is equal to improving the health status of the children. This is necessary in order to improve the health status of the future core segment of the society. This is crucial for economic growth and development of the society under consideration 4 .
To measure nutrition, Body Mass Index (BMI) is used and can be defined as the ratio of weight (kg) to squared height (m 2 ). But BMI is not a direct measure of body fatness. BMI is dependent on age and gender for children and referred as BMI-for-age 5,6 . Using this variable, the percentiles or quantiles of BMI for specified ages are of interest can be defined. Moreover, it gives a reference for individuals at that age with respect to the population. Determining BMIfor children's weight status has been of interest to many researchers. Children's BMI under the age of five at or above the 95 th percentile, between the 85th and 95 th percentile and between the 5 th and 85 th percentile were classified as obesity, overweight and normal (healthy weight) respectively 7 . These intervals and cutoffs were as result of expert knowledge.The World Health Organization Expert Committee on Physical Status suggested the cutoff for underweight corresponding to BMIs less than the 5 th percentile8.
A wide range of nutrient-related deficiencies and disorders are included in malnutrition9. Different studies have been conducted in Ethiopia and regionally. These local and regional studies have shown an increase in malnutrition with increase in age of the child10-13. A study conducted by Teller and Yimer (2000) in the Southern Nations, Nationalities and Peoples Region (SNNPR) of Ethiopia showed that women belonging to low economic status households were affected by malnutrition13. Another study conducted in India showed that 60% of deaths of children under the age of five are related to malnutrition. Malnutrition for children is strongly correlated with mother's poor nutritional status 14,4,12,15 . Education is also one of the most significant factors that enablesor empowerswomen to provide suitable care for their children, which is an important determinant of children's growth and development 16,17,18 .
The main objective of this study is to use ordinal proportional odds modelto identify the determinants ofunderfive children'snutritional statusas a function of age and other relevant factors. This study will assist policy makers to know and understand the areas they need to focus on in order toimprove the planning and assessment of health policies to avoidchild mortality associated with malnutrition and to enhance children's health, diet and growth.

Methods and materials
Ethiopia is one of the sub-Saharan Africa countries located in the Eastern Africa region. In 2000, Ethiopia conducted the first Ethiopian Demographic and Health Survey (EDHS). As a continuous study, subsequent ED-HSs were conducted in 2005, 2011 and 2016. These surveys are periodic cross-sectional surveys administered at the household level. The 2016 Ethiopian Demographic and Health Survey resultswere used for this study. Central Statistical Agency of Ethiopia was the responsible organization for the survey19. The 2016EDHS sample was designed to provide estimates for the health and demographic variables of interest for Ethiopia as a whole; this comprised both urban and rural areas of Ethiopia and 11 geographical areas. For the survey, 17,817 households were included in data collection.The 2007 Population and Housing Census results were used as the sampling frame [20][21][22] .

Study variable
The response variable for this study is underfive children'snutritional statusin Ethiopia, which is an ordinal categorical variable.The explanatory variables used in this study are:-child's age, sex of a child, weight of child at birth, mother's current age, mother's BMI, educational attainment of mother, mother's work status, religion, region, wealth index, place of residence (rural or urban), and current marital status.The socio-economic and demographic factors used in this study were suggested by several researchers. These factors were referred to as intermediate variables for the determinants of children's nutritional status 53 .

Statistical methods
An outcome with more than two categories is known as a polytomous outcome. Let J denote the number of categories for such an outcome. Out of N observations, Y_1,Y_2,…,Y_J are the frequencies in categories 1, 2,…J with corresponding probabilities, π_1,π_2,…,π_J, respectively. The distribution is the multinomial distribution and can be expressed as follows: The distribution leads to the multinomial (polytomous) logistic regression which is an extension of binary logistic regression.The link function is the multinomial logit model because the probability distribution for the outcome variable is assumed to be a multinomial rather than a binomial distribution. For a polytomous response, it is further important to note whether the response is nominal (consisting of unordered categories) or ordinal (consisting of ordered categories). An outcome variable that has two or more nominal categories can be modeled using multinomial logistic regression.It estimates the odds of being atany category compared to being at the baseline category (comparison or reference category). The model can be treated as a combination of a series of binary logistic regression models. Suppose Y can take on values coded as 1,2, . . ., J. Next pick one of the outcome levels say J as the reference level. If we assume we have P covariates then the model is formulated as: , where j=1,2,…,J-1;J is the outcome from the base category, which can be any category but is generally the highest one; β_j0 are the intercepts, and β_j1,β_j2,…,β_jp are the regression coefficients. Since the model includes J-1 comparisons, it estimates J-1 logit function for each predictor 23 .
Commonly the maximum likelihood procedure is used to estimate parameters for the multinomial logistic regression model as it is the case with the binary logistic regression.For nominal categories, one of the categories is designated as a reference or base category and each of the remaining categories is compared with the reference category 24 .
Ordinal logistic regression (OLR) considers any inherent ordering of the levels in the outcome variable and makes full use of the ordinal information 25,26 . The incorporation of ordering can result in models that have simpler interpretations. Although ordinal outcomes can be simple and meaningfultheir optimal statistical treatment remains challenging to many applied researchers [27][28][29] . Moreover, these models have greater power than the multinomial logit models30-32. However, a variable that can be ordered when considered for one purpose could be unordered differently when used for another purpose. Miller and Volker (1985) shows how different assumptions about the ordering of occupations result in different con-clusions33. Therefore, we need to think carefully before concluding that the outcomeis ordinal34. Although the categories for an ordinal variable can be ordered, the distances between the categories are unknown. Multinomial logistic for ordinal responses is normally called ordinal logistic regression. An ordinal logistic regression model is a generalization of a binary logistic regression model, when the outcome variable has more than two ordinal levels. In Stata, the ordinal logistic regression model assumes that the outcome variable is a latent variable, which is expressed in logit form as follows: , where � ≤ | 1 , 2 , … , �, is the probability of being at or below category j, given a set of predictors v=1,2,…,p. β_j0 are the cutoff points (thresholds), and β_j1,β_j2,… ,β_jp are logit coefficients 23 . According to Agresti (2002), one way to use category ordering is to form logits of cumulative probabilities24, Equivalently the cumulative logits (logits of cumulative probabilities) can be defined as Each cumulative logit uses all J response categories. In Stata, the logit form of the ordinal logistic regression model that simultaneously uses all cumulative logits can be expressed as follows: where P(Y≤j|x) is the cumulative probability of the event (Y≤j|x), β_j0 are the unknown intercept parameters increasing in j, and β=(β_1,β_2,…,β_p)' is a vector of unknown regression coefficients corresponding to x. since P(Y≤j|x) increases in j for fixed x, the logit is an increasing function of this probability. The cumulative logit model (4) satisfies An odds ratio of cumulative probabilities is called a cumulative odds ratio. The odds of the eventY ≤j at x=x_1 is ] times the odds of the same event at x=x_2. The log cumulative odds ratio is proportional to the distance between x_1 and x_2. The same proportionality constant applies to each logit. Because of this property, cumulative logit model, is called the proportional odds model24,35,36.
Ordinal variables are often coded as consecutive integers from 1 to the number of categories. Because of this coding, it is tempting to analyze ordinal outcomes with the linear regression model. However, an ordinal response variable violates the assumptions of linear regression model, which can lead to incorrect conclusions 37,38 . With an ordinal response, it is much better to use models that avoid the assumption that the distances between categories are equal. Although many models have been designed for ordinal outcomes, logit and probit models are commonly used as the link function in ordinal regression models39. Most multinomial regression models for ordinal outcome variables are based on the logit function.
The difference between both functions is typically only seen in small samples, because the probit link assumes the normal distribution of the probability of event, whereas the logit link assumes the logistic distribution. Details about models for ordinal outcomes can be found in different literatures 31,32,36,[40][41][42] . We label the four levels of under five children's nutritional status as 1, 2, 3, and 4 where we compare underweight, normal weight, overweight, and obese at the same time.
Since this leads to an ordinal variable for nutritional status, an ordinal logistic regression (OLR) is an obvious choice for analysis. There are many ways of generalizing the logit model to handle ordered categories, such as the partial proportional odds, continuation-ratio, adjacent-category logits, cumulative logits, and stereotype logistic models. Despite this diversity and the vast variety of studies on the subject their use in the public health area is still rare 34,43,44,45 . This may be attributed not only to their complexity, but especially to the difficulty encountered when it comes to validating their assumptions46. When the dependent variable has only two categories, the usual binary logistic model is appropriate.
The usual proportional odds model assumes that data are collected using simple random sampling by which each sampling unit has an equal probability of being selected from a population.When the data comes from a complex survey design with the use of different strata, clustered sampling techniques, and unequal selection probabilities,it is inappropriate to conduct the proportional odds model analysis for the ordinal response variable without taking the survey sampling design into account. Ignoring these features in data analysis may lead to biased estimates of parameters, incorrect variance estimates and misleading results. The parameters and their variance may be either overestimated or underestimated47. In such cases, a specialized technique to produce the appropriate estimates and standard errors for ordinal outcome variable should be used. This method takes into account the weight in the survey sampling design.
Features of complex surveys such as sampling weights, strata, and clusters, have been illustrated in literature 50,47 .
In Stata, svyprefix command for survey data is used to fit the proportional odds model when taking all the elements of survey design features into account. It is necessary to specify strata, cluster and weights before fitting the model. For more details on how to use this command one can use the help svyset command in stata software.

Results
The proposed model namely the proportional odds model was applied to the 2016 Ethiopian DHS data and the results of the application are herein discussed.In addition to the response and explanatory variables, we also assessed two-way interaction effects: unfortunately we did not find any significant interaction effect. Stata ologit command was used for model fitting. Table 1shows the results for the proportional odds model under the simple random sampling assumption. The log likelihood at each iteration shows that ordinal logistic regression, like binary and multinomial logistic regression, uses maximum likelihood estimation, which is an iterative procedure. Iteration 0 is the log likelihood of the "null" or "empty" model; that is, a model with no predictors. At the next iteration, the predictors are included in the model. At each iteration the log likelihood increases because the goal is to maximize the log likelihood. When the difference between successive iterations is very small, the model is said to have "converged", and the iteration stops49.
The value for log likelihood of the fitted model is -6009.1723, which is used in the likelihood ratio chisquare test of whether all predictors' regression coeffi-cients in the model are occurring at the same time zero and in tests of nested models. The likelihood ratio chisquare (LR χ2) tests that at least one of the predictors' regression coefficient is not equal to zero. The number in the parenthesis indicates the degrees of freedom of the Chi square distribution used to test the LR χ2 statistic and is defined by the number of predictors in the model. The  (Table 1).  Table 1 shows the effect of socio-economic, demographic and geographic factors that have influence in fitting the proportional odds model for underfive children ordinal nutritional status.The estimated logit regression coefficients of current age of child is, β=-0.3233(P-val-ue=0.000). This is the ordered log-odds estimate for a oneunit increase in age of a child on the expected nutritional status level given the other variables are held constant in the model.The estimated coefficients of female child is (β=-0.2681, P-value=0.000). The estimated coefficients of weight of child at birth are:large (β=0.5547, P-val-ue=0.000), average (β=0.3776, P-value=0.000).For mother's BMI, (β=0.0604, P-value=0.000), which is the ordered log-odds estimate for one-unit increase in mother's BMI keeping other variables constant. The estimated coefficients for regions are: Affar, β=-0.3951 (P-value=0.001), DireDawa, β=-0.6004(P-value=0.000), Gambela, β=-0.4453 (P-value=0.002), Harari, β=-0.3353(P-val-ue=0.016), SNNP, β=0.2134(P-value=0.047), Somali, β=-0.8988(P-value=0.000) were found to be significant determinants ofunderfive children'snutritional status.
Substituting the values of the estimated logit coefficients into the equation (4) resulted in logit[P(Y≤j|x) ]=β_j0+(-β_jp x). By exponentiating the negative logit coefficients (e^((-β) ) ) the odds of being at or below a particular ordinal nutritional status category, that is obese versus being below that category (overweight, normal and underweight), were obtained. Therefore, to estimate the cumulative odds of being at or below a particular underfive ordinal nutritional status variable(based on weight) categoryj, for the first predictor, current age of child, the logit form of proportional odds model was used, logit[P(Y≤j|x_1 ) ]=β_j0-(-0.3233(age)). OR= e^((0.3233) )=1.3817, indicating that the odds of being at or below a particular underfive ordinal nutritional status variable (based on weight)increased by 38.17% with a one unit increase in the value of current age of a child, holding other variables constant. The estimated cumulative odds of being at or below an ordinal nutritional status (based on weight) category j, for female child, we calculated logit[P(Y≤j|x_1 ) ]=β_j0+(-0.2681(female)). OR= e^((0.2681) )=1.3075, suggesting that the odds of female child being at or below a particular underfive ordinal nutritional status (based on weight) increased by 30.75%. The estimated cumulative oddsof being at or below an ordinal nutritional status (based on weight) categoryj, for child who had large weight at birth, we calculated logit[P(Y≤j|x_1 ) ]=β_j0+(0.5547(large)). OR= e^((-0.5547) )=0.5743, suggesting that a child who had large weight at birth, the odds of being at or below a particular underfive ordinal nutritional status(based on weight)decreased by (1-0.5743)×100% = 42.57% as compared to small weight of child at birth, controlling for all other independent variables in the model. The estimated cumulative odds of being at or below an ordinal nutritional status (based on weight) category j, for a child who had average weight at birth, we calculated logit[P(Y≤j|x_1 ) ]=β_j0+(0.3776(average)). OR= e^((-0.3776) )=0.6856, suggesting that a child who had average weight at birth, the odds of being at or below a particular underfive ordinal nutritional status (based on weight) decreased by (1-0.6856)×100% = 31.44% as compared to small weight of child at birth, controlling for all other independent variables in the model. The odds of being at or below a particular underfive ordinal nutritional status for the other significant effects were computed in the same way as above. It was found that for a one-unit increase in the value of mother's BMI, holding other variables constant, the odds of being at or below a particular underfive ordinal nutritional status decreased by (1-0.9414)×100% = 5.86% (OR=0.9414). The odds of being at or below a particular underfive ordinal nutritional status variablefor children from Affar regionwas1.4845 (P-value= 0.001) times the odds ofchildren from Oromia region.The odds of being at or below a particular underfive ordinal nutritional status variablefor children from Dire Dawa region was 1.8228 (P-value= 0.000) times the odds of children from Oromia region. The odds of being at or below a particular underfive ordinal nutritional status variable for children from Dire Dawa region was 1.8228 (P-value= 0.000) times the odds of children from Oromia region. The odds of being at or below a particular ordinal nutritional status category for children from Gambela, Harari and Somali were respectively 1.5609 (P-value=0.002), 1.3984 (P-value=0.016), and 2.4567 (P-value=0.000) times the odds of children from Oromia region. However, the odds of being at or below a particular ordinal nutritional status category for children from SNNP was 0.8078 (P-value=0.047) times the odds for children from Oromia region(see Table 1).
The odds of being beyond a particular category of ordinal nutritional status are the inverse of those of being at or below a category [48], equation (4) can be transformed to Odds ratios (Table 1) can be used directly for the analysis. In terms of odds ratio (Table 1), it was found that the odds of being beyond a particular category of ordinal nutritional status wasincreased by (1-0.7237)×100% = 27.63%(P-value=0.000) with a one-year increase in current age of child, holding other variables constant. Similarly, the odds of being beyond a particular underfive ordinal nutritional status for female child was 0.7647 times the odds of male child. The odds of being beyond a particular category of ordinal nutritional status for children who had large weight at birth was1.7414 times the odds of children who had small weight at birth. The odds of being at or beyond a particular category of ordinal nutritional status for other significant effects can be interpreted in the same way as above.

Application of complex survey design for ordinal logistic regression
In the subsequent section, the same variables from the previous section are used for data analysis with reference to the Ethiopian DHS (2016) data. Here we investigate the relationship (association) between the response variable and the explanatory variables by the method of proportional odds (PO) model with complex survey design using the statasvy: ologitprefix command. Stata's survey data svy prefix command is used to fit the PO model when taking all the elements of survey design features such as strata, cluster, and weight variables into account48. The result of the svy: ologit is indicated in Table 1below. The svy: ologitfor PO model that considers sampling design, reports the adjusted Wald test for all parameters rather than the log likelihood ratio Chi-square test for the ordinal PO model47. F (28, 588) =13.01, Prob > F= 0.0000 indicates that the full model with all parameters was significant in fitting the PO model with complex survey design. The logit coefficients and odds ratios in the PO model with complex survey design can be interpreted in the same way as those in the standard PO model.
The three cut points, when estimating the odds of being at or below a particular ordinal nutritional status category (based on weight), are used to differentiate the adjacent categories of the response variable (ordinal nutritional status). α_1=-3.2418, which is the first cut point for the cumulative logit model for Y≤1that is level 1 versus levels 2-4; α_2=1.7371is the cut point for the cumulative logit model for Y≤2 that is levels 1 and 2 versus 3 and 4; α_3=2.9871is used as the cut point for the cumulative logit model when Y≤3, that is levels 1-3 versus level 4. The results (Table 2) revealed that estimated logit coefficients of current age of child,female children, large and average weight of a child at birth, mother's current age, mother's BMI,mothers who are not married and Affar, Dire Dawa, Gambela, Harari and Somali regions were significant. Therefore, for the predictor, current age of child (β=-0.3186,OR=0.7271) indicates that the odds of being at or beyond a particular ordinal nutritional status categorydecreased by (1-0.7271)×100% = 27.29% with a one year increase in current age of child, holding other variables constant; female child (β=-0.2417,OR=0.7852) suggesting that the odds of female child being at or beyond a particular underfive ordinal nutritional status (based on weight) decreased by (1-0.7852)×100% = 21.48%. The odds of being at or beyond a particular ordinal nutritional status for weight of child at birth: large(β=0.5481), and average (β=0.3134) were 1.7301, and 1.3680, respectively times the odds of small weight of a child at birth; for the predictor mother's age (β=-0.0203,OR=0.9798) indicates that the odds of being at or beyond a particular ordinal nutritional status category decreased by (1-0.9798)×100% = 2.02% with a one year increase in mother's age; for the predictor mother's BMI (β=0.0479,OR=1.0491) indicates that a one-unit increase in mother's BMI, holding other variables constant, the odds of being at or beyond a particular underfive ordinal nutritional status increased by 4.91%.It was found that the odds of being at or beyond a particular ordinal nutritional status for children born to unmarried mother was 0.7101(β=-0.3423) times the odds for children born tomarried mother. The odds of being at or above a particular ordinal nutritional status for children fromAffar, Dire Dawa, GambelaHarari and Somali regions were re-spectively OR = 0.6411(β=-0.4445), OR = 0.5554 (β=-0.5879), OR = 0.5422(β=-0.6120), OR = 0.7243 (β=-0.3224) and OR = 0.4007 (β=-0.9143) times the odds for children from Oromia region (see Table 2). To estimate the odds of being at or below a particular ordinal nutritional status category compared with being at or above that category, we need to reverse the signs before the cut points and the logit coefficients into equation (4) resulted in logit[P(Y≤y_j |x) ]=β_j0+(-β_j x). Odds ratios (Table 2) can be used directly to the analysis of the odds of being beyond a particular ordinal nutritional status category for significant effects. Table 2 provides the results of the two models, the fitted classical PO model and thereafter PO model with complex sampling design. After complex sampling design was applied to the PO model, the estimated logit coefficients and their standard errors were different from those in the PO model under the simple random sampling assumption. The logit coefficient of the predictors current age of child, female child, Dire Dawa and Harari regions were increased and those of the other significant predictors (large and average weight of child at birth, mother's age, mother's BMI, not married mothers and Affar, Gambela and Somali region) were decreased.

Comparison of results
Compared to the PO model without complex survey design, the estimated logit coefficient for current age of child in the PO model with complex survey design increased by 1.48%, and its standard error increased by 37.7%; the logit coefficient for female child increased by 10.92%, and its standard error increased by 55.31%; the logit coefficient for Dire Dawa and Harari region were respectively increased by 2.12% and 4%, with their standard error increased by 0.99% and 11.53%; the logit coefficient for large and average weight of child at birth were respectively decreased by 1.2% and 20.48%, with their standard error increased by 65.2% and 46.35%; the logit coefficient for mother's age, mother's BMI and not married mothers were respectively decreased by 40.9%, 79.3% and 41.01%, with standard error increased by 42.2%, 65.8% and 35.6%; and the logit coefficient for Affar, Gambela and Somali region were respectively decreased by 11.1%, 27.2% and 1.&%, with their standard error increased by 21.3%, 29.2% and 47.27%.
Further, the standard errors of the significant coefficients in the PO model with complex sampling design were higher as compared to the corresponding standard errors of the significant coefficients in the conventional PO model indicatingthat standard errors were underestimated when we considered the conventional PO model48,3. This is an important distinguishing feature between the models. Analyses ignoring the complex sampling design will lead to a false increased precision and should be avoided.

Conclusion
Therefore, policymakers need to focus on the influence of these significant factors to develop strategies that enhance the normal or healthy weight status of under-five children in Ethiopia. This study also suggests that improving the nutritional status of mothers will consequently improve the nutritional status of their children. Improving the work status of the mothers will enhance the mother's economic status and consequently improve the basic needs of their children. To change weight-related disorders, changes related to children, environmental and social intervention is required to promote and support weight-related change in mothers. The government of Ethiopia needs urgent implementation of programs targeted to the regions of Affar, Dire Dawa, Gambela, Harari and Somali to develop the strategies of enhancing the good nutritional status of under-five children in Ethiopia.

Future direction
It must be borne in mind that this study was conducted based on certain socioeconomic and environmental factors. Further research is hence needed to unravel the specific socio-economic and environmental factors and determine whether they serve as an influential factor that affects the malnutrition status of under-five children and enhance the findings in this study. In a further study, we will extend this study by considering non-parametric and semi-parametric approaches to ordinal logistic regression, Spatial-temporal analysis, and other advanced statistical models. In addition, we will try to identify the trends of malnutrition status of the under-five children using the available EDHS survey results.

Ethics approval and consent to participate
Ethical clearance for the survey was provided by the EHNRI Review Board, the National Research Ethics Review Committee (NRERC) at the Ministry of Science and Technology, the Institutional Review Board of ICF International, and the CDC. Publisher's Note: Spring-