Application of longitudinal multilevel zero inflated Poisson regression in modeling of infectious diseases among infants in Ethiopia

Yirdaw, Bezalem Eshetu; Debusho, Legesse Kassa; Samuel, Aregash

doi:10.1186/s12879-024-09820-0

Application of longitudinal multilevel zero inflated Poisson regression in modeling of infectious diseases among infants in Ethiopia

Research
Open access
Published: 07 September 2024

Volume 24, article number 927, (2024)
Cite this article

Download PDF

You have full access to this open access article

BMC Infectious Diseases Aims and scope Submit manuscript

Application of longitudinal multilevel zero inflated Poisson regression in modeling of infectious diseases among infants in Ethiopia

Download PDF

Bezalem Eshetu Yirdaw¹,
Legesse Kassa Debusho¹ &
Aregash Samuel²

54 Accesses
Explore all metrics

Abstract

Background

In sub-Saharan African countries, preventable and manageable diseases such as diarrhea and acute respiratory infections still claim the lives of children. Hence, this study aims to estimate the rate of change in the log expected number of days a child suffers from Diarrhea (NOD) and flu/common cold (NOF) among children aged 6 to 11 months at the baseline of the study.

Methodology

This study used secondary data which exhibit a longitudinal and multilevel structure. Based on the results of exploratory analysis, a multilevel zero-inflated Poisson regression model with a rate of change in the log expected NOD and NOF described by a quadratic trend was proposed to efficiently analyze both outcomes accounting for correlation between observations and individuals through random effects. Furthermore, residual plots were used to assess the goodness of fit of the model.

Results

Considering subject and cluster-specific random effects, the results revealed a quadratic trend in the rate of change of the log expected NOD. Initially, low dose iron Micronutrient Powder (MNP) users exhibited a higher rate of change compared to non-users, but this trend reversed over time. Similarly, the log expected NOF decreased for children who used MNP and exclusively breastfed for six months, in comparison to their counterparts. In addition, the odds of not having flu decreased with each two-week increment for MNP users, as compared to non-MNP users. Furthermore, an increase in NOD resulted in an increase in the log expected NOF. Region and exclusive breastfeeding also have a significant relationships with both NOD and NOF.

Conclusion

The findings of this study underscore the importance of commencing analysis of data generated from a study with exploratory analysis. The study highlights the critical role of promoting EBF for the first six months and supporting children with additional food after six months to reduce the burden of infectious diseases.

View this article's peer review reports

Longitudinal trends in the health outcomes among children of the North Eastern States of India: a comparative analysis using national DHS data from 2006 to 2020

Article 20 April 2022

Deconstructing the differences: a comparison of GBD 2010 and CHERG’s approach to estimating the mortality burden of diarrhea, pneumonia, and their etiologies

Article Open access 16 January 2015

Trends and determinants of diarrhea among under-five children in Ethiopia: cross-sectional study: multivariate decomposition and multilevel analysis based on Bayesian approach evidenced by EDHS 2000–2016 data

Article Open access 22 January 2021

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Introduction

Promoting well-being and ensuring healthy lives for children lies at the heart of the United Nations Sustainable Development Goals (SDGs). While considerable progress has been made in several nations towards meeting these SDGs by the year 2020, it is important to note that Ethiopia continues to rank among the top five countries responsible for nearly half of the world’s under-five mortality rates [1]. Recent report from the World Health Organization (WHO) illustrate infectious diseases such as diarrhea and acute respiratory infections (ARIs) are among the leading causes of under-five mortality [1]. Approximately 9% of all deaths among children under the age of five are attributed to diarrhea. Notably, the prevalence of under-five years deaths due to diarrhea is particularly elevated in sub-Saharan African countries [2]. Furthermore, as reported by the WHO in 2019, ARI was one of the predominant childhood diseases, ranking among the top seven common causes of death in children and bearing a notably elevated morbidity rate [3]. The incidences of ARI and diarrheal diseases are high during the first two years of a child’s life, casting a shadow over their physical growth. This early-life challenge can potentially lead to adverse health outcomes in adulthood [4, 5]. A recent study in 2021 shows that the prevalence of diarrhea in Ethiopia was 17%, indicating that this issue still requires attention and intervention [6].

Researchers worldwide have conducted extensive studies on the risk factors associated with diarrhea and ARI. A study from Ethiopia showed that child age, drinking water source, family size and exclusive breastfeeding (EBF) status of a child directly associated with childhood diarrhea [6]. In addition, respiratory infections in children under the age of five years are associated with a range of factors, including demographics (education and employment status of the mother), socioeconomic conditions, nutrition status, health-related aspects, and environmental influences [7]. Studies conducted across different regions have shown that childhood diarrhea is associated with various factors, including the socio-economic status of a household [8, 9], EBF [10, 11], education status of a caregiver [9, 10, 12] sanitation [13], age of a mother [12], sex of a child, age of a child [9], water source [14], family size [13] and nutrition status of a child [10]. On the other hand, a study from Pakistan reveals that diarrhea, EBF and gender are significantly associated with ARI [15]. A previous study shows the simultaneous occurrence of childhood diarrhea and ARI [16]. Furthermore, a study from Nepal revealed a causal relationship between diarrhea and ARI [17].

Although the Ethiopian Ministry of Health and regional health offices have prioritized the enhancement of a child health, a considerable number of children continue to lose their lives in Ethiopia due to preventable and manageable cases of diseases such as diarrhea and ARI. Hence, identifying the underlying factors behind diarrhea and ARI has a paramount importance in achieving the SDGs and advancing a nation’s development, because healthy children can subsequently play dynamic roles in their communities, contributing in various ways. Hence, the primary objective of this study is to estimate the rate of change in the expected log number of days a child suffered from diarrhea (NOD) and number of days a child suffered from flu/common cold (NOF) and identify the associated risk factors among young children in Ethiopia.

A previous study investigated the longitudinal prevalence of diarrhea and flu/common cold and the impact of the intervention, low dose iron Micronutrient Powder (MNP), on the prevalence rates [18]. The study used a generalized Poisson linear mixed effects model, and reached to the conclusion that MNP usage increased the longitudinal prevalence of both diarrhea and flu [18]. On the other hand, a previous study shows that MNP usage did not elevate the risk of infectious diseases like diarrhea and ARI [19]. Another study also suggested that supplementing children with micronutrient, including vitamin A and zinc, could reduce the severity of infectious diseases [20]. These conflicting results have prompted our decision to start the analysis from exploratory analysis. Therefore, our study aims to investigate the impact of using exploratory analysis for model specification.

Various studies have highlighted the positive impacts of EBF in reducing the risk of infectious diseases [10, 11, 15, 21, 22]. Another study provided evidence of a reduced risk of infectious diseases among infants who breastfed exclusively for six months [23]. Furthermore, a study indicated that the use of MNP, which contains vitamin A and zinc, could mitigate the severity of infectious diseases [20]. Therefore, according to the findings of [20, 23] studying the interaction of $\text {EBF month} \times \text {MNP usage}$ on NOD and NOF can help policy makers in formulating guidelines and recommendations for infant feeding practices. Therefore, this study explore the effect of the interaction of $\text {EBF month} \times \text {MNP usage}$ on NOD and NOF.

Methodology

Study data

In this study, we used secondary data originally collected to evaluate the effectiveness of MNP in improving child morbidity and growth among young children in Ethiopia [18]. The data were collected from the Oromia and Southern Nations, Nationalities, and Peoples’ Region (SNNPR) of Ethiopia, spanning from March 2015 to May 2016. The dataset displays a hierarchical structure, with observations taken from each individual at two-week intervals throughout the data collection period. Data were collected from 2356 children from 35 villages (clusters) at baseline, and after excluding children with only one measurement, we analyzed data from 2283 children. The dataset comprises three main components: child morbidity, anthropometric measurements, and the iron status of the children. For this study, our focus is mainly on child morbidity.

The outcome variables, diarrhea and flu, were assessed every two weeks and a total of 18 measurements for each individual were used for this study. Thus, the number of days a child suffered from diarrhea (NOD) and the number of days a child suffered from flu/commen cold (NOF) within the two weeks period was counted for each of the 18 observation times. Furthermore, the covariates included were MNP usage, EBF months, gender, baseline age of a child, region, wealth index, age of a mother (AOM) and educational status of a mother (ESM). According to WHO, a child should breastfed for the first six months of life and should continue breastfeeding along with nutritionally adequate and safe complementary (solid/liquid) foods after six months [24]. Thus, EBF months were classified as "0" if a child’s EBF months were less than or greater than six months and "1" if a child’s EBF months were equal to six months. Further, wealth index were calculated using the principal component analysis using stata version 14 by considering family assets, number of domestic animals, family size, type of toilet and water source. Further insights into the sampling procedure can be found in Samuel et al.’s work [18]. In addition, it is important to note that the study specifically considered children aged 6 to 11 months at the baseline assessment.

Methods

A count outcome can be modeled using the generalised Poisson (GP), negative binomial (NB) or zero inflated Poisson (ZIP) regression models. The GP model mainly used under the assumption of mean equals to variance or when there is no overdispersion and underdispersion. The NB and the ZIP regression models can be used when the data exhibit overdispersion and excessive zeros that go beyond what is expected from the Poisson distribution, respectively. However, it is always advisable to start from the simple GP model. The probability mass function for the Poisson random variable $Y_t$ with parameter $\lambda _t$ is given by

$$\begin{aligned} p(Y_t = y_t) = \frac{\lambda _{t}^{y_{t}}e^{-\lambda _{t}}}{y_{t}!},\quad t = 0, 1, 2, \ldots \end{aligned}$$

(1)

Thus, the GP model with p covariates is given by

$$\begin{aligned} log(\lambda _{t}) = \eta _{t} = \beta _0 + \beta _{1}x_{t1} + \ldots + \beta _{p}x_{tp}, \end{aligned}$$

(2)

where $\beta _0, \beta _{1}, \ldots \beta _{p}$ are regression coefficients. However, the data in the current study exhibit a hierarchical structure, with individuals nested within cluster measured multiple times. This structure may violate the assumption of independence in a GP model, as measurements within the same individual or cluster are likely to be correlated. The multilevel generalised Poisson (MGP) regression model is a model which account this correlation. Thus, a subject-level random effect was introduced first to capture the subject specific correlation. Subsequently, a cluster-level random effect was added to account cluster specific correlation. Let the count outcome $Y_{tli} (i=1,2,\ldots ,m; \, l= 1,2,\ldots ,n_i; \,t_i = 1,2,\ldots ,n_{ij})$ represents NOD or NOF of a child which represent the $t^{th}$ observation of $l^{th}$ individual in the $i^{th}$ cluster, the probability mass function for the random variable $Y_{tli}$ with parameter $\lambda _{tli}$ is given by

$$\begin{aligned} p(Y_{tli}=y_{tli} | u_{li}, v_{i}) = \frac{\lambda _{tli}^{y_{tli}}exp^{-{\lambda _{tli}}}}{Y_{til}!}, \qquad \quad y_{tli} =0, 1,2,\ldots , \end{aligned}$$

(3)

where $u_{li}$ and $v_i$ represents the subject- and cluster-specific random effects. Thus, the MGP model for the outcome $Y_{tli}$ with parameter $\lambda _{tli}$ for p covariates is given by

$$\begin{aligned} log(\lambda _{tli}) = \eta _{tli} = \beta _0 + \beta _{1} time_{tli} + x^{\prime }_{tli}\beta + z^{\prime }_{tli}u_{li} + z^{\prime }_{tli}v_{i}, \end{aligned}$$

(4)

where $time_{tli}$ is the time points in which individuals were measured and $z^{\prime }_{tli} = [1, time_{tli}]$. In addition, $u_{li}\sim \mathcal {N}(0, {\Sigma (\theta _u)})$ and $v_{i}\sim \mathcal {N}(0, {\Sigma (\theta _v)})$, where $\Sigma (\theta _u)$ and $\Sigma (\theta _v)$ represents a symmetric and positive semi-definite variance-covariance matrix of random effects $u_{li}$ and $v_{i}$ parameterised by vector of variance-covariance components $\theta _u$ and $\theta _v$, respectively. Furthermore, the outcomes NOD and NOF in the current study exhibit excess zeros (see Fig. 1 ). The Negative Binomial model is often a good starting point for modeling overdispersed count data, including datasets with more zeros than would be expected under a Poisson model. The probability mass function for the random variable $Y_{t}$ from NB distribution is given by

$$\begin{aligned} p(Y_{t} = y_{t}) = \frac{\Gamma (y_{t} + \phi _{t})}{\Gamma (\phi _{t}) \Gamma (y_{t}+1)} \left( \frac{\phi _{t}}{\phi _{t} + \lambda _{t}}\right) ^{\phi _{t}} \left( \frac{\lambda _{t}}{\phi _{t} + \lambda _{t}}\right) ^{y_{t}}, \end{aligned}$$

(5)

where $\phi _{t}$ and $\lambda _{t}$ are dispersion parameter and average count of the outcome $Y_t$, respectively. However, similar to the GP model in Expression 2, NB model may also fail to account the dependency within subject at the subject-level and the dependency within cluster at the cluster-level. Let $Y_{tli}|u_{li}, v_{i} \sim NB(\lambda _{tli}, \phi _{tli})$, the probability mass function of the random variable $Y_{tli}$ is given by

$$\begin{aligned} p(Y_{tli} = y_{tli}|u_{li}, v_{i}) = \frac{\Gamma (y_{tli} + \phi _{tli})}{\Gamma (\phi _{tli}) \Gamma (y_{tli}+1)} \left( \frac{\phi _{tli}}{\phi _{tli} + \lambda _{tli}}\right) ^{\phi _{tli}} \left( \frac{\lambda _{tli}}{\phi _{tli} + \mu _{tli}}\right) ^{y_{tli}} \end{aligned}$$

(6)

where $\phi _{tli}$ and $\lambda _{tli}$ are dispersion parameter and average count of the outcome. Hence, the multilevel negative binomial model for the outcome $Y_{tli}$ is given by

$$\begin{aligned} log(\lambda _{tli}) = \eta _{tli} = \beta _0 + \beta _{1} time_{tli} + x^{\prime }_{tli}\beta + z^{\prime }_{tli}u_{li} + z^{\prime }_{tli}v_{i}, \end{aligned}$$

(7)

where the distributional assumption for the random effects $u_{li}$ and $v_{i}$ is similar to the assumption of MGP model in Expression 4. The NB regression model offer a better fit when there is overdispersion, however, it may fail to adequately address the substantial number of zero counts observed (see Fig. 1). The ZIP model is specifically designed to account for the overabundance of zeros possessed by a count outcome. The Zero-Inflated Poisson (ZIP) regression model was built under the assumption that the data originates from two distinct processes. One process generates the zero counts, while the other process generates both the zero and non-zero counts. Consequently, the ZIP model combines logistic regression and Poisson regression to model the zero part based on the presence or absence of the outcome, and its count part, respectively. However, the ZIP model may still fail to account the correlation between measurements within the same individual or cluster. The multilevel zero inflated Poisson (MZIP) regression model is an extension of ZIP model which can be used in the case of correlated observations. According to [25] the probability mass function of an outcome $Y_{tli}$ from a ZIP distrbution with the average count $\lambda _{tli}$ and probability of extra zeros $\phi _{tli}$ is given by

$$\begin{aligned} p(Y_{tli}=y_{tli} | r_{li}, s_{i}, u_{li}, v_{i}) = \left\{ \begin{array}{lll} \phi _{tli} + (1-\phi _{tli})e^{-{\lambda _{tli}}}, & \quad \text {if} & \quad y_{tli}=0 \\ (1-\phi _{tli})\frac{\lambda _{tli}^{y_{tli}}e^{-{\lambda _{tli}}}}{Y_{til}!}, & \quad & \quad y_{tli} = 1,2,\ldots \end{array}\right. \end{aligned}$$

(8)

In the MZIP models, the fixed effect covariates for the zero and count part of the model are not necessarily the same [26]. For this study, we have assumed a correlation between random intercept and random slope at the Poisson component of the model. However, based on the parsimonious principle, we assumed no correlation between the random effects in the Poisson and logistic components of the model. Let the count outcome $Y_{tli}$ represents NOD or NOF of a child, where $Y_{tli}$ represent the $t^{th}$ observation of $l^{th}$ individual in the $i^{th}$ cluster, and $m, n_i$ and $n_{ij}$ represent total number of clusters, total number of individuals in the $i^{th}$ cluster and total number of observations for the $l^{th}$ individual in the $i^{th}$ cluster, respectively. Hence, the linear predictors for the zero and non-zero part of the model which linked to the logit and log link functions are given by

$$\begin{aligned} \text {and}\qquad \qquad \quad \begin{array}{l} \log \left( \frac{\phi _{tli}}{1-\phi _{tli}} \right) =\xi _{tli} = \alpha _0 + \alpha _1 time_{tli} + a^{\prime }_{tli}\alpha + r_{li} + s_{i} \\ log(\lambda _{tli}) = \eta _{tli} = \beta _0 + \beta _{1} time_{tli} + x^{\prime }_{tli}\beta + z^{\prime }_{tli}u_{li} + z^{\prime }_{tli}v_{i}, \end{array} \end{aligned}$$

(9)

respectively, where $\alpha_0$ and $\beta_0$ are the respective baseline log odds and log expected of the outcome at the logistic and Poisson components of the model, after keeping the other covariates in the model constant. In addition, $a_{tli}$ and $x_{tli}$ are a vector of fixed effect covariates present in the logistic and Poisson components of the model, accompanied by their respective regression coefficients $\alpha$ and $\beta$. Furthermore, $r_{li}$ and $s_{i}$ are the subject and cluster specific random effects at the logistic component of the model. Further, $r_{li} \sim \mathcal {N}(0, {\sigma _{r_0}})$, $s_{i} \sim \mathcal {N}(0, {\sigma _{s_0}})$, $u_{li} \sim \mathcal {N}(0, {\Sigma (\theta _u)})$, and $v_{i}\sim \mathcal {N}(0, {\Sigma (\theta _v)})$, where $\sigma _{r_0}$ and $\sigma _{s_0}$ are variances of subject specific and cluster specific random intercepts in the logistic part of the model. While $\Sigma (\theta _u)$ and $\Sigma (\theta _v)$ are symmetric and positive semi-definite variance-covariance matrix of random effects in the Poisson part of the model parameterised by vector of variance-covariance components $\theta _u$ and $\theta _v$, respectively. For this study, the collection of parameters $\Theta = (\varvec{\alpha }^{t}, \varvec{\beta }^{t}, \sigma _{r_0},\sigma _{s_0},\theta _u, \theta _v)$ were estimated using the maximum likelihood estimation method, where $\varvec{\alpha }^{t}$ and $\varvec{\beta }^{t}$ are vector of regression coefficients including the regression coefficient of $time_{tli}$ at the logistic and Poisson components of the model, respectively.

Model selection

Let Y represents a vector of discrete count outcome. When modeling count data, one of the suitable methods to be employed is a Poisson regression. However, count data sometimes exhibit overdispersion and excessive zeros, that go beyond what is expected from the Poisson distribution. For instance, the response variables NOD and NOF in this study had many zeros (Fig. 1). In cases like this, a standard Poisson model may not be the most appropriate choice for further analysis. Therefore, it is advisable to begin with a generalized Poisson (GP) regression model and assess whether this model can effectively handle the observed excess zeros. Following the approach outlined by [27, 28], we utilized quantile residuals rather than ordinary residuals to select the correct family distribution. This was accomplished via simulation, similar to the principles of Bayesian p-values and parametric bootstrapping, which converts residuals to scaled residuals using the |DHARMa| package in |R| [28]. Thus, if the fitted model is accurate, the scaled/quantile residuals simulated from the fitted model should conform to a uniform [0,1] distribution. In addition, it is advisable to perform a dispersion test to test overdispersion or underdispersion in the model [28]. Hence, we initially employed the GP model, transitioned to the negative binomial model, and ultimately arrived at a zero-inflated Poisson regression model (see Exploratory analysis section for further detail). Furthermore, Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC) were considered to further validate the model selection. These provided additional evidence that supported the superiority of the selected model by balancing the model fit with complexity.

Having the zero-inflated Poisson regression model in mind, it is essential to recognize that the data in this study exhibits a longitudinal and hierarchical structure. Furthermore, the individual and cluster profile plots in Fig. 4 show variations in both intercepts and slopes. We employed a likelihood ratio test to examine zero variance components. This test is based on the hypothesis $H_0: \sigma ^2 = 0$ versus $H_1: \sigma ^2 > 0$ and follows an asymptotic mixture distribution of $0.5\chi ^{2}_0 + 0.5\chi ^{2}_1$, where $\sigma ^2$ is a variance of subject specific random intercept or slope, a covariance between random intercept and random slope at individual level or variance of cluster specific random intercept or slope and covariance between random intercept and random slope at cluster level [29].

Result

Exploratory analysis

This study considered two outcomes, namely NOD and NOF. According to Walker et al. (2013), the joint modelling of diarrhea and ARI would sound appropriate. However, upon reviewing the data, we found that the correlation between the two outcomes is relatively low ($<0.2$). In addition, the number of times that a child was suffering from flu and diarrhea at the same time were limited. This sparse nature of the data makes fitting the joint modelling of the two outcomes difficult. Given the hierarchical nature of the data, the estimation of random effects at the individual and cluster levels for the joint modeling of the two outcomes becomes problematic due to the insufficient number of observations in these cases. Furthermore, given the low correlation between the two outcomes, a joint model may not capture meaningful associations and could lead to unstable estimates. On the other hand, Newman et al., (2020), revealed a causal relationship between diarrhea and ARI among infants. Hence, including this relationship in a joint model could complicate the analysis without providing additional insight. Therefore, this study considered a separate analysis of NOD and NOF.

The outcomes in this study exhibit a count nature, and as demonstrated by the histogram in Fig. 1, both outcomes namely NOD and NOF, exhibit numerous zero. Various studies have indicated that a generalized Poisson linear mixed effects model may not be suitable for such data types [26, 30]. However, the presence of many zeros does not necessarily indicate zero-inflation, it could be explained by the inclusion of explanatory variables. Therefore, it is advisable to begin with a generalized Poisson (GP) regression model that incorporates all fixed-effect covariates and investigate whether the model can account for the observed excess zeros in the data. Furthermore, the plots in Fig. 2 show that the rate of change for both outcomes follows a half-inverted U shape emphasizing the importance of considering quadratic time effect. Furthermore, the plot for NOD shows that the two lines intersected at the end of the study (Fig. 2a). Hence, it is crucial to account for interactions $\text {time} \times \text {MNP usage}$ and $\text {time}^2 \times \text {MNP usage}$ to assess the rate of change in the log expected NOD and NOF for children who used MNP compared to those who did not.

Thus, the GP model for NOD ($M_{D_1}$) is given by

$$\begin{aligned} {\eta _{d}} & = \beta _0 +\beta _1\,time + \beta _2\, {time}^2 + \beta _3\,MNP\,usage \nonumber \\ & +\beta _4 MNP\,usage \times time + \beta _5 MNP\,usage \times {time}^2 \nonumber \\ & + \beta _6 EBF\,months + \beta _7 EBF\,months \times MNP\,usage \nonumber \\ & + \beta _8 \,Age + \beta _9 \,gender + \beta _{10}\, region + \beta _{11}\, AOM \nonumber \\ & + \beta _{12}\, ESM + \beta _{13}\,wealth\,index \nonumber \\ & = \varvec{X\beta }, \end{aligned}$$

(10)

where $\eta _{d}$ represents the linear predictor, time is the time points in which the observations were recorded, $MNP \,usage$ represents whether or not a child used MNP, $EBF\,months$ is exclusive breastfeeding months of a child, Age is age of a child, AOM is age of a mother and ESM is educational status of a mother. However, the model validation test for the GP model for NOD does not indicate a good fit (see Table 1 and Fig. 3a). Specifically, the zero inflation test appeared significant suggesting that the model failed to account for the observed excess zeros in the data ($p-value < 0.001$). Furthermore, the overdispersion parameter is far greater than one and highly significant. Moreover, the QQ-plot in Fig. 3a reveals substantial deviations from the expected distribution.

Table 1 Model validation tests for the outcomes NOD and NOF

Full size table

On the other hand, the Negative Binomial (NB) regression model is a well-suited model for count data that exhibit overdispersion compared to what would be expected from the GP model. Therefore, we considered a NB regression model $(M_{D_2})$ with the same mean structure as in Expression 10. While this model effectively handles the excessive zeros in the observed data, a dispersion test indicates a highly significant underdispersion problem (see Table 1). In addition, the QQ plot in Fig. 3b confirms deviations from the expected distribution. Another viable option that can effectively capture the characteristics of the data is a Zero-Inflated Poisson (ZIP) Regression model $(M_{D_3})$. Even though the dispersion test remains significant, the dispersion parameter is very close to one. Furthermore, the QQ plot in Fig. 3c shows no deviations from the expected distribution, indicating a good fit compared to $M_{D_1}$ and $M_{D_2}$. Moreover, the values of AIC and BIC for $M_{D_3}$ are significantly smaller than those for $M_{D_1}$ and $M_{D_2}$.

Similarly, for the outcome NOF we started from the GP model ($M_{F_1}$) (see Expression 11) and followed the same procedure as we did for NOD. However, the zero-inflation and overdispersion tests were significant for the GP model (Table 1). In addition, the QQ plot in Fig. 3d shows deviation from the expected distribution. Furthermore, the NB regression model $M_{F_2}$ did not fit the data very well (see Table 1 & Fig. 3e). Conversely, the ZIP model ($M_{F_3}$) demonstrates a good fit as compared to $M_{F_1}$ and $M_{F_2}$ (Table 1 & Fig. 3f).

$$\begin{aligned} {\eta _f} & = \beta _0 +\beta _1\,time + \beta _2\, {time}^2 + \beta _3\,MNP\,usage\nonumber \\ & +\beta _4 MNP\,usage \times time + \beta _5 MNP\,usage \times {time}^2\nonumber \\ & + \beta _6 EBF\,months + \beta _7 EBF\,months \times MNP\,usage\nonumber \\ & + \beta _8 \,Age + \beta _9 \,sex + \beta _{10}\, region + \beta _{11}\, MOA\nonumber \\ & + \beta _{12}\, MES + \beta _{13}\,wealth\,index + \beta _{14}\,NOD\nonumber \\ & = \varvec{X\beta }, \end{aligned}$$

(11)

Moreover, the significant overdispersion problem observed in both NOD and NOF may be attributed to the variation due to subject specific and cluster specific random effects. For instance, the individual and cluster profile plots for NOD in Fig. 4a and b show the existence of substantial variations in both intercept and rate of change in NOD across time among the observed individuals and clusters, respectively. Similarly, the individual and cluster profile plots for NOF in Fig. 4c and d also demonstrate variations in both intercept and slope across time among the observed individuals and clusters, respectively. To adequately address these variations, it is important to consider random intercepts and slopes at both the individual and cluster levels.

Selection of variance-covariance structure for random effects

In this section we conducted a mixture chi-square test to select a variance-covariance structure at both subject and cluster levels (Tables 2 & 3). Starting from a ZIP regression model with no random effects, eight models were fitted to test the variances of random effects and the covariance between the random intercept and random slope at each level of the hierarchy.

For the outcome NOD, the test for the subject specific random intercept ($H_0:\, \sigma _{u_0} = 0$ against $H_1:\, \sigma _{u_0} > 0$) yielded a significant result ($-2 log(Lik) = 44.62, p-value < 0.001$), suggesting that subject specific random intercept should be included in the Poisson part of the model. Furthermore, the test also verified the significance of the variances of cluster-specific random intercept ($\sigma _{v_0}$) in the Poisson part of the model $(-2 log(Lik) = 16.27, p-value < 0.001)$. However, including the subject-specific random slope was unnecessary, as the test ($H_0:\, \sigma _{u_1} = 0$ against $H_1:\, \sigma _{u_1} > 0$) did not appear significant (Table 2). To test cluster specific random slope, two models one with subject specific random intercept and cluster specific random intercept ($M_3$), and the other with subject specific random intercept, cluster specific random intercept and cluster specific random slope ($M_5$) were fitted and the test ($H_0:\, \sigma _{v_1} = 0$ against $H_1:\, \sigma _{v_1} > 0$) ensured that the inclusion of cluster specific random slope was crucial $(-2 log(Lik) = 13.37, p-value < 0.001)$. However, the non-rejection of the null hypothesis $H_0: \sigma _{v_{01}} = 0$ with $-2log(Lik) = 2.09$ and $p-value = 0.250$ suggests that considering the covariance between the random intercept and slope at the cluster level of the model ($M_6$) was not necessary. Furthermore, the result from testing the hypothesis $H_0: \sigma _{r_0} = 0$ against $H_1: \sigma _{r_0} > 0$ and $H_0: \sigma _{s_0} = 0$ against $H_1: \sigma _{s_0} > 0$ in models $M_7$ and $M_8$, respectively, confirmed that the inclusion of subject and cluster specific random intercepts at the logistic part of the model was important. Moreover, the model with the smaller AIC and BIC was $M_8$, which support the LRT (Table 2).

Table 2 Longitudinal multilevel zero inflated Poisson regression models for selection of variance-covariance structure for NOD

Full size table

Table 3 Longitudinal multilevel zero inflated Poisson regression models for selection of variance-covariance structure for NOF

Full size table

Similar tests were conducted to evaluate the random effects for the outcome NOF. The test $H_0: \sigma _{u_0} = 0$ against $H_1: \sigma _{u_0} > 0$ in $M_1$ (no random intercept) yielded a test statistic value of 26.63 with a $p-value < 0.001$. This provides compelling evidence against the null hypothesis $H_0$, indicating significant subject-specific variability in the Poisson part of the model. Furthermore, the test $H_0: \sigma _{v_0} = 0$ against $H_0: \sigma _{v_0} > 0$ also confirmed a statistically significant difference among clusters $(-2 log(Lik) = 45.77,$ $p-value < 0.001)$. However, the subject-specific random slope was not statistically significant. To test the cluster-specific random slope, we considered models $M_3$ and $M_5$ with the hypothesis $H_0: \sigma _{v_1} = 0$ against $H_1: \sigma _{v_1} > 0$, and the test was highly significant $(-2 log(Lik) = 27.33, p-value < 0.001)$, suggesting the inclusion of cluster-specific random slope. The test $(H_0: \sigma _{v_{01}} = 0$ against $H_1: \sigma _{v_{01}} > 0)$ with $-2 log(Lik) = 30.66$ and $p-value < 0.001$ indicates a substantial improvement in the fit of $M_6$ compared to $M_5$, favoring the model with an unstructured covariance structure at the cluster level over the model with a diagonal covariance structure. Similarly, the LRT, the AIC and BIC values supported the inclusion of both subject-specific and cluster-specific random intercepts in the logistic part of the model (Table 3). This leads to the conclusion that the next section will be based on model $M_8$.

Multilevel zero inflated Poisson regression analysis

In this section, the results from the multivariable MZIP regression model are presented for both NOD and NOF (Tables 4 & 5). For the outcome NOD, considering subject and cluster specific variations, the covariates significant at a 5% level, including region, EBF month, time, MNP usage, $\text {time}^2$, $\text {time} \times \text {MNP usage}$ and $\text {time}^2 \times \text {MNP usage}$ in the Poisson part of the model, and gender, region, MNP usage, time and $\text {time} \times \text {MNP usage}$ in the logistic part of the model, were retained for further analysis. Keeping the effects of other covariates constant, the odds of not having diarrhea for girls were higher than that of boys ($\alpha = 0.084$, $s.e. = 0.042$, $p-value = 0.004$). In addition, the odds of not having diarrhea for children in the SNNPR region was lower than those from the Oromia region ($\alpha = -0.512$, $s.e. = 0.177$, $p-value = 0.004$). Furthermore, the log of expected NOD was 0.138 unit lower for children whose EBF months were six months compared to those with less than or greater than six months ($\beta = -0.138$, $s.e. = 0.029$, $p-value < 0.001$).

Table 4 Parameter estimates (standard errors, s.e) of multilevel zero inflated Poisson regression model for NOD

Full size table

Table 5 Parameter estimates (standard errors, s.e) of multilevel zero inflated Poisson regression model for NOF

Full size table

The interaction terms, $\text {Time} \times \text {MNP usage}$ and $\text {Time}^2 \times \text {MNP usage}$, reveal non-monotonic changes in log expected NOD among children who used MNP as time progresses. The positive coefficient for $\text {Time} \times \text {MNP usage}$ ($\beta = 0.035$, $s.e. = 0.014$, $p-value = 0.013$) and the negative coefficient for $\text {Time}^2 \times \text {MNP usage}$ suggest an inverted U-shaped relationship between NOD and time, confirming the pattern seen in Fig. 2a in exploratory analysis. This indicates that the log of expected NOD increases by 0.035 for every two-week increment in time among MNP users compared to non-users which could be an initial adverse reaction to the MNP at the beginning of the study. However, as time progresses, the log of expected NOD decelerates for each two-week increment among MNP users compared to non-users. The interaction term $\text {Time} \times \text {MNP usage}$ in the logistic part of the model revealed that the odds of not having diarrhea for children who used MNP increased for two weeks increment in time ($\alpha = 0.020$, $s.e. = 0.007$, $p-value = 0.012$)(Table 4). These findings suggest that, even if the difference is minimum due to various nuisance factors, the use of MNP contributed to the well-being of the children.

The results from MZIP regression model for the outcome NOF is presented in Table 5. The covariates significant at 5% level of significance were region, NOD, EBF month, MNP usage, time and $\text {EBF months} \times \text {MNP usage}$ in the Poisson part of the model, and region, EBF month, MNP usage, time and $\text {time} \times \text {MNP usage}$ at the logistic part of the model. After adjusting the effects of other covariates, subject specific and cluster specific random effects, we observed that the log of expected NOF was 0.116 unit higher for children living in the SNNPR region compared to children in the Oromia region ($\beta = 0.116$, $s.e. = 0.026$, $p-value < 0.001$). Furthermore, in the logistic part of the model, it was revealed that the odds of not having the flu were lower for children living in the SNNPR region as compared to children from Oromia region ($\alpha = -1.193$, $s.e. = 0.215$, $p-value < 0.001$).

The log expected NOF was expected to increase by 0.017 for a unit increase in NOD ($\beta = 0.017$, $s.e. = 0.003$, $p-value < 0.001$). Notably, the interaction term $\text {EBF month} \times \text {MNP usage}$ indicated that the log expected NOF was expected to decrease by 0.106 units for children who exclusively breastfed for six months and used MNP ($\beta = -0.106$, $s.e. = 0.045$, $p-value = 0.019$) as compared to children who did not used MNP and whose EBF months less or greater than six months. In addition, the odds of not having flu/common cold was higher for children who exclusively breastfed for six months as compared to the counterparts ($\alpha = 0.141$, $s.e. = 0.062$, $p-value = 0.009$). Moreover, the logistic part of the model in Table 5 reveals that the odds of not having the flu/common cold increased as time increased by two weeks for children who used MNP as compared to those who did not ($\alpha = 0.023$, $s.e. = 0.007$, $p-value < 0.001$). These interpretations hold after adjusting subject and cluster specific random effects.

Model diagnostic

To assess the goodness of fit of the model for both NOD and NOF, we employed scaled/quantile residuals generated from the fitted models. As depicted in Fig. 5a and b, the QQ plots for NOD and NOF, respectively, show that the residuals align the straight line. This indicates that the residuals for both models follow a uniform distribution over the range [0, 1]. In addition, the dispersion and KS test appeared non significant revealing a good fit.

Discussion

This study primarily focused on the longitudinal analysis of NOD and NOF among young children drawn from 35 clusters within the SNNPR and Oromia regions, Ethiopia. The study employed exploratory analysis, i.e., data driven approach to identify the suitable family distribution for modeling the two outcomes and emphasized the importance of such analysis in achieving appropriate model specification.

The results obtained from the longitudinal MZIP model reveal that, after adjusting for subject and cluster-specific random effects, several covariates are associated with both the log of the expected value of NOD and the log odds of not having diarrhea. The covariates region, EBF months, and gender were found to have significant associations with the log of the expected value of NOD. In addition, significant associations were observed for the interactions between linear time and MNP usage, as well as between the quadratic time and MNP usage in relation to the log of the expected value of NOD. Furthermore, gender, region, and the interaction of time by MNP usage showed significant relationship with the log odds of not having diarrhea. The study shows that region has a significant association with log of the expected value of NOD and odds of not having diarrhea. This observation is consistent with the findings of a systematic review research that highlighted regional disparities in the prevalence of diarrhea among under five children in Ethiopia [31]. Furthermore, our study reveals that EBF months are significantly associated with the expected log of NOD, aligning with the results of several previous studies [10, 11, 21, 22, 32, 33]. It was observed that the log odds of not having diarrhea were higher for female children compared to their male counterparts. This finding is consistent with prior research conducted in Ethiopia that reported a significant association between a child’s sex and the odds of experiencing diarrhea [34].

One of the interesting findings of this study underscores the importance of starting with exploratory analysis and the impact of selecting an appropriate model specification on study outcomes i.e., a prior study employing the same dataset as the current study concluded that longitudinal diarrhea prevalence was higher among children who used MNP [18]. However, the results of the current study contradict these findings by revealing an inverted U-shaped trend in the log-expected NOD as time increases for children who used MNP. This indicates that the log-expected NOD for children who used MNP increases in comparison to those who did not as time increases until a turning point is reached. After this turning point, the negative coefficient in the quadratic term of the interaction implies that the log-expected NOD decreases as time increases for children who used MNP, compared to those who did not. In addition, the odds of not having diarrhea was initially very low for children who used MNP at the start of the study. The exploratory plot also reveals a high prevalence of diarrhea among children who used MNP at the beginning of the study, which may explain the inverted U-shaped rate of change in the expected log of NOD.

The variances of the individual and cluster level random effects for the outcome NOD indicate that there is more variability between individuals than between clusters. This may be due to individual-level factors such as poor hygiene, poor sanitation, or poor nutrition practices [14]. Therefore, since these variables were not included in the model, the variation due to these factors can be controlled by the subject specific random effects which may increase the variance of subject specific random effects. In addition, the individual reaction to the low-dose iron micronutrient powder may be different at the beginning of the study. A previous study shows that micronutrient with iron supplies can increase the incidence of diarrhea [35]. Therefore, some children may experience a sensitive reaction to the micronutrient powder, which may contribute to high variability between individuals within the same cluster.

Moreover, the study demonstrates that the odds of not having diarrhea for children who used MNP increase as time progresses compared to those who did not use MNP. Thus, we can assert that the use of MNP do not increase the longitudinal prevalence of diarrhea. Our findings align with another study that demonstrated providing children with MNP did not increase the risk of childhood infectious diseases such as diarrhea and lower respiratory infections [19]. Furthermore, a different study indicated that supplementing children with micronutrient, including vitamin A and zinc, can reduce the severity of diarrhea [20]. Based on this finding, we can say that selecting a data driven model plays a pivotal role in drawing reliable conclusion.

The finding of this study for the outcome NOF reveal that after adjusting for the subject and cluster specific random effects, region, NOD and the interaction $\text {MNP usage} \times \text {EBF months}$ had a significant association with the expected log of NOF. In addition, region, EBF months and the interaction $\text {MNP usage} \times \text {time}$ had a significant association with the odds of having flu/common cold. A significant relationship between NOF and region was identified in both part of the model. This finding is consistent with the results reported in a prior investigation [7]. Furthermore, the current study supported the causal relationship between diarrhea and ARI which aligned with earlier findings [17]. It was observed that the log expected NOF increases with NOD. This result was supported by previous findings which shows diarrhea as risk factor of ARI [7, 15, 15]. Various studies also show the simultaneous occurrence of ARI and diarrhea [16, 17].

Moreover, this study indicates a significant association between the interaction of EBF month and MNP usage with the log expected NOF. While there are no studies specifically addressing the effects of the interaction between EBF month and MNP usage, various studies have highlighted the benefits of EBF for the first six months of a child’s life [15, 21, 23]. In addition, a study has demonstrated that providing children with MNP support can reduce the risk of infectious diseases [20]. Furthermore, this study indicates that the odds of not having flu/common cold increased with each two-week increment in time for children who used MNP, compared to children who did not. This finding aligns with earlier studies [19, 20]. However, it contradicts a previous study that demonstrated a higher longitudinal prevalence of flu/common cold among children who used MNP compared to those who did not [18]. The variances of the individual and cluster level random effects for the outcome NOF indicate that most of the variability in NOF is occurring at the cluster level rather than the individual level, which indicates that the occurrence of flu/common cold is mostly influenced by cluster-level factors such as environmental conditions of the villages as it is an easily transmittable disease . In addition, flu/common cold is associated with seasonal changes that can affect the entire cluster [36]. These factors may increase the variance of the cluster specific random effects. The current study could not control seasonal effects because of the limitation of the data. Thus, We recommend that future studies control for seasonal effects, as this may provide significant additional insights.

Conclusion

In this paper, we conducted a multilevel zero-inflated Poisson regression analysis on longitudinal data focusing on infectious diseases such as diarrhea and flu among children aged 6 to 11 months. Our study underscores the vital role of commencing with exploratory analysis to select the appropriate statistical model for the data. After adjusting for random effects, we observed that children who used MNP exhibited an initially higher rate of change in the expected log of NOD compared to those who did not use MNP. However, over time, this rate declined. In addition, region, EBF month, and gender demonstrated significant associations with NOD. Similarly, children who used MNP and were EBF for six months shows a decrease in the log-expected NOF compared to their counterparts. Furthermore, the odds of not having flu/common cold were higher for children who used MNP for two weeks increment in time. Region and NOD were also found to have significant associations with NOF. In light of these findings, we emphasize the importance of starting from exploratory analysis as a fundamental step in statistical modeling. In addition, we recommend raising awareness about the critical importance of EBF for the first six months to mitigate the impact of infectious diseases. Policymakers and health practitioners should encourage MNP usage, with regular monitoring and adaptation programs over time for the better outcome. Further, developing comprehensive strategies considering the joint influence of MNP usage and exclusive breastfeeding along with interventions for childhood diarrhea may assist in reducing morbidity and mortality associated with comorbidity of diarrhea and flu. It is important to consider studying the dependency between infectious disease with malnutrition indicators over time in future studies as malnourished children are more vulnerable to these diseases.

Availability of data and materials

The data that support the findings of this study are available from the authors upon reasonable request and with permission of Ethiopian public health institute.

Abbreviations

AIC:: Akaike information criterion
AOM:: Age of a mother
ARIs:: Acute respiratory infections
BIC:: Bayesian information criteria
EBF:: Exclusive breastfeeding
EPHI:: Ethiopian Public Health Institute
ESM:: Educational status of a mother
GP:: Generalized Poisson
LRT:: Likelihood ratio test
MNP:: Micronutrient Powder
MZIP:: Multilevel Zero-Inflated Poisson
NB:: Negative Binomial
NOD:: Number of days a child suffered from diarrhea
NOF:: Number of days a child suffered from flu/common cold
SDGs:: Sustainable Development Goals
SNNPR:: Southern Nations, Nationalities, and Peoples’ Region
WHO:: World Health Organization
ZIP:: Zero-Inflated Poisson

References

(WHO) WHO. Child mortality (under 5 years). 2020. https://www.who.int/news-room/fact-sheets/detail/levels-and-trends-in-child-under-5-mortality-in-2020. Accessed 23 Sep 2023.
(UNICEF) UNICEF. Diarrhea. 2022. https://data.unicef.org/topic/child-health/diarrhoeal-disease/. Accessed 23 Sep 2023.
(WHO) WHO. The top 10 causes of death. 2020. https://www.who.int/news-room/fact-sheets/detail/the-top-10-causes-of-death. Accessed 23 Sep 2023.
Dewey KG, Mayers DR. Early child growth: how do nutrition and infection interact? Matern Child Nutr. 2011;7:129–42.
Article PubMed PubMed Central Google Scholar
Kundu S, Kundu S, Al Banna MH, Ahinkorah BO, Seidu AA, Okyere J. Prevalence of and factors associated with childhood diarrhoeal disease and acute respiratory infection in Bangladesh: an analysis of a nationwide cross-sectional survey. BMJ Open. 2022;12(4):e051744.
Article PubMed PubMed Central Google Scholar
Feleke Y, Legesse A, Abebe M. Prevalence of diarrhea, feeding practice, and associated factors among children under five years in Bereh District, Oromia, Ethiopia. Infect Dis Obstet Gynecol. 2022. p. 1-13. https://doi.org/10.1155/2022/4139648.
Merera AM. Determinants of acute respiratory infection among under-five children in rural Ethiopia. BMC Infect Dis. 2021;21(1):1–12.
Article Google Scholar
Tareke AA, Enyew EB, Takele BA. Pooled prevalence and associated factors of diarrhea among under-five years children in East Africa: A multilevel logistic regression analysis. PLoS ONE. 2022;17(4):e0264559.
Article PubMed PubMed Central Google Scholar
Apanga PA, Kumbeni MT. Factors associated with diarrhoea and acute respiratory infection in children under-5 years old in Ghana: an analysis of a national cross-sectional survey. BMC Pediatr. 2021;21(1):1–8.
Article Google Scholar
Bbaale E. Determinants of diarrhoea and acute respiratory infection among under-fives in Uganda. Australas Med J. 2011;4(7):400.
Article PubMed PubMed Central Google Scholar
Saeed OB, Haile ZT, Chertok IA. Association between exclusive breastfeeding and infant health outcomes in Pakistan. J Pediatr Nurs. 2020;50:e62–8.
Article PubMed Google Scholar
Demissie GD, Yeshaw Y, Aleminew W, Akalu Y. Diarrhea and associated factors among under five children in sub-Saharan Africa: Evidence from demographic and health surveys of 34 sub-Saharan countries. PLoS ONE. 2021;16(9):e0257522.
Article PubMed PubMed Central Google Scholar
Tesfaye TS, Magarsa AU, Zeleke TM. Moderate to severe diarrhea and associated factors among under-five children in Wonago District, South Ethiopia: a cross-sectional study. Pediatr Health Med Ther. 2020;11:437–43. https://doi.org/10.2147/PHMT.S266828.
Hailu B, Ji-Guo W, Hailu T. Water, sanitation, and hygiene risk factors on the prevalence of diarrhea among under-five children in the rural community of Dangila district, northwest Ethiopia. J Trop Med. 2021;2021.
Mir F, Ariff S, Bhura M, Chanar S, Nathwani AA, Jawwad M, et al. Risk factors for acute respiratory infections in children between 0 and 23 months of age in a peri-urban district in Pakistan: A matched case-control study. Front Pediatr. 2022;9:704545.
Article PubMed PubMed Central Google Scholar
Walker CLF, Perin J, Katz J, Tielsch JM, Black RE. Diarrhea as a risk factor for acute lower respiratory tract infections among young children in low income settings. J Glob Health. 2013;3(1).
Newman KL, Gustafson K, Englund JA, Khatry SK, LeClerq SC, Tielsch JM, et al. Risk of respiratory infection following diarrhea among adult women and infants in Nepal. Am J Trop Med Hyg. 2020;102(1):28.
Article PubMed Google Scholar
Samuel A, Brouwer ID, Feskens EJ, Adish A, Kebede A, De-Regil LM, et al. Effectiveness of a program intervention with reduced-iron multiple micronutrient powders on iron status, morbidity and growth in young children in Ethiopia. Nutrients. 2018;10(10):1508.
Article PubMed PubMed Central Google Scholar
Lemaire M, Islam QS, Shen H, Khan MA, Parveen M, Abedin F, et al. Iron-containing micronutrient powder provided to children with moderate-to-severe malnutrition increases hemoglobin concentrations but not the risk of infectious morbidity: a randomized, double-blind, placebo-controlled, noninferiority safety trial. Am J Clin Nutr. 2011;94(2):585–93.
Article PubMed Google Scholar
Fischer Walker CL, Black RE. Micronutrients and diarrheal disease. Clin Infect Dis. 2007;45(Supplement_1):S73–S77.
Mulatu T, Yimer NB, Alemnew B, Linger M, Liben ML. Exclusive breastfeeding lowers the odds of childhood diarrhea and other medical conditions: evidence from the 2016 Ethiopian demographic and health survey. Ital J Pediatr. 2021;47(1):1–6.
Article Google Scholar
Hossain S, Mihrshahi S. Exclusive breastfeeding and childhood morbidity: A narrative review. Int J Environ Res Public Health. 2022;19(22):14804.
Article PubMed PubMed Central Google Scholar
Duijts L, Jaddoe VW, Hofman A, Moll HA. Prolonged and exclusive breastfeeding reduces the risk of infectious diseases in infancy. Pediatrics. 2010;126(1):e18–25.
Article PubMed Google Scholar
(WHO), W.H.O.: Infant and young child feeding (2021). https://www.who.int/data/nutrition/nlis/info/infant-and-youngchild-feeding. Accessed 28 Sept 2023.
Lambert D. Zero-inflated Poisson regression, with an application to defects in manufacturing. Technometrics. 1992;34(1):1–14.
Article Google Scholar
Lee AH, Wang K, Scott JA, Yau KK, McLachlan GJ. Multi-level zero-inflated Poisson regression modelling of correlated count data with excess zeros. Stat Methods Med Res. 2006;15(1):47–61.
Article PubMed Google Scholar
Dunn PK, Smyth GK. Randomized quantile residuals. J Comput Graph Stat. 1996;5(3):236–44.
Article Google Scholar
Hartig F. DHARMa: Residual Diagnostics for Hierarchical (Multi Level /Mixed) Regression Models. (2022). R package version 0.4.6. http://florianhartig.github.io/DHARMa/. Accessed 26 Sept 2023.
Stram DO, Lee JW. Variance components testing in the longitudinal mixed effects model. Biometrics. 1994;1171–7.
Moghimbeigi A, Eshraghian MR, Mohammad K, Mcardle B. Multilevel zero-inflated negative binomial regression modeling for over-dispersed count data with extra zeros. J Appl Stat. 2008;35(10):1193–202.
Article Google Scholar
Alebel A, Tesema C, Temesgen B, Gebrie A, Petrucka P, Kibret GD. Prevalence and determinants of diarrhea among under-five children in Ethiopia: a systematic review and meta-analysis. PLoS ONE. 2018;13(6):e0199684.
Article PubMed PubMed Central Google Scholar
Mihrshahi S, Ichikawa N, Shuaib M, Oddy W, Ampon R, Dibley MJ, et al. Prevalence of exclusive breastfeeding in Bangladesh and its association with diarrhoea and acute respiratory infection: results of the multiple indicator cluster survey 2003. J Health Popul Nutr. 2007;25(2):195.
PubMed PubMed Central Google Scholar
Hamer DH, Solomon H, Das G, Knabe T, Beard J, Simon J, et al. Importance of breastfeeding and complementary feeding for management and prevention of childhood diarrhoea in low-and middle-income countries. J Glob Health. 2022;12.
Anteneh ZA, Andargie K, Tarekegn M. Prevalence and determinants of acute diarrhea among children younger than five years old in Jabithennan District, Northwest Ethiopia, 2014. BMC Public Health. 2017;17(1):1–8.
Article Google Scholar
Soofi S, Cousens S, Iqbal SP, Akhund T, Khan J, Ahmed I, et al. Effect of provision of daily zinc and iron with several micronutrients on growth and morbidity among young children in Pakistan: a cluster-randomised trial. Lancet. 2013;382(9886):29–40.
Article PubMed Google Scholar
Vidal K, Sultana S, Patron AP, Salvi I, Shevlyakova M, Foata F, et al. Changing epidemiology of acute respiratory infections in under-two children in Dhaka. Bangladesh Front Pediatr. 2022;9:728382.
Article PubMed Google Scholar

Download references

Acknowledgements

We thank the Ethiopian Public Health Institute for giving access to the data. Mrs. Bezalem Eshetu Yirdaw would like to thank the University of South Africa for providing a good working environment office. She would like to thank the National Research Foundation (NRF) of South Africa for the partial scholarship, the Schlumberger foundation faculty for the future program for the financial support provided during her study. The L’Oreal Unesco for women in science foundation for the endowment and award towards her PhD project in 2022.

Funding

The authors declare that there was no funding to conduct this research.

Author information

Authors and Affiliations

Department of Statistics, University of South Africa, c/o Christiaan de Wet Road & Pioneer Avenue, Johannesburg, 1709, Florida, South Africa
Bezalem Eshetu Yirdaw & Legesse Kassa Debusho
Ethiopian Public Health Institute, Gulele Sub City, Addis Ababa, Ethiopia
Aregash Samuel

Authors

Bezalem Eshetu Yirdaw
View author publications
You can also search for this author in PubMed Google Scholar
Legesse Kassa Debusho
View author publications
You can also search for this author in PubMed Google Scholar
Aregash Samuel
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

BEY reviewed literature, performed conceptualization, statistical methods, statistical analyses, interpretations and compiled the manuscript. LKD conceptualize the research problems, suggested the statistical methods applied in the paper, supervised and reviewed the findings of data analyses and compilation of the manuscript. AS reviewed the manuscript.

Corresponding author

Correspondence to Bezalem Eshetu Yirdaw.

Ethics declarations

Ethics approval and consent to participate

Ethical approval for this study was obtained from the University of South Africa School of Science Ethics Review Committee with reference number 2023/CSET/SOS/014. The data for this study is a secondary data taken from Ethiopian public health institute and the Ethical approval for the original study was obtained from the Ethiopian National Research Ethics Review Committee (NRERC). Signed consent was obtained from caregivers of the study children before participation in the study. The study was registered at http://www.clinicaltrials.gov/ with clinical trials identifier of NCT02479815. All methods were carried out in accordance with relevant guidelines and regulations/Declaration of Helsinki.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Yirdaw, B.E., Debusho, L.K. & Samuel, A. Application of longitudinal multilevel zero inflated Poisson regression in modeling of infectious diseases among infants in Ethiopia. BMC Infect Dis 24, 927 (2024). https://doi.org/10.1186/s12879-024-09820-0

Download citation

Received: 01 April 2024
Accepted: 27 August 2024
Published: 07 September 2024
DOI: https://doi.org/10.1186/s12879-024-09820-0

Keywords

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

Application of longitudinal multilevel zero inflated Poisson regression in modeling of infectious diseases among infants in Ethiopia

Abstract

Background

Methodology

Results

Conclusion

Similar content being viewed by others

Longitudinal trends in the health outcomes among children of the North Eastern States of India: a comparative analysis using national DHS data from 2006 to 2020

Deconstructing the differences: a comparison of GBD 2010 and CHERG’s approach to estimating the mortality burden of diarrhea, pneumonia, and their etiologies

Trends and determinants of diarrhea among under-five children in Ethiopia: cross-sectional study: multivariate decomposition and multilevel analysis based on Bayesian approach evidenced by EDHS 2000–2016 data

Introduction

Methodology

Study data

Methods

Model selection

Result

Exploratory analysis

Selection of variance-covariance structure for random effects

Multilevel zero inflated Poisson regression analysis

Model diagnostic

Discussion

Conclusion

Availability of data and materials

Abbreviations

References

Acknowledgements

Funding

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation