Measuring and correcting bias in indirect estimates of under-5 mortality in populations affected by HIV/AIDS: a simulation study
In populations that lack vital registration systems, under-5 mortality (U5M) is commonly estimated using survey-based approaches, including indirect methods. One assumption of indirect methods is that a mother’s survival and her children’s survival are not correlated, but in populations affected by HIV/AIDS this assumption is violated, and thus indirect estimates are biased. Our goal was to estimate the magnitude of the bias, and to create a predictive model to correct it.
We used an individual-level, discrete time-step simulation model to measure how the bias in indirect estimates of U5M changes under various fertility rates, mortality rates, HIV/AIDS rates, and levels of antiretroviral therapy. We simulated 4480 populations in total and measured the amount of bias in U5M due to HIV/AIDS. We also developed a generalized linear model via penalized maximum likelihood to correct this bias.
We found that indirect methods can underestimate U5M by 0–41% in populations with HIV prevalence of 0–40%. Applying our model to 2010 survey data from Malawi and Tanzania, we show that indirect methods would underestimate U5M by up to 7.7% in those countries at that time. Our best fitting model to correct bias in U5M had a root median square error of 0.0012.
Indirect estimates of U5M can be significantly biased in populations affected by HIV/AIDS. Our predictive model allows scholars and practitioners to correct that bias using commonly measured population characteristics. Policies and programs based on indirect estimates of U5M in populations with generalized HIV epidemics may need to be reevaluated after accounting for estimation bias.
KeywordsUnder-5 mortality Indirect methods of estimation Bias HIV/AIDS
Acquired immune deficiency syndrome
Age-specific fertility rates
Cluster of differentiation 4
Children ever born
Demographic and health survey
Human immunodeficiency virus
UN Inter-agency Group for Child Mortality Estimation
Millennium Development Goals
Maternal mortality ratio
Prevention of mother-to-child transmission
Sustainable Development Goals
Total fertility rate
Joint United Nations Programme on HIV/AIDS
World Development Indicators
Under-5 mortality (U5M) is an important indicator of population health, and relationships between U5M and fertility, population growth, economic growth, and democratization are actively researched [1, 2, 3, 4, 5, 6]. Several national and international goals, most notably the Millennium Development Goals (MDGs) and the Sustainable Development Goals (SDGs), have included U5M as a target indicator. MDG4 called for a 2/3 reduction from 1990 U5M levels by 2015, and SDG3 calls for a reduction of U5M to at least 25 per 1000 live births by 2030. Yet accurate measurement of U5M in many countries is still hampered by the quality and/or availability of data [7, 8, 9, 10].
Most child deaths occur in countries that lack or have incomplete vital registration systems. In such populations, survey- and census-based methods for mortality rate estimation are commonly used. Survey-based methods include direct and indirect estimation. The former requires the collection of a full birth history, that is, date of birth and age at death, if appropriate, for every live birth a woman has had. With that information U5M rates can be calculated for any time period before the survey. However, because of small sample sizes, rates are typically calculated for 5-year periods (1–5, 6–10 and 11–15 years before the survey). Indirect methods, by contrast, require only the collection of a summary birth history . Mothers are asked about the number of live-born children they have ever given birth to and the number that are still alive. No information about dates of birth or dates of death is collected. Models of fertility and age-specific mortality are used to estimate the probability of dying between birth and age 5 (U5M) based on the ratio of children dead (CD) to children ever born (CEB). The resulting estimates correspond to periods that precede the survey date by a length of time determined largely by age patterns of fertility, approximated by parity ratios across age groups . Although full birth histories have come to dominate the measurement of U5M at the country level, summary birth histories remain valuable. They are often included in population censuses, and offer greater potential for spatial or socioeconomic disaggregation .
In populations affected by HIV/AIDS, three key assumptions of indirect methods for U5M estimation are likely to be violated. First, the methods assume that the survival of a mother and the survival of her children are not correlated. HIV/AIDS has a substantial impact on the mortality risks of children born to HIV positive mothers due to vertical transmission of the virus and to other harmful consequences of maternal death. Empirical studies demonstrate that the survival of a mother and that of her children are highly correlated in populations affected by HIV/AIDS . Note that this also leads to bias in direct estimates of U5M that rely on surveys, because women who have died are under-represented in the survey sample.
The second assumption is that the mortality experience of the children of mothers in each age group at the time of the survey is representative of the mortality experience of the children of all mothers for some time period in the past; in other words, time trends in U5M need to have been gradual and unidirectional. If the incidence of HIV/AIDS has changed over time (or access to antiretroviral therapy (ART) has changed) then this assumption would be violated.
The third assumption is that age-patterns of under-5 mortality are accurately captured in the mortality model (i.e., life table) that is used. To the extent that populations impacted by HIV are likely to have age-patterns of mortality that differ from those available in any model life tables, then the indirect estimates would be biased.. Recently developed model life tables based on demographic surveillance systems in rural Africa are among the first to account for the impact of HIV .
Underestimation of U5M may have a range of undesirable consequences. First, it can lead to overestimates of intervention effectiveness and to false declarations of success in campaigns to meet objectives such as the MDGs or the SDGs. If the bias is large enough, it may appear that U5M is decreasing when it is in fact increasing. Second, it may also result in resources previously dedicated to lowering U5M being reallocated to other targets when there is still scope for these resources to produce significant benefits in reducing the burden of U5M. Finally, underestimates of U5M may make epidemics, such as HIV, appear less harmful than they are in reality. To address these concerns, we offer an alternative to correct the bias due to HIV in indirect estimates of U5M, which requires only estimates of HIV prevalence in the year of the survey and 10 years prior to the survey, and an estimate of ART prevalence in the year prior to the survey. Given the centrality of U5M estimates to many policy and planning efforts in global health, we intend that this tool will facilitate more reliable U5M estimation for countries impacted by HIV and produce corresponding benefits for priority-setting and other decision-making in these settings.
Previous studies of the bias in estimates of U5M due to HIV/AIDS include [16, 17, 18]. Only Ward and Zaba  assessed indirect estimates, using a stable population model, and assuming that HIV incidence was stable over time. They found that the degree of negative bias in indirect mortality estimates increased from 1.2 to 44.3% as the adult prevalence of HIV increased from 2.5 to 45%, with greater bias in estimates from older women, particularly those aged 45–49.
Hallett et al.  calculated bias in direct estimates of U5M based on a prospective, population-based cohort in rural Zimbabwe that used verbal autopsies to identify AIDS deaths. They also built a mathematical model calibrated to the empirical data to estimate and correct the bias in U5M. Bias was calculated by comparing a demographic and health survey (DHS) continuous time series, consisting of smoothed direct estimates of U5M, to a DHS corrected time series. Reports from surviving mothers underestimated U5M by 9.8% compared to reports from all mothers, in a population in which HIV prevalence fell from 22% in 1998 to 18% in 2005.
Most recently, Walker et al.  used a cohort component projection model where the key inputs were derived from the latest projections available from the Joint United Nations Programme on HIV/AIDS (UNAIDS) Spectrum package . Spectrum outputs include: annual number of births (typically from 1970 onwards), number of women each year in need of prevention of mother-to-child transmission (PMTCT - considered as a proxy for the number of births to HIV-positive women), and number of HIV-positive infants. The Spectrum model takes into account the fertility-reducing effects of HIV, the estimated transmission of HIV from mother to child, breastfeeding patterns, and the impact of interventions to reduce MTCT. For HIV-negative births, the risks of dying in each year from birth to age 5 years were obtained from a model life table in the Coale and Demeny “West” family, using a level of U5M that was a best guess of the U5M in the HIV-negative population. Thus, the model assumed that mortality of HIV-negative children born to HIV-positive mothers was the same as that for children born to HIV-negative mothers. The model did not take into account the age when a woman is infected with HIV when estimating mortality due to AIDS. It estimated bias by comparing the ratio of under-five deaths to births for all mothers and for surviving mothers across the 35-year intervals preceding the year of the survey.
This paper builds on the literature examining bias in U5M estimates, focusing on indirect methods and using a simulation model to incorporate a more comprehensive set of population characteristics than in previous studies. Using the model to simulate a variety of trajectories in HIV incidence, levels of ART coverage, mortality rates and fertility rates, we calculated the magnitude of bias in indirect estimates of U5M under different combinations of these variables. Based on the results of the simulations, we developed a parsimonious predictive model of bias as a function of a subset of these variables, and we used the predictive model to adjust estimates based on empirical data from Malawi and Tanzania. This analysis was the first since Ward and Zaba  to assess indirect estimates. Unlike Ward and Zaba , the evolution of the AIDS epidemic was incorporated into the simulation model, and unlike Walker et al.  the dynamics of ART take-up were included. In addition, the simulation used more recent data than Ward and Zaba  and Hallett et al. , and, unlike the latter, it was not calibrated to empirical cohort data, which means that this study relies more on parameters estimated in previous studies.
We created a discrete-time, stochastic, individual-based model to simulate fertility, HIV infection, ART initiation, and mortality for women and their children living during the period 1946–2010. In each yearly time step, each woman in the model faces some probability of giving birth, being infected with HIV, initiating ART (if HIV-positive), and dying. Children born to HIV-positive mothers face some probability of infection at birth, all children face some probability of dying each year, and female children, should they survive to age 15, begin to face the same probabilities listed above. In other words, children born during the simulation can become adults in the simulation. Parameters of the model were derived from published and unpublished sources, as detailed below. Some of the parameters (the “inputs”) were varied across simulations in order to generate populations with a wide range of fertility, mortality, HIV incidence, and ART initiation trajectories. Other parameters remained fixed across populations, particularly those that define biological relationships (e.g. survival time among HIV-positive women who do not initiate ART).
The goal of the simulation was to create a wide variety of population histories, resembling the experiences of different actual populations, to assess how bias will vary in relation to other population characteristics that may be measured independently (e.g., HIV prevalence). In order to characterize these general relationships rather than their expression in a small number of particular populations, the parameters included in the simulation model vary over a range of different values that each selected population characteristics may take, rather than precisely matching fertility, mortality, HIV incidence, and ART initiation rates experienced in specific settings. All simulations were run in R , and the data and code are freely available at https://github.com/jquattro/hiv-childmort-bias. A user-friendly web application to correct indirect estimates is available at johnquattrochi.com/bias.
Size and date of initial population
We initiated the simulation with 22,500 women who were aged 15 years in 1906, and ran the simulation through 2010. This was the smallest initial population and shortest simulation duration (104 years) that produced stable estimates. Larger initial populations and longer durations were too computationally costly.
Annual probability of birth, HIV negative women
Annual probability of birth, HIV positive women not on ART
Using DHS data, Chen and Walker  found that among women aged 15–19 years, those who were HIV-positive experienced higher ASFRs compared to HIV-negative women, with the ratio dependent on the percent of 15–19 year old women who were sexually active; also, among those aged 19, HIV-positive women experienced lower fertility rates relative to HIV-negative women. We use the ratios estimated by Chen and Walker  as fixed parameters in the simulation model (although the percent of females aged 15–19 who are sexually active was an input that varied across simulations.
Annual probability of birth, HIV positive women on ART
Several studies have found that incidence of pregnancy increases following initiation of ART [24, 25, 26], while at least one has found that incidence does not increase . The effect of ART on fertility likely depends on age, cluster of differentiation 4 (CD4) count at initiation, educational attainment, contraceptive use, and partner’s HIV status. For the simulation model, we assumed that, among women over age 19, ART erases half of the fertility decrease caused by HIV/AIDS. In other words, for women on ART, the ASFR ratios in Chen & Walker  increase by half the difference from one (one indicating equal ASFRs between HIV-positive and HIV-negative women). We assumed that the ASFR for 15–19 year olds is not affected by ART. This simplifying assumption has minimal effect as few women in the simulation will be infected with HIV/AIDS and initiate ART by age 19.
Maternal mortality: probability of mother’s death at each birth
Annual probability of HIV infection
The annual probability of HIV infection was selected among the HIV incidence curves estimated by Hogan and Salomon  for 31 African countries. We selected five curves that included early-starting and late-starting epidemics, with either high or low peak incidence. The age pattern of incidence was determined using age-specific HIV incidence ratios from Heuveline .
CD4 count at infection and annual progression of CD4 count
Parameters governing CD4 count were derived from Hallett et al. . Specifically, when a woman was infected with HIV, the square root of her initial CD4 count was a random draw from a normal distribution with a mean of 25.9 and a standard deviation of 0.61. CD4 was assumed to decline linearly over time. For each woman under age 35 the absolute yearly decline was defined by a random draw from a normal distribution with a mean of 1.32, and a standard deviation of 1. For women 35 years or older the draw came from a normal distribution with a mean of 2.0 and a standard deviation of 1.
Annual probability of ART initiation, given that CD4 < threshold
For duration, we assumed that the median survival time on ART is 13 years . Thus we ended up with a series of annual probabilities for initiating ART given that a woman’s CD4 was below threshold, for 2004 to 2010.
Annual probability of death, HIV negative individuals
Time series for 5q0 and 1q0 estimates from the UN Inter-agency Group for Child Mortality Estimation (IGME) for selected countries were used as inputs . To estimate one-year, age-specific probabilities of death, the ratios of 1q2 to 1q3 to 1q4 from the UN Model Life Table, General Pattern for both sexes, were used to interpolate from the IGME estimates.
Time series for the probability of dying between ages 15 and 60 (45q15) were taken from the Institute for Health Metrics and Evaluation (2010) for selected countries. To obtain age-specific annual probabilities of death from ages five and up, the 45q15 for an input “model country” and year in the simulation were matched to the UN model life table with the closest 45q15 .
Annual probability of death, HIV positive individuals not on ART
The annual probability of death for HIV-positive women who were not on ART was based on cumulative mortality reported in Walker, Hill, and Zhao , who drew on cohort studies by Schneider, Zwahlen, and Egger , Todd et al. , and Stover et al. .
Annual probability of death, HIV positive women on ART
HIV-positive women on ART faced an annual probability of death that was a function of CD4 count at ART initiation, presence or absence of symptoms at baseline, and time since initiation. The function was taken from the “medium” scenario published by Hallett et al. . Women were assigned to “symptomatic” or “non-symptomatic” with probability 0.5, based on Braitstein et al. . The median survival after initiation of ART ranged from roughly 13 to 19 years.
Mother-to-child transmission of HIV
Probability of mother-to-child transmission of HIV was taken from Stover et al.  Transmission depends on breastfeeding duration and ART, including the assumption that all ART is single-dose nevirapine, which is less effective at preventing transmission than dual- or triple-treatment ART.
Range of inputs used in the simulation
The primary goal was to measure bias in indirect estimates across a set of populations that have experienced different rates of fertility, mortality, HIV infection, and ART initiation. To generate such a set of populations, we varied ten inputs: fertility, adult mortality, U5M, percent of 15–19 year olds who are sexually active, maternal mortality in 1990, percent annual decline in the maternal mortality rate, HIV incidence, duration of breastfeeding, and ART coverage. We simulated one population for each combination of inputs, for a total of 4480 populations.
For adult mortality we considered IHME estimates of 45q15 for 195 countries, 1970–2010 . We selected Madagascar and Sudan to represent high-and-decreasing and low-and-steady adult mortality (Fig. 1b).
For U5M we considered UN IGME  estimates for 195 countries. We chose estimates for Mali and Morocco to represent high-and-decreasing and low-and-decreasing U5M, in populations with low prevalence of HIV/AIDS (Fig. 1c). Note that, in the simulation, these are background mortality rates that capture causes of death other than HIV/AIDS.
For HIV incidence, we considered 31 curves estimated for urban or rural parts of selected African countries . We chose curves for urban Botswana, rural Cameroon, rural Malawi, rural Lesotho, and rural Uganda to vary the timing of epidemic onset and the level of epidemic peak (Fig. 1d).
Indirect estimation of under-5 mortality and calculation of bias
For each simulated population, we tabulated CEB and CS as of 2010 for two overlapping groups of women: (1) all surviving women aged 15–49, and (2) all surviving women and all women who died from HIV/AIDS aged 15–49. We used all women in each category rather than drawing a sample to simulate a survey in order to avoid sampling variability and focus on bias due to HIV/AIDS. The second population approximates a counterfactual in which no bias due to HIV/AIDS occurs. Inherent in our tabulations is the assumption that ‘dead’ women provide equally valid responses as women who survived. For each of the two groups of women, we used indirect methods to estimate under-5 mortality for each of the 75-year age groups of mothers aged 15–49 years . We used a UN General Standard model life table to estimate nq0 and to convert nq0 into 5q0.
Predictive model to correct for bias from HIV mortality
Our aim was to develop a predictive model, based on a large number of simulations, which related the bias due to HIV/AIDS in indirect measures of U5M to a small number of predictor variables that are available for most countries. The dependent variable was the absolute bias as defined above; the unit of analysis was the simulated population of a particular age group.
We employed a variety of modeling strategies, drawing on recent developments in predictive modeling . We randomly selected 80% of our data for model fitting, and used the other 20% for out-of-sample predictions. We gauged model performance using four metrics of out-of-sample prediction accuracy: root mean squared error, root median squared error, mean relative error, and median relative error.
The full model included 53 variables: unadjusted U5M; five-year age group dummies; HIV prevalence 5, 10, and 20 years before the survey; ART prevalence 1, 3, and 5 years before the survey; TFR in the year of the survey and 10 years earlier; interactions between HIV prevalence and age group; interactions between ART prevalence and age group; and an intercept term. Note that while 2010 is used as the year of the survey throughout this paper, the predictive equation can be used for other years.
Our modeling strategies included forward and backward selection, principle components regression, partial least squares regression, and generalized linear models with penalized maximum likelihood. For forward and backward selection, we used Akaike’s Information Criterion and a Bayesian Information Criterion . We fit principle components regressions with 20, 30, and 35 components, and we fit partial least squares regressions with 16 and 32 components. We also fit a generalized linear model via penalized maximum likelihood with three elastic-net penalties: 0 (commonly referred to as ridge regression), 1 (lasso), and 0.5 (an intermediate value). With the penalty at zero, the coefficients of correlated predictors shrink towards zero and each other. With the penalty at one, a single coefficient will be retained from a group of correlated predictors. We used 10-fold cross-validation to select the elastic-net tuning parameter, and we generated prediction intervals from the generalized linear models via bootstrapping.
Application to empirical data from Malawi and Tanzania
Child survival, HIV, ART, and TFR for Malawi and Tanzania
Children ever born
HIV prevalence, 1990
HIV prevalence, 2000
HIV prevalence, 2010
ART prevalence, 2005
ART prevalence, 2007
ART prevalence, 2009
Total fertility rate, 2000
Total fertility rate, 2010
We estimated past ART coverage by assuming a constant proportional increase from no coverage in 2004 to the levels reported by UNAIDS in 2009–2012. We generated a point prediction and prediction interval for U5M for each country-age group-year observation, using standard statistical techniques . We also compared our adjustments to adjustments generated by the predictive model in Ward and Zaba . Because Ward & Zaba used a stable population model, it is not clear which year’s HIV prevalence is most appropriate for prediction. We used that of 10 years prior to the survey. This will likely overestimate the adjustment for women over 40 years old, but it should be reasonable for women aged 25–39 years.
Outcomes for simulated populations, summary statistics
HIV prevalence, 1990
HIV prevalence, 2000
HIV prevalence, 2010
ART coverage, 2004
ART coverage, 2008
ART coverage, 2010
ART prevalence, 2005
ART prevalence, 2007
ART prevalence, 2009
Total fertility rate, 2000
Total fertility rate, 2010
For each of the 4480 simulated populations, we generated fourteen estimates of U5M, seven using surviving women (one estimate for each five-year age group from 15 to 19 to 45–49), and seven using surviving women and women who died from HIV/AIDS. Using those two sets of U5M estimates, we calculated 31,360 (7 * 4480) estimates of bias based on the difference between the unadjusted estimate (using reports from surviving women only) and the adjusted estimate (using reports from surviving women plus women who died from HIV/AIDS).
Bias in indirect estimates in 4480 simulated populations
Yrs before survey that estimates pertain to
Women who died from HIV/AIDS
Children ever born, surv women
Children ever born, surv women + HIV deaths
Dead children, surviving women
Dead children, surv women + HIV deaths
Ratio of HIV deaths to surviving women
The mean relative bias was highest for estimates from women aged 35–39 (−9.7%), followed by estimates from women 40–44 (−8.8%) and women 30–34 (−7.7%). Mean relative bias was also substantial for estimates from 45 to 49 year olds (−5.6%) and 25–29 year olds (−4.4%). For the two youngest age groups the mean relative bias was −1.5% [20, 21, 22, 23, 24] and − 0.6% [15, 16, 17, 18, 19]. The largest recorded relative biases were − 40.5% for estimates from 35 to 39 year olds, −36.8% for estimates from 30 to 34 year olds and − 31.6% for estimates from 40 to 44 year olds, which appeared in simulated populations with the highest HIV incidence curves, yielding HIV prevalence of up to 40% in 2000. These populations also had relatively low U5M (120–130 deaths per 1000 live births).
The mean of the ratio of HIV deaths to the number of surviving women was highest for those aged 40–44 (0.59) followed by 45–49 and 35–39 (0.51), 30–34 (0.27), 25–29 (0.09), 20–24 (0.03) and 15–19 year olds (0.02). Comparing surviving women to surviving women and HIV deaths, the mean number of children ever born begins to diverge at age 25–29, and the mean number of dead children begins to diverge at age 30–34. On average, women who died from HIV had fewer births and more dead children.
Prediction errors from models to correct bias in indirect estimates of U5M
Root mean square error
Root median square error
Mean relative error
Median relative error
Full Linear Model
Forward Sel. BIC
Forward Sel. AIC
Backward Sel. BIC
Backward Sel. AIC
glmnet, alpha = 0
glmnet, alpha = 0.5
glmnet, alpha = 1
PCR, ncomp = 20
PCR, ncomp = 30
PCR, ncomp = 35
PLS, ncomp = 16
PLS, ncomp = 32
Selection bias occurs in indirect estimates of U5M based on CEB and CS when the survival of children born to mothers who are not included in the survey differs from the survival of children whose mothers are included. In populations with high rates of HIV/AIDS, this selection bias can be significant, because a relatively large proportion of mothers die during their reproductive ages and their children die more frequently than other children due to the vertical transmission of HIV and the adverse effects of not having a living mother.
In this paper we presented an individual-based discrete time simulation model to measure and correct the bias in indirect estimates of U5M due to HIV/AIDS. The simulated populations were based on data and estimates from sub-Saharan Africa. We estimated bias by comparing indirect estimates from simulated reports of surviving women to estimates from simulated reports of surviving women and women who died from HIV/AIDS. We calculated bias in 4480 simulated populations, covering a range of peak HIV prevalence (0–40%), time between epidemic initiation and survey (25–35 years), ART coverage (0–79%), background U5M (50–290 deaths per 1000 live births), and TFR (2.4–6.9).
Our results showed negligible bias in estimates from 15 to 19 and 20–24 year olds. Unfortunately, this finding is of little practical value, since estimates based on reports of women at these ages are biased upwards for other reasons . However, reports from surviving women aged 25 and older underestimated U5M by over two percentage points (over 20 deaths per 1000 live births), or, in relative terms, 24%. Bias was greatest in reports from 30 to 34, 35–39 and 40–44 year olds, reaching 69 deaths per 1000 births, a relative bias of 41%. The magnitude of the bias calculated by our model is somewhat difficult to compare to that found by Ward and Zaba  because of their use of a stable population model. They estimated that relative bias increased from − 1.2% to − 44.3% as the adult prevalence of HIV increased from 2.5 to 45%. That is generally consistent with the results of the present study, in which adult prevalence of HIV ranged from 0 to 40% and the relative bias ranged from 0% to − 41%. Also consistent with our results, Ward and Zaba found that estimates from women aged over 30 were more biased than estimates from women under 25. We found, however, that bias in estimates from women aged 45–49 was lower than in estimates from those aged 30–44. This was due to two related factors. First, as Ward and Zaba noted, stable population models assume that the level of age-specific incidence risks is constant over time. For any given level of prevalence, a stable population model will overestimate the exposure of older cohorts, because no actual population has been subject to constant incidence for such a long period. Second, HIV incidence in our simulated populations peaked between 1988 and 1998, 12 to 22 years before the simulated surveys. Women who were 45–49 in 2010 would have given birth to many of their children prior to peak HIV incidence.
Our analysis has several advantages over previous work. Unlike the only other study of bias in indirect estimates , we did not use a stable population model, but allowed HIV, mortality and fertility rates to follow the trajectories of selected countries, and we also included ART. Thus we used a larger variety of inputs and more recent empirical data than Ward and Zaba  and Hallet et al. . In our simulations, the range of HIV prevalence was similar to that of Ward and Zaba, who used peak prevalence from 0 to 45%. We modeled background adult mortality using estimated 45q15 from country-time periods corresponding to life expectancies from 47 to 64 years; Ward and Zaba allowed adult mortality to vary from a life expectancy of 41 to 67 years. It is difficult to compare our fertility rates to their fertility model as they reported only the range they used for the location (− 0.5 to 0.5) and spread (0.8 to 1.2) parameters of the relation system based on the Gompertz transformation of the Brass-Booth standard.
Our model also has several limitations. First, although the range of population characteristics was wider than in previous studies, the trajectories of HIV incidence, ART coverage, mortality rates and fertility rates considered here were a small fraction of all possible trajectories. The results of the predictive model should be applied with caution to population trajectories outside of the bounds explored in this study. Second, empirical data on the inputs required by the predictive model may not be available for some populations. In those cases, estimated inputs can be used. We encourage users to generate a range of bias estimates using a range of plausible estimated inputs (i.e. sensitivity analysis). Third, as in all models, our simulation included a number of simplifying assumptions, such as: use of a 1 year time step rather than continuous time; independence between the probability of giving birth and the probability of contracting HIV in a given time-step (although the probability of giving birth changes in time-steps following infection); use of only one set of age-specific HIV incidence ratios; independence of the probability of giving birth and CD4 count (although the former is influenced by HIV and ART status); independence of the effect of HIV infection on fertility and the duration of infection (this relationship is difficult to quantify ); independence of child survival and maternal survival, other than through vertical transmission of HIV; use of a single model life table to convert nq0 into 5q0, which does not incorporate the effect of HIV on the age pattern of mortality [15, 52]; all vertical transmission occurs at birth; absence of variation in the effectiveness of ART in preventing vertical transmission; no drop-out once ART is initiated; and all women on ART take up PMTCT (and no women not on ART take up PMTCT). In most of these cases, we adopted these simplifying assumptions because they were expected to have relatively minimal effect on the main quantity of interest in this study, which was the HIV-related bias in indirect U5M rates; moreover, independent measurements of mortality, fertility and HIV rates showed that those rates were within acceptable ranges for our simulated populations (Table 2). Third, our study did not assess bias in indirect estimates due to factors other than HIV/AIDS. It is well-established that indirect methods applied to reports from women aged 15–19 (and in some cases women aged 20–24) tend to overestimate U5M, due to the higher risk of first births and the correlation between lower socioeconomic status and younger childbearing (Hill 1991).
HIV can also cause bias in direct estimation of U5M. Walker, Hill, and Zhao  found relative biases ranging from 1.1 to 26.5% across six African countries and time periods ranging from 1 to 5 to 11–15 years before the survey. They found that the largest biases were in estimates from 6 to 10 years before the survey (corresponding to indirect estimates from 30 to 44 year olds), and that biases in estimates from 11 to 15 years before the survey (corresponding to indirect estimates from 45 to 49 year olds) were slightly lower, which is consistent with the results that we found. Hallett et al. , applying direct methods to prospective cohort data from rural Zimbabwe, measured a relative underestimate of 9.8% in U5M for the period 0–7 years before the survey, a period during which HIV prevalence fell from 23 to 18% among the study population, with minimal ART coverage, in a population with relatively low U5M (0.0671). Taking as inputs 18% HIV prevalence in the year of the survey, 20.5% 10 years earlier, 23% 20 years earlier, with a baseline U5M of 0.0671, our model predicts a relative underestimate of 15.4% for 4 years prior to the survey (estimates from 25 to 29 year olds). This is reasonably close to the Hallett et al. given the probable overestimate of prevalence used for 20 years prior to the survey, and the sensitivity of relative bias measures at low levels of U5M.
In populations affected by HIV/AIDS, indirect estimates of U5M can be significantly biased. Our predictive model allows scholars and practitioners to correct that bias using commonly measured population characteristics. Policies and programs based on indirect estimates of U5M in populations with generalized HIV epidemics may need to be reevaluated after accounting for bias in indirect estimates.
We thank Juan Luis Herrera Cortijo for assistance with implementing the simulation, Simo Goshev, Ista Zahn, Alex Storer, and Kareem Carr from the Research Consulting Support at the Harvard-MIT Data Center for help with simulation modeling and cloud computing, Daniel Hogan for sharing HIV incidence estimates, Patrick Heuveline and Jason Thomas for sharing code for population projections, and Basia Zaba for detailed comments on an earlier draft.
JQ’s role included conceptualization, data curation, formal analysis, software, and writing the original draft. JS, KH, and MC’s roles included conceptualization and editing and review of the manuscript. All authors read and approved the final manuscript.
JQ was supported by a Harvard University Presidential Scholarship and an NIH Infectious Disease and Biodefense traineeship (T32 AI007535, PI: George Seage). The funders had no role in the design of the study; collection, analysis, and interpretation of data; or writing the manuscript.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
- 8.Hill K, You D, Inoue M, Oestergaard MZ. Technical Advisory Group of the United Nations Inter-agency Group for Child Mortality Estimation. Child Mortality Estimation: Accelerated Progress in Reducing Global Child Mortality, 1990–2010. Byass P, editor. PLoS Med. 2012;9(8):e1001303.CrossRefPubMedPubMedCentralGoogle Scholar
- 11.Brass W. Methods for estimating fertility and mortality from limited and defective data. Methods Estim Fertil Mortal Ltd Defective Data. 1975 [cited 2019 Apr 16]; Available from: https://www.cabdirect.org/cabdirect/abstract/19762901082 Google Scholar
- 12.United Nations. Manual X: indirect techniques for demographic estimation. 1983.Google Scholar
- 15.INDEPTH Network, Ghana. INDEPTH model life tables for sub-Saharan Africa. Burlington: Ashgate Publishing, Ltd.; 2004. p. 166.Google Scholar
- 16.Ward P, Zaba B. The effect of HIV on the estimation of child mortality using the children surviving/children ever born technique. South Afr J Demogr. 2008;11(1):39–73.Google Scholar
- 20.R Development Core Team. R: A language and environment for statistical computing. Vienna, Austria: R Foundation for statistical Computing; 2013.Google Scholar
- 21.United Nations Population Division. World Fertility Data 2012. 2012 [cited 2019 Apr 16]. Available from: https://www.un.org/en/development/desa/population/publications/dataset/fertility/wfd2012/MainFrame.html Google Scholar
- 25.Makumbi FE, Nakigozi G, Reynolds SJ, Ndyanabo A, Lutalo T, Serwada D, et al. Associations between HIV Antiretroviral Therapy and the Prevalence and Incidence of Pregnancy in Rakai, Uganda. AIDS Research and Treatment. 2011 [cited 2019 Apr 18]. Available from: https://www.hindawi.com/journals/art/2011/519492/abs/ Google Scholar
- 27.Maier M, Andia I, Emenyonu N, Guzman D, Kaida A, Pepper L, et al. Antiretroviral therapy is associated with increased fertility desire, but not pregnancy or live birth, among HIV+ women in an early HIV treatment program in rural Uganda. AIDS Behav. 2009;13(1):28–37.CrossRefPubMedPubMedCentralGoogle Scholar
- 30.Zaba B, Calvert C, Marston M, Isingo R, Nakiyingi-Miiro J, Lutalo T, et al. Effect of HIV infection on pregnancy-related mortality in sub-Saharan Africa: secondary analyses of pooled community-based data from the network for Analysing longitudinal population-based HIV/AIDS data on Africa (ALPHA). Lancet. 2013;381(9879):1763–71.CrossRefPubMedPubMedCentralGoogle Scholar
- 34.World Bank. World Development Indicators (WDI) | Data Catalog. 2012 [cited 2019 Apr 18]. Available from: https://datacatalog.worldbank.org/dataset/world-development-indicators Google Scholar
- 36.Schneider M, Zwahlen M, Egger M. Natural history and mortality in HIV-positive individuals living in resource-poor settings: A literature review. UNAIDS Oblig HQ03463871 UNAIDS Oblig HQ03463871; 2005.Google Scholar
- 40.Institute for Health Metrics and Evaluation. Adult Mortality Estimates by Country 1970-2010 | GHDx. 2011 [cited 2019 Apr 18]. Available from: http://ghdx.healthdata.org/record/ihme-data/adult-mortality-estimates-country-1970-2010 Google Scholar
- 42.McQuarrie AD, Tsai C-L. Regression and time series model selection. Vol. 43. World Scientific; 1998 [cited 2014 Jan 17]. Available from: http://www.worldscientific.com/doi/pdf/10.1142/9789812385451_0001 CrossRefGoogle Scholar
- 43.NBS/Tanzania NB of S-, Macro ICF. Tanzania Demographic and Health Survey 2010. 2011 [cited 2019 Apr 19]; Available from: https://dhsprogram.com/publications/publication-fr243-dhs-final-reports.cfm Google Scholar
- 44.NSO/Malawi NSO-, Macro ICF. Malawi Demographic and Health Survey 2010. 2011 [cited 2019 Apr 19]; Available from: https://dhsprogram.com/publications/publication-FR247-DHS-Final-Reports.cfm Google Scholar
- 45.UNAIDS. Report on the Global AIDS Epidemic. 2010.Google Scholar
- 46.WHO | World Health Statistics 2012. WHO. [cited 2013 May 20]. Available from: http://www.who.int/gho/publications/world_health_statistics/2012/en/ Google Scholar
- 47.Dept of Nutrition, HIV and AIDS, Govt of Malawi. Malawi HIV and AIDS Monitoring and Evaluation Report 2005. Lilongwe: Govt of Malawi. Available: http://data.unaids.org/pub/report/2006/2006_country_progress_report_malawi_en.pdf.
- 48.Tanzania AIDS Commission. UNAIDS Country Progress Reporting. 2012. Available from: http://www.unaids.org/en/dataanalysis/datatools/aidsinfo/ Google Scholar
- 49.Montgomery DC, Peck EA, Vining GG. Introduction to linear regression analysis, 5th edition. Hoboken: Wiley; 2012.Google Scholar
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.