pp 1–35 | Cite as

Heavy-tailed longitudinal regression models for censored data: a robust parametric approach

  • Larissa A. Matos
  • Víctor H. LachosEmail author
  • Tsung-I Lin
  • Luis M. Castro
Original Paper


Longitudinal HIV-1 RNA viral load measures are often subject to censoring due to upper and lower detection limits depending on the quantification assays. A complication arises when these continuous measures present a heavy-tailed behavior because inference can be seriously affected by the misspecification of their parametric distribution. For such data structures, we propose a robust nonlinear censored regression model based on the scale mixtures of normal distributions. By taking into account the autocorrelation existing among irregularly observed measures, a damped exponential correlation structure is considered. A stochastic approximation of the EM algorithm is developed to obtain the maximum likelihood estimates of the model parameters. The main advantage of this new procedure os to allow estimating the parameters of interest and evaluating the log-likelihood function easily and quickly. Furthermore, the standard errors of the fixed effects and predictions of unobservable values of the response can be obtained as a byproduct. The practical utility of the proposed method is exemplified using both simulated and real data.


HIV viral load Longitudinal data Nonlinear models SAEM algorithm Outliers 

Mathematics Subject Classification

62F10 62J05 



We are grateful to two anonymous referees and the associate editor for very useful comments and suggestions, which greatly improved this paper. We also acknowledge the support from FAPESP-Brazil (Grants 2011/22063-9, 2015/05385-3, 2014/ 02938-9 and 2018/05013-7), CNPq-Brazil (Grant 305054/2011-2), Grant FONDECYT 1170258 from the Chilean government and the Ministry of Science and Technology of Taiwan (Grant MOST105-2118-M-005-003-MY2).


  1. Akaike H (1974) A new look at the statistical model identification. IEEE Trans Autom Cont 19:716–723MathSciNetCrossRefzbMATHGoogle Scholar
  2. Andrews D, Bickel P, Hampel F, Huber P, Rogers W, Tukey J (1972) Robust estimates of location: survey and advances. Princeton University Press, PrincetonzbMATHGoogle Scholar
  3. Andrews DF, Mallows CL (1974) Scale mixtures of normal distributions. J R Stat Soc Ser B 36(1):99–102MathSciNetzbMATHGoogle Scholar
  4. Arellano-Valle RB, Castro LM, González-Farías G, Muñoz-Gajardo KA (2012) Student-t censored regression model: properties and inference. Stat Methods Appl 21(4):453–473MathSciNetCrossRefzbMATHGoogle Scholar
  5. Davidian M, Giltinan D (2003) Nonlinear models for repeated measurements: an overview and update. J Agric Biol Environ Stat 8:387–419CrossRefGoogle Scholar
  6. Delyon B, Lavielle M, Moulines E (1999) Convergence of a stochastic approximation version of the EM algorithm. Ann Stat 27:94–128MathSciNetCrossRefzbMATHGoogle Scholar
  7. Dempster A, Laird N, Rubin D (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B 39:1–38MathSciNetzbMATHGoogle Scholar
  8. Diggle P (2002) Analysis of longitudinal data. Oxford University Press, OxfordGoogle Scholar
  9. Galarza CE, Lachos VH, Bandyopadhyay D (2017) Quantile regression in linear mixed models: a stochastic approximation EM approach. Stat Interface 10(2):471–482MathSciNetCrossRefzbMATHGoogle Scholar
  10. Garay AM, Castro LM, Leskow J, Lachos VH (2017a) Censored linear regression models for irregularly observed longitudinal data using the multivariate-t distribution. Stat Methods Med Res 26(2):542–566MathSciNetCrossRefGoogle Scholar
  11. Garay AM, Lachos VH, Bolfarine H, Cabral CR (2017b) Linear censored regression models with scale mixtures of normal distributions. Stat Pap 58(1):247–278MathSciNetCrossRefzbMATHGoogle Scholar
  12. Gross AM (1973) A monte carlo swindle for estimators of location. Appl Stat 22:347–353MathSciNetCrossRefGoogle Scholar
  13. Kuhn E, Lavielle M (2004) Coupling a stochastic approximation version of EM with an MCMC procedure. ESAIM: Probab Stat 8:115–131MathSciNetCrossRefzbMATHGoogle Scholar
  14. Kuhn E, Lavielle M (2005) Maximum likelihood estimation in nonlinear mixed effects models. Comput Stat Data Anal 49(4):1020–1038MathSciNetCrossRefzbMATHGoogle Scholar
  15. Lachos V, Labra F, Bolfarine H, Ghosh P (2010) Multivariate measurement error models based on scale mixtures of the skew-normal distribution. Statistics 44(6):541–556MathSciNetCrossRefzbMATHGoogle Scholar
  16. Lachos VH, Bandyopadhyay D, Dey DK (2011) Linear and nonlinear mixed-effects models for censored hiv viral loads using normal/independent distributions. Biometrics 67:1594–1604MathSciNetCrossRefzbMATHGoogle Scholar
  17. Lange K, Little R, Taylor J (1989) Robust statistical modeling using the t distribution. J Am Stat Assoc 84:881–896MathSciNetGoogle Scholar
  18. Lange KL, Sinsheimer JS (1993) Normal/independent distributions and their applications in robust regression. J Comput Graph Stat 2:175–198MathSciNetGoogle Scholar
  19. Lavielle M, Mbogning C (2014) An improved SAEM algorithm for maximum likelihood estimation in mixtures of non linear mixed effects models. Stat Comput 24(5):693–707MathSciNetCrossRefzbMATHGoogle Scholar
  20. Louis TA (1982) Finding the observed information matrix when using the EM algorithm. J R Stat Soc Ser B (Methodol) 44:226–233MathSciNetzbMATHGoogle Scholar
  21. Lucas A (1997) Robustness of the student t based M-estimator. Commun Stat-Theory Methods 26(5):1165–1182MathSciNetCrossRefzbMATHGoogle Scholar
  22. Massuia MB, Cabral CRB, Matos LA, Lachos VH (2015) Influence diagnostics for Student-t censored linear regression models. Statistics 49:1074–1094MathSciNetCrossRefzbMATHGoogle Scholar
  23. Matos L, Prates M, Chen MH, Lachos V (2013) Likelihood based inference for linear and nonlinear mixed-effects models with censored response using the multivariate-t distribution. Statistica Sinica 23:1323–1345MathSciNetzbMATHGoogle Scholar
  24. Matos L, Bandyopadhyay D, Castro L, Lachos V (2015) Diagnostics for censored mixed-effects models using the multivariate \(t\)-distribution. J Multivar Anal 141:104–117CrossRefzbMATHGoogle Scholar
  25. Matos L, Castro LM, Lachos VH (2016) Censored mixed-effects models for irregularly observed repeated measures with applications to HIV viral loads. Test 25:627–653MathSciNetCrossRefzbMATHGoogle Scholar
  26. Meza C, Osorio F, De la Cruz R (2012) Estimation in nonlinear mixed-effects models using heavy-tailed distributions. Stat Comput 22(1):121–139MathSciNetCrossRefzbMATHGoogle Scholar
  27. Muñoz A, Carey V, Schouten JP, Segal M, Rosner B (1992) A parametric family of correlation structures for the analysis of longitudinal data. Biometrics 48:733–742CrossRefGoogle Scholar
  28. Prates MO, Costa DR, Lachos VH (2014) Generalized linear mixed models for correlated binary data with t-link. Stat Comput 24(6):1111–1123MathSciNetCrossRefzbMATHGoogle Scholar
  29. R Development Core Team (2017) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN: 3-900051-07-0.
  30. Saitoh A, Foca M, Viani R, Heffernan-Vacca S, Vaida F, Lujan-Zilbermann J, Emmanuel P, Deville J, Spector S (2008) Clinical outcomes after an unstructured treatment interruption in children and adolescents with perinatally acquired HIV infection. Pediatrics 121(3):e513CrossRefGoogle Scholar
  31. Samson A, Lavielle M, Mentré F (2006) Extension of the SAEM algorithm to left-censored data in nonlinear mixed-effects model: application to HIV dynamics model. Comput Stat Data Anal 51(3):1562–1574MathSciNetCrossRefzbMATHGoogle Scholar
  32. Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464MathSciNetCrossRefzbMATHGoogle Scholar
  33. Vaida F, Liu L (2009) Fast implementation for normal mixed effects models with censored response. J Comput Graph Stat 18(4):797–817MathSciNetCrossRefGoogle Scholar
  34. Vaida F, Fitzgerald A, DeGruttola V (2007) Efficient hybrid EM for linear and nonlinear mixed effects models with censored response. Comput Stat Data Anal 51(12):5718–5730MathSciNetCrossRefzbMATHGoogle Scholar
  35. Wang J, Genton MG (2006) The multivariate skew-slash distribution. J Stat Plan Inference 136:209–220MathSciNetCrossRefzbMATHGoogle Scholar
  36. Wang WL (2013) Multivariate t linear mixed models for irregularly observed multiple repeated measures with missing outcomes. Biom J 55(4):554–571MathSciNetCrossRefzbMATHGoogle Scholar
  37. Wang WL (2017) Mixture of multivariate \(t\) linear mixed models for multi-outcome longitudinal data with heterogeneity. Statistica Sinica 27:733–760MathSciNetzbMATHGoogle Scholar
  38. Wei GC, Tanner MA (1990) A Monte Carlo implementation of the EM algorithm and the poor man’s data augmentation algorithms. J Am Stat Assoc 85(411):699–704CrossRefGoogle Scholar
  39. Wu L (2010) Mixed effects models for complex data. Chapman & Hall/CRC, Boca RatonzbMATHGoogle Scholar

Copyright information

© Sociedad de Estadística e Investigación Operativa 2018

Authors and Affiliations

  • Larissa A. Matos
    • 1
  • Víctor H. Lachos
    • 2
    Email author
  • Tsung-I Lin
    • 3
    • 4
  • Luis M. Castro
    • 5
  1. 1.Department of StatisticsUniversidade Estadual de CampinasCampinasBrazil
  2. 2.Department of StatisticsUniversity of ConnecticutStorrsUSA
  3. 3.Institute of StatisticsNational Chung Hsing UniversityTaichungTaiwan
  4. 4.Department of Public HealthChina Medical UniversityTaichungTaiwan
  5. 5.Department of StatisticsPontificia Universidad Católica de ChileSantiagoChile

Personalised recommendations