Semiparametric generalized exponential frailty model for clustered survival data

  • Wagner Barreto-SouzaEmail author
  • Vinícius Diniz Mayrink


In this paper, we propose a novel and mathematically tractable frailty model for clustered survival data by assuming a generalized exponential (GE) distribution for the latent frailty effect. Both parametric and semiparametric versions of the GE frailty model are studied with main focus for the semiparametric case, where an EM-algorithm is proposed. Our EM-based estimation for the GE frailty model is simpler, faster and immune to a flat likelihood issue affecting, for example, the semiparametric gamma model, as illustrated in this paper through simulated and real data. We also show that the GE model is at least competitive with respect to the gamma frailty model under misspecification. A broad analysis is developed, with simulation results explored via Monte Carlo replications, to evaluate and compare models. A real application using a clustered kidney catheter data is considered to demonstrate the potential for practice of the GE frailty model.


Censored data EM-algorithm Flat likelihood Gamma frailty model Partial likelihood Proportional hazards 



The authors would like to thank the Associate Editor and two anonymous referees for their constructive comments and suggestions. The authors also acknowledge the financial support from Fundação de Amparo à Pesquisa de Minas Gerais (FAPEMIG/Brazil) and Conselho Nacional de Desenvolvimento Científico e Tecnológico (CNPq/Brazil).


  1. Aalen, O. O. (1992). Modelling heterogeneity in survival analysis by the compound Poisson distribution. Annals of Applied Probability, 4, 951–972.MathSciNetCrossRefzbMATHGoogle Scholar
  2. Andersen, P. K., Klein, J. P., Knudsen, K., Palacios, R. T. (1997). Estimation of variance in Cox’s regression model with shared gamma frailties. Biometrics, 53, 1475–1484.Google Scholar
  3. Balakrishnan, N., Peng, Y. (2006). Generalized gamma frailty model. Statistics in Medicine, 25, 2797–2816.Google Scholar
  4. Brostrom, G. (2016). eha: Event history analysis. R package version 2.4-4. Accessed March 2018.
  5. Callegaro, A., Iacobelli, S. (2012). The Cox shared frailty model with log-skew-normal frailties. Statistical Modelling, 12, 399–418.Google Scholar
  6. Christian, N. J., Ha, I. D., Jeong, J. H. (2016). Hierarchical likelihood inference on clustered competing risks data. Statistics in Medicine, 35, 251–267.Google Scholar
  7. Cox, D. R. (1972). Regression models and life-tables (with discussion). Journal of the Royal Statistical Society B, 34, 187–220.MathSciNetzbMATHGoogle Scholar
  8. Crowder, M. (1989). A multivariate distribution with Weibull connections. Journal of the Royal Statistical Society B, 51, 93–107.MathSciNetzbMATHGoogle Scholar
  9. Duchateau, L., Janssen, P. (2008). The frailty model. Springer series in statistics. New York: Springer.Google Scholar
  10. Enki, D. G., Noufaily, A., Farrington, C. P. (2014). A time-varying shared frailty model with application to infectious diseases. The Annals of Applied Statistics, 8, 430–447.Google Scholar
  11. Fletcher, R. (2000). Practical methods of optimization (2nd ed.). New York: Wiley.CrossRefzbMATHGoogle Scholar
  12. Giner, G., Smyth, G. K. (2016). statmod: Probability calculations for the inverse Gaussian distribution. R Journal, 8(1), 339–351.Google Scholar
  13. Gupta, R. C., Gupta, P. L., Gupta, R. D. (1998). Modeling failure time data by Lehman alternatives. Communications in Statistics: Theory and Methods, 27, 887–904.Google Scholar
  14. Gupta, R. D., Kundu, D. (1999). Generalized exponential distributions. Australian and New Zealand Journal of Statistics, 41, 173–188.Google Scholar
  15. Gupta, R. D., Kundu, D. (2001). Exponentiated exponential family: An alternative to gamma and Weibull distributions. Biometrical Journal, 43, 117–130.Google Scholar
  16. Ha, I. D., Pan, J., Oh, S., Lee, Y. (2014). Variable selection in general frailty models using penalized h-likelihood. Journal of Computational and Graphical Statistics, 23, 1044–1060.Google Scholar
  17. Hougaard, P. (1984). Life table methods for heterogeneous populations: Distributions describing the heterogeneity. Biometrika, 71, 75–83.MathSciNetCrossRefzbMATHGoogle Scholar
  18. Hougaard, P. (1986). A class of multivariate failure time distributions. Biometrika, 73, 671–678.MathSciNetzbMATHGoogle Scholar
  19. Hougaard, P. (2000). Analysis of multivariate survival data. New York: Springer. Springer series. in Statistics.CrossRefzbMATHGoogle Scholar
  20. Hougaard, P., Harvald, B., Holm, N. V. (1992). Measuring the similarities between the lifetimes of adult danish twins born 1881–1930. Journal of the American Statistical Association, 87, 17–24.Google Scholar
  21. Ibrahim, J. G., Chen, M. H., Sinha, D. (2001). Bayesian survival analysis. Springer series in statistics. New York: Springer.Google Scholar
  22. Klein, J. P. (1992). Semiparametric estimation of random effects using the Cox model based on the EM algorithm. Biometrics, 48, 795–806.CrossRefGoogle Scholar
  23. Knaus, J. (2015). Snowfall: Easier cluster computing (based on snow). R package version 1.84-6.1. Accessed Mar 2018.
  24. McGilchrist, C. A. (1993). REML estimation for survival models with frailty. Biometrics, 49, 221–225.CrossRefGoogle Scholar
  25. McGilchrist, C. A., Aisbett, C. W. (1991). Regression with frailty in survival analysis. Biometrics, 49, 461–466.Google Scholar
  26. Munda, M., Rotolo, F., Legrand, C. (2012). Parfm: Parametric frailty models in R. Journal of Statistical Software, 51(11), 1–20.Google Scholar
  27. Nadarajah, S., Kotz, S. (2006). The beta exponential distribution. Reliability Engineering and System Safety, 91, 689–697.Google Scholar
  28. Nielsen, G. G., Gill, R. D., Andersen, P. K., Sorensen, T. I. A. (1992). A counting process approach to maximum likelihood estimation in frailty models. Scandinavian Journal of Statistics, 19, 25–44.Google Scholar
  29. Parner, E. (1998). Asymptotic theory for the correlated gamma-frailty model. Annals of Statistics, 26, 181–214.MathSciNetzbMATHGoogle Scholar
  30. R Core Team (2016). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. Accessed Mar 2018.
  31. Therneau, T. (2015). A package for survival analysis in S. R package version 2.38. Accessed Mar 2018.
  32. Therneau, T. M., Grambsch, P. M. (2000). Modeling survival data: Extending the Cox model. New York: Springer.Google Scholar
  33. Therneau, T. M., Grambsch, P. M., Fleming, T. R. (1990). Martingale-based residuals for survival models. Biometrika, 77(1), 147–160.Google Scholar
  34. Therneau, T. M., Grambsch, P. M., Pankratz, V. S. (2003). Penalized survival models. Journal of Computational and Graphical Statistics, 12(1), 156–175.Google Scholar
  35. Vaupel, J., Manton, K., Stallard, E. (1979). The impact of heterogeneity in individual frailty on the dynamics of mortality. Demography, 16, 439–454.Google Scholar
  36. Wienke, A. (2011). Frailty models in survival analysis. CRC biostatistics series. New York: Chapman and Hall.Google Scholar
  37. Yavuz, A. C., Lambert, P. (2016). Semi-parametric frailty model for clustered interval-censored data. Statistical Modelling, 16, 360–391.Google Scholar
  38. Yu, B. (2006). Estimation of shared gamma frailty models by a modified EM algorithm. Computational Statistics and Data Analysis, 50, 463–474.MathSciNetCrossRefzbMATHGoogle Scholar
  39. Zeng, D., Lin, D. Y., Lin, X. (2008). Semiparametric transformation models with random effects for clustered failure time data. Statistica Sinica, 18, 355–377.Google Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2018

Authors and Affiliations

  • Wagner Barreto-Souza
    • 1
    Email author
  • Vinícius Diniz Mayrink
    • 1
  1. 1.Departamento de EstatísticaUniversidade Federal de Minas GeraisBelo HorizonteBrazil

Personalised recommendations