Tree-based modeling of time-varying coefficients in discrete time-to-event models

  • Marie-Therese PuthEmail author
  • Gerhard Tutz
  • Nils Heim
  • Eva Münster
  • Matthias Schmid
  • Moritz Berger


Hazard models are popular tools for the modeling of discrete time-to-event data. In particular two approaches for modeling time dependent effects are in common use. The more traditional one assumes a linear predictor with effects of explanatory variables being constant over time. The more flexible approach uses the class of semiparametric models that allow the effects of the explanatory variables to vary smoothly over time. The approach considered here is in between these modeling strategies. It assumes that the effects of the explanatory variables are piecewise constant. It allows, in particular, to evaluate at which time points the effect strength changes and is able to approximate quite complex variations of the change of effects in a simple way. A tree-based method is proposed for modeling the piecewise constant time-varying coefficients, which is embedded into the framework of varying-coefficient models. One important feature of the approach is that it automatically selects the relevant explanatory variables and no separate variable selection procedure is needed. The properties of the method are investigated in several simulation studies and its usefulness is demonstrated by considering two real-world applications.


Discrete time-to-event data Time-varying coefficients Recursive partitioning Semiparametric regression Survival analysis 



This paper uses data from the German Family Panel pairfam, coordinated by Josef Brüderl, Karsten Hank, Johannes Huinink, Bernhard Nauck, Franz Neyer, and Sabine Walper. Pairfam is funded as long-term project by the German Research Foundation (DFG).


The work was supported by the German Research Foundation (DFG), Grant SCHM 2966/2-1.


  1. Adebayo SB, Fahrmeir L (2005) Analysing child mortality in Nigeria with geoadditive discrete-time survival models. Stat Med 24:709–728MathSciNetCrossRefGoogle Scholar
  2. Agresti A (2013) Categorical data analysis, 3rd edn. Wiley, New YorkzbMATHGoogle Scholar
  3. Berger M (2018) TSVC: tree-structured modelling of varying coefficients. R package version 1.2.0. MathSciNetCrossRefGoogle Scholar
  4. Berger M, Schmid M (2018) Semiparametric regression for discrete time-to-event data. Stat Model 18:322–345MathSciNetCrossRefGoogle Scholar
  5. Berger M, Schmid M, Welchowski T, Schmitz-Valckenberg S, Beyersmann J (2018a) Subdistribution hazard models for competing risks in discrete time. Biostatistics. CrossRefGoogle Scholar
  6. Berger M, Tutz G, Schmid M (2018b) Tree-structured modelling of varying coefficients. Stat Comput. CrossRefzbMATHGoogle Scholar
  7. Berger M, Welchowski T, Schmitz-Valckenberg S, Schmid M (2018c) A classification tree approach for the modeling of competing risks in discrete time. Adv Data Anal Classif. CrossRefGoogle Scholar
  8. Biasotto M, Pellis T, Cadenaro M, Bevilacqua L, Berlot G, Lenarda RD (2004) Odontogenic infections and descending necrotising mediastinitis: case report and review of the literature. Int Dental J 54:97–102CrossRefGoogle Scholar
  9. Brüderl J, Drobnic̆ S, Hank K, Huinink J, Nauck B, Neyer F, Walper S, Alt P, Borschel E, Bozoyan C, Buhr P, Finn C, Garrett M, Greischel H, Hajek K, Herzig M, Huyer-May B, Lenke R, Müller B, Peter T, Schmiedeberg C, Schütze P, Schumann N, Thönnissen C, Wetzel M, Wilhelm B (2018) The German family panel (pairfam). GESIS Data Archive, Cologne. ZA5678 Data file Version 9.1.0.
  10. Burnham R, Rishi RB, Bridle C (2011) Changes in admission rates for spreading odontogenic infection resulting from changes in government policy about the dental schedule and remunerations. Br J Oral Maxillofac Surg 49:26–28CrossRefGoogle Scholar
  11. Cai Z, Sun Y (2003) Local linear estimation for time-dependent coefficients in Cox’s regression models. Scand J Stat 30:93–111MathSciNetCrossRefGoogle Scholar
  12. Cox DR (1972) Regression models and life-tables. J R Stat Soc, Ser B (Stat Methodol) 34:187–220 (with discussion)MathSciNetzbMATHGoogle Scholar
  13. De Boor C (1978) A practical guide to splines. Springer, New YorkCrossRefGoogle Scholar
  14. Djeundje VB, Crook J (2018) Dynamic survival models with varying coefficients for credit risks. Eur J Oper Res 275:319–333. MathSciNetCrossRefzbMATHGoogle Scholar
  15. Eilers PH, Marx BD (1996) Flexible smoothing with B-splines and penalties. Stat Sci 11:89–102MathSciNetCrossRefGoogle Scholar
  16. Fahrmeir L, Wagenpfeil S (1996) Smoothing hazard functions and time-varying effects in discrete duration and competing risks models. J Am Stat Assoc 91:1584–1594MathSciNetCrossRefGoogle Scholar
  17. Groll A, Tutz G (2017) Variable selection in discrete survival models including heterogeneity. Lifetime Data Anal 23:305–338MathSciNetCrossRefGoogle Scholar
  18. Hastie T, Tibshirani R (1993) Varying-coefficient models. J R Stat Soc, Ser B (Stat Methodol) 55:757–796MathSciNetzbMATHGoogle Scholar
  19. Heim N, Berger M, Wiedemeyer V, Reich RH, Martini M (2018) A mathematical approach improves the predictability of length of hospitalization due to acute odontogenic infection. A retrospective investigation of 303 patients. J Cranio-Maxillofac Surg 47:334–340. CrossRefGoogle Scholar
  20. Heyard R, Timsit JF, Essaied W, Held L (2018) Dynamic clinical prediction models for discrete time-to-event data with competing risks—a case study on the OUTCOMEREA database. Biom J. CrossRefGoogle Scholar
  21. Huininik J (2014) Alter der Mütter bei Geburt des ersten und der nachfolgenden Kinder - europäischer Vergleich. In: Deutsche Familienstiftung (Hrsg) Wenn Kinder - wann Kinder? Ergebnisse der ersten Welle des Beziehungs- und Familienpanels. Parzellers Buchverlag, Fulda, pp 13–26Google Scholar
  22. Huinink J, Brüderl J, Nauck B, Walper S, Castiglioni L, Feldhaus M (2011) Panel analysis of intimate relationships and family dynamics (pairfam): conceptual framework and design. J Fam Res 23:77–101Google Scholar
  23. Kalbfleisch JD, Prentice R (2002) The survival analysis of failure time data, 2nd edn. Hoboken, WileyCrossRefGoogle Scholar
  24. Kandala NB, Ghilagaber G (2006) A geo-additive Bayesian discrete-time survival model and its application to spatial analysis of childhood mortality in Malawi. Qual Quant 40:935–957CrossRefGoogle Scholar
  25. Klein J, Möschberger M (2003) Survival analysis: statistical methods for censored and truncated data. Springer, New YorkGoogle Scholar
  26. Klein JP, Houwelingen HCV, Ibrahim JG, Scheike TH (2016) Handbook of survival analysis. Chapman & Hall, Boca RatonCrossRefGoogle Scholar
  27. Lambert P, Eilers P (2005) Bayesian proportional hazards model with time-varying regression coefficients: a penalized Poisson regression approach. Stat Med 24:3977–3989MathSciNetCrossRefGoogle Scholar
  28. Möst S, Pößnecker W, Tutz G (2016) Variable selection for discrete competing risks models. Qual Quan 50:1589–1610CrossRefGoogle Scholar
  29. Rao D, Desai A, Kulkarni R, Gopalkrishnan K, Rao C (2010) Comparison of maxillofacial space infection in diabetic and nondiabetic patients. Oral Surg, Oral Med, Oral Pathol, Oral Radiol, Endod 110:e7–e12CrossRefGoogle Scholar
  30. Ruhe C (2018) Quantifying change over time: interpreting time-varying effects in duration analyses. Polit Anal 26:90–111CrossRefGoogle Scholar
  31. Sargent DJ (1997) A flexible approach to time-varying coefficients in the Cox regression setting. Lifetime Data Anal 3:13CrossRefGoogle Scholar
  32. Schmid M, Tutz G, Welchowski T (2017) Discrimination measures for discrete time-to-event predictions. Econom Stat 7:153–164MathSciNetGoogle Scholar
  33. Tian L, Zucker D, Wei L (2005) On the Cox model with time-varying regression coefficients. J Am Stat Assoc 100:172–183MathSciNetCrossRefGoogle Scholar
  34. Tutz G, Binder H (2004) Flexible modelling of discrete failure time including time-varying smooth effects. Stat Med 23:2445–2461CrossRefGoogle Scholar
  35. Tutz G, Schmid M (2016) Modeling discrete time-to-event data. Springer, New YorkCrossRefGoogle Scholar
  36. Van den Berg GJ (2001) Duration models: specification, identification and multiple durations. In: Heckman JJ, Leamer E (eds) Handbook of econometrics. North Holland, AmsterdamGoogle Scholar
  37. Welchowski T, Schmid M (2018) discSurv: discrete time survival analysis. R package version 1.3.4.
  38. Willett JB, Singer JD (1993) Investigating onset, cessation, relapse, and recovery: why you should, and how you can, use discrete-time survival analysis to examine event occurrence. J Consult Clin Psychol 61:952–965CrossRefGoogle Scholar
  39. Wood SN (2011) Fast stable restricted maximum likelihood and marginal likelihood estimation of semiparametric generalized linear models. J R Stat Soc: Ser B (Stat Methodol) 73:3–36MathSciNetCrossRefGoogle Scholar
  40. Wood SN (2017) Generalized additive models: an introduction with R, 2nd edn. Chapman & Hall, Boca RatonCrossRefGoogle Scholar
  41. Wood SN (2018) mgcv: mixed GAM computation vehicle with GCV/AIC/REML smoothness estimation. R package version 1.8-15.
  42. Xu R, Adak S (2002) Survival analysis with time-varying regression effects using a tree-based approach. Biometrics 58:305–315MathSciNetCrossRefGoogle Scholar
  43. Yee TW (2010) The VGAM package for categorical data analysis. J Stat Softw 32:1–34CrossRefGoogle Scholar
  44. Yee TW (2017) VGAM: vector generalized linear and additive models. R package version 1.0-4.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Department of Medical Biometry, Informatics and Epidemiology, Faculty of MedicineUniversity of BonnBonnGermany
  2. 2.Institute of General Practice and Family Medicine, Faculty of MedicineUniversity of BonnBonnGermany
  3. 3.Department of StatisticsLudwig-Maximilians-University MunichMunichGermany
  4. 4.Department of Oral and Cranio-Maxillo and Facial Plastic SurgeryUniversity Hospital BonnBonnGermany

Personalised recommendations