Quality of Life Research

, Volume 28, Issue 5, pp 1315–1325 | Cite as

Modeling strategies to improve parameter estimates in prognostic factors analyses with patient-reported outcomes in oncology

  • Francesco CottoneEmail author
  • Nina Deliu
  • Gary S. Collins
  • Amelie Anota
  • Franck Bonnetain
  • Kristel Van Steen
  • David Cella
  • Fabio Efficace



The inclusion of patient-reported outcome (PRO) questionnaires in prognostic factor analyses in oncology has substantially increased in recent years. We performed a simulation study to compare the performances of four different modeling strategies in estimating the prognostic impact of multiple collinear scales from PRO questionnaires.


We generated multiple scenarios describing survival data with different sample sizes, event rates and degrees of multicollinearity among five PRO scales. We used the Cox proportional hazards (PH) model to estimate the hazard ratios (HR) using automatic selection procedures, which were based on either the likelihood ratio-test (Cox-PV) or the Akaike Information Criterion (Cox-AIC). We also used Cox PH models which included all variables and were either penalized using the Ridge regression (Cox-R) or were estimated as usual (Cox-Full). For each scenario, we simulated 1000 independent datasets and compared the average outcomes of all methods.


The Cox-R showed similar or better performances with respect to the other methods, particularly in scenarios with medium–high multicollinearity (ρ = 0.4 to ρ = 0.8) and small sample sizes (n = 100). Overall, the Cox-PV and Cox-AIC performed worse, for example they did not select one or more prognostic collinear PRO scales in some scenarios. Compared with the Cox-Full, the Cox-R provided HR estimates with similar bias patterns but smaller root-mean-squared errors, particularly in higher multicollinearity scenarios.


Our findings suggest that the Cox-R is the best approach when performing prognostic factor analyses with multiple and collinear PRO scales, particularly in situations of high multicollinearity, small sample sizes and low event rates.


Health-related quality of life Multicollinearity Patient-reported outcomes Prognostic factor analysis Ridge regression 


Author contributions

FC, FE: Conception and design, FC, ND, FE: Statistical analyses, all authors: Interpretation of results, all authors: Manuscript writing.

Compliance with ethical standards

Conflict of interest

No potential conflict of interest for this paper was reported by the authors.

Supplementary material

11136_2018_2097_MOESM1_ESM.pdf (1.5 mb)
Supplementary material 1 (PDF 1541 KB)


  1. 1.
    Gotay, C. C., Kawamoto, C. T., Bottomley, A., & Efficace, F. (2008). The prognostic significance of patient-reported outcomes in cancer clinical trials. Journal of Clinical Oncology, 26(8), 1355–1363.Google Scholar
  2. 2.
    Secord, A. A., Coleman, R. L., Havrilesky, L. J., Abernethy, A. P., Samsa, G. P., & CELLA, D. (2015). Patient-reported outcomes as end points and outcome indicators in solid tumours. Nature Reviews Clinical oncology, 12(6), 358–370.Google Scholar
  3. 3.
    Efficace, F., Gaidano, G., Breccia, M., Voso, M. T., Cottone, F., Angelucci, E., et al. (2015). Prognostic value of self-reported fatigue on overall survival in patients with myelodysplastic syndromes: A multicentre, prospective, observational, cohort study. The Lancet Oncology, 16(15), 1506–1514.Google Scholar
  4. 4.
    Efficace, F., Bottomley, A., Coens, C., Van Steen, K., Conroy, T., Schoffski, P., et al. (2006). Does a patient’s self-reported health-related quality of life predict survival beyond key biomedical data in advanced colorectal cancer? European Journal of Cancer, 42(1), 42–49.Google Scholar
  5. 5.
    Quinten, C., Martinelli, F., Coens, C., Sprangers, M. A., Ringash, J., Gotay, C., et al. (2014). A global analysis of multitrial data investigating quality of life and symptoms as prognostic factors for survival in different tumor sites. Cancer, 120(2), 302–311.Google Scholar
  6. 6.
    Efficace, F., Biganzoli, L., Piccart, M., Coens, C., Van Steen, K., Cufer, T., et al. (2004). Baseline health-related quality-of-life data as prognostic factors in a phase III multicentre study of women with metastatic breast cancer. European Journal of Cancer, 40(7), 1021–1030.Google Scholar
  7. 7.
    Maisey, N. R., Norman, A., Watson, M., Allen, M. J., Hill, M. E., & Cunningham, D. (2002). Baseline quality of life predicts survival in patients with advanced colorectal cancer. European Journal of Cancer, 38(10), 1351–1357.Google Scholar
  8. 8.
    Efficace, F., Innominato, P. F., Bjarnason, G., Coens, C., Humblet, Y., Tumolo, S., et al. (2008). Validation of patient’s self-reported social functioning as an independent prognostic factor for survival in metastatic colorectal cancer patients: results of an international study by the Chronotherapy Group of the European Organisation for Research and Treatment of Cancer. Journal of Clinical Oncology, 26(12), 2020–2026.Google Scholar
  9. 9.
    Fang, F. M., Tsai, W. L., Chiu, H. C., Kuo, W. R., & Hsiung, C. Y. (2004). Quality of life as a survival predictor for esophageal squamous cell carcinoma treated with radiotherapy. International Journal of Radiation Oncology, Biology, Physics, 58(5), 1394–1404.Google Scholar
  10. 10.
    Chau, I., Norman, A. R., Cunningham, D., Waters, J. S., Oates, J., & Ross, P. J. (2004). Multivariate prognostic factor analysis in locally advanced and metastatic esophago-gastric cancer–pooled analysis from three multicenter, randomized, controlled trials using individual patient data. Journal of Clinical Oncology, 22(12), 2395–2403.Google Scholar
  11. 11.
    de Graeff, A., de Leeuw, J. R., Ros, W. J., Hordijk, G. J., Blijham, G. H., & Winnubst, J. A. (2001). Sociodemographic factors and quality of life as prognostic indicators in head and neck cancer. European Journal of Cancer, 37(3), 332–339.Google Scholar
  12. 12.
    Chiarion-Sileni, V., Del Bianco, P., De Salvo, G. L., Lo Re, G., Romanini, A., Labianca, R., et al. (2003). Quality of life evaluation in a randomised trial of chemotherapy versus bio-chemotherapy in advanced melanoma patients. European Journal of Cancer, 39(11), 1577–1585.Google Scholar
  13. 13.
    Dubois, D., Dhawan, R., van de Velde, H., Esseltine, D., Gupta, S., Viala, M., et al. (2006). Descriptive and prognostic value of patient-reported outcomes: the bortezomib experience in relapsed and refractory multiple myeloma. Journal of Clinical Oncology, 24(6), 976–982.Google Scholar
  14. 14.
    Eton, D. T., Fairclough, D. L., Cella, D., Yount, S. E., Bonomi, P., & Johnson, D. H. (2003). Early change in patient-reported health during lung cancer chemotherapy predicts clinical outcomes beyond those predicted by baseline report: Results from Eastern Cooperative Oncology Group Study 5592. Journal of Clinical Oncology, 21(8), 1536–1543.Google Scholar
  15. 15.
    Bottomley, A., Coens, C., Efficace, F., Gaafar, R., Manegold, C., Burgers, S., et al. (2007). Symptoms and patient-reported well-being: Do they predict survival in malignant pleural mesothelioma? A prognostic factor analysis of EORTC-NCIC 08983: Randomized phase III study of cisplatin with or without raltitrexed in patients with malignant pleural mesothelioma. Journal of Clinical Oncology, 25(36), 5770–5776.Google Scholar
  16. 16.
    Cella, D., Traina, S., Li, T., Johnson, K., Ho, K. F., Molina, A., et al. (2018). Relationship between patient-reported outcomes and clinical outcomes in metastatic castration-resistant prostate cancer: post hoc analysis of COU-AA-301 and COU-AA-302. Annals of Oncology, 29(2), 392–397.Google Scholar
  17. 17.
    Movsas, B., Hu, C., Sloan, J., Bradley, J., Komaki, R., Masters, G., et al. (2016). Quality of life analysis of a radiation dose-escalation study of patients with non-small-cell lung cancer: A secondary analysis of the radiation therapy oncology group 0617 randomized clinical trial. JAMA Oncology, 2(3), 359–367.Google Scholar
  18. 18.
    Mauer, M., Bottomley, A., Coens, C., & Gotay, C. (2008). Prognostic factor analysis of health-related quality of life data in cancer: A statistical methodological evaluation. Expert Review of Pharmacoeconomics & Outcomes Research, 8(2), 179–196.Google Scholar
  19. 19.
    Van Steen, K., Curran, D., Kramer, J., Molenberghs, G., Van Vreckem, A., Bottomley, A., et al. (2002). Multicollinearity in prognostic factor analyses using the EORTC QLQ-C30: identification and impact on model selection. Statistics in Medicine, 21(24), 3865–3884.Google Scholar
  20. 20.
    Aaronson, N. K., Ahmedzai, S., Bergman, B., Bullinger, M., Cull, A., Duez, N. J., et al. (1993). The european organization for research and treatment of cancer QLQ-C30: A quality-of-life instrument for use in international clinical trials in oncology. Journal of the National Cancer Institute, 85(5), 365–376.Google Scholar
  21. 21.
    Cramer, E. M. (1985). Multicollinearity. In S. Kotz, N. L. Johnson & C. B. Read (Eds.), Encyclopedia of statistical sciences. (Vol. 2, pp. 639–643). New York, Wiley.Google Scholar
  22. 22.
    Slinker, B. K., & Glantz, S. A. (1985). Multiple regression for physiological data analysis: The problem of multicollinearity. The American Journal of Physiology, 249(1 Pt 2), R1–R12.Google Scholar
  23. 23.
    Sithisarankul, P., Weaver, V. M., Diener-West, M., & Strickland, P. T. (1997). Multicollinearity may lead to artificial interaction: An example from a cross sectional study of biomarkers. The Southeast Asian Journal of Tropical Medicine and Public Health, 28(2), 404–409.Google Scholar
  24. 24.
    Ediebah, D. E., Coens, C., Zikos, E., Quinten, C., Ringash, J., King, M. T., et al. (2014). Does change in health-related quality of life score predict survival? Analysis of EORTC 08975 lung cancer trial. British Journal of Cancer, 110(10), 2427–2433.Google Scholar
  25. 25.
    Staren, E. D., Gupta, D., & Braun, D. P. (2011). The prognostic role of quality of life assessment in breast cancer. The Breast Journal, 17(6), 571–578.Google Scholar
  26. 26.
    Harrell, f. e. jr., Lee, K. L., Matchar, D. B., & Reichert, T. A. (1985). Regression models for prognostic prediction: Advantages, problems, and suggested solutions. Cancer Treatment Reports, 69(10), 1071–1077.Google Scholar
  27. 27.
    Harrell, F. E. (2015). Regression modeling strategies: With applications to linear models, logistic and ordinal regression, and survival analysis. Cham: Springer.Google Scholar
  28. 28.
    Simon, R., & Altman, D. G. (1994). Statistical aspects of prognostic factor studies in oncology. British journal of cancer, 69(6), 979–985.Google Scholar
  29. 29.
    Cohen, J. (2003). Applied multiple regression/correlation analysis for the behavioral sciences. Mahwah: Lawrence Erlbaum Associates Publishers.Google Scholar
  30. 30.
    Hoerl, A. E., & Kennard, R. W. (2000). Ridge regression: Biased estimation for nonorthogonal problems. Technometrics, 42(1), 80–86.Google Scholar
  31. 31.
    Akaike, H. (1973). Information theory and an extension of the maximum likelihood principle. In B. N. Petrov, F. Csaki (Ed.), Second international symposium on information theory (pp. 267–281): Budapest: Akademai Kiado.Google Scholar
  32. 32.
    Hastie, T., Tibshirani, R., & Friedman, J. H. (2001). The elements of statistical learning: data mining, inference, and prediction. New York: Springer.Google Scholar
  33. 33.
    Fayers, P., Aaronson, N. K., Bjordal, K., Groenvold, M., Curran, D., & Bottomley, A. on behalf of the EORTC Quality of Life Group. (2001). The EORTC QLQ-C30 Scoring Manual (3rd Edn). European Organisation for Research and Treatment of Cancer, Brussels.Google Scholar
  34. 34.
    Masters, G. (1982). A Rasch model for partial credit scoring. Psychometrika, 47, 149–174.Google Scholar
  35. 35.
    Lee, E. T., & Go, O. T. (1997). Survival analysis in public health research. Annual Review of Public Health, 18, 105–134.Google Scholar
  36. 36.
    Bender, R., Augustin, T., & Blettner, M. (2005). Generating survival times to simulate Cox proportional hazards models. Statistics in Medicine, 24(11), 1713–1723.Google Scholar
  37. 37.
    Altman, D. G., & Andersen, P. K. (1989). Bootstrap investigation of the stability of a Cox regression model. Statistics in Medicine, 8(7), 771–783.Google Scholar
  38. 38.
    Sauerbrei, W., Boulesteix, A. L., & Binder, H. (2011). Stability investigations of multivariable regression models derived from low- and high-dimensional data. Journal of Biopharmaceutical Statistics, 21(6), 1206–1231.Google Scholar
  39. 39.
    Efron, B. (1977). The efficiency of Cox’s likelihood function for censored data. Journal of the American Statistical Association, 72, 557–565.Google Scholar
  40. 40.
    Team, R. C. (2016). R: A language and environment for statistical computing.
  41. 41.
    Morozova, O., Levina, O., Uuskula, A., & Heimer, R. (2015). Comparison of subset selection methods in linear regression in the context of health-related quality of life and substance abuse in Russia. BMC Medical Research Methodology, 15, 71.Google Scholar
  42. 42.
    Steyerberg, E. W., Eijkemans, M. J., Harrell, F. E. Jr., & Habbema, J. D. (2000). Prognostic modelling with logistic regression analysis: A comparison of selection and estimation methods in small data sets. Statistics in Medicine, 19(8), 1059–1079.Google Scholar
  43. 43.
    Yoo, W., Mayberry, R., Bae, S., Singh, K., He, P., Q., & Lillard, J. W. Jr. (2014). A study of effects of multicollinearity in the multivariable analysis. International Journal of Applied Science and Technology, 4(5), 9–19.Google Scholar
  44. 44.
    Dormann, C. F., Elith, J., Bacher, S., Buchmann, C., Carl, G., Carré, G., et al. (2013). Collinearity: a review of methods to deal with it and a simulation study evaluating their performance. Ecography, 36(1), 27–46.Google Scholar
  45. 45.
    Xue, X., Kim, M. Y., & Shore, R. E. (2007). Cox regression analysis in presence of collinearity: An application to assessment of health risks associated with occupational radiation exposure. Lifetime Data Analysis, 13(3), 333–350.Google Scholar
  46. 46.
    Sauerbrei, W., & Schumacher, M. (1992). A bootstrap resampling procedure for model building: Application to the Cox regression model. Statistics in Medicine, 11(16), 2093–2109.Google Scholar
  47. 47.
    Peduzzi, P., Concato, J., Kemper, E., Holford, T. R., & Feinstein, A. R. (1996). A simulation study of the number of events per variable in logistic regression analysis. Journal of Clinical Epidemiology, 49(12), 1373–1379.Google Scholar
  48. 48.
    Harrell, F. E. Jr., Lee, K. L., Califf, R. M., Pryor, D. B., & Rosati, R. A. (1984). Regression modelling strategies for improved prognostic prediction. Statistics in Medicine, 3(2), 143–152.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Data Center and Health Outcomes Research UnitItalian Group for Adult Hematologic Diseases (GIMEMA)RomeItaly
  2. 2.Centre for Statistics in MedicineNDORMS, University of OxfordOxfordUK
  3. 3.Methodology and Quality of Life in Oncology Unit (INSERM UMR 1098)University Hospital of BesançonBesançonFrance
  4. 4.French National Platform Quality of Life and CancerBesançonFrance
  5. 5.GIGA-R Medical Genomics UnitUniversity of LiègeLiègeBelgium
  6. 6.Department of Human Genetics – Systems MedicineUniversity of LeuvenLeuvenBelgium
  7. 7.Department of Medical Social Sciences, Feinberg School of MedicineNorthwestern UniversityChicagoUSA

Personalised recommendations