Journal of Statistical Theory and Practice

, Volume 11, Issue 3, pp 418–435 | Cite as

On the Fleming—Harrington test for late effects in prevention randomized controlled trials

  • Valérie GarèsEmail author
  • Sandrine Andrieu
  • Jean-François Dupuy
  • Nicolas Savy


Weighted logrank tests are the usual tool for detecting late effects in clinical trials. Weights determine the alternative hypotheses against which the tests are optimal. Choosing a specific weight is thus a crucial issue in practice. One common weight was introduced in 1982 by Harrington and Fleming. The corresponding test is implemented in standard statistical softwares packages. However, using this test in randomized controlled clinical trials raises two major and still unsolved difficulties. First, the weight depends on a parameter q that has to be set before collecting the data. Second, the necessary sample size depends on this q. This article addresses these difficulties. We provide the explicit form of the alternative hypothesis under which the Fleming–Harrington test for late effects is optimal in terms of Pitman’s asymptotic relative efficiency. Using simulations, we investigate various aspects of the Fleming–Harrington test for late effects, such as power properties and sensitivity to the value of q. We also investigate the relation between q and the necessary sample size for the Fleming–Harrington test. Based on these results, we propose q = 3 as a general choice for testing late effects. We illustrate our methodology on a data set arising from a prevention trial in the field of dementia.


Hypothesis test survival data analysis weighted logrank tests asymptotic relative efficiency sample size calculation prevention trial 

AMS Subject Classification

62N03 62P10 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Supplementary material

42519_2017_1295889_MOESM1_ESM.pdf (75 kb)
On Fleming-Harrington’s test for late effects in prevention randomized controlled trials


  1. Andrieu, S., N. Coley, S. Lovestone, P. S. Aisen, and B. Vellas. 2015. Prevention of sporadic Alzheimer’s disease: Lessons learned from clinical trials and future directions. Lancet Neurology 14 (9):926–44.CrossRefGoogle Scholar
  2. Andrieu, S., S. Gillette, K. Amouyal, F. Nourhashemi, E. Reynish, P. J. Ousset, J. L. Albarede, B. Vellas, and H. Grandjean. 2003. Association of Alzheimer’s disease onset with ginkgo biloba and other symptomatic cognitive treatments in a population of women aged 75 years and older from the EPIDOS study. Journals of Gerontology Series A, Biological Sciences and Medical Sciences 58 (4):372–77.CrossRefGoogle Scholar
  3. Andrieu, S., P. J. Ousset, N. Coley, M. Ouzid, H. Mathiex-Fortunet, and B. Vellas. 2008. GuidAge study: A 5-year double blind, randomised trial of EGb 761 for the prevention of Alzheimer’s disease in elderly subjects with memory complaints. I. Rationale, design and baseline data. Current Alzheimer Research 5(4):406–15.CrossRefGoogle Scholar
  4. Billingsley, P. 1999. Convergence of probability measures, 2nd ed. Wiley Series in Probability and Statistics: Probability and Statistics. New York, NY: John Wiley & Sons.CrossRefGoogle Scholar
  5. Breslow, N. E., L. Edler, and J. Berger. 1984. A two-sample censored-data rank test for acceleration. Biometrics 40(4):1049–62.CrossRefGoogle Scholar
  6. Brookmeyer, R. 2007. Forecasting the global burden of Alzheimer’s disease. Alzheimer’s and Dementia 3(3):186–91.MathSciNetCrossRefGoogle Scholar
  7. Brookmeyer, R., S. Gray, and C. Kawas. 1998. Projections of Alzheimer’s disease in the United States and the public health impact of delaying disease onset. American Journal of Public Health 88 (9):1337–42.CrossRefGoogle Scholar
  8. Buyske, S., R. Fagerstrom, and Z. Ying. 2000. A class of weighted log-rank tests for survival data when the event is rare. Journal of the American Statistical Association 95 (449):249–58.MathSciNetCrossRefGoogle Scholar
  9. Cox, D. R. 1972. Regression models and life-tables. Journal of the Royal Statistical Society Series B 34:187–220.MathSciNetzbMATHGoogle Scholar
  10. DeKosky, S. T. 2008. Ginkgo biloba for prevention of dementia: a randomized controlled trial. Journal of the American Medical Association 300(19):2253–62.CrossRefGoogle Scholar
  11. Eng, K. H., and M. R. Kosorok. 2005. A sample size formula for the supremum log-rank statistic. Biometrics 61(1):86–91.MathSciNetCrossRefGoogle Scholar
  12. Fleming, T. R., and D. P. Harrington. 1991. Counting processes and survival analysis. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. New York, NY: John Wiley & Sons.zbMATHGoogle Scholar
  13. Fleming, T. R., D. P. Harrington, and M. O’Sullivan. 1987. Supremum versions of the log-rank and generalized Wilcoxon statistics. Journal of the American Statistical Association 82 (397):312–20.MathSciNetCrossRefGoogle Scholar
  14. Garès, V. 2014. Améliorer la performance des analyses de survie dans le cadre des essais de prévention et application à la maladie dAlzheimer. PhD thesis, Université de Toulouse, Toulouse, France.
  15. Garès, V., S. Andrieu, J.-F. Dupuy, and N. Savy. 2013. Comparison of constant piecewise weighted test and Fleming Harrington’s test — Application in clinical trials. Electronic Journal of Statistics 8 (1):841–860.CrossRefGoogle Scholar
  16. Garès, V., S. Andrieu, J.-F. Dupuy, and N. Savy. 2015. An omnibus test for several hazard alternatives in prevention randomized controlled clinical trials. Statistics in Medicine 34 (4):541–57.MathSciNetCrossRefGoogle Scholar
  17. Gastwirth, J. L. 1985. The use of maximin efficiency robust tests in combining contingency tables and survival analysis. Journal of the American Statistical Association 80 (390):381–84.MathSciNetCrossRefGoogle Scholar
  18. Gehan, E. A. 1965. A generalized Wilcoxon test for comparing arbitrarily singly-censored samples. Biometrika 52:203–23.MathSciNetCrossRefGoogle Scholar
  19. Gill, R. 1980. Censoring and stochastic integrals. Mathematical Centre Tracts 124. Amsterdam, The Netherlands: Mathematisch Centrum.zbMATHGoogle Scholar
  20. Halperin, M., E. Rogot, J. Gurian, and F. Ederer. 1967. Sample sizes for medical trials with special reference to long-term therapy. Biometrics 21:13–24.Google Scholar
  21. Harrington, D. P., and T. R. Fleming. 1982. A class of rank test procedures for censored survival data. Biometrika 69 (3):553–66.MathSciNetCrossRefGoogle Scholar
  22. Jung, S. H. 2008. Sample size calculation for the weighted rank statistics with paired survival data. Statistics in Medicine 27 (17):3350–65.MathSciNetCrossRefGoogle Scholar
  23. Kosorok, M. R. 2008. Introduction to empirical processes and semiparametric inference. Springer Series in Statistics. New York, NY: Springer.CrossRefGoogle Scholar
  24. Kosorok, M. R., and C. Y. Lin. 1999. The versatility of function-indexed weighted log-rank statistics. Journal of the American Statistical Association 94 (445):320–32.MathSciNetCrossRefGoogle Scholar
  25. Lai, T. L., and Z. Ying. 1991. Rank regression methods for left-truncated and right-censored data. Annals of Statistics 19 (2):531–56.MathSciNetCrossRefGoogle Scholar
  26. Lakatos, E., and K. G. Lan. 1992. A comparison of sample size methods for the logrank statistic. Statistics in Medicine 11:179–91.CrossRefGoogle Scholar
  27. Lee, J. W. 1996. Some versatile tests based on the simultaneous use of weighted log-rank statistics. Biometrics 52 (2):721–25.CrossRefGoogle Scholar
  28. Lyketsos, C. G. 2007. Naproxen and celecoxib do not prevent Alzheimer’s disease in early results from a randomized controlled trial. Neurology 68 (21):1800–1808.CrossRefGoogle Scholar
  29. Machin, D., M. J. Campbell, T. S. Beng, and T. S. Huey. 2009. Sample size tables for clinical studies. New York, NY: John Wiley & Sons.Google Scholar
  30. Mantel, N., and W. Haenszel. 1959. Statistical aspects of the analysis of data from retrospective studies of disease. Journal of the National Cancer 22:719–48.Google Scholar
  31. Martinussen, T. and T. H. Scheike. 2006. Dynamic regression models for survival data. Statistics for Biology and Health. New York, NY: Springer.zbMATHGoogle Scholar
  32. Pecková, M. and T. R. Fleming. 2003. Adaptive test for testing the difference in survival distributions. Lifetime Data Analysis 9 (3):223–38.MathSciNetCrossRefGoogle Scholar
  33. Peto, R., and J. Peto. 1972. Asymptotically efficient rank invariant test procedures. Journal of the Royal Statistical Society Series A 135:185–206.CrossRefGoogle Scholar
  34. Prentice, R. L. 1978. Linear rank tests with right censored data. Biometrika 65 (1):167–79.MathSciNetCrossRefGoogle Scholar
  35. Scherrer, B., S. Andrieu, P. J. Ousset, G. Berrut, J. F. Dartigues, B. Dubois, F. Pasquier, F. Piette, P. Robert, J. Touchon, P. Garnier, H. Mathiex-Fortunet, B. Vellas, and the GuidAge Study Group. 2015. Analysing time to event data in dementia prevention trials: The example of the GuidAge study of EGb761. Journal of Nutrition Health and Aging 19 (10):1009–11.CrossRefGoogle Scholar
  36. Schork, M. A., and R. D. Remington. 1967. The determination of sample size in treatment-control comparisons for chronic disease studies in which noncompliance or nonadherence is a problem. Journal of Chronic Diseases 20:233–39.CrossRefGoogle Scholar
  37. Self, S. G. 1991. An adaptive weighted log-rank test with application to cancer prevention and screening trials. Biometrics 47 (3):975–86.CrossRefGoogle Scholar
  38. Shumaker, S. A. 2003. Estrogen plus progestin and the incidence of dementia and mild cognitive impairment in postmenopausal women: The Women’s Health Initiative Memory Study: A randomized controlled trial. Journal of the American Medical Association 289 (20):2651–62.CrossRefGoogle Scholar
  39. Shumaker, S. A. 2004. Conjugated equine estrogens and incidence of probable dementia and mild cognitive impairment in postmenopausal women: Women’s Health Initiative Memory Study. Journal of the American Medical Association 291 (24):2947–58.CrossRefGoogle Scholar
  40. Tarone, R. E., and J. Ware. 1977. On distribution-free tests for equality of survival distributions. Biometrika 64 (1):156–60.MathSciNetCrossRefGoogle Scholar
  41. Van der Vaart, A. W. 1998. Asymptotic statistics, Vol. 3 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge, UK: Cambridge University Press.Google Scholar
  42. Vellas, B., N. Coley, P. J. Ousset, G. Berrut, J. F. Dartigues, B. Dubois, H. Grandjean, F. Pasquier, F. Piette, P. Robert, J. Touchon, P. Garnier, H. Mathiex-Fortunet, and S. Andrieu for the GuidAge Study Group. 2012. Long-term use of standardised ginkgo biloba extract for the prevention of Alzheimer’s disease (GuidAge): A randomised placebo-controlled trial. Lancet Neurology 11:851–59.CrossRefGoogle Scholar
  43. Wallenstein, S., and A. Berger. 1997. Weighted logrank tests to detect a transient improvement in survivorship. Biometrics 53 (2):736–44.CrossRefGoogle Scholar
  44. Wimo, A., and M. Prince. 2010. World Alzheimer report 2010: The global economic impact of dementia. London, UK: Alzheimer’s Disease International.Google Scholar
  45. Wu, L., and P. B. Gilbert. 2002. Flexible weighted log-rank tests optimal for detecting early and/or late survival differences. Biometrics 58 (4):997–1004.MathSciNetCrossRefGoogle Scholar
  46. Yang, S., and R. Prentice. 2010. Improved logrank-type tests for survival data using adaptive weights. Biometrics 66 (1):30–38.MathSciNetCrossRefGoogle Scholar
  47. Zucker, D. M. 1992. The efficiency of a weighted log-rank test under a percent error misspecification model for the log hazard ratio. Biometrics 48 (3):893–899.MathSciNetCrossRefGoogle Scholar
  48. Zucker, D. M., and E. Lakatos. 1990. Weighted log rank type statistics for comparing survival curves when there is a time lag in the effectiveness of treatment. Biometrika 77 (4):853–64.MathSciNetCrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing, 20 Middlefield Ct, Greensboro, NC 27455 2017

Authors and Affiliations

  • Valérie Garès
    • 1
    Email author
  • Sandrine Andrieu
    • 1
    • 2
  • Jean-François Dupuy
    • 3
  • Nicolas Savy
    • 4
  1. 1.UMR 1027, InsermPaul Sabatier UniversityToulouseFrance
  2. 2.Department of Public HealthCHU of ToulouseToulouseFrance
  3. 3.IRMAR UMR 6625CNRS and INSA of RennesRennesFrance
  4. 4.Institute of Mathematics of ToulousePaul Sabatier UniversityToulouseFrance

Personalised recommendations