Efficient Study Designs and Semiparametric Inference Methods for Developing Genomic Biomarkers in Cancer Clinical Research

Noma, Hisashi

doi:10.1007/978-981-10-0126-0_23

Hisashi Noma³

858 Accesses

Abstract

In the development of genomic biomarkers and molecular diagnostics, clinical studies using high-throughput assays such as DNA microarrays generally require enormous costs and efforts. Several efficient study designs for reducing the costs of such expensive measurements have been developed, mainly in the field of epidemiology. Under these efficient designs, expensive measurements are collected only on selected subsamples based on adequate response-selective sampling schemes, and total measurement costs are effectively reduced. In this study, we discuss the application of these effective designs to genomic analyses in cancer clinical studies, and provide relevant statistical methods such as gene selection (e.g., multiple testing based on the false discovery rate). Efficient semiparametric inference methods using auxiliary clinical information are also discussed.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Simon R. Genomic clinical trials and predictive medicine. New York: Cambridge University Press; 2013.
Book Google Scholar
Crowley J, Hoering A, editors. Handbook of statistics in clinical oncology. 3rd ed. Boca Raton: Chapman Hall/CRC; 2012.
MATH Google Scholar
Matsui S, Buyse M, Simon D, editors. Design and analysis of clinical trials for predictive medicine. Boca Raton: Chapman Hall/CRC; 2015.
MATH Google Scholar
Benjamini Y, Hochberg Y. Controlling the false discovery rate—a practical and powerful approach to multiple testing. J R Stat Soc B. 1995;57(1):289–300.
MathSciNet MATH Google Scholar
Storey JD. A direct approach to false discovery rates. J R Stat Soc B. 2002;64(3):479–98. doi:10.1111/1467-9868.00346.
Article MathSciNet MATH Google Scholar
Rosenwald A, Wright G, Chan WC, Connors JM, Campo E, et al. The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma. N Engl J Med. 2002;346(25):1937–47. doi:10.1056/NEJMoa012914.
Article Google Scholar
van de Vijver MJ, He YD, van’t Veer LJ, Dai H, Hart AAM, et al. A gene-expression signature as a predictor of survival in breast cancer. N Engl J Med. 2002;347(25):1999–2009. doi:10.1056/NEJMoa021967.
Article Google Scholar
Wang Y, Klijn JG, Zhang Y, Sieuwerts AM, Look MP, et al. Gene-expression profiles to predict distant metastasis of lymph-node-negative primary breast cancer. Lancet. 2005;365(9460):671–9. doi:10.1016/S0140-6736(05)17947-1.
Article Google Scholar
Rothman KJ, Greenland G, Lash TL. Modern epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins; 2008.
Google Scholar
Lawless JF, Kalbfleisch JD, Wild CJ. Semiparametric methods for response-selective and missing data problems. J R Stat Soc B. 1999;61(2):413–38. doi:10.1111/1467-9868.00185.
Article MathSciNet MATH Google Scholar
Breslow NE, McNeney B, Wellner JA. Large sample theory for semiparametric regression models with two-phase, outcome dependent sampling. Ann Stat. 2003;31(4):1110–39. doi:10.1214/aos/1059655907.
Article MathSciNet MATH Google Scholar
Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Using the whole cohort in the analysis of case-cohort data. Am J Epidemiol. 2009;169(11):1398–405. doi:10.1093/aje/kwp055.
Article Google Scholar
Breslow NE, Lumley T, Ballantyne CM, Chambless LE, Kulich M. Improved Horvitz–Thompson estimation of model parameters from two-phases stratified samples: applications in epidemiology. Stat Biosci. 2009;1(1):32–49. doi:10.1007/s12561-009-9001-6.
Article Google Scholar
Lumley T, Shaw PA, Dai JY. Connections between survey calibration estimators and semiparametric models for incomplete data. Int Stat Rev. 2011;79(2):200–20. doi:10.1111/j.1751-5823.2011.00138.x.
Article MATH Google Scholar
Laird NM, Lange C. The fundamentals of modern statistical genetics. New York: Springer; 2011.
Book MATH Google Scholar
Simon RM, Korn EL, McShane LM, Radmacher MD, Wright GW, et al. Design and analysis of DNA microarray investigations. New York: Springer; 2003.
MATH Google Scholar
Thomas DC. Addendum to a paper by Liddell FDK, McDolad JC, Thomas DC, and Cunliffe SV. J R Stat Soc Ser A. 1977;140(4):483–5.
Google Scholar
Prentice RL. A case-cohort design for epidemiologic cohort studies and disease prevention trials. Biometrika. 1986;73:1–11. doi:10.1093/biomet/73.1.1.
Article MathSciNet MATH Google Scholar
Self SG, Prentice RL. Asymptotic distribution theory and efficiency results for case-cohort studies. Ann Stat. 1988;16(1):64–81. doi:10.1214/aos/1176350691.
Article MathSciNet MATH Google Scholar
Barlow WE, Ichikawa L, Rosner D, Izumi S. Analysis of case-cohort designs. J Clin Epidemiol. 1999;52(12):1165–72.
Article Google Scholar
Borgan Ø, Langholz B, Samuelsen SO, Goldstein DR, Pogoda J. Exposure stratified case-cohort designs. Lifetime Data Anal. 2000;6(1):39–58. doi:10.1023/A:1009661900674.
Article MathSciNet MATH Google Scholar
Barlow WE. Robust variance estimation for the case-cohort design. Biometrics. 1994;50(4):1064–72. doi:10.2307/2533444.
Article MATH Google Scholar
Kulathinal S, Karvanen J, Saarela O, Kuulasmaa K. Case-cohort design in practice: experiences from the MORGAM Project. Epidemiol Perspect Innov. 2007;4:15. doi:10.1186/1742-5573-4-15.
Article Google Scholar
Desmedt C, Piette F, Loi S, Wang Y, Lallemand F, et al. Strong time dependence of the 76-gene prognostic signature for node-negative breast cancer patients in the TRANSBIG multicenter independent validation series. Clin Cancer Res. 2007;13(11):3207–14. doi:10.1158/1078-0432.CCR-06-2765.
Article Google Scholar
Noma H, Tanaka S. Analysis of case-cohort designs with binary outcomes: improving the efficiency using whole cohort auxiliary information. Stat Methods Med Res. 2014;. doi:10.1177/0962280214556175.
Google Scholar
Prentice RL, Pyke R. Logistic disease incidence models and case-control studies. Biometrika. 1979;66(3):403–11. doi:10.2307/2335158.
Article MathSciNet MATH Google Scholar
Breslow NE, Robins JM, Wellner JA. On the semi-parametric efficiency of logistic regression under case-control sampling. Bernoulli. 2000;6(3):447–55.
Article MathSciNet MATH Google Scholar
Hatzis C, Pusztai L, Valero V, Booser DJ, Esserman L, et al. A genomic predictor of response and survival following taxane-anthracycline chemotherapy for invasive breast cancer. J Am Med Assoc. 2011;305(18):1873–81. doi:10.1001/jama.2011.593.
Article Google Scholar
Robins JM, Rotnitzky A, Zhao LP. Estimation of regression-coefficients when some regressors are not always observed. J Am Stat Assoc. 1994;89(427):846–66. doi:10.2307/2290910.
Article MathSciNet MATH Google Scholar
Samuelsen SO. A pseudolikelihood approach to analysis of nested case-control data. Biometrika. 1997;84(2):379–94. doi:10.1093/biomet/84.2.379.
Article MathSciNet MATH Google Scholar
Henmi M, Eguchi S. A paradox concerning nuisance parameters and projected estimating functions. Biometrika. 2004;91(4):929–41. doi:10.1093/biomet/91.4.929.
Article MathSciNet MATH Google Scholar
Lumley T. Analysis of complex survey samples. J Stat Softw. 2004;. doi:10.18637/jss.v009.i08.
Google Scholar
Kulich M, Lin DY. Improving the efficiency of relative-risk estimation in case-control studies. J Am Stat Assoc. 2004;99(467):832–44. doi:10.1198/016214504000000584.
Article MATH Google Scholar
Qi L, Wang CY, Prentice RL. Weighted estimators for proportional hazards regression with missing covariates. J Am Stat Assoc. 2005;100(472):1250–63. doi:10.1198/016214505000000295.
Article MathSciNet MATH Google Scholar
Breslow NE, Wellner JA. Weighted likelihood for semiparametric models and two-phase stratified samples, with application to Cox regression. Scand J Stat. 2007;34(1):86–102. doi:10.1111/j.1467-9469.2006.00523.x.
Article MathSciNet MATH Google Scholar
Scott AJ, Wild CJ. Fitting regression models to case-control data by maximum likelihood. Biometrika. 1997;84(1):57–71. doi:10.1093/biomet/84.1.57.
Article MathSciNet MATH Google Scholar
Horvitz D, Thompson D. A generalization of sampling without replacement from a finite population. J Am Stat Assoc. 1952;47(260):663–85. doi:10.2307/2280784.
Article MATH Google Scholar
Deville JC, Särndal C-E. Calibration estimators in survey sampling. J Am Stat Assoc. 1992;87(418):376–82. doi:10.2307/2290268.
Article MathSciNet MATH Google Scholar
Stoer NC, Samuelsen SO. Comparison of estimators in nested case-control studies with multiple outcomes. Lifetime Data Anal. 2012;18(3):261–83. doi:10.1007/s10985-012-9214-8.
Article MathSciNet MATH Google Scholar
Deville JC, Särndal C-E, Sautory O. Generalized raking procedures in survey sampling. J Am Stat Assoc. 1993;88(423):1013–20. doi:10.2307/2290793.
Article MATH Google Scholar
McLachlan GJ. Discriminant analysis and statistical pattern recognition. Hoboken: Wiley; 2004.
MATH Google Scholar
Guo Y, Hastie T, Tibshirani R. Regularized linear discriminant analysis and its application in microarrays. Biostatistics. 2007;8(1):86–100. doi:10.1093/biostatistics/kxj035.
Article MATH Google Scholar
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed. New York: Springer; 2009.
Book MATH Google Scholar

Download references

Acknowledgements

This research was supported by the GSK Japan Research Grant, JST-CREST (JPMJCR1412), and a Grant-in-Aid for Scientific Research (15K15954) from the Ministry of Education, Culture, Sports, Science and Technology of Japan.

Author information

Authors and Affiliations

Department of Data Science, The Institute of Statistical Mathematics, 10-3 Midori-cho, Tachikawa, Tokyo, 190-8562, Japan
Hisashi Noma

Authors

Hisashi Noma
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hisashi Noma .

Editor information

Editors and Affiliations

Graduate School of Medicine, Nagoya University Graduate School of Medicine, Nagoya, Aichi, Japan
Shigeyuki Matsui
Cancer Research and Biostatistics, Seattle, Washington, USA
John Crowley

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Noma, H. (2017). Efficient Study Designs and Semiparametric Inference Methods for Developing Genomic Biomarkers in Cancer Clinical Research. In: Matsui, S., Crowley, J. (eds) Frontiers of Biostatistical Methods and Applications in Clinical Oncology. Springer, Singapore. https://doi.org/10.1007/978-981-10-0126-0_23

Download citation

DOI: https://doi.org/10.1007/978-981-10-0126-0_23
Published: 04 October 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-0124-6
Online ISBN: 978-981-10-0126-0
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics