Skip to main content

A Statistical Change-Point Analysis Approach for Modeling the Ratio of Next Generation Sequencing Reads

  • Conference paper
  • First Online:
Advances in the Mathematical Sciences

Part of the book series: Association for Women in Mathematics Series ((AWMS,volume 6))

Abstract

One of the key features of statistical change-point analysis is to estimate the unknown change-point locations for various statistical models imposed on the sample data. This analysis can be done through a hypothesis testing process, a model selection perspective, or a Bayesian approach, among other methods. Change-point analysis has a wide range of applications in research fields such as statistical quality control, finance and economics, climate study, medicine, genetics, etc. In this paper, a change-point analysis motivated by the modeling of genomic data will be provided. The high throughput next generation sequencing (NGS) technology is now frequently used in profiling tumor and control samples for the study of DNA copy number variants (CNVs). In particular, the ratio of the read count of the tumor sample to that of the control sample is popularly used for identifying CNV regions. To identify CNV regions is equivalent to finding change-points that potentially exist in the NGS reads ratio data. We present a change-point model and a Bayesian solution for the estimation of the change-point locations in NGS reads ratio data. Simulation studies of the proposed method indicate the effectiveness of the proposed method in identifying change-point locations. Applications of the proposed change point model for identifying boundaries of DNA copy number variation (CNV) regions using the next generation sequencing data of breast cancer/tumor cell lines and lung cancer cell line will be presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. R. Redon, S. Ishiwaka, K.R. Fitch, L. Feuk, G.H. Perry, D. Andrews, H. Fiegler, M.H. Shapero, A.R. Carson, W. Chen, E.K. Cho, S. Dallaire, J.L. Freeman, J.R. Gonzalez, M. Gratacos, J. Huang, D. Kalaitzopoulos, D. Komura, J.R. MacDonald, C.R. Marshall, R. Mei, L. Montgomery, K. Nishimura, K. Okamura, F. Shen, M.J. Somerville, J. Tchinda, A. Valsesia, C. Woodwark, F. Yang, J. Zhang, T. Zerjal, J. Zhang, L. Armengol, D.F. Conrad, X. Estivill, C. Tyler-Smith, N.P. Carter, H. Aburatani, C. Lee, K.W. Jones, S.W. Scherer, M.E. Hurles, Global variation in copy number in the human genome. Nature 444, 444–454 (2006)

    Article  Google Scholar 

  2. B. Stranger, M. Forrest, M. Dunning, C. Ingle, C. Beazley, N. Thorne, R. Redon, C. Bird, A. de Grassi, C. Lee, C. Tyler-Smith, N. Carter, S.W. Scherer, S. Tavar, P. Deloukas, M.E. Hurles, E.T. Dermitzakis, Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848 (2007)

    Article  Google Scholar 

  3. J. Sebat, B. Lakshmi, D. Malhotra, J. Troge, C. Lese-Martin, T. Walsh, B. Yamrom, S. Yoon, A. Krasnitz, J. Kendall, A. Leotta, D. Pai, R. Zhang, Y.-H. Lee, J. Hicks, S.J. Spence, A.T. Lee, K. Puura, T. Lehtimki, D. Ledbetter, P.K. Gregersen, J. Bregman, J.S. Sutcliffe, V. Jobanputra, W. Chung, D. Warburton, M.-C. King, D. Skuse, D.H. Geschwind, T.C. Gilliam, K. Ye, M. Wigler, Strong association of de novo copy number mutations with autism. Science 316, 445–449 (2007)

    Article  Google Scholar 

  4. P.J. Campbell, P.J. Stephens, E.D. Pleasance, S. O’Meara, H. Li, T. Santarius, L.A. Stebbings, C. Leroy, S. Edkins, C. Hardy, J.W. Teague, A. Menzies, I. Goodhead, D.J. Turner, C.M. Clee, M.A. Quail, A. Cox, C. Brown, R. Durbin, M.E. Hurles, P.A.W. Edwards, G.R. Bignell, M.R. Stratton, P.A. Futreal, Identification of somatically acquired rearrangements in cancer using genome-wide massively parallel paired-end sequencing. Nat. Genet. 40, 722–729 (2008)

    Article  Google Scholar 

  5. H. Stefansson, D. Rujescu, S. Cichon, O.P.H. Pietilinen, A. Ingason, S. Steinberg, R. Fossdal, E. Sigurdsson, T. Sigmundsson, J.E. Buizer-Voskamp, T. Hansen, K.D. Jakobsen, P. Muglia, C. Francks, P.M. Matthews, A. Gylfason, B.V. Halldorsson, D. Gudbjartsson, T.E. Thorgeirsson, A. Sigurdsson, A. Jonasdottir, A. Jonasdottir, A. Bjornsson, S. Mattiasdottir, T. Blondal, M. Haraldsson, B.B. Magnusdottir, I. Giegling, H.-J. Mller, A. Hartmann, K.V. Shianna, D. Ge, A.C. Need, C. Crombie, G. Fraser, N. Walker, J. Lonnqvist, J. Suvisaari, A. Tuulio-Henriksson, T. Paunio, T. Toulopoulou, E. Bramon, M. Di Forti, R. Murray, M. Ruggeri, E. Vassos, S. Tosato, M. Walshe, T. Li, C. Vasilescu, T.W. Mhleisen, A.G. Wang, H. Ullum, S. Djurovic, I. Melle, J. Olesen, L.A. Kiemeney, B. Franke, C. Sabatti, N.B. Freimer, J.R. Gulcher, U. Thorsteinsdottir, A. Kong, O.A. Andreassen, R.A. Ophoff, A. Georgi, M. Rietschel, T. Werge, H. Petursson, D.B. Goldstein, M.M. Nthen, L. Peltonen, D.A. Collier, D. St Clair, K. Stefansson, R.S. Kahn, D.H. Linszen, J. Van Os, D. Wiersma, R. Bruggeman, W. Cahn, L. De Haan, L. Krabbendam, I. Myin-Germeys, Large recurrent microdeletions associated with schizophrenia. Nature 455, 232–236 (2008)

    Article  Google Scholar 

  6. T.-L. Yang, X.-D. Chen, Y. Guo, S.-F. Lei, J.-T. Wang, Q. Zhou, F. Pan, Y. Chen, Z.-X. Zhang, S.-S. Dong, X.-H. Xu, H. Yan, X. Liu, C. Qiu, X.-Z. Zhu, T. Chen, M. Li, H. Zhang, L. Zhang, B.M. Drees, J.J. Hamilton, C.J. Papasian, R.R. Recker, X.-P. Song, J. Cheng, H.-W. Deng, Genome-wide copy-number-variation study identified a susceptibility gene, UGT2B17, for osteoporosis. Am. J. Hum. Genet. 83(6), 663–674 (2008)

    Article  Google Scholar 

  7. A. Rovelet-Lecrux, D. Hannequin, G. Raux, N. Le Meur, A. Laquerrire, A. Vital, C. Dumanchin, S. Feuillette, A. Brice, M. Vercelletto, F. Dubas, T. Frebourg, D. Campion, APP locus duplication causes autosomal dominant early-onset Alzheimer disease with cerebral amyloid angiopathy. Nat. Genet. 38, 24–26 (2006)

    Article  Google Scholar 

  8. S. Moorthie, C.J. Mattocks, C.F. Wright, Review of massively parallel DNA sequencing technologies. Hugo J. 5, 112 (2001)

    Google Scholar 

  9. S. Yoon, Z. Xuan, V. Makarov, K. Ye, J. Sebat, Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 19, 1586–1592 (2006)

    Article  Google Scholar 

  10. C.A. Miller, O. Hampton, C. Coarfa, A. Milosavljevic, ReadDepth: a parallel R package for detecting copy number alterations from short sequencing reads. PLoS One 6(1), e16327 (2011)

    Article  Google Scholar 

  11. A. Magi, L. Tattini, T. Pippucci, F. Torricelli, M. Benelli, Read count approach for DNA copy number variants detection. Bioinformatics 28, 470–478 (2012)

    Article  Google Scholar 

  12. T. Ji, J. Chen, Modeling the next generation sequencing read count data for DNA copy number variant study. Stat. Appl. Genet. Mol. Biol. 14, 361374 (2015)

    MathSciNet  MATH  Google Scholar 

  13. C. Xie, M.T. Tammi, CNV-seq: a new method to detect copy number variation using high-throughput sequencing. BMC Bioinform. 10, 80 (2009)

    Article  Google Scholar 

  14. A.B. Olshen, E.S. Venkatraman, R. Lucito, M. Wigler, Circular binary segmentation for the analysis of array-based DNA copy number data. Biostatistics 5(4), 557–572 (2004)

    Article  MATH  Google Scholar 

  15. D.Y. Chiang, G. Getz, D.B. Jaffe, M.J.T. O’Kelly, X. Zhao, S.L. Carter, C. Russ, C. Nusbaum, M. Meyerson, E.S. Lander, High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat. Methods 6, 99–103 (2009)

    Article  Google Scholar 

  16. T.M. Kim, L.J. Luquette, R. Xi, P.J. Park, rSW-seq: algorithm for detection of copy number alterations in deep sequencing data. BMC Bioinform. 11(432), 1471–2105 (2010)

    Google Scholar 

  17. R. Xi, A.G. Hadjipanayis, L.J. Luquette, T.-M. Kim, E. Lee, J. Zhang, M.D. Johnson, D.M. Muzny, D.A. Wheeler, R.A. Gibbs, R. Kucherlapati, P.J. Park, Copy number variation detection in whole-genome sequencing data using Bayesian information criterion. PNAS 108, E1128–E1136 (2011)

    Article  Google Scholar 

  18. J. Chen, A.K. Gupta, Parametric Statistical Change Point Analysis - With Applications to Genetics, Medicine, and Finance, 2nd edn. (Birkhauser, New York, 2012)

    Book  MATH  Google Scholar 

  19. H. Li, J. Vallandingham, J. Chen, SeqBBS: a change-point model based algorithm and R package for searching CNV regions via the ratio of sequencing reads, in Proceedings of the 2013 IEEE International Workshop on Genomic Signal Processing and Statistics (2013), pp. 46–49

    Google Scholar 

  20. J. Chen, Y.-P. Wang, A statistical change point model approach for the detection of DNA copy number variations in array CGH data. IEEE/ACM Trans. Comput. Biol. Bioinform. 6, 529–541 (2009)

    Article  Google Scholar 

  21. J. Chen, A. Yiiter, K.-C. Chang, A Bayesian approach to inference about a change point model with application to DNA copy number experimental data. J. Appl. Stat. 38, 1899–1913 (2011)

    Article  MathSciNet  Google Scholar 

  22. L.J. Vostrikova, Detecting “disorder” in multidimensional random processes. Sov. Math. Dokl. 2, 55–59 (1981)

    MATH  Google Scholar 

  23. R.E. Bellman, S.E. Dreyfus, Applied Dynamic Programming (Princeton University Press, Princeton, 1962)

    Book  MATH  Google Scholar 

  24. www.Biobase-international.com

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jie Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this paper

Cite this paper

Chen, J., Li, H. (2016). A Statistical Change-Point Analysis Approach for Modeling the Ratio of Next Generation Sequencing Reads. In: Letzter, G., et al. Advances in the Mathematical Sciences. Association for Women in Mathematics Series, vol 6. Springer, Cham. https://doi.org/10.1007/978-3-319-34139-2_13

Download citation

Publish with us

Policies and ethics