Skip to main content

Accurately Estimating Tumor Purity of Samples with High Degree of Heterogeneity from Cancer Sequencing Data

  • Conference paper
  • First Online:
Intelligent Computing Theories and Application (ICIC 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10362))

Included in the following conference series:

Abstract

Tumor purity is the proportion of tumor cells in the sampled admixture. Estimating tumor purity is one of the key steps for both understanding the tumor micro-environment and reducing false positives and false negatives in the genomic analysis. However, existing approaches often lose some accuracy when analyzing the samples with high degree of heterogeneity. The patterns of clonal architecture shown in sequencing data interfere with the data signals that the purity estimation algorithms expect. In this article, we propose a computational method, EMPurity, which is able to accurately infer the tumor purity of the samples with high degree of heterogeneity. EMPurity captures the patterns of both the tumor purity and clonal structure by a probabilistic model. The model parameters are directly calculated from aligned reads, which prevents the errors transferring from the variant calling results. We test EMPurity on a series of datasets comparing to three popular approaches, and EMPurity outperforms them on different simulation configurations.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. The Cancer Genome Atlas Research Network: Comprehensive genomic characterization defines human glioblastoma genes and core pathways. Nature 455(7216), 1061–1068 (2008)

    Article  Google Scholar 

  2. International Cancer Genome Consortium (2016). http://icgc.org

  3. Loo, P., Nordgard, S., Lingjærde, O., et al.: Allele-specific copy number analysis of tumors. Proc. Natl. Acad. Sci. U.S.A. 107(39), 16910–16915 (2010)

    Article  Google Scholar 

  4. Cibulskis, K., Lawrence, M., Carter, S., et al.: Sensitive detection of somatic point mutations in impure and heterogeneous cancer samples. Nat. Biotechnol. 31(3), 213–219 (2013)

    Article  Google Scholar 

  5. Larson, D., Harris, C., Chen, K., et al.: SomaticSniper: identification of somatic point mutations in whole genome sequencing data. Bioinformatics 28(3), 311–317 (2012)

    Article  Google Scholar 

  6. Roth, A., Ding, J., Morin, R., et al.: JointSNVMix: a probabilistic model for accurate detection of somatic mutations in normal/tumour paired next-generation sequencing data. Bioinformatics 28(7), 907–913 (2012)

    Article  Google Scholar 

  7. Carter, S., Cibulskis, K., Helman, E., et al.: Absolute quantification of somatic DNA alterations in human cancer. Nat. Biotechnol. 30(5), 413–421 (2012)

    Article  Google Scholar 

  8. Gusnanto, A., Wood, H., Pawitan, Y., et al.: Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next-generation sequence data. Bioinformatics 28(1), 40–47 (2012)

    Article  Google Scholar 

  9. Oesper, L., Mahmoody, A., Raphael, B.: THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 14(7), R80 (2013)

    Article  Google Scholar 

  10. Yoshihara, K., Shahmoradgoli, M., Martínez, E., et al.: Inferring tumour purity and stromal and immune cell admixture from expression data. Nature Commun. 4(4), 2612 (2013)

    Google Scholar 

  11. Su, X., Zhang, L., Zhang, J., et al.: PurityEst: estimating purity of human tumor samples using next-generation sequencing data. Bioinformatics 28(17), 2265–2266 (2012)

    Article  Google Scholar 

  12. Berger, M., Lawrence, M., Demichelis, F., et al.: The genomic complexity of primary human prostatecancer. Nature 470(7333), 214–220 (2011)

    Article  Google Scholar 

  13. Larson, N., Fridley, B.: PurBayes: estimating tumor cellularity and subclonality in next-generation sequencing data. Bioinformatics 29(15), 1888–1889 (2013)

    Article  Google Scholar 

  14. Miller, C., White, B., Dees, N., et al.: SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput. Biol. 10(8), e1003665 (2014)

    Article  Google Scholar 

  15. Li, H., Durbin, R.: Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14), 1754–1760 (2009)

    Article  Google Scholar 

  16. Lu, C., Xie, M., Wendl, M., Wang, J., McLellan, M., Leiserson, M., et al.: Patterns and functional implications of rare germline variants across 12 cancer types. Nature Commun. 6, 10086 (2015)

    Article  Google Scholar 

  17. Xie, M., Lu, C., Wang, J., et al.: Age-related cancer mutations associated with clonal hematopoietic expansion. Nat. Med. 20(12), 1472–1478 (2014)

    Article  Google Scholar 

Download references

Acknowledgement

This work is supported by the National Science Foundation of China (Grant No: 81400632), Shaanxi Science Plan Project (Grant No: 2014JM8350) and the Fundamental Research Funds for the Central Universities (XJTU).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jiayin Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Geng, Y. et al. (2017). Accurately Estimating Tumor Purity of Samples with High Degree of Heterogeneity from Cancer Sequencing Data. In: Huang, DS., Jo, KH., Figueroa-García, J. (eds) Intelligent Computing Theories and Application. ICIC 2017. Lecture Notes in Computer Science(), vol 10362. Springer, Cham. https://doi.org/10.1007/978-3-319-63312-1_25

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63312-1_25

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63311-4

  • Online ISBN: 978-3-319-63312-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics