Advertisement

Computational Analysis of Structural Variation in Cancer Genomes

  • Matthew HayesEmail author
Protocol
Part of the Methods in Molecular Biology book series (MIMB, volume 1878)

Abstract

Cancer onset and progression is often triggered by the accumulation of structural abnormalities in the genome. Somatically acquired large structural variants (SV) are one class of abnormalities that can lead to cancer onset by, for example, deactivating tumor suppressor genes and by upregulating oncogenes. Detecting and classifying these variants can lead to improved therapies and diagnostics for cancer patients.

This chapter provides an overview of the problem of computational genomic SV detection using next-generation sequencing (NGS) platforms, along with a brief overview of typical approaches for addressing this problem. It also discusses the general protocol that should be followed to analyze a cancer genome for SV detection in NGS data.

Key words

Cancer Structural variation Sequencing Next-generation sequencing 

References

  1. 1.
    Baker M (2012) Structural variation: the genome’s hidden architecture. Nat Methods 9(2):133–137.  https://doi.org/10.1038/nmeth.1858CrossRefPubMedGoogle Scholar
  2. 2.
    Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE (2005) Fine-scale structural variation of the human genome. Nat Genet 37(7):727–732.  https://doi.org/10.1038/ng1562CrossRefGoogle Scholar
  3. 3.
    Human Genome Structural Variation Working Group, Eichler EE, Nickerson DA, Altshuler D, Bowcock AM, Brooks LD, Carter NP, Church DM, Felsenfeld A, Guyer M, Lee C, Lupski JR, Mullikin JC, Pritchard JK, Sebat J, Sherry ST, Smith D, Valle D, Waterston RH (2007) Completing the map of human genetic variation. Nature 447(7141):161–165.  https://doi.org/10.1038/447161aCrossRefGoogle Scholar
  4. 4.
    Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stutz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO, Genomes P (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65.  https://doi.org/10.1038/nature09708CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, GA MV (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061–1073.  https://doi.org/10.1038/nature09534CrossRefGoogle Scholar
  6. 6.
    Nowell PC (1962) The minute chromosome (Phl) in chronic granulocytic leukemia. Blut 8:65–66CrossRefGoogle Scholar
  7. 7.
    Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, Fujiwara S, Watanabe H, Kurashina K, Hatanaka H, Bando M, Ohno S, Ishikawa Y, Aburatani H, Niki T, Sohara Y, Sugiyama Y, Mano H (2007) Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 448(7153):561–566.  https://doi.org/10.1038/nature05945CrossRefPubMedGoogle Scholar
  8. 8.
    Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M (2012) Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012:251364.  https://doi.org/10.1155/2012/251364CrossRefPubMedPubMedCentralGoogle Scholar
  9. 9.
    Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760.  https://doi.org/10.1093/bioinformatics/btp324CrossRefPubMedPubMedCentralGoogle Scholar
  10. 10.
    Langmead B (2010) Aligning short sequencing reads with Bowtie. Curr Protoc Bioinformatics Chapter 11:Unit 11.7. doi: https://doi.org/10.1002/0471250953.bi1107s32
  11. 11.
    Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079.  https://doi.org/10.1093/bioinformatics/btp352CrossRefPubMedPubMedCentralGoogle Scholar
  12. 12.
    Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19(7):1270–1278.  https://doi.org/10.1101/gr.088633.108CrossRefPubMedPubMedCentralGoogle Scholar
  13. 13.
    Sindi S, Helman E, Bashir A, Raphael BJ (2009) A geometric approach for classification and comparison of structural variants. Bioinformatics 25(12):i222–i230.  https://doi.org/10.1093/bioinformatics/btp208CrossRefPubMedPubMedCentralGoogle Scholar
  14. 14.
    Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18):i333–i339.  https://doi.org/10.1093/bioinformatics/bts378CrossRefPubMedPubMedCentralGoogle Scholar
  15. 15.
    Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6(9):677–681.  https://doi.org/10.1038/nmeth.1363CrossRefPubMedPubMedCentralGoogle Scholar
  16. 16.
    Ye K, Schulz MH, Long Q, Apweiler R, Ning Z (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21):2865–2871.  https://doi.org/10.1093/bioinformatics/btp394CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Abel HJ, Duncavage EJ, Becker N, Armstrong JR, Magrini VJ, Pfeifer JD (2010) SLOPE: a quick and accurate method for locating non-SNP structural variation from targeted next-generation sequence data. Bioinformatics 26(21):2684–2688.  https://doi.org/10.1093/bioinformatics/btq528CrossRefPubMedGoogle Scholar
  18. 18.
    Schroder J, Hsu A, Boyle SE, Macintyre G, Cmero M, Tothill RW, Johnstone RW, Shackleton M, Papenfuss AT (2014) Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics.  https://doi.org/10.1093/bioinformatics/btt767CrossRefGoogle Scholar
  19. 19.
    Wang J, Mullighan CG, Easton J, Roberts S, Heatley SL, Ma J, Rusch MC, Chen K, Harris CC, Ding L, Holmfeldt L, Payne-Turner D, Fan X, Wei L, Zhao D, Obenauer JC, Naeve C, Mardis ER, Wilson RK, Downing JR, Zhang J (2011) CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Methods 8(8):652–654.  https://doi.org/10.1038/nmeth.1628CrossRefPubMedPubMedCentralGoogle Scholar
  20. 20.
    Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29(1):24–26.  https://doi.org/10.1038/nbt.1754CrossRefPubMedPubMedCentralGoogle Scholar
  21. 21.
    Karakoc E, Alkan C, O’Roak BJ, Dennis MY, Vives L, Mark K, Rieder MJ, Nickerson DA, Eichler EE (2012) Detection of structural variants and indels within exome data. Nat Methods 9(2):176–178.  https://doi.org/10.1038/nmeth.1810CrossRefGoogle Scholar
  22. 22.
    Chiang DY, Getz G, Jaffe DB, O’Kelly MJ, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES (2009) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6(1):99–103.  https://doi.org/10.1038/nmeth.1276CrossRefGoogle Scholar
  23. 23.
    Yoon S, Xuan Z, Makarov V, Ye K, Sebat J (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19(9):1586–1592.  https://doi.org/10.1101/gr.092981.109CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Wu Y, Tian L, Pirastu M, Stambolian D, Li H (2013) MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads. Front Genet 4:157.  https://doi.org/10.3389/fgene.2013.00157CrossRefPubMedPubMedCentralGoogle Scholar
  25. 25.
  26. 26.
    Picard Tools. http://picard.sourceforge.net. Accessed 15 Jan 2016
  27. 27.
    Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21(6):974–984.  https://doi.org/10.1101/gr.114876.110CrossRefPubMedPubMedCentralGoogle Scholar
  28. 28.
    Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-ne P, Nicolas A, Delattre O, Barillot E (2010) SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data. Bioinformatics 26(15):1895–1896.  https://doi.org/10.1093/bioinformatics/btq293CrossRefPubMedPubMedCentralGoogle Scholar
  29. 29.
    Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, Ding L (2009) VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25(17):2283–2285.  https://doi.org/10.1093/bioinformatics/btp373CrossRefPubMedPubMedCentralGoogle Scholar
  30. 30.
    Gusnanto A, Wood HM, Pawitan Y, Rabbitts P, Berri S (2012) Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next-generation sequence data. Bioinformatics 28(1):40–47.  https://doi.org/10.1093/bioinformatics/btr593CrossRefGoogle Scholar

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2019

Authors and Affiliations

  1. 1.Computer ScienceXavier University of LouisianaNew OrleansUSA

Personalised recommendations