Cancer Bioinformatics pp 65-83 | Cite as
Computational Analysis of Structural Variation in Cancer Genomes
Abstract
Cancer onset and progression is often triggered by the accumulation of structural abnormalities in the genome. Somatically acquired large structural variants (SV) are one class of abnormalities that can lead to cancer onset by, for example, deactivating tumor suppressor genes and by upregulating oncogenes. Detecting and classifying these variants can lead to improved therapies and diagnostics for cancer patients.
This chapter provides an overview of the problem of computational genomic SV detection using next-generation sequencing (NGS) platforms, along with a brief overview of typical approaches for addressing this problem. It also discusses the general protocol that should be followed to analyze a cancer genome for SV detection in NGS data.
Key words
Cancer Structural variation Sequencing Next-generation sequencingReferences
- 1.Baker M (2012) Structural variation: the genome’s hidden architecture. Nat Methods 9(2):133–137. https://doi.org/10.1038/nmeth.1858CrossRefPubMedGoogle Scholar
- 2.Tuzun E, Sharp AJ, Bailey JA, Kaul R, Morrison VA, Pertz LM, Haugen E, Hayden H, Albertson D, Pinkel D, Olson MV, Eichler EE (2005) Fine-scale structural variation of the human genome. Nat Genet 37(7):727–732. https://doi.org/10.1038/ng1562CrossRefGoogle Scholar
- 3.Human Genome Structural Variation Working Group, Eichler EE, Nickerson DA, Altshuler D, Bowcock AM, Brooks LD, Carter NP, Church DM, Felsenfeld A, Guyer M, Lee C, Lupski JR, Mullikin JC, Pritchard JK, Sebat J, Sherry ST, Smith D, Valle D, Waterston RH (2007) Completing the map of human genetic variation. Nature 447(7141):161–165. https://doi.org/10.1038/447161aCrossRefGoogle Scholar
- 4.Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A, Yoon SC, Ye K, Cheetham RK, Chinwalla A, Conrad DF, Fu Y, Grubert F, Hajirasouliha I, Hormozdiari F, Iakoucheva LM, Iqbal Z, Kang S, Kidd JM, Konkel MK, Korn J, Khurana E, Kural D, Lam HY, Leng J, Li R, Li Y, Lin CY, Luo R, Mu XJ, Nemesh J, Peckham HE, Rausch T, Scally A, Shi X, Stromberg MP, Stutz AM, Urban AE, Walker JA, Wu J, Zhang Y, Zhang ZD, Batzer MA, Ding L, Marth GT, McVean G, Sebat J, Snyder M, Wang J, Ye K, Eichler EE, Gerstein MB, Hurles ME, Lee C, McCarroll SA, Korbel JO, Genomes P (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65. https://doi.org/10.1038/nature09708CrossRefPubMedPubMedCentralGoogle Scholar
- 5.Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, Gibbs RA, Hurles ME, GA MV (2010) A map of human genome variation from population-scale sequencing. Nature 467(7319):1061–1073. https://doi.org/10.1038/nature09534CrossRefGoogle Scholar
- 6.Nowell PC (1962) The minute chromosome (Phl) in chronic granulocytic leukemia. Blut 8:65–66CrossRefGoogle Scholar
- 7.Soda M, Choi YL, Enomoto M, Takada S, Yamashita Y, Ishikawa S, Fujiwara S, Watanabe H, Kurashina K, Hatanaka H, Bando M, Ohno S, Ishikawa Y, Aburatani H, Niki T, Sohara Y, Sugiyama Y, Mano H (2007) Identification of the transforming EML4-ALK fusion gene in non-small-cell lung cancer. Nature 448(7153):561–566. https://doi.org/10.1038/nature05945CrossRefPubMedGoogle Scholar
- 8.Liu L, Li Y, Li S, Hu N, He Y, Pong R, Lin D, Lu L, Law M (2012) Comparison of next-generation sequencing systems. J Biomed Biotechnol 2012:251364. https://doi.org/10.1155/2012/251364CrossRefPubMedPubMedCentralGoogle Scholar
- 9.Li H, Durbin R (2009) Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25(14):1754–1760. https://doi.org/10.1093/bioinformatics/btp324CrossRefPubMedPubMedCentralGoogle Scholar
- 10.Langmead B (2010) Aligning short sequencing reads with Bowtie. Curr Protoc Bioinformatics Chapter 11:Unit 11.7. doi: https://doi.org/10.1002/0471250953.bi1107s32
- 11.Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing Subgroup (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. https://doi.org/10.1093/bioinformatics/btp352CrossRefPubMedPubMedCentralGoogle Scholar
- 12.Hormozdiari F, Alkan C, Eichler EE, Sahinalp SC (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19(7):1270–1278. https://doi.org/10.1101/gr.088633.108CrossRefPubMedPubMedCentralGoogle Scholar
- 13.Sindi S, Helman E, Bashir A, Raphael BJ (2009) A geometric approach for classification and comparison of structural variants. Bioinformatics 25(12):i222–i230. https://doi.org/10.1093/bioinformatics/btp208CrossRefPubMedPubMedCentralGoogle Scholar
- 14.Rausch T, Zichner T, Schlattl A, Stutz AM, Benes V, Korbel JO (2012) DELLY: structural variant discovery by integrated paired-end and split-read analysis. Bioinformatics 28(18):i333–i339. https://doi.org/10.1093/bioinformatics/bts378CrossRefPubMedPubMedCentralGoogle Scholar
- 15.Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP, Shi X, Fulton RS, Ley TJ, Wilson RK, Ding L, Mardis ER (2009) BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods 6(9):677–681. https://doi.org/10.1038/nmeth.1363CrossRefPubMedPubMedCentralGoogle Scholar
- 16.Ye K, Schulz MH, Long Q, Apweiler R, Ning Z (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21):2865–2871. https://doi.org/10.1093/bioinformatics/btp394CrossRefPubMedPubMedCentralGoogle Scholar
- 17.Abel HJ, Duncavage EJ, Becker N, Armstrong JR, Magrini VJ, Pfeifer JD (2010) SLOPE: a quick and accurate method for locating non-SNP structural variation from targeted next-generation sequence data. Bioinformatics 26(21):2684–2688. https://doi.org/10.1093/bioinformatics/btq528CrossRefPubMedGoogle Scholar
- 18.Schroder J, Hsu A, Boyle SE, Macintyre G, Cmero M, Tothill RW, Johnstone RW, Shackleton M, Papenfuss AT (2014) Socrates: identification of genomic rearrangements in tumour genomes by re-aligning soft clipped reads. Bioinformatics. https://doi.org/10.1093/bioinformatics/btt767CrossRefGoogle Scholar
- 19.Wang J, Mullighan CG, Easton J, Roberts S, Heatley SL, Ma J, Rusch MC, Chen K, Harris CC, Ding L, Holmfeldt L, Payne-Turner D, Fan X, Wei L, Zhao D, Obenauer JC, Naeve C, Mardis ER, Wilson RK, Downing JR, Zhang J (2011) CREST maps somatic structural variation in cancer genomes with base-pair resolution. Nat Methods 8(8):652–654. https://doi.org/10.1038/nmeth.1628CrossRefPubMedPubMedCentralGoogle Scholar
- 20.Robinson JT, Thorvaldsdottir H, Winckler W, Guttman M, Lander ES, Getz G, Mesirov JP (2011) Integrative genomics viewer. Nat Biotechnol 29(1):24–26. https://doi.org/10.1038/nbt.1754CrossRefPubMedPubMedCentralGoogle Scholar
- 21.Karakoc E, Alkan C, O’Roak BJ, Dennis MY, Vives L, Mark K, Rieder MJ, Nickerson DA, Eichler EE (2012) Detection of structural variants and indels within exome data. Nat Methods 9(2):176–178. https://doi.org/10.1038/nmeth.1810CrossRefGoogle Scholar
- 22.Chiang DY, Getz G, Jaffe DB, O’Kelly MJ, Zhao X, Carter SL, Russ C, Nusbaum C, Meyerson M, Lander ES (2009) High-resolution mapping of copy-number alterations with massively parallel sequencing. Nat Methods 6(1):99–103. https://doi.org/10.1038/nmeth.1276CrossRefGoogle Scholar
- 23.Yoon S, Xuan Z, Makarov V, Ye K, Sebat J (2009) Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19(9):1586–1592. https://doi.org/10.1101/gr.092981.109CrossRefPubMedPubMedCentralGoogle Scholar
- 24.Wu Y, Tian L, Pirastu M, Stambolian D, Li H (2013) MATCHCLIP: locate precise breakpoints for copy number variation using CIGAR string by matching soft clipped reads. Front Genet 4:157. https://doi.org/10.3389/fgene.2013.00157CrossRefPubMedPubMedCentralGoogle Scholar
- 25.FastQC. http://www.bioinformatics.babraham.ac.uk/projects/fastqc/. Accessed 14 Jan 2015
- 26.Picard Tools. http://picard.sourceforge.net. Accessed 15 Jan 2016
- 27.Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21(6):974–984. https://doi.org/10.1101/gr.114876.110CrossRefPubMedPubMedCentralGoogle Scholar
- 28.Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-ne P, Nicolas A, Delattre O, Barillot E (2010) SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data. Bioinformatics 26(15):1895–1896. https://doi.org/10.1093/bioinformatics/btq293CrossRefPubMedPubMedCentralGoogle Scholar
- 29.Koboldt DC, Chen K, Wylie T, Larson DE, McLellan MD, Mardis ER, Weinstock GM, Wilson RK, Ding L (2009) VarScan: variant detection in massively parallel sequencing of individual and pooled samples. Bioinformatics 25(17):2283–2285. https://doi.org/10.1093/bioinformatics/btp373CrossRefPubMedPubMedCentralGoogle Scholar
- 30.Gusnanto A, Wood HM, Pawitan Y, Rabbitts P, Berri S (2012) Correcting for cancer genome size and tumour cell content enables better estimation of copy number alterations from next-generation sequence data. Bioinformatics 28(1):40–47. https://doi.org/10.1093/bioinformatics/btr593CrossRefGoogle Scholar