Abstract
In this chapter, we will briefly touch on the historical discoveries of large abnormalities in the structure of the human genome. It is now clear that more subtle structural variants are in fact ubiquitous and key to understanding the spectrum of risk for many human diseases. While many of these changes are individually rare, the aggregate burden in the population is significant. With this in mind, we give an overview of the technologies developed to assay these variants in a high-throughput manner at ever-increasing granularity, including array-based platforms and next-generation sequencing. We then focus on whole-exome sequencing, since many disease studies to date have adopted this approach. Throughout, we review some of computer software and algorithms available for extracting structural variant information from experimental data. We conclude with a comparison of the strengths and weaknesses of the various current technologies and provide a small sampling of emerging methods for investigating the range of structural variation in more detail.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Abyzov A, Gerstein M (2011) AGE: defining breakpoints of genomic structural variants at single-nucleotide resolution, through optimal alignments with gap excision. Bioinformatics 27(5):595–603. doi:10.1093/bioinformatics/btq713
Abyzov A, Urban AE, Snyder M, Gerstein M (2011) CNVnator: an approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21(6):974–984. doi:10.1101/gr.114876.110
Amarasinghe KC, Li J, Halgamuge SK (2013) CoNVEX: copy number variation estimation in exome sequencing data using HMM. BMC Bioinformatics 14(Suppl 2):S2. doi:10.1186/1471-2105-14-S2-S2
Aten E, White SJ, Kalf ME, Vossen RHAM, Thygesen HH, Ruivenkamp CA, Kriek M, Breuning MHB, den Dunnen JT (2008) Methods to detect CNVs in the human genome. Cytogenet Genome Res 123(1-4):313–321. doi:10.1159/000184723
Bellos E, Johnson MR, Coin LJ (2012) CnvHiTSeq: integrative models for high-resolution copy number variation detection and genotyping using population sequencing data. Genome Biol 13(12):R120. doi:10.1186/gb-2012-13-12-r120
Burton PR, Clayton DG, Cardon LR, Craddock N, Deloukas P, Duncanson A, Kwiatkowski DP et al (2007) Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls. Nature 447(7145):661–678. doi:10.1038/nature05911
Carter NP (2007) Methods and strategies for analyzing copy number variation using DNA microarrays. Nat Genet 39:S16–S21. doi:10.1038/ng2028
Coin LJM, Cao D, Ren J, Zuo X, Sun L, Yang S, Zhang X et al (2012) An exome sequencing pipeline for identifying and genotyping common CNVs associated with disease with application to psoriasis. Bioinformatics 28(18):i370–i374. doi:10.1093/bioinformatics/bts379
Conrad DF, Daniel Andrews T, Carter NP, Hurles ME, Pritchard JK (2006) A high-resolution survey of deletion polymorphism in the human genome. Nat Genet 38(1):75–81. doi:10.1038/ng1697
Craddock N, Hurles ME, Cardin N, Pearson RD, Plagnol V, Robson S, Vukcevic D et al (2010) Genome-wide association study of CNVs in 16,000 cases of eight common diseases and 3,000 shared controls. Nature 464(7289):713–720. doi:10.1038/nature08979
Deng X (2011) SeqGene: a comprehensive software solution for mining exome- and transcriptome-sequencing data. BMC Bioinformatics 12(1):267. doi:10.1186/1471-2105-12-267
DePristo MA, Banks E, Poplin R, Garimella KV, Maguire JR, Hartl C, Philippakis AA et al (2011) A framework for variation discovery and genotyping using next-generation DNA sequencing data. Nat Genet 43(5):491–498. doi:10.1038/ng.806
Feuk L, Carson AR, Scherer SW (2006) Structural variation in the human genome. Nat Rev Genet 7(2):85–97. doi:10.1038/nrg1767
Fromer M, Moran JL, Chambert K, Banks E, Bergen SE, Ruderfer DM, Handsaker RE et al (2012) Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth. Am J Hum Genet 91(4):597–607. doi:10.1016/j.ajhg.2012.08.005
Geiss GK, Bumgarner RE, Birditt B, Dahl T, Dowidar N, Dunaway DL, Perry Fell H et al (2008) Direct multiplexed measurement of gene expression with color-coded probe pairs. Nat Biotechnol 26(3):317–325. doi:10.1038/nbt1385
Handsaker RE, Korn JM, Nemesh J, McCarroll SA (2011) Discovery and genotyping of genome structural polymorphism by sequencing on a population scale. Nat Genet 43(3):269–276. doi:10.1038/ng.768
Heidenblad M, Lindgren D, Jonson T, Liedberg F, Veerla S, Chebil G, Gudjonsson S, Borg A, Mansson W, Hoglund M (2008) Tiling resolution array CGH and high density expression profiling of urothelial carcinomas delineate genomic amplicons and candidate target genes specific for advanced tumors. BMC Med Genomics 1:3. doi:10.1186/1755-8794-1-3
Hormozdiari F, Alkan C, Eichler EE, Cenk Sahinalp S (2009) Combinatorial algorithms for structural variation detection in high-throughput sequenced genomes. Genome Res 19(7):1270–1278. doi:10.1101/gr.088633.108
International Schizophrenia Consortium (2008) Rare chromosomal deletions and duplications increase risk of schizophrenia. Nature 455(7210):237–241. doi:10.1038/nature07239
Karakoc E, Alkan C, O’Roak BJ, Dennis MY, Vives L, Mark K, Rieder MJ, Nickerson DA, Eichler EE (2012) Detection of structural variants and indels within exome data. Nat Methods 9(2):176–178. doi:10.1038/nmeth.1810
Karimpour-Fard A, Dumas L, Phang T, Sikela JM, Hunter LE (2010) A survey of analysis software for array-comparative genomic hybridisation studies to detect copy number variation. Hum Genomics 4(6):421–427. doi:10.1186/1479-7364-4-6-421
Koboldt DC, Zhang Q, Larson DE, Shen D, McLellan MD, Lin L, Miller CA, Mardis ER, Ding L, Wilson RK (2012) VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing. Genome Res 22(3):568–576. doi:10.1101/gr.129684.111
Konings P, Vanneste E, Jackmaert S, Ampe M, Verbeke G, Moreau Y, Vermeesch JR, Voet T (2012) Microarray analysis of copy number variation in single cells. Nat Protoc 7(2):281–310. doi:10.1038/nprot.2011.426
Korbel JO, Abyzov A, Mu XJ, Carriero N, Cayting P, Zhang Z, Snyder M, Gerstein MB (2009) PEMer: a computational framework with simulation-based error models for inferring genomic structural variants from massive paired-end sequencing data. Genome Biol 10(2):R23. doi:10.1186/gb-2009-10-2-r23
Korn JM, Kuruvilla FG, McCarroll SA, Wysoker A, Nemesh J, Cawley S, Hubbell E et al (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet 40(10):1253–1260. doi:10.1038/ng.237
Krumm N, Sudmant PH, Ko A, O’Roak BJ, Malig M, Coe BP, Quinlan AR, Nickerson DA, Eichler EE (2012) Copy number variation detection and genotyping from exome sequence data. Genome Res 22(8):1525–1532. doi:10.1101/gr.138115.112
Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R (2009) The sequence alignment/map format and SAMtools. Bioinformatics 25(16):2078–2079. doi:10.1093/bioinformatics/btp352
Li J, Lupat R, Amarasinghe KC, Thompson ER, Doyle MA, Ryland GL, Tothill RW, Halgamuge SK, Campbell IG, Gorringe KL (2012) CONTRA: copy number analysis for targeted resequencing. Bioinformatics 28(10):1307–1313. doi:10.1093/bioinformatics/bts146
Lim ET, Raychaudhuri S, Sanders SJ, Stevens C, Sabo A, MacArthur DG, Neale BM et al (2013) Rare complete knockouts in humans: population distribution and significant role in autism spectrum disorders. Neuron 77(2):235–242. doi:10.1016/j.neuron.2012.12.029
Lonigro RJ, Grasso CS, Robinson DR, Jing X, Wu YM, Cao X, Quist MJ, Tomlins SA, Pienta KJ, Chinnaiyan AM (2011) Detection of somatic copy number alterations in cancer using targeted exome capture sequencing. Neoplasia 13(11):1019–1025
Love MI, Myšičková A, Sun R, Kalscheuer V, Vingron M, Haas SA (2011) Modeling read counts for CNV detection in exome sequencing data. Stat Appl Genet Mol Biol 10(1). http://www.degruyter.com/view/j/sagmb.2011.10.issue-1/1544-6115.1732/1544-6115.1732.xml.
Lupski JR (1998) Genomic Disorders: Structural Features of the Genome Can Lead to DNA Rearrangements and Human Disease Traits. Trends Genet 14(10):417–422. doi:10.1016/S0168-9525(98)01555-8
Medvedev P, Fiume M, Dzamba M, Smith T, Brudno M (2010) Detecting copy number variation with mated short reads. Genome Res 20(11):1613–1622. doi:10.1101/gr.106344.110
Mills RE, Walter K, Stewart C, Handsaker RE, Chen K, Alkan C, Abyzov A et al (2011) Mapping copy number variation by population-scale genome sequencing. Nature 470(7332):59–65. doi:10.1038/nature09708
Murtaza M, Dawson S-J, Dana WY, Tsui DG, Forshew T, Piskorz AM, Parkinson C et al (2013) Non-invasive analysis of acquired resistance to cancer therapy by sequencing of plasma DNA. Nature 497(7447):108–112. doi:10.1038/nature12065
Nord AS, Lee M, King M-C, Walsh T (2011) Accurate and exact CNV identification from targeted high-throughput sequence data. BMC Genomics 12(1):184. doi:10.1186/1471-2164-12-184
Park G, Gim J, Kim A, Han K-H, Kim H-S, Seung-Ha O, Park T, Park W-Y, Choi BY (2013) Multiphasic analysis of whole exome sequencing data identifies a novel mutation of ACTG1 in a nonsyndromic hearing loss family. BMC Genomics 14(1):1–9. doi:10.1186/1471-2164-14-191
Perry GH, Yang F, Marques-Bonet T, Murphy C, Fitzgerald T, Lee AS, Hyland C et al (2008) Copy number variation and evolution in humans and chimpanzees. Genome Res 18(11):1698–1710. doi:10.1101/gr.082016.108
Pinto D, Darvishi K, Shi X, Rajan D, Rigler D, Fitzgerald T, Lionel AC et al (2011) Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants. Nat Biotechnol 29(6):512–520. doi:10.1038/nbt.1852
Plagnol V, Curtis J, Epstein M, Mok KY, Stebbings E, Grigoriadou S, Wood NW et al (2012) A robust model for read count data in exome sequencing experiments and implications for copy number variant calling. Bioinformatics 28(21):2747–2754. doi:10.1093/bioinformatics/bts526
Ramachandran A, Micsinai M, Pe’er I (2011) CONDEX: copy number detection in exome sequences. In: 2012 IEEE international conference on bioinformatics and biomedicine workshops, 0:87–93. IEEE Computer Society, Los Alamitos, CA. doi:10.1109/BIBMW.2011.6112359
Sathirapongsasuti JF, Lee H, Basil AJ, Horst GB, Cochran AJ, Binder S, Quackenbush J, Nelson SF (2011) Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV. Bioinformatics 27(19):2648–2654. doi:10.1093/bioinformatics/btr462
Sharp AJ, Cheng Z, Eichler EE (2006) Structural variation of the human genome. Annu Rev Genomics Hum Genet 7(1):407–442. doi:10.1146/annurev.genom.7.080505.115618
Shi Y, Majewski J (2013) FishingCNV: a graphical software package for detecting rare copy number variations in exome-sequencing data. Bioinformatics 29(11):1461–1462. doi:10.1093/bioinformatics/btt151
Shinawi M, Cheung SW (2008) The array CGH and its clinical applications. Drug Discov Today 13(17–18):760–770. doi:10.1016/j.drudis.2008.06.007
Shlien A, Malkin D (2010) Copy number variations and cancer susceptibility. Curr Opin Oncol 22(1):55–63. doi:10.1097/CCO.0b013e328333dca4
Sindi SS, Önal S, Peng LC, Hsin-Ta W, Raphael BJ (2012) An integrative probabilistic model for identification of structural variation in sequencing data. Genome Biol 13(3):R22. doi:10.1186/gb-2012-13-3-r22
Sobreira NLM, Gnanakkan V, Walsh M, Marosy B, Wohler E, Thomas G, Hoover-Fong JE, Hamosh A, Wheelan SJ, Valle D (2011) Characterization of complex chromosomal rearrangements by targeted capture and next-generation sequencing. Genome Res 21(10):1720–1727. doi:10.1101/gr.122986.111
Southard AE, Edelmann LJ, Gelb BD (2012) Role of copy number variants in structural birth defects. Pediatrics 129(4):755–763. doi:10.1542/peds.2011-2337
Szatkiewicz JP, Neale BM, O’Dushlaine C, Fromer M, Goldstein JI, Moran JL, Chambert K et al (2013) Detecting large copy number variants using exome genotyping arrays in a large Swedish schizophrenia sample. Mol Psychiatry 18(11):1178–84. doi:10.1038/mp.2013.98, http://www.nature.com/mp/journal/vaop/ncurrent/abs/mp201398a.html
Talkowski ME, Ernst C, Heilbut A, Chiang C, Hanscom C, Lindgren A, Kirby A et al (2011) Next-generation sequencing strategies enable routine detection of balanced chromosome rearrangements for clinical diagnostics and genetic research. Am J Hum Genet 88(4):469–481. doi:10.1016/j.ajhg.2011.03.013
Talkowski ME, Ordulu Z, Pillalamarri V, Benson CB, Blumenthal I, Connolly S, Hanscom C et al (2012) Clinical diagnosis by whole-genome sequencing of a prenatal sample. N Engl J Med 367(23):2226–2232. doi:10.1056/NEJMoa1208594
Teer JK, Mullikin JC (2010) Exome sequencing: the sweet spot before whole genomes. Hum Mol Genet 19(R2):R145–R151. doi:10.1093/hmg/ddq333
Valdés-Mas R, Bea S, Puente DA, López-Otín C, Puente XS (2012) Estimation of copy number alterations from exome sequencing data. PLoS One 7(12), e51422. doi:10.1371/journal.pone.0051422
Wang K, Li M, Hadley D, Liu R, Glessner J, Struan FA, Grant HH, Bucan M (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res 17(11):1665–1674. doi:10.1101/gr.6861907
Weischenfeldt J, Symmons O, Spitz F, Korbel JO (2013) Phenotypic impact of genomic structural variation: insights from and for human disease. Nat Rev Genet 14(2):125–138. doi:10.1038/nrg3373
Whale AS, Huggett JF, Cowen S, Speirs V, Shaw J, Ellison S, Foy CA, Scott DJ (2012) Comparison of microfluidic digital PCR and conventional quantitative PCR for measuring copy number variation. Nucleic Acids Res 40(11):e82–e82. doi:10.1093/nar/gks203
Wu J, Grzeda KR, Stewart C, Grubert F, Urban AE, Snyder MP, Marth GT (2012) Copy number variation detection from 1000 genomes project exon capture sequencing data. BMC Bioinformatics 13(1):305. doi:10.1186/1471-2105-13-305
Xi R, Hadjipanayis AG, Luquette LJ, Kim TM, Lee E, Zhang J, Johnson MD et al (2011) Copy number variation detection in whole-genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci 108(46):E1128–E1136. doi:10.1073/pnas.1110574108
Xi R, Lee S, Park PJ (2012) A survey of copy-number variation detection tools based on high-throughput sequencing data. In: Haines JL, Korf BR, Morton CC, Seidman CE, Seidman JG, Smith DR (eds) Current protocols in human genetics. Wiley, Hoboken, http://www.currentprotocols.com/WileyCDA/CPUnit/refId-hg0719.html
Ye K, Schulz MH, Long Q, Apweiler R, Ning Z (2009) Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics 25(21):2865–2871. doi:10.1093/bioinformatics/btp394
Zeitouni B, Boeva V, Janoueix-Lerosey I, Loeillet S, Legoix-né P, Nicolas A, Delattre O, Barillot E (2010) SVDetect: a tool to identify genomic structural variations from paired-end and mate-pair sequencing data. Bioinformatics 26(15):1895–1896. doi:10.1093/bioinformatics/btq293
Zhang F, Wenli G, Hurles ME, Lupski JR (2009) Copy number variation in human health, disease, and evolution. Annu Rev Genomics Hum Genet 10(1):451–481. doi:10.1146/annurev.genom.9.081307.164217
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer Science+Business Media New York
About this chapter
Cite this chapter
Fromer, M., Purcell, S. (2015). Rare Structural Variants. In: Zeggini, E., Morris, A. (eds) Assessing Rare Variation in Complex Traits. Springer, New York, NY. https://doi.org/10.1007/978-1-4939-2824-8_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-2824-8_4
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4939-2823-1
Online ISBN: 978-1-4939-2824-8
eBook Packages: Biomedical and Life SciencesBiomedical and Life Sciences (R0)