Skip to main content
Log in

Characterization and evolutionary dynamics of complex regions in eukaryotic genomes

  • Review
  • Published:
Science China Life Sciences Aims and scope Submit manuscript

Abstract

Complex regions in eukaryotic genomes are typically characterized by duplications of chromosomal stretches that often include one or more genes repeated in a tandem array or in relatively close proximity. Nevertheless, the repetitive nature of these regions, together with the often high sequence identity among repeats, have made complex regions particularly recalcitrant to proper molecular characterization, often being misassembled or completely absent in genome assemblies. This limitation has prevented accurate functional and evolutionary analyses of these regions. This is becoming increasingly relevant as evidence continues to support a central role for complex genomic regions in explaining human disease, developmental innovations, and ecological adaptations across phyla. With the advent of long-read sequencing technologies and suitable assemblers, the development of algorithms that can accommodate sample heterozygosity, and the adoption of a pangenomic-like view of these regions, accurate reconstructions of complex regions are now within reach. These reconstructions will finally allow for accurate functional and evolutionary studies of complex genomic regions, underlying the generation of genotype-phenotype maps of unprecedented resolution.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

References

  • Abel, H.J., and Duncavage, E.J. (2013). Detection of structural DNA variation from next generation sequencing data: a review of informatic approaches. Cancer Genets 206, 432–440.

    Article  CAS  Google Scholar 

  • Absalan, F., and Ronaghi, M. (2007). Molecular inversion probe assay. Methods Mol Biol 396, 315–330.

    Article  CAS  PubMed  Google Scholar 

  • Abu Bakar, S., Hollox, E.J., and Armour, J.A.L. (2009). Allelic recombination between distinct genomic locations generates copy number diversity in human β–defensins. Proc Natl Acad Sci USA 106, 853–858.

    Article  PubMed  Google Scholar 

  • Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. (2011). CNVnator: An approach to discover, genotype, and characterize typical and atypical CNVs from family and population genome sequencing. Genome Res 21, 974–984.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Adams, M.D., Celniker, S.E., Holt, R.A., Evans, C.A, Gocayne, J.D., Amanatides, P.G., Scherer, S.E., Li, P.W., Hoskins, R.A., Galle, R.F., et al. (2000). The genome sequence of Drosophila melanogaster. Science 287, 2185–2195.

    Article  PubMed  Google Scholar 

  • Alberts, B. (2008). Molecular Biology of the Cell, 5th edn (New York: Garland Science).

    Google Scholar 

  • Alkan, C., Coe, B.P., and Eichler, E.E. (2011a). Genome structural variation discovery and genotyping. Nat Rev Genet 12, 363–376.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Alkan, C., Sajjadian, S., and Eichler, E.E. (2011b). Limitations of nextgeneration genome sequence assembly. Nat Methods 8, 61–65.

    Article  CAS  PubMed  Google Scholar 

  • Ananiev, E.V., Chamberlin, M.A., Klaiber, J., and Svitashev, S. (2005). Microsatellite megatracts in the maize (Zea mays L.) genome. Genome 48, 1061–1069.

    Article  CAS  PubMed  Google Scholar 

  • Andersson, D.I., Jerlström–Hultqvist, J., and Näsvall, J. (2015). Evolution of new functions de novo and from preexisting genes. Cold Spring Harb Perspect Biol 7, a017996.

    Book  Google Scholar 

  • Anhuf, D., Eggermann, T., Rudnik–Schöneborn, S., and Zerres, K. (2003). Determination of SMN1 and SMN2 copy number using TaqMan™ technology. Hum Mutat 22, 74–78.

    Article  CAS  PubMed  Google Scholar 

  • Arabidopsis Genome Initiative. (2000). Analysis of the genome sequence of the flowering plant Arabidopsis thaliana, Nature 408, 796–815.

    Article  Google Scholar 

  • Arguello, J.R., Chen, Y., Yang, S., Wang, W., and Long, M. (2006). Origination of an X–linked testes chimeric gene by illegitimate recombination in Drosophila. PLoS Genet 2, e77.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Arguello, J.R., and Connallon, T. (2011). Gene duplication and ectopic gene conversion in Drosophila. Genes 2, 131–151.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Assogba, B.S., Milesi, P., Djogbénou, L.S., Berthomieu, A., Makoundou, P., Baba–Moussa, L.S., Fiston–Lavier, A.S., Belkhir, K., Labbé, P., and Weill, M. (2016). The ace–1 locus is amplified in all resistant anopheles gambiae mosquitoes: fitness consequences of homogeneous and heterogeneous duplications. PLoS Biol 14, e2000618.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Baltimore, D. (1981). Gene conversion: some implications for immunoglobulin genes. Cell 24, 592–594.

    Article  CAS  PubMed  Google Scholar 

  • Bankevich, A., Nurk, S., Antipov, D., Gurevich, A.A., Dvorkin, M., Kulikov, A.S., Lesin, V.M., Nikolenko, S.I., Pham, S., Prjibelski, A.D., et al. (2012). SPAdes: a new genome assembly algorithm and its applications to single–cell sequencing. J Comput Biol 19, 455–477.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bass, C., and Field, L.M. (2011). Gene amplification and insecticide resistance. Pest Manag Sci 67, 886–890.

    Article  CAS  PubMed  Google Scholar 

  • Bellos, E., Johnson, M.R., and Coin, L.J.M. (2012). cnvHiTSeq: integrative models for high–resolution copy number variation detection and genotyping using population sequencing data. Genome Biol 13, R120.

    Google Scholar 

  • Bennett–Baker, P.E., and Mueller, J.L. (2017). CRISPR–mediated isolation of specific megabase segments of genomic DNA. Nucleic Acids Res 45, e165.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bentley, D.R., Balasubramanian, S., Swerdlow, H.P., Smith, G.P., Milton, J., Brown, C.G., Hall, K.P., Evers, D.J., Barnes, C.L., Bignell, H.R., et al. (2008). Accurate whole human genome sequencing using reversible terminator chemistry. Nature 456, 53–59.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bergthorsson, U., Andersson, D.I., and Roth, J.R. (2007). Ohno’s dilemma: Evolution of new genes under continuous selection. Proc Natl Acad Sci USA 104, 17004–17009.

    Article  PubMed  PubMed Central  Google Scholar 

  • Berlin, K., Koren, S., Chin, C.S., Drake, J.P., Landolin, J.M., and Phillippy, A.M. (2015). Assembling large genomes with single–molecule sequencing and locality–sensitive hashing. Nat Biotechnol 33, 623–630.

    Article  CAS  PubMed  Google Scholar 

  • Béziat, V., Traherne, J.A., Liu, L.L., Jayaraman, J., Enqvist, M., Larsson, S., Trowsdale, J., and Malmberg, K.J. (2013). Influence of KIR gene copy number on natural killer cell education. Blood 121, 4703–4707.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Bleidorn, C. (2016). Third generation sequencing: technology and its potential impact on evolutionary biodiversity research. Systatics Biodiversity 14, 1–8.

    Article  Google Scholar 

  • Bresler, M., Sheehan, S., Chan, A.H., and Song, Y.S. (2012). Telescoper: de novo assembly of highly repetitive regions. Bioinformatics 28, i311–i317.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Buermans, H.P.J., Vossen, R.H.A.M., Anvar, S.Y., Allard, W.G., Guchelaar, H.J., White, S.J., den Dunnen, J.T., Swen, J.J., and van der Straaten, T. (2017). Flexible and scalable full–length CYP2D6 long amplicon PacBio sequencing. Human Mutat 38, 310–316.

    Article  CAS  Google Scholar 

  • Campbell, P.J., Stephens, P.J., Pleasance, E.D., O’Meara, S., Li, H., Santarius, T., Stebbings, L.A., Leroy, C., Edkins, S., Hardy, C., et al. (2008). Identification of somatically acquired rearrangements in cancer using genome–wide massively parallel paired–end sequencing. Nat Genet 40, 722–729.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Cardoso–Moreira, M., Arguello, J.R., Gottipati, S., Harshman, L.G., Grenier, J.K., and Clark, A.G. (2016). Evidence for the fixation of gene duplications by positive selection in Drosophila. Genome Res 26, 787–798.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Carpenter, D., Dhar, S., Mitchell, L.M., Fu, B., Tyson, J., Shwan, N.A.A., Yang, F., Thomas, M.G., and Armour, J.A.L. (2015). Obesity, starch digestion and amylase: association between copy number variants at human salivary (AMY1) and pancreatic (AMY2) amylase genes. Human Mol Genets 24, 3472–3480.

    Article  CAS  Google Scholar 

  • Carvalho, C.M.B., and Lupski, J.R. (2016). Mechanisms underlying structural variant formation in genomic disorders. Nat Rev Genet 17, 224–238.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Casola, C., Ganote, C.L., and Hahn, M.W. (2010). Nonallelic gene conversion in the genus Drosophila. Genetics 185, 95–103.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chaisson, M.J.P., Huddleston, J., Dennis, M.Y., Sudmant, P.H., Malig, M., Hormozdiari, F., Antonacci, F., Surti, U., Sandstrom, R., Boitano, M., et al. (2015a). Resolving the complexity of the human genome using single–molecule sequencing. Nature 517, 608–611.

    Article  CAS  PubMed  Google Scholar 

  • Chaisson, M.J.P., Wilson, R.K., and Eichler, E.E. (2015b). Genetic variation and the de novo assembly of human genomes. Nat Rev Genet 16, 627–640.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chakraborty, M., Baldwin–Brown, J.G., Long, A.D., and Emerson, J.J. (2016). Contiguous and accurate de novo assembly of metazoan genomes with modest long read coverage. Nucleic Acids Res 15, gkw654.

    Article  CAS  Google Scholar 

  • Chakraborty, M., VanKuren, N.W., Zhao, R., Zhang, X., Kalsow, S., and Emerson, J.J. (2018). Hidden genetic variation shapes the structure of functional elements in Drosophila. Nat Genet 50, 20–25.

    Article  CAS  PubMed  Google Scholar 

  • Charrier, C., Joshi, K., Coutinho–Budd, J., Kim, J.E., Lambert, N., de Marchena, J., Jin, W.L., Vanderhaeghen, P., Ghosh, A., Sassa, T., et al. (2012). Inhibition of SRGAP2 function by its human–specific paralogs induces neoteny during spine maturation. Cell 149, 923–935.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, K., Wallis, J.W., McLellan, M.D., Larson, D.E., Kalicki, J.M., Pohl, C.S., McGrath, S.D., Wendl, M.C., Zhang, Q., Locke, D.P., et al. (2009). BreakDancer: an algorithm for high–resolution mapping of genomic structural variation. Nat Methods 6, 677–681.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chen, S., Krinsky, B.H., and Long, M. (2013). New genes as drivers of phenotypic evolution. Nat Rev Genet 14, 645–660.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chin, C.S., Alexander, D.H., Marks, P., Klammer, A.A., Drake, J., Heiner, C., Clum, A., Copeland, A., Huddleston, J., Eichler, E.E., et al. (2013). Nonhybrid, finished microbial genome assemblies from long–read SMRT sequencing data. Nat Methods 10, 563–569.

    Article  CAS  PubMed  Google Scholar 

  • Chin, C.S., Peluso, P., Sedlazeck, F.J., Nattestad, M., Concepcion, G.T., Clum, A., Dunn, C., O’Malley, R., Figueroa–Balderas, R., Morales–Cruz, A., et al. (2016). Phased diploid genome assembly with singlemolecule real–time sequencing. Nat Methods 13, 1050–1054.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Chung, H., Bogwitz, M.R., McCart, C., Andrianopoulos, A., Ffrench–Constant, R.H., Batterham, P., and Daborn, P.J. (2007). Cis–regulatory elements in the Accord retrotransposon result in tissue–specific expression of the Drosophila melanogaster insecticide resistance gene Cyp6g1. Genetics 175, 1071–1077.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Church, D.M., Goodstadt, L., Hillier, L.W., Zody, M.C., Goldstein, S., She, X., Bult, C.J., Agarwala, R., Cherry, J.L., DiCuccio, M., et al. (2009). Lineage–specific biology revealed by a finished genome assembly of the mouse. PLoS Biol 7, e1000112.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Clarke, J., Wu, H.C., Jayasinghe, L., Patel, A., Reid, S., and Bayley, H. (2009). Continuous base identification for single–molecule nanopore DNA sequencing. Nat Nanotech 4, 265–270.

    Article  CAS  Google Scholar 

  • Clifton, B.D., Librado, P., Yeh, S.D., Solares, E.S., Real, D.A., Jayasekera, S.U., Zhang, W., Shi, M., Park, R.V., Magie, R.D., et al. (2017). Rapid functional and sequence differentiation of a tandemly repeated speciesspecific multigene family in Drosophila. Mol Biol Evol 34, 51–65.

    Article  CAS  PubMed  Google Scholar 

  • Conrad, D.F., and Hurles, M.E. (2007). The population genetics of structural variation. Nat Genet 39, S30–S36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Conrad, D.F., Pinto, D., Redon, R., Feuk, L., Gokcumen, O., Zhang, Y., Aerts, J., Andrews, T.D., Barnes, C., Campbell, P., et al. (2010). Origins and functional impact of copy number variation in the human genome. Nature 464, 704–712.

    Article  CAS  PubMed  Google Scholar 

  • C. elegans Sequencing Consortium. (1998). Genome sequence of the nematode C. elegans: a platform for investigating biology. Science 282, 2012–2018.

    Article  Google Scholar 

  • Deng, C., Cheng, C.H.C., Ye, H., He, X., and Chen, L. (2010). Evolution of an antifreeze protein by neofunctionalization under escape from adaptive conflict. Proc Natl Acad Sci USA 107, 21593–21598.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dennis, M.Y., Harshman, L., Nelson, B.J., Penn, O., Cantsilieris, S., Huddleston, J., Antonacci, F., Penewit, K., Denman, L., Raja, A., et al. (2017). The evolution and population diversity of human–specific segmental duplications. Nat ecol evol 1, 0069.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dennis, M.Y., Nuttle, X., Sudmant, P.H., Antonacci, F., Graves, T.A., Nefedov, M., Rosenfeld, J.A., Sajjadian, S., Malig, M., Kotkiewicz, H., et al. (2012). Evolution of human–specific neural SRGAP2 genes by incomplete segmental duplication. Cell 149, 912–922.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Des Marais, D.L., and Rausher, M.D. (2008). Escape from adaptive conflict after duplication in an anthocyanin pathway gene. Nature 454, 762–765.

    Article  CAS  PubMed  Google Scholar 

  • Dopman, E.B., and Hartl, D.L. (2007). A portrait of copy–number polymorphism in Drosophila melanogaster. Proc Natl Acad Sci USA 104, 19920–19925.

    Article  PubMed  PubMed Central  Google Scholar 

  • Dujon, B. (2010). Yeast evolutionary genomics. Nat Rev Genet 11, 512–524.

    Article  CAS  PubMed  Google Scholar 

  • Dunn, B., Richter, C., Kvitek, D.J., Pugh, T., and Sherlock, G. (2012). Analysis of the Saccharomyces cerevisiae pan–genome reveals a pool of copy number variants distributed in diverse yeast strains from differing industrial environments. Genome Res 22, 908–924.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Earl, D., Bradnam, K., St. John, J., Darling, A., Lin, D., Fass, J., Yu, H.O. K., Buffalo, V., Zerbino, D.R., Diekhans, M., et al. (2011). Assemblathon 1: a competitive assessment of de novo short read assembly methods. Genome Res 21, 2224–2241.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Eid, J., Fehr, A., Gray, J., Luong, K., Lyle, J., Otto, G., Peluso, P., Rank, D., Baybayan, P., Bettman, B., et al. (2009). Real–time DNA sequencing from single polymerase molecules. Science 323, 133–138.

    Article  CAS  PubMed  Google Scholar 

  • Eirin–Lopez, J.M., Rebordinos, L., Rooney, A.P., and Rozas, J. (2012). The birth–and–death evolution of multigene families revisited. Genome Dynam 7, 170–196.

    Article  Google Scholar 

  • Emerson, J.J., Cardoso–Moreira, M., Borevitz, J.O., and Long, M. (2008). Natural selection shapes genome–wide patterns of copy–number polymorphism in Drosophila melanogaster. Science 320, 1629–1631.

    Article  CAS  PubMed  Google Scholar 

  • Ersfeld, K. (2004). Fiber–FISH: fluorescence in situ hybridization on stretched DNA. Methods Mol Biol 270, 395–402.

    CAS  PubMed  Google Scholar 

  • Faucon, F., Dusfour, I., Gaude, T., Navratil, V., Boyer, F., Chandre, F., Sirisopa, P., Thanispong, K., Juntarajumnong, W., Poupardin, R., et al. (2015). Identifying genomic changes associated with insecticide resistance in the dengue mosquito Aedes aegypti by deep targeted sequencing. Genome Res 25, 1347–1359.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Fawcett, J.A., and Innan, H. (2015). Spreading good news. eLife 4, e07108.

    Google Scholar 

  • Feyereisen, R., Dermauw, W., and Van Leeuwen, T. (2015). Genotype to phenotype, the molecular and physiological dimensions of resistance in arthropods. Pesticide Biochem Physiol 121, 61–77.

    Article  CAS  Google Scholar 

  • Fiddes, I.T., Lodewijk, G.A., Mooring, M., Bosworth, C.M., Ewing, A.D., Mantalas, G.L., Novak, A.M., van den Bout, A., Bishara, A., Rosenkrantz, J.L., et al. (2018). Human–specific NOTCH2NL genes affect Notch signaling and cortical neurogenesis. Cell 173, 1356–1369. e22.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Florio, M., Albert, M., Taverna, E., Namba, T., Brandl, H., Lewitus, E., Haffner, C., Sykes, A., Wong, F.K., Peters, J., et al. (2015). Humanspecific gene ARHGAP11B promotes basal progenitor amplification and neocortex expansion. Science 347, 1465–1470.

    Article  CAS  PubMed  Google Scholar 

  • Force, A., Lynch, M., Pickett, F.B., Amores, A., Yan, Y.L., and Postlethwait, J. (1999). Preservation of duplicate genes by complementary, degenerative mutations. Genetics 151, 1531–1545.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Francino, M.P. (2005). An adaptive radiation model for the origin of new gene functions. Nat Genet 37, 573–578.

    Article  CAS  PubMed  Google Scholar 

  • Gabrieli, T., Sharim, H., Fridman, D., Arbib, N., Michaeli, Y., and Ebenstein, Y. (2018). Selective nanopore sequencing of human BRCA1 by Cas9–assisted targeting of chromosome segments (CATCH). Nucleic Acids Res 46, e87.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gao, L.Z., and Innan, H. (2004). Very low gene duplication rate in the yeast genome. Science 306, 1367–1370.

    Article  CAS  PubMed  Google Scholar 

  • Gnerre, S., Maccallum, I., Przybylski, D., Ribeiro, F.J., Burton, J.N., Walker, B.J., Sharpe, T., Hall, G., Shea, T.P., Sykes, S., et al. (2011). High–quality draft assemblies of mammalian genomes from massively parallel sequence data. Proc Natl Acad Sci USA 108, 1513–1518.

    Article  CAS  PubMed  Google Scholar 

  • Golicz, A.A., Batley, J., and Edwards, D. (2016). Towards plant pangenomics. Plant Biotechnol J 14, 1099–1105.

    Article  PubMed  Google Scholar 

  • Green, P. (1997). Against a whole–genome shotgun. Genome Res 7, 410–417.

    Article  CAS  PubMed  Google Scholar 

  • Gu, W., Zhang, F., and Lupski, J.R. (2008). Mechanisms for human genomic rearrangements. PathoGenetics 1, 4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Gu, Z., Steinmetz, L.M., Gu, X., Scharfe, C., Davis, R.W., and Li, W.H. (2003). Role of duplicate genes in genetic robustness against null mutations. Nature 421, 63–66.

    Article  CAS  PubMed  Google Scholar 

  • Guillemaud, T., Lenormand, T., Bourguet, D., Chevillon, C., Pateur, N., and Raymond, M. (1998). Evolution of resistance in Culex pipiens: allele replacemente and changing environment. Evolution 52, 443–453.

    PubMed  Google Scholar 

  • Gurevich, A., Saveliev, V., Vyahhi, N., and Tesler, G. (2013). QUAST: quality assessment tool for genome assemblies. Bioinformatics 29, 1072–1075.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hahn, M.W. (2009). Distinguishing among evolutionary models for the maintenance of gene duplicates. J Hered 100, 605–617.

    Article  CAS  PubMed  Google Scholar 

  • Hahn, M.W., Han, M.V., and Han, S.G. (2007). Gene family evolution across 12 Drosophila genomes. PLoS Genet 3, e197.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Harewood, L., Chaignat, E., and Alexandre., R. (2012). Structural variation and its effects on expresson. Methods Mol Biol 838, 173–186.

    Article  CAS  PubMed  Google Scholar 

  • Hastings, P.J., Lupski, J.R., Rosenberg, S.M., and Ira, G. (2009). Mechanisms of change in gene copy number. Nat Rev Genet 10, 551–564.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hemingway, J., Hawkes, N.J., McCarroll, L., and Ranson, H. (2004). The molecular basis of insecticide resistance in mosquitoes. Insect Biochem Mol Biol 34, 653–665.

    Article  CAS  PubMed  Google Scholar 

  • Hendrickson, H., Slechta, E.S., Bergthorsson, U., Andersson, D.I., and Roth, J.R. (2002). Amplification–mutagenesis: Evidence that “directed” adaptive mutation and general hypermutability result from growth with a selected gene amplification. Proc Natl Acad Sci USA 99, 2164–2169.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hindson, B.J., Ness, K.D., Masquelier, D.A., Belgrader, P., Heredia, N.J., Makarewicz, A.J., Bright, I.J., Lucero, M.Y., Hiddessen, A.L., Legler, T.C., et al. (2011). High–throughput droplet digital PCR system for absolute quantitation of DNA copy number. Anal Chem 83, 8604–8610.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hollox, E.J. (2008). Copy number variation of beta–defensins and relevance to disease. Cytogenet Genome Res 123, 148–155.

    Article  CAS  PubMed  Google Scholar 

  • Hollox, E.J. (2012). The challenges of studying complex and dynamic regions of the human genome. Methods Mol Biol 838, 187–207.

    Article  CAS  PubMed  Google Scholar 

  • Hollox, E.J., and Abujaber, R. (2017). Evolution and diversity of defensins in vertebrates. In Evolutionary Biology: Self/Nonself Evolution, Species and Complex Traits Evolution, Methods and Concepts, P. Pontarotti, ed. (Cham, Switzerland: Springer), pp. 27–50.

    Google Scholar 

  • Hollox, E.J., Barber, J.C.K., Brookes, A.J., and Armour, J.A.L. (2008a). Defensins and the dynamic genome: What we can learn from structural variation at human chromosome band 8p23.1. Genome Res 18, 1686–1697.

    Article  CAS  PubMed  Google Scholar 

  • Hollox, E.J., Huffmeier, U., Zeeuwen, P.L.J.M., Palla, R., Lascorz, J., Rodijk–Olthuis, D., van de Kerkhof, P.C.M., Traupe, H., de Jongh, G., den Heijer, M., et al. (2008b). Psoriasis is associated with increased β–defensin genomic copy number. Nat Genet 40, 23–25.

    Article  CAS  PubMed  Google Scholar 

  • Hormozdiari, F., Alkan, C., Eichler, E.E., and Sahinalp, S.C. (2009). Combinatorial algorithms for structural variation detection in highthroughput sequenced genomes. Genome Res 19, 1270–1278.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hoskins, R.A., Carlson, J.W., Kennedy, C., Acevedo, D., Evans–Holm, M., Frise, E., Wan, K.H., Park, S., Mendez–Lago, M., Rossi, F., et al. (2007). Sequence finishing and mapping of Drosophila melanogaster heterochromatin. Science 316, 1625–1628.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huddleston, J., and Eichler, E.E. (2016). An incomplete understanding of human genetic variation. Genetics 202, 1251–1254.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Huddleston, J., Ranade, S., Malig, M., Antonacci, F., Chaisson, M., Hon, L., Sudmant, P.H., Graves, T.A., Alkan, C., Dennis, M.Y., et al. (2014). Reconstructing complex regions of genomes using long–read sequencing technology. Genome Res 24, 688–696.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Hughes, A.L. (1994). The evolution of functionally novel proteins after gene duplication. Proc R Soc Lond B 256, 119–124.

    Article  CAS  Google Scholar 

  • Innan, H. (2003). A two–locus gene conversion model with selection and its application to the human RHCE and RHD genes. Proc Natl Acad Sci USA 100, 8793–8798.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Innan, H. (2009). Population genetic models of duplicated genes. Genetica 137, 19–37.

    Article  CAS  PubMed  Google Scholar 

  • Innan, H., and Kondrashov, F. (2010). The evolution of gene duplications: classifying and distinguishing between models. Nat Rev Genet 11, 97–108.

    Article  CAS  PubMed  Google Scholar 

  • Iqbal, Z., Caccamo, M., Turner, I., Flicek, P., and McVean, G. (2012). De novo assembly and genotyping of variants using colored de Bruijn graphs. Nat Genet 44, 226–232.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Istrail, S., Sutton, G.G., Florea, L., Halpern, A.L., Mobarry, C.M., Lippert, R., Walenz, B., Shatkay, H., Dew, I., Miller, J.R., et al. (2004). Wholegenome shotgun assembly and comparison of human genome assemblies. Proc Natl Acad Sci USA 101, 1916–1921.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • James, C.P., Bajaj–Elliott, M., Abujaber, R., Forya, F., Klein, N., David, A. L., Hollox, E.J., and Peebles, D.M. (2018). Human beta defensin (HBD) gene copy number affects HBD2 protein levels: impact on cervical bactericidal immunity in pregnancy. Eur J Hum Genet 26, 434–439.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jayaswal, V., Jimenez, J., Magie, R., Nguyen, K., Clifton, B., Yeh, S., and Ranz, J.M. (2018). A species–specific multigene family mediates differential sperm displacement in Drosophila melanogaster. Evolution 72, 399–403.

    Article  PubMed  Google Scholar 

  • Jiang, W., Johnson, C., Jayaraman, J., Simecek, N., Noble, J., Moffatt, M. F., Cookson, W.O., Trowsdale, J., and Traherne, J.A. (2012a). Copy number variation leads to considerable diversity for B but not A haplotypes of the human KIR genes encoding NK cell receptors. Genome Res 22, 1845–1854.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Jiang, W., Johnson, C., Simecek, N., López–Álvarez, M.R., Di, D., Trowsdale, J., and Traherne, J.A. (2016). qKAT: a high–throughput qPCR method for KIR gene copy number and haplotype determination. Genome Med 8, 99.

    CAS  Google Scholar 

  • Jiang, W., Zhao, X., Gabrieli, T., Lou, C., Ebenstein, Y., and Zhu, T.F. (2015). Cas9–assisted targeting of chromosome segments CATCH enables one–step targeted cloning of large gene clusters. Nat Commun 6, 8101.

    Article  PubMed  Google Scholar 

  • Jiang, Y., Wang, Y., and Brudno, M. (2012b). PRISM: pair–read informed split–read mapping for base–pair level detection of insertion, deletion and structural variants. Bioinformatics 28, 2576–2583.

    Article  CAS  PubMed  Google Scholar 

  • Jugulam, M., Niehues, K., Godar, A.S., Koo, D.H., Danilova, T., Friebe, B., Sehgal, S., Varanasi, V.K., Wiersma, A., Westra, P., et al. (2014). Tandem amplification of a chromosomal segment harboring 5–enolpyruvylshikimate–3–phosphate synthase locus confers glyphosate resistance in Kochia scoparia. Plant Physiol 166, 1200–1207.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kaessmann, H. (2010). Origins, evolution, and phenotypic impact of new genes. Genome Res 20, 1313–1326.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kajitani, R., Toshimoto, K., Noguchi, H., Toyoda, A., Ogura, Y., Okuno, M., Yabana, M., Harada, M., Nagayasu, E., Maruyama, H., et al. (2014). Efficient de novo assembly of highly heterozygous genomes from whole–genome shotgun short reads. Genome Res 24, 1384–1395.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Katju, V. (2012). In with the old, in with the new: the promiscuity of the duplication process engenders diverse pathways for novel gene creation. Int J Evol Biol 2012(2), 1–24.

    Article  Google Scholar 

  • Katju, V., and Bergthorsson, U. (2013). Copy–number changes in evolution: rates, fitness effects and adaptive significance. Front Genet 4, 273.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kondrashov, F.A. (2010). Gene dosage and duplication. In Evolution after Gene Duplication, K. Dittmar, and D. Liberles, ed. (Wiley–Blackwell), pp. 57–76.

    Google Scholar 

  • Kondrashov, F.A. (2012). Gene duplication as a mechanism of genomic adaptation to a changing environment. Proc R Soc B–Biol Sci 279, 5048–5057.

    Article  Google Scholar 

  • Korbel, J.O., Abyzov, A., Mu, X.J., Carriero, N., Cayting, P., Zhang, Z., Snyder, M., and Gerstein, M.B. (2009). PEMer: a computational framework with simulation–based error models for inferring genomic structural variants from massive paired–end sequencing data. Genome Biol 10, R23.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Korbel, J.O., Urban, A.E., Affourtit, J.P., Godwin, B., Grubert, F., Simons, J.F., Kim, P.M., Palejev, D., Carriero, N.J., Du, L., et al. (2007). Pairedend mapping reveals extensive structural variation in the human genome. Science 318, 420–426.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Koren, S., Schatz, M.C., Walenz, B.P., Martin, J., Howard, J.T., Ganapathy, G., Wang, Z., Rasko, D.A., McCombie, W.R., Jarvis, E.D., et al. (2012). Hybrid error correction and de novo assembly of single–molecule sequencing reads. Nat Biotechnol 30, 693–700.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Koren, S., Walenz, B.P., Berlin, K., Miller, J.R., Bergman, N.H., and Phillippy, A.M. (2017). Canu: scalable and accurate long–read assembly via adaptive k–mer weighting and repeat separation. Genome Res 27, 722–736.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Krsticevic, F.J., Schrago, C.G., and Carvalho, A.B. (2015). Long–read single molecule sequencing to resolve tandem gene copies: The Mst77Y region on the Drosophila melanogaster Y chromosome. G3 5, 1145–1150.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Kulathinal, R.J., Sawyer, S.A., Bustamante, C.D., Nurminsky, D., Ponce, R., Ranz, J.M., and Hartl, D.L. (2004). Selective sweep in the evolution of a new sperm–specific gene in Drosophila. In Selective Sweep, D. Nurminsky, ed. (Austin, Texas: Kluwer Academic/Plenum Publishers), pp. 1–12.

    Google Scholar 

  • Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., and Salzberg, S.L. (2004). Versatile and open software for comparing large genomes.. Genome Biol 5, R12.

    Google Scholar 

  • Labbé, P., Berthomieu, A., Berticat, C., Alout, H., Raymond, M., Lenormand, T., and Weill, M. (2007). Independent duplications of the acetylcholinesterase gene conferring insecticide resistance in the mosquito Culex pipiens. Mol Biol Evol 24, 1056–1067.

    Article  CAS  PubMed  Google Scholar 

  • Lander, E.S., Linton, L.M., Birren, B., Nusbaum, C., Zody, M.C., Baldwin, J., Devon, K., Dewar, K., Doyle, M., FitzHugh, W., et al. (2001). Initial sequencing and analysis of the human genome. Nature 409, 860–921.

    Article  CAS  PubMed  Google Scholar 

  • Layer, R.M., Chiang, C., Quinlan, A.R., and Hall, I.M. (2014). LUMPY: a probabilistic framework for structural variant discovery. Genome Biol 15, R84.

    Google Scholar 

  • Li, H. (2016). Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Lin, Y., Yuan, J., Kolmogorov, M., Shen, M.W., Chaisson, M., and Pevzner, P.A. (2016). Assembly of long error–prone reads using de Bruijn graphs. Proc Natl Acad Sci USA 113, e8396–E8405.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Livak, K.J., and Schmittgen, T.D. (2001). Analysis of relative gene expression data using real–time quantitative PCR and the 2−ΔΔCT method. Methods 25, 402–408.

    Article  CAS  PubMed  Google Scholar 

  • Long, M., VanKuren, N.W., Chen, S., and Vibranovski, M.D. (2013). New gene evolution: little did we know. Annu Rev Genet 47, 307–333.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Luo, R., Liu, B., Xie, Y., Li, Z., Huang, W., Yuan, J., He, G., Chen, Y., Pan, Q., Liu, Y., et al. (2012). SOAPdenovo2: an empirically improved memory–efficient short–read de novo assembler. GigaScience 1, 18.

    Article  PubMed  PubMed Central  Google Scholar 

  • Lupski, J.R. (1998). Genomic disorders: structural features of the genome can lead to DNA rearrangements and human disease traits. Trends Genets 14, 417–422.

    Article  CAS  Google Scholar 

  • Lupski, J.R., and Stankiewicz, P. (2005). Genomic disorders: Molecular mechanisms for rearrangements and conveyed phenotypes. PLoS Genet 1, e49–633.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mardis, E.R. (2013). Next–generation sequencing platforms. Annu Rev Anal Chem 6, 287–303.

    Article  CAS  Google Scholar 

  • Margulies, M., Egholm, M., Altman, W.E., Attiya, S., Bader, J.S., Bemben, L.A., Berka, J., Braverman, M.S., Chen, Y.J., Chen, Z., et al. (2005). Genome sequencing in microfabricated high–density picolitre reactors. Nature 437, 376–380.

    Article  PubMed  PubMed Central  Google Scholar 

  • Marques–Bonet, T., Girirajan, S., and Eichler, E.E. (2009). The origins and impact of primate segmental duplications. Trends Genets 25, 443–454.

    Article  CAS  Google Scholar 

  • Martin, M.P., Bashirova, A., Traherne, J., Trowsdale, J., and Carrington, M. (2003). Cutting edge: Expansion of the KIR locus by unequal crossing over. J Immunol 171, 2192–2195.

    Article  CAS  PubMed  Google Scholar 

  • Martins, W.F.S., Subramaniam, K., Steen, K., Mawejje, H., Liloglou, T., Donnelly, M.J., and Wilding, C.S. (2017). Detection and quantitation of copy number variation in the voltage–gated sodium channel gene of the mosquito Culex quinquefasciatus. Sci Rep 7, 5821.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McCoy, R.C., Taylor, R.W., Blauwkamp, T.A., Kelley, J.L., Kertesz, M., Pushkarev, D., Petrov, D.A., and Fiston–Lavier, A.S. (2014). Illumina TruSeq synthetic long–reads empower de novo assembly and resolve complex, highly–repetitive transposable elements. PLoS ONE 9, e106689.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • McKernan, K.J., Peckham, H.E., Costa, G.L., McLaughlin, S.F., Fu, Y., Tsung, E.F., Clouser, C.R., Duncan, C., Ichikawa, J.K., Lee, C.C., et al. (2009). Sequence and structural variation in a human genome uncovered by short–read, massively parallel ligation sequencing using two–base encoding. Genome Res 19, 1527–1541.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Medvedev, P., Stanciu, M., and Brudno, M. (2009). Computational methods for discovering structural variation with next–generation sequencing. Nat Methods 6, S13–S20.

    Article  CAS  PubMed  Google Scholar 

  • Miller, J.R., Zhou, P., Mudge, J., Gurtowski, J., Lee, H., Ramaraj, T., Walenz, B.P., Liu, J., Stupar, R.M., Denny, R., et al. (2017). Hybrid assembly with long and short reads improves discovery of gene family expansions. BMC Genomics 18, 541.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mohajeri, K., Cantsilieris, S., Huddleston, J., Nelson, B.J., Coe, B.P., Campbell, C.D., Baker, C., Harshman, L., Munson, K.M., Kronenberg, Z.N., et al. (2016). Interchromosomal core duplicons drive both evolutionary instability and disease susceptibility of the Chromosome 8p23.1 region. Genome Res 26, 1453–1467.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Mouches, C., Pasteur, N., Berge, J.B., Hyrien, O., Raymond, M., de Saint Vincent, B.R., de Silvestri, M., and Georghiou, G.P. (1986). Amplification of an esterase gene is responsible for insecticide resistance in a California Culex mosquito. Science 233, 778–780.

    Article  CAS  PubMed  Google Scholar 

  • Waterston, R.H., Lindblad–Toh, K., Birney, E., Rogers, J., Abril, J.F., Agarwal, P., Agarwala, R., Ainscough, R., Alexandersson, M., et al. (2002). Initial sequencing and comparative analysis of the mouse genome. Nature 420, 520–562.

    Article  CAS  PubMed  Google Scholar 

  • Myers, E.W., Sutton, G.G., Delcher, A.L., Dew, I.M., Fasulo, D.P., Flanigan, M.J., Kravitz, S.A., Mobarry, C.M., Reinert, K.H.J., Remington, K.A., et al. (2000). A whole–genome assembly of Drosophila. Science 287, 2196–2204.

    Article  CAS  PubMed  Google Scholar 

  • Nagylaki, T., and Petes, T.D. (1982). Intrachromosomal gene conversion and the maintenance of sequence homogeneity among repeated genes. Genetics 100, 315–337.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Näsvall, J., Sun, L., Roth, J.R., and Andersson, D.I. (2012). Real–time evolution of new genes by innovation, amplification, and divergence. Science 338, 384–387.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nei, M., and Rooney, A.P. (2005). Concerted and birth–and–death evolution of multigene families. Annu Rev Genet 39, 121–152.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nguyen, D.Q., Webber, C., Hehir–Kwa, J., Pfundt, R., Veltman, J., and Ponting, C.P. (2008). Reduced purifying selection prevails over positive selection in human copy number variant evolution. Genome Res 18, 1711–1723.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nguyen, H.T., Boocock, J., Merriman, T.R., and Black, M.A. (2016). SRBreak: A read–depth and split–read framework to identify breakpoints of different events inside simple copy–number variable regions. Front Genet 7, 160.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nijkamp, J.F., van den Broek, M.A., Geertman, J.M.A., Reinders, M.J.T., Daran, J.M.G., and de Ridder, D. (2012). De novo detection of copy number variation by co–assembly. Bioinformatics 28, 3195–3202.

    Article  CAS  PubMed  Google Scholar 

  • Nurminsky, D., De Aguiar, D., Bustamante, C.D., and Hartl, D.L. (2001). Chromosomal effects of rapid gene evolution in Drosophila melanogaster. Science 291, 128–130.

    Article  CAS  PubMed  Google Scholar 

  • Nurminsky, D.I., Nurminskaya, M.V., De Aguiar, D., and Hartl, D.L. (1998). Selective sweep of a newly evolved sperm–specific gene in Drosophila. Nature 396, 572–575.

    Article  CAS  PubMed  Google Scholar 

  • Nuttle, X., Giannuzzi, G., Duyzend, M.H., Schraiber, J.G., Narvaiza, I., Sudmant, P.H., Penn, O., Chiatante, G., Malig, M., Huddleston, J., et al. (2016). Emergence of a Homo sapiens–specific gene family and chromosome 16p11.2 CNV susceptibility. Nature 536, 205–209.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Nuttle, X., Huddleston, J., O’Roak, B.J., Antonacci, F., Fichera, M., Romano, C., Shendure, J., and Eichler, E.E. (2013). Rapid and accurate large–scale genotyping of duplicated genes and discovery of interlocus gene conversions. Nat Meth 10, 903–909.

    Article  CAS  Google Scholar 

  • O’Roak, B.J., Vives, L., Fu, W., Egertson, J.D., Stanaway, I.B., Phelps, I. G., Carvill, G., Kumar, A., Lee, C., Ankenman, K., et al. (2012). Multiplex targeted sequencing identifies recurrently mutated genes in autism spectrum disorders. Science 338, 1619–1622.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Obbard, D.J., Maclennan, J., Kim, K.W., Rambaut, A., O’Grady, P.M., and Jiggins, F.M. (2012). Estimating divergence dates and substitution rates in the Drosophila phylogeny. Mol Biol Evol 29, 3459–3473.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ohno, S. (1970). Evolution by Gene Duplication (New York: Springer–Verlag).

    Book  Google Scholar 

  • Ohta, T. (1982). Allelic and nonallelic homology of a supergene family.. Proc Natl Acad Sci USA 79, 3251–3254.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Osada, N., and Innan, H. (2008). Duplication and gene conversion in the Drosophila melanogaster genome. PLoS Genet 4, e1000305.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Owen, R.P., Sangkuhl, K., Klein, T.E., and Altman, R.B. (2009). Cytochrome P450 2D6. Pharmacogenet Genomics 19, 559–562.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Parham, P. (2005). Influence of KIR diversity on human immunity. Adv Exp Med Biol 560, 47–50.

    Article  CAS  PubMed  Google Scholar 

  • Parham, P., Norman, P.J., Abi–Rached, L., and Guethlein, L.A. (2012). Human–specific evolution of killer cell immunoglobulin–like receptor recognition of major histocompatibility complex class I molecules. Philos Trans R Soc B–Biol Sci 367, 800–811.

    Article  CAS  Google Scholar 

  • Parra, G., Bradnam, K., and Korf, I. (2007). CEGMA: a pipeline to accurately annotate core genes in eukaryotic genomes. Bioinformatics 23, 1061–1067.

    Article  CAS  PubMed  Google Scholar 

  • Perry, G.H., Dominy, N.J., Claw, K.G., Lee, A.S., Fiegler, H., Redon, R., Werner, J., Villanea, F.A., Mountain, J.L., Misra, R., et al. (2007). Diet and the evolution of human amylase gene copy number variation. Nat Genet 39, 1256–1260.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Pillai, S., Gopalan, V., and Lam, A.K.Y. (2017). Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas. Crit Rev Oncol/Hematol 116, 58–67.

    Article  Google Scholar 

  • Pinheiro, L.B., Coleman, V.A., Hindson, C.M., Herrmann, J., Hindson, B. J., Bhat, S., and Emslie, K.R. (2012). Evaluation of a droplet digital polymerase chain reaction format for DNA copy number quantification. Anal Chem 84, 1003–1011.

    Article  CAS  PubMed  Google Scholar 

  • Pirooznia, M., Goes, F.S., and Zandi, P.P. (2015). Whole–genome CNV analysis: advances in computational approaches. Front Genet 06, 138.

    Article  CAS  Google Scholar 

  • Ponce, R., and Hartl, D.L. (2006). The evolution of the novel Sdic gene cluster in Drosophila melanogaster. Gene 376, 174–183.

    Article  CAS  PubMed  Google Scholar 

  • Ponchel, F., Toomes, C., Bransfield, K., Leong, F.T., Douglas, S.H., Field, S.L., Bell, S.M., Combaret, V., Puisieux, A., Mighell, A.J., et al. (2003). Real–time PCR based on SYBR–Green I fluorescence: an alternative to the TaqMan assay for a relative quantification of gene rearrangements, gene amplifications and micro gene deletions.. BMC Biotechnol 3, 18.

    Article  PubMed  PubMed Central  Google Scholar 

  • Pyo, C.W., Wang, R., Vu, Q., Cereb, N., Yang, S.Y., Duh, F.M., Wolinsky, S., Martin, M.P., Carrington, M., and Geraghty, D.E. (2013). Recombinant structures expand and contract inter and intragenic diversification at the KIR locus. BMC Genomics 14, 89.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ranz, J.M., and Parsch, J. (2012). Newly evolved genes: moving from comparative genomics to functional studies in model systems. Bioessays 34, 477–483.

    Article  CAS  PubMed  Google Scholar 

  • Ranz, J.M., Ponce, A.R., Hartl, D.L., and Nurminsky, D. (2003). Origin and evolution of a new gene expressed in the Drosophila sperm axoneme. Genetica 118, 233–244.

    Article  CAS  PubMed  Google Scholar 

  • Rausch, T., Zichner, T., Schlattl, A., Stütz, A.M., Benes, V., and Korbel, J. O. (2012). DELLY: structural variant discovery by integrated pairedend and split–read analysis. Bioinformatics 28, i333–i339.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Raymond, M., Poulin, E., Boiroux, V., Dupont, E., and Pasteur, N. (1993). Stability of insecticide resistance due to amplification of esterase genes in Culex pipiens. Heredity 70, 301–307.

    Article  CAS  Google Scholar 

  • Redon, R., Ishikawa, S., Fitch, K.R., Feuk, L., Perry, G.H., Andrews, T.D., Fiegler, H., Shapero, M.H., Carson, A.R., Chen, W., et al. (2006). Global variation in copy number in the human genome. Nature 444, 444–454.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Reisner, W., Larsen, N.B., Silahtaroglu, A., Kristensen, A., Tommerup, N., Tegenfeldt, J.O., and Flyvbjerg, H. (2010). Single–molecule denaturation mapping of DNA in nanofluidic channels. Proc Natl Acad Sci USA 107, 13294–13299.

    Article  PubMed  PubMed Central  Google Scholar 

  • Remnant, E.J., Good, R.T., Schmidt, J.M., Lumb, C., Robin, C., Daborn, P. J., and Batterham, P. (2013). Gene duplication in the major insecticide target site, Rdl, in Drosophila melanogaster. Proc Natl Acad Sci USA 110, 14705–14710.

    Article  PubMed  PubMed Central  Google Scholar 

  • Ritz, A., Bashir, A., Sindi, S., Hsu, D., Hajirasouliha, I., and Raphael, B.J. (2014). Characterization of structural variants with single molecule and hybrid sequencing approaches. Bioinformatics 30, 3458–3466.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rodrigo, G., and Fares, M.A. (2018). Intrinsic adaptive value and early fate of gene duplication revealed by a bottom–up approach. eLife 7, e29739.

    Article  PubMed  PubMed Central  Google Scholar 

  • Rogers, R.L., Bedford, T., and Hartl, D.L. (2009). Formation and longevity of chimeric and duplicate genes in Drosophila melanogaster. Genetics 181, 313–322.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Rogers, R.L., Cridland, J.M., Shao, L., Hu, T.T., Andolfatto, P., and Thornton, K.R. (2014). Landscape of standing variation for tandem duplications in Drosophila yakuba and Drosophila simulans. Mol Biol Evol 31, 1750–1766.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Salzberg, S.L., Phillippy, A.M., Zimin, A., Puiu, D., Magoc, T., Koren, S., Treangen, T.J., Schatz, M.C., Delcher, A.L., Roberts, M., et al. (2012). GAGE: A critical evaluation of genome assemblies and assembly algorithms. Genome Res 22, 557–567.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sedlazeck, F.J., Rescheneder, P., Smolka, M., Fang, H., Nattestad, M., von Haeseler, A., and Schatz, M.C. (2018). Accurate detection of complex structural variations using single–molecule sequencing. Nat Methods 15, 461–468.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • She, X., Liu, G., Ventura, M., Zhao, S., Misceo, D., Roberto, R., Cardone, M.F., Rocchi, M., Rocchi, M., Green, E.D., et al. (2006). A preliminary comparative analysis of primate segmental duplications shows elevated substitution rates and a great–ape expansion of intrachromosomal duplications. Genome Res 16, 576–583.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Simão, F.A., Waterhouse, R.M., Ioannidis, P., Kriventseva, E.V., and Zdobnov, E.M. (2015). BUSCO: assessing genome assembly and annotation completeness with single–copy orthologs. Bioinformatics 31, 3210–3212.

    Article  CAS  PubMed  Google Scholar 

  • Sindi, S.S., Onal, S., Peng, L.C., Wu, H.T., and Raphael, B.J. (2012). An integrative probabilistic model for identification of structural variation in sequencing data. Genome Biol 13, R22.

    Google Scholar 

  • Spofford, J.B. (1969). Heterosis and the evolution of duplications. Am Natist 103, 407–432.

    Article  Google Scholar 

  • Stancu, M.C., van Roosmalen, M.J., Renkens, I., Nieboer, M.M., Middelkamp, S., de Ligt, J., Pregno, G., Giachino, D., Mandrile, G., Espejo Valle–Inclan, J., et al. (2017). Mapping and phasing of structural variation in patient genomes using nanopore sequencing. Nat Commun 8, 1326.

    Article  CAS  Google Scholar 

  • Staňková, H., Hastie, A.R., Chan, S., Vrána, J., Tulpová, Z., Kubaláková, M., Visendi, P., Hayashi, S., Luo, M., Batley, J., et al. (2016). BioNano genome mapping of individual chromosomes supports physical mapping and sequence assembly in complex plant genomes. Plant Biotechnol J 14, 1523–1531.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Stranger, B.E., Forrest, M.S., Dunning, M., Ingle, C.E., Beazley, C., Thorne, N., Redon, R., Bird, C.P., de Grassi, A., Lee, C., et al. (2007). Relative impact of nucleotide and copy number variation on gene expression phenotypes. Science 315, 848–853.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Sudmant, P.H., Rausch, T., Gardner, E.J., Handsaker, R.E., Abyzov, A., Huddleston, J., Zhang, Y., Ye, K., Jun, G., Hsi–Yang Fritz, M., et al. (2015). An integrated map of structural variation in 2,504 human genomes. Nature 526, 75–81.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Tettelin, H., Masignani, V., Cieslewicz, M.J., Donati, C., Medini, D., Ward, N.L., Angiuoli, S.V., Crabtree, J., Jones, A.L., Durkin, A.S., et al. (2005). Genome analysis of multiple pathogenic isolates of Streptococcus agalactiae: Implications for the microbial “pangenome”. Proc Natl Acad Sci USA 102, 13950–13955.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Traherne, J.A., Martin, M., Ward, R., Ohashi, M., Pellett, F., Gladman, D., Middleton, D., Carrington, M., and Trowsdale, J. (2010). Mechanisms of copy number variation and hybrid gene formation in the KIR immune gene complex. Human Mol Genets 19, 737–751.

    Article  CAS  Google Scholar 

  • Trappe, K., Emde, A.K., Ehrlich, H.C., and Reinert, K. (2014). Gustaf: Detecting and correctly classifying SVs in the NGS twilight zone. Bioinformatics 30, 3484–3490.

    Article  CAS  PubMed  Google Scholar 

  • Traut, W., Rahn, I.M., Winking, H., Kunze, B., and Weichenhan, D. (2001). Evolution of a 6–20. Mb long–range repeat cluster in the genus Mus. Chromosoma 110, 247–252.

    Article  CAS  PubMed  Google Scholar 

  • VanKuren, N.W., and Long, M. (2018). Gene duplicates resolving sexual conflict rapidly evolved essential gametogenesis functions. Nat Ecol Evol 2, 705–712.

    Article  PubMed  PubMed Central  Google Scholar 

  • Veitia, R.A. (2002). Exploring the etiology of haploinsufficiency. Bioessays 24, 175–184.

    Article  CAS  PubMed  Google Scholar 

  • Venter, J.C., Adams, M.D., Myers, E.W., Li, P.W., Mural, R.J., Sutton, G. G., Smith, H.O., Yandell, M., Evans, C.A., Holt, R.A., et al. (2001). The sequence of the human genome. Science 291, 1304–1351.

    Article  CAS  PubMed  Google Scholar 

  • Voskoboynik, A., Neff, N.F., Sahoo, D., Newman, A.M., Pushkarev, D., Koh, W., Passarelli, B., Fan, H.C., Mantalas, G.L., Palmeri, K.J., et al. (2013). The genome sequence of the colonial chordate, Botryllus schlosseri. eLife 2, e00569.

    Article  PubMed  PubMed Central  Google Scholar 

  • Walker, B.J., Abeel, T., Shea, T., Priest, M., Abouelliel, A., Sakthikumar, S., Cuomo, C.A., Zeng, Q., Wortman, J., Young, S.K., et al. (2014). Pilon: an integrated tool for comprehensive microbial variant detection and genome assembly improvement. PLoS ONE 9, e112963.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Walsh, J.B. (1987). Sequence–dependent gene conversion: can duplicated genes diverge fast enough to escape conversion? Genetics 117, 543–557.

    CAS  PubMed  PubMed Central  Google Scholar 

  • Weber, J.L., and Myers, E.W. (1997). Human whole–genome shotgun sequencing. Genome Res 7, 401–409.

    Article  CAS  PubMed  Google Scholar 

  • Weichenhan, D., Kunze, B., Winking, H., van Geel, M., Osoegawa, K., de Jong, P.J., and Traut, W. (2001). Source and component genes of a 6–20. Mb gene cluster in the house mouse. Mamm Genome 12, 590–594.

    CAS  PubMed  Google Scholar 

  • Wondji, C.S., Irving, H., Morgan, J., Lobo, N.F., Collins, F.H., Hunt, R.H., Coetzee, M., Hemingway, J., and Ranson, H. (2009). Two duplicated P450 genes are associated with pyrethroid resistance in Anopheles funestus, a major malaria vector. Genome Res 19, 452–459.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xi, R., Hadjipanayis, A.G., Luquette, L.J., Kim, T.M., Lee, E., Zhang, J., Johnson, M.D., Muzny, D.M., Wheeler, D.A., Gibbs, R.A., et al. (2011). Copy number variation detection in whole–genome sequencing data using the Bayesian information criterion. Proc Natl Acad Sci USA 108, e1128–E1136.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Xie, C., and Tammi, M.T. (2009). CNV–seq, a new method to detect copy number variation using high–throughput sequencing. BMC BioInf 10, 80.

    Article  CAS  Google Scholar 

  • Yao, R., Zhang, C., Yu, T., Li, N., Hu, X., Wang, X., Wang, J., and Shen, Y. (2017). Evaluation of three read–depth based CNV detection tools using whole–exome sequencing data. Mol Cytogenet 10, 30.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ye, C., Hill, C.M., Wu, S., Ruan, J., and Ma, Z.S. (2016). DBG2OLC: Efficient assembly of large genomes using long erroneous reads of the third generation sequencing technologies. Sci Rep 6, 31900.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Ye, K., Schulz, M.H., Long, Q., Apweiler, R., and Ning, Z. (2009). Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired–end short reads. Bioinformatics 25, 2865–2871.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Yeh, S.D., Do, T., Abbassi, M., and Ranz, J.M. (2012a). Functional relevance of the newly evolved sperm dynein intermediate chain multigene family in Drosophila melanogaster males. Commun Integrat Biol 5, 462–465.

    Article  Google Scholar 

  • Yeh, S.D., Do, T., Chan, C., Cordova, A., Carranza, F., Yamamoto, E.A., Abbassi, M., Gandasetiawan, K.A., Librado, P., Damia, E., et al. (2012b). Functional evidence that a recently evolved Drosophila sperm-specific gene boosts sperm competition. Proc Natl Acad Sci USA 109, 2043–2048.

    Article  PubMed  PubMed Central  Google Scholar 

  • Yoon, S., Xuan, Z., Makarov, V., Ye, K., and Sebat, J. (2009). Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res 19, 1586–1592.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zhang, B., Sambono, J.L., Morgan, J.A.T., Venus, B., Rolls, P., and Lew–Tabor, A.E. (2016). An evaluation of quantitative PCR assays (TaqMan® and SYBR Green) for the detection of Babesia bigemina and Babesia bovis, and a novel fluorescent–ITS1–PCR capillary electrophoresis method for genotyping B. bovis isolates. Vet Sci 3, 23.

    Article  PubMed Central  Google Scholar 

  • Zhang, F., Carvalho, C.M.B., and Lupski, J.R. (2009). Complex human chromosomal and genomic rearrangements. Trends Genets 25, 298–307.

    Article  CAS  Google Scholar 

  • Zhang, J., Wang, J., and Wu, Y. (2012). An improved approach for accurate and efficient calling of structural variations with low–coverage sequence data. BMC BioInf 13, S6.

    Google Scholar 

  • Zhang, Z.D., Du, J., Lam, H., Abyzov, A., Urban, A.E., Snyder, M., and Gerstein, M. (2011). Identification of genomic indels and structural variations using split reads. BMC Genomics 12, 375.

    Article  PubMed  PubMed Central  Google Scholar 

  • Zhao, M., Wang, Q., Wang, Q., Jia, P., and Zhao, Z. (2013). Computational tools for copy number variation (CNV) detection using next–generation sequencing data: features and perspectives. BMC BioInf 14, S1.

    Article  Google Scholar 

  • Zhao, Q., Feng, Q., Lu, H., Li, Y., Wang, A., Tian, Q., Zhan, Q., Lu, Y., Zhang, L., Huang, T., et al. (2018). Pan–genome analysis highlights the extent of genomic variation in cultivated and wild rice. Nat Genet 50, 278–284.

    Article  CAS  PubMed  Google Scholar 

  • Zhou, J., Lemos, B., Dopman, E.B., and Hartl, D.L. (2011). Copy–number variation: the balance between gene dosage and expression in Drosophila melanogaster. Genome Biol Evol 3, 1014–1024.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zimin, A.V., Marçais, G., Puiu, D., Roberts, M., Salzberg, S.L., and Yorke, J.A. (2013). The MaSuRCA genome assembler. Bioinformatics 29, 2669–2677.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  • Zimin, A.V., Puiu, D., Luo, M.C., Zhu, T., Koren, S., Marçais, G., Yorke, J. A., Dvořák, J., and Salzberg, S.L. (2017). Hybrid assembly of the large and highly repetitive genome of Aegilops tauschii, a progenitor of bread wheat, with the MaSuRCA mega–reads algorithm. Genome Res 27, 787–792.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work was supported by a National Science Foundation Grant (MCB-1157876) to J.M.R.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to José Ranz.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Ranz, J., Clifton, B. Characterization and evolutionary dynamics of complex regions in eukaryotic genomes. Sci. China Life Sci. 62, 467–488 (2019). https://doi.org/10.1007/s11427-018-9458-0

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11427-018-9458-0

Keywords

Navigation