Abstract
Since the pioneering studies of Thomas Hunt Morgan and coworkers at the dawn of the twentieth century, Drosophila melanogaster and its sister species have tremendously contributed to unveil the rules underlying animal genetics, development, behavior, evolution, and human disease. Recent advances in DNA sequencing technologies launched Drosophila into the post-genomic era and paved the way for unprecedented comparative genomics investigations. The complete sequencing and systematic comparison of the genomes from 12 Drosophila species represents a milestone achievement in modern biology, which allowed a plethora of different studies ranging from the annotation of known and novel genomic features to the evolution of chromosomes and, ultimately, of entire genomes. Despite the efforts of countless laboratories worldwide, the vast amount of data that were produced over the past 15 years is far from being fully explored.
In this chapter, we will review some of the bioinformatic approaches that were developed to interrogate the genomes of the 12 Drosophila species. Setting off from alignments of the entire genomic sequences, the degree of conservation can be separately evaluated for every region of the genome, providing already first hints about elements that are under purifying selection and therefore likely functional. Furthermore, the careful analysis of repeated sequences sheds light on the evolutionary dynamics of transposons, an enigmatic and fascinating class of mobile elements housed in the genomes of animals and plants. Comparative genomics also aids in the computational identification of the transcriptionally active part of the genome, first and foremost of protein-coding loci, but also of transcribed nevertheless apparently noncoding regions, which were once considered “junk” DNA. Eventually, the synergy between functional and comparative genomics also facilitates in silico and in vivo studies on cis-acting regulatory elements, like transcription factor binding sites, that due to the high degree of sequence variability usually impose increased challenges for bioinformatics approaches.
This is a preview of subscription content, log in via an institution.
References
Adams MD, Celniker SE, Holt RA et al (2000) The genome sequence of Drosophila melanogaster. Science 287:2185–2195
Misra S, Crosby MA, Mungall CJ et al (2002) Annotation of the Drosophila melanogaster euchromatic genome: a systematic review. Genome Biol 3(12):research0083.1–research083.22
Richards S, Liu Y, Bettencourt BR et al (2005) Comparative genome sequencing of Drosophila pseudoobscura: chromosomal, gene, and cis-element evolution. Genome Res 15:1–18
Bergman CM, Pfeiffer BD, Rincón-Limas DE et al (2002) Assessing the impact of comparative genomic sequence data on the functional annotation of the Drosophila genome. Genome Biol 3:RESEARCH0086
Kellis M, Patterson N, Endrizzi M et al (2003) Sequencing and comparison of yeast species to identify genes and regulatory elements. Nature 423:241–254
Clark AG, Eisen MB, Smith DR et al (2007) Evolution of genes and genomes on the Drosophila phylogeny. Nature 450:203–218
Stark A, Lin MF, Kheradpour P et al (2007) Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures. Nature 450:219–232
Lin MF, Carlson JW, Crosby MA et al (2007) Revisiting the protein-coding gene catalog of Drosophila melanogaster using 12 fly genomes. Genome Res 17:1823–1836
Roy S, Ernst J, modENCODE Consortium et al (2010) Identification of functional elements and regulatory circuits by Drosophila modENCODE. Science 330:1787–1797
Nègre N, Brown CD, Ma L et al (2011) A cis-regulatory map of the Drosophila genome. Nature 471:527–531
Attrill H, Falls K, Goodman JL et al (2016) FlyBase: establishing a gene group resource for Drosophila melanogaster. Nucleic Acids Res 44:D786–D792
Herrero J, Muffato M, Beal K et al (2016) Ensembl comparative genomics resources. Database 2016:bav096. https://doi.org/10.1093/database/baw053
Speir ML, Zweig AS, Rosenbloom KR et al (2016) The UCSC genome browser database: 2016 update. Nucleic Acids Res 44:D717–D725
Harris RS (2007) Improved pairwise alignment of genomic DNA. Pennsylvania State University, State College, PA
Blanchette M, Kent WJ, Riemer C et al (2004) Aligning multiple genomic sequences with the threaded blockset aligner. Genome Res 14:708–715
Felsenstein J, Churchill GA (1996) A hidden markov model approach to variation among sites in rate of evolution. Mol Biol Evol 13:93–104
Siepel A, Bejerano G, Pedersen JS et al (2005) Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res 15:1034–1050
Li R, Ye J, Li S et al (2005) ReAS: recovery of ancestral sequences for transposable elements from the unassembled reads of a whole genome shotgun. PLoS Comput Biol 1:e43
Edgar RC, Myers EW (2005) PILER: identification and classification of genomic repeats. Bioinformatics 21(Suppl 1):i152–i158
Tempel S (2012) Using and understanding RepeatMasker. Methods Mol Biol 859:29–51
Smith CD, Edgar RC, Yandell MD et al (2007) Improved repeat identification and masking in dipterans. Gene 389:1–9
Stanke M, Waack S (2003) Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19(Suppl 2):ii215–ii225
Gross SS, Brent MR (2006) Using multiple alignments to improve gene prediction. J Comput Biol 13:379–393
Gross SS, Do CB, Sirota M, Batzoglou S (2007) CONTRAST: a discriminative, phylogeny-free approach to multiple informant de novo gene prediction. Genome Biol 8:R269
Lin MF, Jungreis I, Kellis M (2011) PhyloCSF: a comparative genomics method to distinguish protein coding and non-coding regions. Bioinformatics 27:i275–i282
Kristensen DM, Wolf YI, Mushegian AR, Koonin EV (2011) Computational methods for gene Orthology inference. Brief Bioinform 12:379–391
Vilella AJ, Severin J, Ureta-Vidal A et al (2009) EnsemblCompara genetrees: complete, duplication-aware phylogenetic trees in vertebrates. Genome Res 19:327–335
Huerta-Cepas J, Capella-Gutiérrez S, Pryszcz LP et al (2014) PhylomeDB v4: zooming into the plurality of evolutionary histories of a genome. Nucleic Acids Res 42:D897–D902
Li L, Stoeckert CJ Jr, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189
Nielsen R (2005) Molecular signatures of natural selection. Annu Rev Genet 39:197–218
Yang Z (2007) PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol 24:1586–1591
Pedersen JS, Bejerano G, Siepel A et al (2006) Identification and classification of conserved RNA secondary structures in the human genome. PLoS Comput Biol 2:e33
Lorenz R, Bernhart SH, Höner Zu Siederdissen C et al (2011) ViennaRNA package 2.0. Algorithms Mol Biol 6:26
Lai EC, Tomancak P, Williams RW, Rubin GM (2003) Computational identification of drosophila microRNA genes. Genome Biol 4:R42
Lim LP, Lau NC, Weinstein EG et al (2003) The microRNAs of Caenorhabditis elegans. Genes Dev 17:991–1008
Blanchette M, Tompa M (2002) Discovery of regulatory elements by a computational method for phylogenetic footprinting. Genome Res 12:739–748
Zhang Z, Gerstein M (2003) Of mice and men: phylogenetic footprinting aids the discovery of regulatory elements. J Biol 2:11
Ganley ARD, Kobayashi T (2007) Phylogenetic footprinting to find functional DNA elements. Methods Mol Biol 395:367–380
Satija R, Novák A, Miklós I et al (2009) BigFoot: Bayesian alignment and phylogenetic footprinting with MCMC. BMC Evol Biol 9:217
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media LLC
About this protocol
Cite this protocol
Oti, M., Pane, A., Sammeth, M. (2018). Comparative Genomics in Drosophila . In: Setubal, J., Stoye, J., Stadler, P. (eds) Comparative Genomics. Methods in Molecular Biology, vol 1704. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-7463-4_17
Download citation
DOI: https://doi.org/10.1007/978-1-4939-7463-4_17
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-7461-0
Online ISBN: 978-1-4939-7463-4
eBook Packages: Springer Protocols