Abstract
This chapter presents a generalized protocol for conducting phylogenetic analyses using large-scale molecular datasets, specifically using transcriptome data from the Illumina sequencing platform. The general molecular lab bench protocol consists of RNA extraction, cDNA synthesis, and sequencing, in this case via Illumina. After sequences have been obtained, bioinformatics methods are used to assemble raw reads, identify coding regions, and categorize sequences from different species into groups of orthologous genes (OGs). The specific OGs to be used for phylogenetic inference are selected using a custom shell script. Finally, the selected orthologous groups are concatenated into a supermatrix. Generalized methods for phylogenomic inference using maximum likelihood and Bayesian inference software are presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Giribet G (2015) New animal phylogeny: future challenges for animal phylogeny in the age of phylogenomics. Org Divers Evol 2015:1–8. doi:10.1007/s13127-015-0236-4
Telford MJ, Budd GE, Philippe H (2015) Phylogenomic insights into animal evolution. Curr Biol 25:R876–R887. doi:10.1016/j.cub.2015.07.060
Eisen JA, Fraser CM (2003) Phylogenomics: intersection of evolution and genomics. Science 300:1706–1707. doi:10.1126/science.1086292
Regier JC, Shultz JW, Zwick A et al (2010) Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature 463:1079–1083. doi:10.1038/nature08742
Lemmon AR, Emme SA, Lemmon EM (2012) Anchored hybrid enrichment for massively high-throughput phylogenomics. Syst Biol 61:727–744. doi:10.1093/sysbio/sys049
Peloso PLV, Frost DR, Richards SJ et al (2015) The impact of anchored phylogenomics and taxon sampling on phylogenetic inference in narrow-mouthed frogs (Anura, Microhylidae). Cladistics 32:113–140. doi:10.1111/cla.12118
Prum RO, Berv JS, Dornburg A et al (2015) A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature 526:569–573. doi:10.1038/nature15697
Philippe H, Lartillot N, Brinkmann H (2005) Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol 22:1246–1253. doi:10.1093/molbev/msi111
Dunn CW, Hejnol A, Matus DQ et al (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745–749. doi:10.1038/nature06614
Delsuc F, Brinkmann H, Chourrout D, Philippe H (2006) Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439:965–968
Bourlat SJ, Juliusdottir T, Lowe CJ et al (2006) Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida. Nature 444:85–88. doi:10.1038/nature05241
Kocot KM, Cannon JT, Todt C et al (2011) Phylogenomics reveals deep molluscan relationships. Nature 477:452–456
Smith SA, Wilson NG, Goetz FE et al (2011) Resolving the evolutionary relationships of molluscs with phylogenomic tools. Nature 480:364–367. doi:10.1038/nature10526
Telford MJ, Lowe CJ, Cameron CB et al (2014) Phylogenomic analysis of echinoderm class relationships supports Asterozoa. Proc Biol Sci 281(1786):pii: 20140479. doi:10.1098/rspb.2014.0479
Cannon JT, Kocot KM, Waits DS et al (2014) Phylogenomic resolution of the hemichordate and echinoderm clade. Curr Biol 24:2827–2832. doi:10.1016/j.cub.2014.10.016
Whelan NV, Kocot KM, Moroz LL, Halanych KM (2015) Error, signal, and the placement of Ctenophora sister to all other animals. Proc Natl Acad Sci 112:5773–5778. doi:10.1073/pnas.1503453112
Struck TH, Golombek A, Weigert A et al (2015) The evolution of annelids reveals two adaptive routes to the interstitial realm. Curr Biol 25:1993–1999. doi:10.1016/j.cub.2015.06.007
Weigert A, Helm C, Meyer M et al (2014) Illuminating the base of the annelid tree using transcriptomics. Mol Biol Evol 31:1391–1401. doi:10.1093/molbev/msu080
Laumer CE, Bekkouche N, Kerbl A et al (2015) Spiralian phylogeny informs the evolution of microscopic lineages. Curr Biol 25:2000–2006. doi:10.1016/j.cub.2015.06.068
Andrade SCS, Novo M, Kawauchi GY et al (2015) Articulating “Archiannelids”: phylogenomics and annelid relationships, with emphasis on Meiofaunal taxa. Mol Biol Evol 32:2860–2875. doi:10.1093/molbev/msv157
Andrade SCS, Montenegro H, Strand M et al (2014) A transcriptomic approach to ribbon worm systematics (Nemertea): resolving the Pilidiophora problem. Mol Biol Evol 31:3206–3215. doi:10.1093/molbev/msu253
Laumer CE, Hejnol A, Giribet G (2015) Nuclear genomic signals of the “microturbellarian” roots of platyhelminth evolutionary innovation. eLife e05503. doi:10.7554/eLife.05503
Egger B, Lapraz F, Tomiczek B et al (2015) A transcriptomic-phylogenomic analysis of the evolutionary relationships of flatworms. Curr Biol 25:1347–1353. doi:10.1016/j.cub.2015.03.034
Dunn CW, Howison M, Zapata F (2013) Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14:330. doi:10.1186/1471-2105-14-330
Oakley TH, Alexandrou MA, Ngo R et al (2014) Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system. BMC Bioinformatics 15:230. doi:10.1186/1471-2105-15-230
Yang Y, Smith SA (2014) Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Mol Biol Evol 31:3081–3092. doi:10.1093/molbev/msu245
Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652
Ebersberger I, Strauss S, von Haeseler A (2009) HaMStR: profile hidden markov model based search for orthologs in ESTs. BMC Evol Biol 9:157
Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–518
Misof B, Misof K (2009) A Monte Carlo approach successfully identifies randomness in multiple sequence alignments: a more objective means of data exclusion. Syst Biol 58:21–34
Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490
Kocot KM, Citarella MR, Moroz LL, Halanych KM (2013) PhyloTreePruner: a phylogenetic tree-based approach for selection of orthologous sequences for phylogenomics. Evol Bioinf Online 9:429–435. doi:10.4137/EBO.S12813
Kück P, Meusemann K (2010) FASconCAT: convenient handling of data matrices. Mol Phylogenet Evol 56:1115–1118. doi:10.1016/j.ympev.2010.04.024
Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. doi:10.1093/bioinformatics/btu033
Lartillot N, Rodrigue N, Stubbs D, Richer J (2013) PhyloBayes MPI. Phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst Biol 62:611–615. doi:10.1093/sysbio/syt022
Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu170
Östlund G, Schmitt T, Forslund K et al (2010) InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 38:D196–D203
Altenhoff AM, Schneider A, Gonnet GH, Dessimoz C (2011) OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res 39:D289–D294. doi:10.1093/nar/gkq1238
Wattam AR, Abraham D, Dalay O et al (2014) PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 42:D581–D591. doi:10.1093/nar/gkt1099
Li L, Stoeckert CJ, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189. doi:10.1101/gr.1224503
Lechner M, Findeiß S, Steiner L et al (2011) Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics 12:124. doi:10.1186/1471-2105-12-124
Tan G, Muffato M, Ledergerber C et al (2015) Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Syst Biol 64:778–791. doi:10.1093/sysbio/syv033
Meyer B, Meusemann K, Misof B (2010) MARE v0.1.2-rc
Criscuolo A, Gribaldo S (2010) BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol 10:210. doi:10.1186/1471-2148-10-210
Struck TH (2014) TreSpEx-detection of misleading signal in phylogenetic reconstructions based on tree information. Evol Bioinf Online 10:51–67. doi:10.4137/EBO.S14239
Kück P, Struck TH (2014) BaCoCa—a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol Phylogenet Evol 70:94–98. doi:10.1016/j.ympev.2013.09.011
Philippe H, Brinkmann H, Lavrov DV et al (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biol 9, e1000602. doi:10.1371/journal.pbio.1000602
Le SQ, Dang CC, Gascuel O (2012) Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol Biol Evol 29:2921–2936. doi:10.1093/molbev/mss112
Darriba D, Taboada GL, Doallo R, Posada D (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27:1164–1165. doi:10.1093/bioinformatics/btr088
Lanfear R, Calcott B, Ho SYW, Guindon S (2012) PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol Biol Evol 29:1695–1701. doi:10.1093/molbev/mss020
Lartillot N, Philippe H (2004) A Bayesian Mixture Model for across-site heterogeneities in the amino-acid replacement process. Mol Biol Evol 21:1095–1109. doi:10.1093/molbev/msh112
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer Science+Business Media New York
About this protocol
Cite this protocol
Cannon, J.T., Kocot, K.M. (2016). Phylogenomics Using Transcriptome Data. In: Bourlat, S. (eds) Marine Genomics. Methods in Molecular Biology, vol 1452. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3774-5_4
Download citation
DOI: https://doi.org/10.1007/978-1-4939-3774-5_4
Published:
Publisher Name: Humana Press, New York, NY
Print ISBN: 978-1-4939-3772-1
Online ISBN: 978-1-4939-3774-5
eBook Packages: Springer Protocols