Skip to main content

Phylogenomics Using Transcriptome Data

  • Protocol
  • First Online:
Marine Genomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 1452))

Abstract

This chapter presents a generalized protocol for conducting phylogenetic analyses using large-scale molecular datasets, specifically using transcriptome data from the Illumina sequencing platform. The general molecular lab bench protocol consists of RNA extraction, cDNA synthesis, and sequencing, in this case via Illumina. After sequences have been obtained, bioinformatics methods are used to assemble raw reads, identify coding regions, and categorize sequences from different species into groups of orthologous genes (OGs). The specific OGs to be used for phylogenetic inference are selected using a custom shell script. Finally, the selected orthologous groups are concatenated into a supermatrix. Generalized methods for phylogenomic inference using maximum likelihood and Bayesian inference software are presented.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Giribet G (2015) New animal phylogeny: future challenges for animal phylogeny in the age of phylogenomics. Org Divers Evol 2015:1–8. doi:10.1007/s13127-015-0236-4

    Google Scholar 

  2. Telford MJ, Budd GE, Philippe H (2015) Phylogenomic insights into animal evolution. Curr Biol 25:R876–R887. doi:10.1016/j.cub.2015.07.060

    Article  CAS  PubMed  Google Scholar 

  3. Eisen JA, Fraser CM (2003) Phylogenomics: intersection of evolution and genomics. Science 300:1706–1707. doi:10.1126/science.1086292

    Article  CAS  PubMed  Google Scholar 

  4. Regier JC, Shultz JW, Zwick A et al (2010) Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature 463:1079–1083. doi:10.1038/nature08742

    Article  CAS  PubMed  Google Scholar 

  5. Lemmon AR, Emme SA, Lemmon EM (2012) Anchored hybrid enrichment for massively high-throughput phylogenomics. Syst Biol 61:727–744. doi:10.1093/sysbio/sys049

    Article  CAS  PubMed  Google Scholar 

  6. Peloso PLV, Frost DR, Richards SJ et al (2015) The impact of anchored phylogenomics and taxon sampling on phylogenetic inference in narrow-mouthed frogs (Anura, Microhylidae). Cladistics 32:113–140. doi:10.1111/cla.12118

    Article  Google Scholar 

  7. Prum RO, Berv JS, Dornburg A et al (2015) A comprehensive phylogeny of birds (Aves) using targeted next-generation DNA sequencing. Nature 526:569–573. doi:10.1038/nature15697

    Article  CAS  PubMed  Google Scholar 

  8. Philippe H, Lartillot N, Brinkmann H (2005) Multigene analyses of bilaterian animals corroborate the monophyly of Ecdysozoa, Lophotrochozoa, and Protostomia. Mol Biol Evol 22:1246–1253. doi:10.1093/molbev/msi111

    Article  CAS  PubMed  Google Scholar 

  9. Dunn CW, Hejnol A, Matus DQ et al (2008) Broad phylogenomic sampling improves resolution of the animal tree of life. Nature 452:745–749. doi:10.1038/nature06614

    Article  CAS  PubMed  Google Scholar 

  10. Delsuc F, Brinkmann H, Chourrout D, Philippe H (2006) Tunicates and not cephalochordates are the closest living relatives of vertebrates. Nature 439:965–968

    Article  CAS  PubMed  Google Scholar 

  11. Bourlat SJ, Juliusdottir T, Lowe CJ et al (2006) Deuterostome phylogeny reveals monophyletic chordates and the new phylum Xenoturbellida. Nature 444:85–88. doi:10.1038/nature05241

    Article  CAS  PubMed  Google Scholar 

  12. Kocot KM, Cannon JT, Todt C et al (2011) Phylogenomics reveals deep molluscan relationships. Nature 477:452–456

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Smith SA, Wilson NG, Goetz FE et al (2011) Resolving the evolutionary relationships of molluscs with phylogenomic tools. Nature 480:364–367. doi:10.1038/nature10526

    Article  CAS  PubMed  Google Scholar 

  14. Telford MJ, Lowe CJ, Cameron CB et al (2014) Phylogenomic analysis of echinoderm class relationships supports Asterozoa. Proc Biol Sci 281(1786):pii: 20140479. doi:10.1098/rspb.2014.0479

    Article  Google Scholar 

  15. Cannon JT, Kocot KM, Waits DS et al (2014) Phylogenomic resolution of the hemichordate and echinoderm clade. Curr Biol 24:2827–2832. doi:10.1016/j.cub.2014.10.016

    Article  CAS  PubMed  Google Scholar 

  16. Whelan NV, Kocot KM, Moroz LL, Halanych KM (2015) Error, signal, and the placement of Ctenophora sister to all other animals. Proc Natl Acad Sci 112:5773–5778. doi:10.1073/pnas.1503453112

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  17. Struck TH, Golombek A, Weigert A et al (2015) The evolution of annelids reveals two adaptive routes to the interstitial realm. Curr Biol 25:1993–1999. doi:10.1016/j.cub.2015.06.007

    Article  CAS  PubMed  Google Scholar 

  18. Weigert A, Helm C, Meyer M et al (2014) Illuminating the base of the annelid tree using transcriptomics. Mol Biol Evol 31:1391–1401. doi:10.1093/molbev/msu080

    Article  CAS  PubMed  Google Scholar 

  19. Laumer CE, Bekkouche N, Kerbl A et al (2015) Spiralian phylogeny informs the evolution of microscopic lineages. Curr Biol 25:2000–2006. doi:10.1016/j.cub.2015.06.068

    Article  CAS  PubMed  Google Scholar 

  20. Andrade SCS, Novo M, Kawauchi GY et al (2015) Articulating “Archiannelids”: phylogenomics and annelid relationships, with emphasis on Meiofaunal taxa. Mol Biol Evol 32:2860–2875. doi:10.1093/molbev/msv157

    Article  PubMed  Google Scholar 

  21. Andrade SCS, Montenegro H, Strand M et al (2014) A transcriptomic approach to ribbon worm systematics (Nemertea): resolving the Pilidiophora problem. Mol Biol Evol 31:3206–3215. doi:10.1093/molbev/msu253

    Article  CAS  PubMed  Google Scholar 

  22. Laumer CE, Hejnol A, Giribet G (2015) Nuclear genomic signals of the “microturbellarian” roots of platyhelminth evolutionary innovation. eLife e05503. doi:10.7554/eLife.05503

  23. Egger B, Lapraz F, Tomiczek B et al (2015) A transcriptomic-phylogenomic analysis of the evolutionary relationships of flatworms. Curr Biol 25:1347–1353. doi:10.1016/j.cub.2015.03.034

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  24. Dunn CW, Howison M, Zapata F (2013) Agalma: an automated phylogenomics workflow. BMC Bioinformatics 14:330. doi:10.1186/1471-2105-14-330

    Article  PubMed  PubMed Central  Google Scholar 

  25. Oakley TH, Alexandrou MA, Ngo R et al (2014) Osiris: accessible and reproducible phylogenetic and phylogenomic analyses within the Galaxy workflow management system. BMC Bioinformatics 15:230. doi:10.1186/1471-2105-15-230

    Article  PubMed  PubMed Central  Google Scholar 

  26. Yang Y, Smith SA (2014) Orthology inference in nonmodel organisms using transcriptomes and low-coverage genomes: improving accuracy and matrix occupancy for phylogenomics. Mol Biol Evol 31:3081–3092. doi:10.1093/molbev/msu245

    Article  PubMed  PubMed Central  Google Scholar 

  27. Grabherr MG, Haas BJ, Yassour M et al (2011) Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol 29:644–652

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  28. Ebersberger I, Strauss S, von Haeseler A (2009) HaMStR: profile hidden markov model based search for orthologs in ESTs. BMC Evol Biol 9:157

    Article  PubMed  PubMed Central  Google Scholar 

  29. Katoh K, Kuma K, Toh H, Miyata T (2005) MAFFT version 5: improvement in accuracy of multiple sequence alignment. Nucleic Acids Res 33:511–518

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  30. Misof B, Misof K (2009) A Monte Carlo approach successfully identifies randomness in multiple sequence alignments: a more objective means of data exclusion. Syst Biol 58:21–34

    Article  CAS  PubMed  Google Scholar 

  31. Price MN, Dehal PS, Arkin AP (2010) FastTree 2—approximately maximum-likelihood trees for large alignments. PLoS One 5, e9490

    Article  PubMed  PubMed Central  Google Scholar 

  32. Kocot KM, Citarella MR, Moroz LL, Halanych KM (2013) PhyloTreePruner: a phylogenetic tree-based approach for selection of orthologous sequences for phylogenomics. Evol Bioinf Online 9:429–435. doi:10.4137/EBO.S12813

    Article  CAS  Google Scholar 

  33. Kück P, Meusemann K (2010) FASconCAT: convenient handling of data matrices. Mol Phylogenet Evol 56:1115–1118. doi:10.1016/j.ympev.2010.04.024

    Article  PubMed  Google Scholar 

  34. Stamatakis A (2014) RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics 30:1312–1313. doi:10.1093/bioinformatics/btu033

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  35. Lartillot N, Rodrigue N, Stubbs D, Richer J (2013) PhyloBayes MPI. Phylogenetic reconstruction with infinite mixtures of profiles in a parallel environment. Syst Biol 62:611–615. doi:10.1093/sysbio/syt022

    Article  CAS  PubMed  Google Scholar 

  36. Bolger AM, Lohse M, Usadel B (2014) Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics 30:2114–2120. doi:10.1093/bioinformatics/btu170

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  37. Östlund G, Schmitt T, Forslund K et al (2010) InParanoid 7: new algorithms and tools for eukaryotic orthology analysis. Nucleic Acids Res 38:D196–D203

    Article  PubMed  Google Scholar 

  38. Altenhoff AM, Schneider A, Gonnet GH, Dessimoz C (2011) OMA 2011: orthology inference among 1000 complete genomes. Nucleic Acids Res 39:D289–D294. doi:10.1093/nar/gkq1238

    Article  CAS  PubMed  Google Scholar 

  39. Wattam AR, Abraham D, Dalay O et al (2014) PATRIC, the bacterial bioinformatics database and analysis resource. Nucleic Acids Res 42:D581–D591. doi:10.1093/nar/gkt1099

    Article  CAS  PubMed  Google Scholar 

  40. Li L, Stoeckert CJ, Roos DS (2003) OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189. doi:10.1101/gr.1224503

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Lechner M, Findeiß S, Steiner L et al (2011) Proteinortho: detection of (co-)orthologs in large-scale analysis. BMC Bioinformatics 12:124. doi:10.1186/1471-2105-12-124

    Article  PubMed  PubMed Central  Google Scholar 

  42. Tan G, Muffato M, Ledergerber C et al (2015) Current methods for automated filtering of multiple sequence alignments frequently worsen single-gene phylogenetic inference. Syst Biol 64:778–791. doi:10.1093/sysbio/syv033

    Article  PubMed  PubMed Central  Google Scholar 

  43. Meyer B, Meusemann K, Misof B (2010) MARE v0.1.2-rc

    Google Scholar 

  44. Criscuolo A, Gribaldo S (2010) BMGE (block mapping and gathering with entropy): a new software for selection of phylogenetic informative regions from multiple sequence alignments. BMC Evol Biol 10:210. doi:10.1186/1471-2148-10-210

    Article  PubMed  PubMed Central  Google Scholar 

  45. Struck TH (2014) TreSpEx-detection of misleading signal in phylogenetic reconstructions based on tree information. Evol Bioinf Online 10:51–67. doi:10.4137/EBO.S14239

    Article  CAS  Google Scholar 

  46. Kück P, Struck TH (2014) BaCoCa—a heuristic software tool for the parallel assessment of sequence biases in hundreds of gene and taxon partitions. Mol Phylogenet Evol 70:94–98. doi:10.1016/j.ympev.2013.09.011

    Article  PubMed  Google Scholar 

  47. Philippe H, Brinkmann H, Lavrov DV et al (2011) Resolving difficult phylogenetic questions: why more sequences are not enough. PLoS Biol 9, e1000602. doi:10.1371/journal.pbio.1000602

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  48. Le SQ, Dang CC, Gascuel O (2012) Modeling protein evolution with several amino acid replacement matrices depending on site rates. Mol Biol Evol 29:2921–2936. doi:10.1093/molbev/mss112

    Article  CAS  PubMed  Google Scholar 

  49. Darriba D, Taboada GL, Doallo R, Posada D (2011) ProtTest 3: fast selection of best-fit models of protein evolution. Bioinformatics 27:1164–1165. doi:10.1093/bioinformatics/btr088

    Article  CAS  PubMed  Google Scholar 

  50. Lanfear R, Calcott B, Ho SYW, Guindon S (2012) PartitionFinder: combined selection of partitioning schemes and substitution models for phylogenetic analyses. Mol Biol Evol 29:1695–1701. doi:10.1093/molbev/mss020

    Article  CAS  PubMed  Google Scholar 

  51. Lartillot N, Philippe H (2004) A Bayesian Mixture Model for across-site heterogeneities in the amino-acid replacement process. Mol Biol Evol 21:1095–1109. doi:10.1093/molbev/msh112

    Article  CAS  PubMed  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johanna Taylor Cannon .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer Science+Business Media New York

About this protocol

Cite this protocol

Cannon, J.T., Kocot, K.M. (2016). Phylogenomics Using Transcriptome Data. In: Bourlat, S. (eds) Marine Genomics. Methods in Molecular Biology, vol 1452. Humana Press, New York, NY. https://doi.org/10.1007/978-1-4939-3774-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4939-3774-5_4

  • Published:

  • Publisher Name: Humana Press, New York, NY

  • Print ISBN: 978-1-4939-3772-1

  • Online ISBN: 978-1-4939-3774-5

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics