Skip to main content

Improved Core Genes Prediction for Constructing Well-Supported Phylogenetic Trees in Large Sets of Plant Species

  • Conference paper
Bioinformatics and Biomedical Engineering (IWBBIO 2015)

Abstract

The way to infer well-supported phylogenetic trees that precisely reflect the evolutionary process is a challenging task that completely depends on the way the related core genes have been found. In previous computational biology studies, many similarity based algorithms, mainly dependent on calculating sequence alignment matrices, have been proposed to find them. In these kinds of approaches, a significantly high similarity score between two coding sequences extracted from a given annotation tool means that one has the same genes. In a previous work article, we presented a quality test approach (QTA) that improves the core genes quality by combining two annotation tools (namely NCBI, a partially human-curated database, and DOGMA, an efficient annotation algorithm for chloroplasts). This method takes the advantages from both sequence similarity and gene features to guarantee that the core genome contains correct and well-clustered coding sequences (i.e., genes). We then show in this article how useful are such well-defined core genes for biomolecular phylogenetic reconstructions, by investigating various subsets of core genes at various family or genus levels, leading to subtrees with strong bootstraps that are finally merged in a well-supported supertree.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Alkindy, B., Couchot, J., Guyeux, C., Mouly, A., Salomon, M., Bahi, J.M.: Finding the core-genes of chloroplasts. Journal of Bioscience, Biochemistry, and Bioinformatics 4(5), 357–364 (2014)

    Google Scholar 

  2. Alkindy, B., Guyeux, C., Couchot, J., Salomon, M., Bahi, J.M.: Gene similarity-based approaches for determining core-genes of chloroplasts. In: 2014 IEEE International Conference on Bioinformatics and Biomedicine, BIBM (2014) 978-1-4799-5669-2/14/

    Google Scholar 

  3. Chaffey, N., Alberts, B., Johnson, A., Lewis, J., Raff, M., Roberts, K., Walter, P.: Molecular biology of the cell. Annals of Botany 91(3), 401–401 (2003)

    Article  Google Scholar 

  4. Stoebe, B., Martin, W., Kowallik, K.V.: Distribution and nomenclature of protein-coding genes in 12 sequenced chloroplast genomes. Plant Molecular Biology Reporter 16(3), 243–255 (1998)

    Article  Google Scholar 

  5. Grzebyk, D., Schofield, O., Vetriani, C., Falkowski, P.G.: The mesozoic radiation of eukaryotic algae: The portable plastid hypothesis1. Journal of Phycology 39(2), 259–267 (2003)

    Article  Google Scholar 

  6. De Chiara, M., Hood, D., Muzzi, A., Pickard, D.J., Perkins, T., Pizza, M., Dougan, G., Rappuoli, R., Moxon, E.R., Soriani, M., Donati, C.: Genome sequencing of disease and carriage isolates of non typeable haemophilus influenzae identifies discrete population structure. Proceedings of the National Academy of Sciences 111(14), 5439–5444 (2014)

    Article  Google Scholar 

  7. Kurtz, S., Phillippy, A., Delcher, A.L., Smoot, M., Shumway, M., Antonescu, C., Salzberg, S.L.: Versatile and open software for comparing large genomes. Genome Biology 5(2), R12 (2004)

    Google Scholar 

  8. Apweiler, R., ODonovan, C., Martin, M.J., Fleischmann, W., Hermjakob, H., Moeller, S., Contrino, S., Junker, V.: Swiss-prot and its computer-annotated supplement trembl: How to produce high quality automatic annotation. Eur. J. Biochem. 147, 9–15 (1985)

    Article  Google Scholar 

  9. Sugawara, H., Ogasawara, O., Okubo, K., Gojobori, T., Tateno, Y.: Ddbj with new system and face. Nucleic Acids Research 36(suppl. 1), D22–D24 (2008)

    Google Scholar 

  10. Wyman, S.K., Jansen, R.K., Boore, J.L.: Automatic annotation of organellar genomes with dogma. Bioinformatics 20(17), 3252–3255 (2004)

    Article  Google Scholar 

  11. Zafar, N., Mazumder, R., Seto, D.: Coregenes: A computational tool for identifying and cataloging. BMC Bioinformatics 33(1), 12 (2002)

    Article  Google Scholar 

  12. Stamatakis, A.: Raxml version 8: A tool for phylogenetic analysis and post-analysis of large phylogenies. Bioinformatics (2014)

    Google Scholar 

  13. Kearse, M., Moir, R., Wilson, A., Stones-Havas, S., Cheung, M., Sturrock, S., Buxton, S., Cooper, A., Markowitz, S., Chris, Duran, o.: Geneious basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics 28(12), 1647–1649 (2012)

    Article  Google Scholar 

  14. Ranwez, V., Criscuolo, A., Douzery, E.J.: Supertriplets: a triplet-based supertree approach to phylogenomics. Bioinformatics 26(12), i115–i123 (2010)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

AlKindy, B., Al-Nayyef, H., Guyeux, C., Couchot, JF., Salomon, M., Bahi, J.M. (2015). Improved Core Genes Prediction for Constructing Well-Supported Phylogenetic Trees in Large Sets of Plant Species. In: Ortuño, F., Rojas, I. (eds) Bioinformatics and Biomedical Engineering. IWBBIO 2015. Lecture Notes in Computer Science(), vol 9043. Springer, Cham. https://doi.org/10.1007/978-3-319-16483-0_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16483-0_38

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16482-3

  • Online ISBN: 978-3-319-16483-0

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics