Abstract
Most biological discoveries can only be made in light of evolution. In particular, functional annotation of genes is usually deduced from the orthology, paralogy, or xenology relations between genes, which are inferred from the comparison of a gene tree with a species tree. As sequence-only gene tree reconstruction methods often do not allow to confidently discriminate between trees, recent “integrative methods” include information from the species tree. The idea is to consider, in addition to a value measuring the fitness of a tree to a sequence alignment, a measure reflecting the evolution of a whole gene family through gene gain and loss. One such measure is the “reconciliation” cost, i.e., the cost of a gain and loss scenario explaining the incongruence between the gene and species tree. This chapter begins with a review of deterministic algorithms for computing reconciliation distances under various evolutionary models of gene family evolution. We then review integrative methods for correcting a gene tree, based on various strategies for exploring its neighborhood. The considered algorithms are those based on polytomy resolution, tree amalgamation and supertree reconstruction. The goal is to provide a comprehensive overview of existing methods with algorithms presented in concise form. The reader is referred to original papers for more details and proofs of complexity.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
See Quest for Orthologs links at http://questfororthologs.org/.
References
Aho, A., Yehoshua, S., Szymanski, T., Ullman, J.: Inferring a tree from lowest common ancestors with an application to the optimization of relational expressions. SIAM J. Comput. 10(3), 405–421 (1981)
Akerborg, O., Sennblad, B., Arvestad, L., Lagergren, J.: Simultaneous bayesian gene tree reconstruction and reconciliation analysis. Proc. Nal. Acad. Sci. USA 106(14), 5714–5719 (2009)
Altenhoff, A.M., Studer, R.A., Robinson-Rechavi, M., Dessimoz, C.: Resolving the ortholog conjecture: orthologs tend to be weakly, but significantly, more similar in function than paralogs. PLoS Comput. Biol. 8(5), e1002,514 (2012)
Arvestad, L., Berglund, A., Lagergren, J., Sennblad, B.: Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. In: RECOMB, pp. 326–335 (2004)
Bader, D., Moret, B., Yan, M.: A linear-time algorithm for computing inversion distance between signed permutations with an experimental study. J. Comput. Biol. 8(5), 483–491 (2001)
Bansal, M., Alm, E., Kellis, M.: Efficient algorithms for the reconciliation problem with gene duplication, horizontal transfer and loss. Bioinformatics 28(12), i283–i291 (2012). https://doi.org/10.1093/bioinformatics/bts225
Bansal, M., Burleigh, J., Eulenstein, O., Fernández-Baca, D.: Robinson-foulds supertrees. Alg. Mol. Biol. 5(18) (2010)
Bansal, M., Wu, Y., Alm, E., Kellis., M.: Improved gene tree error-correction in the presence of horizontal gene transfer. Bioinformatics 31(8), 1211–1218 (2015). https://doi.org/10.1093/bioinformatics/btu806
Bérard, S., Gallien, C., Boussau, B., Szollosi, G., Daubin, V., Tannier, E.: Evolution of gene neighborhoods within reconciled phylogenies. Bioinformatics 28(18), i382–i388 (2012)
Berglund, A., Sjolund, E., Ostlund, G., Sonnhammer, E.: InParanoid 6: eukaryotic ortholog clusters with inparalogs. Nucl. Acid Res. 36 (2008)
Bininda-Emonds, O. (ed.): Phylogenetic Supertrees combining information to reveal The Tree of Life. In: Computational Biology. Kluwer Academic, Dordrecht, The Netherlands (2004)
Boeckmann, B., Robinson-Rechavi, M., Xenarios, I., Dessimoz, C.: Conceptual framework and pilot study to benchmark phylogenomic databases based on reference gene trees. Brief. Bioinform. 12(5), 423–435 (2011)
Bork, D., Cheng, R., Wang, J., Sung, J., Libeskind-Hadas, R.: On the computational complexity of the maximum parsimony reconciliation problem in the duplication-loss-coalescence model. Algorithms Mol. Biol. 12(1), 6 (2017)
Boussau, B., Szöllősi, G., Duret, L., Gouy, M., Tannier, E., Daubin, V.: Genome-scale coestimation of species and gene trees. Genome Res. 23, 323–330 (2013)
Chan, Y., Ranwez, V., Scornavacca, C.: Exploring the space of gene/species reconciliations with transfers. J. Math. Biol. 71(5), 1179–1209 (2015)
Chan, Y., Ranwez, V., Scornavacca, C.: Inferring incomplete lineage sorting, duplications, transfers and losses with reconciliations. J. Theoret. Biol. 432, 1–13 (2017)
Chang, W., Eulenstein, O.: Reconciling gene trees with apparent polytomies. In: Chen, D., Lee, D.T. (eds.) Proceedings of the 12th Conference on Computing and Combinatorics (COCOON). Lecture Notes in Computer Science, vol. 4112, pp. 235–244 (2006)
Chen, K., Durand, D., Farach-Colton, M.: Notung: dating gene duplications using gene family trees. J. Comput. Biol. 7, 429–447 (2000)
Constantinescu, M., Sankoff, D.: An efficient algorithm for supertrees. J. Classif. 12, 101–112 (1995)
Darby, C.A., Stolzer, M., Ropp, P.J., Barker, D., Durand, D.: Xenolog classification. Bioinformatics 33(5), 640–649 (2016)
David, L., Alm, E.: Rapid evolutionary innovation during an Archaean genetic expansion. Nature 469 (2011)
Doyon, J.P., Chauve, C., Hamel, S.: Space of gene/species trees reconciliations and parsimonious models. J. Comput. Biol 16(10), 1399–1418 (2009)
Doyon, J., Ranwez, V., Daubin, V., Berry, V.: Models, algorithms and programs for phylogeny reconciliation. Brief. Bioinform. 12(5), 392–400 (2011)
Doyon, J.P., Scornavacca, C., Gorbunov, K.Y., Szöllősi, G.J., Ranwez, V., Berry, V.: An efficient algorithm for gene/species trees parsimonious reconciliation with losses, duplications and transfers. In: Tannier, E. (ed.) RECOMB International Workshop on Comparative Genomics, RECOMB-CG, pp. 93–108. Springer (2010)
Durand, D., Halldórsson, B.V., Vernot, B.: A hybrid micro-macroevolutionary approach to gene tree reconstruction. J. Comput. Biol. 13(2), 320–335 (2006)
El-Mabrouk, N., Ouangraoua, A.: A general framework for gene tree correction based on duplication-loss reconciliation. In: LIPIcs, Workshop on Algorithms in Bioinformatics (WABI), vol. 88, pp. 8:1–8:14 (2017)
Fitch, W.: Homology—a personal view on some of the problems. Trends Genet. 16(5), 227–231 (2000)
Flicek, P., et al.: Ensembl 2012. Nucleic Acids Res. 40, D84–D90 (2012)
Gabaldón, T., Koonin, E.V.: Functional and evolutionary implications of gene orthology. Nat. Rev. Genet. 14(5), 360 (2013)
Goodman, M., Czelusniak, J., Moore, G., Romero-Herrera, A., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Syst. Zool. 28, 132–163 (1979)
Górecki, P., Eulenstein, O.: Algorithms: simultaneous error-correction and rooting for gene tree reconciliation and the gene duplication problem. BMC Bioinform. 13(Supp 10), S14 (2011)
Gorecki, P., Eulenstein, O., Tiuryn, J.: Unrooted tree reconciliation: a unified approach. IEEE/ACM Trans. Comput. Biol. Bioinform. 10(2), 522–536 (2013)
Guindon, S., Gascuel, O.: A simple, fast and accurate algorithm to estimate large phylogenies by maximum likelihood. Syst. Biol. 52, 696–704 (2003)
Hallett, M., Lagergren, J.: Efficient algorithms for lateral gene transfer problems. In: Proceedings of the Fifth Annual International Conference on Computational Biology, RECOMB-CG, pp. 149–156 (2001)
Höhna, S., Drummond, A.J.: Guided tree topology proposals for bayesian phylogenetic inference. Syst. Biol. 61(1), 1–11 (2011)
Jacox, E., Chauve, C., Szöllősi, G.J., Ponty, Y., Scornavacca, C.: ecceTERA: comprehensive gene tree-species tree reconciliation using parsimony. Bioinformatics 32(13), 2056–2058 (2016). https://doi.org/10.1093/bioinformatics/btw105
Jacox, E., Weller, M., Tannier, E., Scornavacca, C.: Resolution and reconciliation of non-binary gene trees with transfers, duplications and losses. Bioinformatics 33(7), 980–987 (2017)
Kordi, M., Bansal, M.: On the complexity of duplication-transfer-loss reconciliation with non-binary gene trees. IEEE/ACM Trans. Comput. Biol. Bioinform. (2016)
Kordi, M., Bansal, M.: Exact algorithms for duplication-transfer-loss reconciliation with non-binary gene trees. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017)
Lafond, M., Chauve, C., Dondi, R., Manuel, El-Mabrouk, N.: Polytomy refinement for the correction of dubious duplications in gene trees. Bioinformatics 30(17), i519–i526 (2014)
Lafond, M., Chauve, C., El-Mabrouk, N., Ouangraoua, A.: Gene tree construction and correction using supertree and reconciliation. IEEE/ACM Trans. Comput. Biol. Bioinform. (TCBB) PP(99), 12 pp. (2018)
Lafond, M., Noutahi, E., El-Mabrouk, N.: Efficient non-binary gene tree resolution with weighted reconciliation cos. In: 27th Annual Symposium on Combinatorial Pattern Matching (CPM) (2016)
Lafond, M., Ouangraoua, A., El-Mabrouk, N.: Reconstructing a supergenetree minimizing reconciliation. BMC Genomics 16, S4 (2015). Special issue of RECOMB-CG 2015
Lafond, M., Semeria, M., Swenson, K., Tannier, E., El-Mabrouk, N.: Gene tree correction guided by orthology. BMC Bioinform. 14(supp 15)(S5) (2013)
Lafond, M., Swenson, K., El-Mabrouk, N.: An optimal reconciliation algorithm for gene trees with polytomies. In: WABI. LNCS, vol. 7534, pp. 106–122 (2012)
Lafond, M., Swenson, K., El-Mabrouk, N.: Error detection and correction of gene trees. In: Models and Algorithms for Genome Evolution. Springer (2013)
Lai, H., Stolzer, M., Durand, D.: Fast heuristics for resolving weakly supported branches using duplication, transfers, and losses. In: RECOMB-CG, 22 pp. (2017)
Lartillot, N., Philippe, H.: A Bayesian mixture model for across-site heterogeneities in the amino-acid replacement process. Mol. Biol. Evol. 21(6), 1095–1109 (2004). http://dx.doi.org/10.1093/molbev/msh112
Lechner, M., Findeiß, S., Steiner, L., Manja, M., Stadler, P., Prohaska, S.: Proteinortho: Detection of co-orthologs in large-scale analysis. BMC Bioinform. 12(1), 1 (2011)
Li, L., Stoeckert, C.J., Roos, D.: OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 13, 2178–2189 (2003)
Libeskind-Hadas, R., Charleston, M.: On the computational complexity of the reticulate cophylogeny reconstruction problem. J. Comput. Biol. 16 (2009)
Maddison, W.P.: Gene trees in species trees. Syst. Biol. 46(3), 523–536 (1997)
Massey, S., Churbanov, A., Rastogi, S., Liberles, D.: Characterizing positive and negative selection and their phylogenetic effects. Gene 418, 22–26 (2008)
Moret, B., Warnow, T.: Molecular evolution: producing the biochemical data. In: Zimmer, E., Roalson, E. (eds.) Methods in Enzymology, Part B, vol. 395, pp. 673–700. Elsevier (2005)
Moret, B.M., Bader, D.A., Wyman, S., Warnow, T., Yan, M.: A new implementation and detailed study of breakpoint analysis. In: Biocomputing 2001, pp. 583–594. World Scientific (2000)
Ng, M., Wormald, N.: Reconstruction of rooted trees from subtrees. Discrete Appl. Math. 69, 19–31 (1996)
Nguyen, N., Mirarab, S., Warnow, T.: MRL and SuperFine+MRL: new supertree methods. Algorithms Mol. Biol. 7(3) (2012)
Nguyen, T.H., Ranwez, V., Pointet, S., Chifolleau, A.M.A., Doyon, J.P., Berry, V.: Reconciliation and local gene tree rearrangement can be of mutual profit. Algorithms Mol. Biol. 8(1), 12 (2013). http://dx.doi.org/10.1186/1748-7188-8-12
Noutahi, E., El-Mabrouk, N.: GATC: a genetic algorithm for gene tree construction under the duplication-transfer-loss model of evolution. BMC Genomics 19(2), 102 (2018)
Noutahi, E., Semeria, M., Lafond, M., Seguin, J., Gueguen, L., El-Mabrouk, N., Tannier, E.: Efficient gene tree correction guided by genome evolution. PLoS One 11(8) (2016)
Ovadia, Y., Fielder, D., Conow, C., Libeskind-Hadas, R.: The cophylogeny reconstruction problem is NP-complete. J. Comput. Biol. 18(1), 59–65 (2011). https://doi.org/10.1089/cmb.2009.0240
Page, R.D., Cotton, J.A.: Genetree: a tool for exploring gene family evolution. In: Comparative Genomics, pp. 525–536. Springer (2000)
Pattengale, N., Gottlieb, E., Moret, B.: Efficiently computing the Robinson-Foulds metric. J. Comput. Biol. 14(6), 724–735 (2007)
Ranwez, V., Berry, V., Criscuolo, A., Fabre, P., Guillemot, S., Scornavacca, C., Douzery, E.: PhySIC: a veto supertree method with desirable properties. Syst. Biol. 56(5), 798–817 (2007)
Ranwez, V., Criscuolo, A., Douzery, E.: SuperTriplets: a triplet-based supertree approach to phylogenomics. Bioinformatics 26(12), i115–i123 (2010)
Rasmussen, M., Kellis, M.: A Bayesian approach for fast and accurate gene tree reconstruction. Mol. Biol. Evol. 28(1), 273–290 (2010)
Rasmussen, M.D., Kellis, M.: Unified modeling of gene duplication, loss, and coalescence using a locus tree. Genome Res. 22(4), 755–765 (2012)
Robinson, D., Foulds, L.: Comparison of phylogenetic trees. Math. Biosci. 53, 131–147 (1981)
Rodrìguez-Ezpeleta, N., Brinkmann, H., Roure, B., Lartillot, N., Lang, B.F., Philippe, H.: Detecting and overcoming systematic errors in genome-scale phylogenies. Syst. Biol. 56(3), 389–399 (2007). http://dx.doi.org/10.1080/10635150701397643
Rogers, J., Fishberg, A., Youngs, N., Wu, Y.C.: Reconciliation feasibility in the presence of gene duplication, loss, and coalescence with multiple individuals per species. BMC Bioinform. 18(1), 292 (2017)
Ronquist, F., Huelsenbeck, J.: MrBayes3: Bayesian phylogenetic inference under mixed models. Bioinformatics 19, 1572–1574 (2003)
Roshan, U., Moret, B., Warnow, T., Williams, T.: Performance of supertree methods on various dataset decompositions. In: Bininda-Edmonds, O. (ed.) Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, pp. 301–328. Springer (2004)
Scornavacca, C., van Iersel, L., Kelk, S., Bryant, D.: The agreement problem for unrooted phylogenetic trees is FPT. J. Graph Algorithms Appl. 18(3), 385–392 (2014)
Scornavacca, C., Jacox, E., Szollosi, G.: Joint amalgamation of most parsimonious reconciled gene trees. Bioinformatics 31(6), 841–848 (2015)
Semple, C.: Reconstructing minimal rooted trees. Discrete Appl. Math. 127(3) (2003)
Sjöstrand, J., Tofigh, A., Daubin, V., Arvestad, L., Sennblad, B., Lagergren, J.: A Bayesian method for analyzing lateral gene transfer. Sys. Biol. 63(3), 409–420 (2014)
Skovgaard, M., Kodra, J., Gram, D., Knudsen, S., Madsen, D., Liberles, D.: Using evolutionary information and ancestral sequences to understand the sequence-function relationship in GLP-1 agonists. J. Mol. Biol. 363, 977–988 (2006)
Stamatakis, A.: RAxML-VI-HPC: maximum likelihood-based phylogenetic analysis with thousands of taxa and mixed models. Bioinformatics 22, 2688–2690 (2006)
Steel, M.: The complexity of reconstructing trees from qualitative characters and subtrees. J. Classif. 9, 91–116 (1992)
Steel, M., Rodrigo, A.: Maximum likelihood supertrees. Syst. Biol. 57(2), 243–250 (2008)
Stolzer, M., Lai, H., Xu, M., Sathaye, D., Vernot, B., Durand, D.: Inferring duplications, losses, transfers and incomplete lineage sorting with nonbinary species trees. Bioinformatics 28(18), i409–i415 (2012)
Swenson, K.M., El-Mabrouk, N.: Gene trees and species trees: irreconcilable differences. BMC Bioinform. 13(Suppl 19), S15 (2012)
Swenson, M., Suri, R., Linder, C., Warnow, T.: SuperFine: fast and accurate supertree estimation. Sys. Biol. 61(2), 214–227 (2012). Special issue RECOMB-CG 2012
Szöllősi, G., Rosikiewicz, W., Boussau, B., Tannier, E., Daubin, V.: Efficient exploration of the space of reconciled gene trees. Syst. Biol. 62(6), 901–912 (2013). http://dx.doi.org/10.1093/sysbio/syt054
Szöllősi, G., E., Tannier, Daubin, V., Boussau, B.: The inference of gene trees with species trees. Syst. Biol. 64(1), e42–e62 (2014)
Szöllősi, G.J., Tannier, E., Lartillot, N., Daubin, V.: Lateral gene transfer from the dead. Syst. Biol. 62(3), 386–397 (2013)
Tatusov, R., Galperin, M., Natale, D., Koonin, E.: The COG database: a tool for genome-scale analysis of protein functions. Nucleic Acids Res. 28, 33–36 (2000)
Taylor, S., de la Cruz, K., Porter, M., Whiting, M.: Characterization of the long-wavelength opsin from Mecoptera and Siphonaptera: does a flea see? Mol. Biol. Evol. 22, 1165–1174 (2005)
Thomas, P.: GIGA: a simple, efficient algorithm for gene tree inference in the genomic age. BMC Bioinform. 11, 312 (2010)
Tofigh, A.: Using trees to capture reticulate evolution: lateral gene transfers and cancer progression. Ph.D. thesis, KTH Royal Institute of Technology, Sweden (2009)
Tofigh, A., Hallett, M., Lagergren, J.: Simultaneous identification of duplications and lateral gene transfers. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(2), 517–535 (2011). https://doi.org/10.1109/TCBB.2010.14
Vernot, B., Stolzer, M., Goldman, A., Durand, D.: Reconciliation with non-binary species trees. J. Comput. Biol. 15, 981–1006 (2009)
Wu, T., Zhang, L.: Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree. BMC Bioinform. 12, S7 (2011)
Wu, Y., Rasmussen, M., Bansal, M., Kellis, M.: TreeFix: statistically informed gene tree error correction using species trees. Syst. Biol. 62(1), 110–120 (2013)
Wu, Y., Rasmussen, M., Bansal, M., Kellis, M.: Most parsimonious reconciliation in the presence of gene duplication, loss, and deep coalescence using labeled coalescent trees. Genome Res. 24, 475–486 (2014)
Zhang, L.: On Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. J. Comput. Biol. 4, 177–188 (1997)
Zheng, Y., Wu, T., Zhang, L.: Reconciliation of gene and species trees with polytomies (2012). arXiv:1201.3995
Zheng, Y., Zhang, L.: Reconciliation with non-binary gene trees revisited. In: Proceedings of RECOMB. Lecture Notes in Computer Science, vol. 8394, pp. 418–432 (2014)
Zmasek, C.M., Eddy, S.R.: A simple algorithm to infer gene duplication and speciiation events on a gene tree. Bioinformatics 17, 821–828 (2001)
Acknowledgements
The authors acknowledge the support of the Fonds de Recherche du Québec Nature et Technologie (FRQNT) and of the Natural Sciences and Engineering Research Council (NSERC) (Discovery Grant RGPIN-249834).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
El-Mabrouk, N., Noutahi, E. (2019). Gene Family Evolution—An Algorithmic Framework. In: Warnow, T. (eds) Bioinformatics and Phylogenetics. Computational Biology, vol 29. Springer, Cham. https://doi.org/10.1007/978-3-030-10837-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-030-10837-3_5
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-10836-6
Online ISBN: 978-3-030-10837-3
eBook Packages: Computer ScienceComputer Science (R0)