The Gene-Duplication Problem: Near-Linear Time Algorithms for NNI Based Local Searches

  • Mukul S. Bansal
  • Oliver Eulenstein
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4983)


The gene-duplication problem is to infer a species supertree from a collection of gene trees that are confounded by complex histories of gene duplication events. This problem is NP-complete and thus requires efficient and effective heuristics. Existing heuristics perform a stepwise search of the tree space, where each step is guided by an exact solution to an instance of a local search problem. A classical local search problem is the \(\mathop{\rm NNI}\) search problem, which is based on the nearest neighbor interchange operation. In this work we (i) provide a novel near-linear time algorithm for the \(\mathop{\rm NNI}\) search problem, (ii) introduce extensions that significantly enlarge the search space of the \(\mathop{\rm NNI}\) search problem, and (iii) present algorithms for these extended versions that are asymptotically just as efficient as our algorithm for the \(\mathop{\rm NNI}\) search problem. The substantially extended \(\mathop{\rm NNI}\) search problem, along with the exceptional speed-up achieved, make the gene-duplication problem more tractable for large-scale phylogenetic analyses.


Species Tree Local Search Gene Duplication Gene Tree Search Problem 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Allen, B.L., Steel, M.: Subtree transfer operations and their induced metrics on evolutionary trees. Annals of Combinatorics 5, 1–13 (2001)CrossRefMathSciNetGoogle Scholar
  2. 2.
    Bansal, M.S., Burleigh, J.G., Eulenstein, O., Wehe, A.: Heuristics for the gene-duplication problem: A Θ(n) speed-up for the local search. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 238–252. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  3. 3.
    Bansal, M.S., Eulenstein, O.: An Ω(n 2/ logn) speed-up of TBR heuristics for the gene-duplication problem. In: Giancarlo, R., Hannenhalli, S. (eds.) WABI 2007. LNCS (LNBI), vol. 4645, pp. 124–135. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  4. 4.
    Bender, M.A., Farach-Colton, M.: The LCA problem revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)CrossRefGoogle Scholar
  5. 5.
    Bonizzoni, P., Vedova, G.D., Dondi, R.: Reconciling a gene tree to a species tree under the duplication cost model. Theor. Comput. Sci. 347(1-2), 36–53 (2005)CrossRefMathSciNetzbMATHGoogle Scholar
  6. 6.
    Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)CrossRefMathSciNetzbMATHGoogle Scholar
  7. 7.
    Chen, K., Durand, D., Farach-Colton, M.: Notung: a program for dating gene duplications and optimizing gene family trees. Journal of Computational Biology 7, 429–447 (2000)CrossRefGoogle Scholar
  8. 8.
    Cotton, J.A., Page, R.D.M.: Tangled tales from multiple markers: reconciling conflict between phylogenies to build molecular supertrees. In: Phylogenetic Supertrees: Combining Information to Reveal the Tree of Life, pp. 107–125. Springer, Heidelberg (2004)Google Scholar
  9. 9.
    DasGupta, B., He, X., Jiang, T., Li, M., Tromp, J., Zhang, L.: On distances between phylogenetic trees. In: SODA, pp. 427–436 (1997)Google Scholar
  10. 10.
    Fellows, M., Hallett, M., Korostensky, C., Stege, U.: Analogs and duals of the mast problem for sequences and trees. In: Bilardi, G., Pietracaprina, A., Italiano, G.F., Pucci, G. (eds.) ESA 1998. LNCS, vol. 1461, pp. 103–114. Springer, Heidelberg (1998)CrossRefGoogle Scholar
  11. 11.
    Ganapathy, G., Ramachandran, V., Warnow, T.: Better hill-climbing searches for parsimony. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 245–258. Springer, Heidelberg (2003)Google Scholar
  12. 12.
    Ganapathy, G., Ramachandran, V., Warnow, T.: On contract-and-refine transformations between phylogenetic trees. In: SODA, pp. 900–909 (2004)Google Scholar
  13. 13.
    Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage. a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28, 132–163 (1979)CrossRefGoogle Scholar
  14. 14.
    Górecki, P., Tiuryn, J.: On the structure of reconciliations. In: Lagergren, J. (ed.) RECOMB-WS 2004. LNCS (LNBI), vol. 3388, pp. 42–54. Springer, Heidelberg (2005)Google Scholar
  15. 15.
    Guigó, R., Muchnik, I., Smith, T.F.: Reconstruction of ancient molecular phylogeny. Molecular Phylogenetics and Evolution 6(2), 189–213 (1996)CrossRefGoogle Scholar
  16. 16.
    Hallett, M.T., Lagergren, J.: New algorithms for the duplication-loss model. In: RECOMB, pp. 138–146 (2000)Google Scholar
  17. 17.
    Ma, B., Li, M., Zhang, L.: From gene trees to species trees. SIAM J. Comput. 30(3), 729–752 (2000)CrossRefMathSciNetzbMATHGoogle Scholar
  18. 18.
    Mirkin, B., Muchnik, I., Smith, T.F.: A biology consistent model for comparing molecular phylogenies. Journal of Computational Biology 2(4), 493–507 (1995)Google Scholar
  19. 19.
    Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)CrossRefGoogle Scholar
  20. 20.
    Page, R.D.M.: GeneTree: comparing gene and species phylogenies using reconciled trees. Bioinformatics 14(9), 819–820 (1998)CrossRefGoogle Scholar
  21. 21.
    Page, R.D.M.: Extracting species trees from complex gene trees: reconciled trees and vertebrate phylogeny. Molecular Phylogenetics and Evolution 14, 89–106 (2000)CrossRefGoogle Scholar
  22. 22.
    Page, R.D.M., Charleston, M.A.: From gene to organismal phylogeny: reconciled trees and the gene tree/species tree problem. Molec. Phyl. and Evol. 7, 231–240 (1997)CrossRefGoogle Scholar
  23. 23.
    Page, R.D.M., Cotton, J.: Vertebrate phylogenomics: reconciled trees and gene duplications. In: Pacific Symposium on Biocomputing, pp. 536–547 (2002)Google Scholar
  24. 24.
    Page, R.D.M., Holmes, E.C.: Molecular evolution: a phylogenetic approach. Blackwell Science, Malden (1998)Google Scholar
  25. 25.
    Sanderson, M.J., McMahon, M.M.: Inferring angiosperm phylogeny from EST data with widespread gene duplication. BMC Evolutionary Biology 7 (suppl. 1), 3 (2007)CrossRefGoogle Scholar
  26. 26.
    Semple, C., Steel, M.: Phylogenetics. Oxford University Press, Oxford (2003)zbMATHGoogle Scholar
  27. 27.
    Slowinski, J.B., Knight, A., Rooney, A.P.: Inferring species trees from gene trees: A phylogenetic analysis of the elapidae (serpentes) based on the amino acid sequences of venom proteins. Molecular Phylogenetics and Evolution 8, 349–362 (1997)CrossRefGoogle Scholar
  28. 28.
    Stege, U.: Gene trees and species trees: The gene-duplication problem in fixed-parameter tractable. In: WADS, pp. 288–293 (1999)Google Scholar
  29. 29.
    Zhang, L.: On a Mirkin-Muchnik-Smith conjecture for comparing molecular phylogenies. Journal of Computational Biology 4(2), 177–187 (1997)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Mukul S. Bansal
    • 1
  • Oliver Eulenstein
    • 1
  1. 1.Department of Computer ScienceIowa State UniversityUSA

Personalised recommendations