Topological Rearrangements and Local Search Method for Tandem Duplication Trees

  • Denis Bertrand
  • Olivier Gascuel
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 3240)


The problem of reconstructing the duplication history of a set of tandemly repeated sequences was first introduced by Fitch (1977). Many recent works deal with this problem, showing the validity of the unequal recombination model proposed by Fitch, describing numerous inference algorithms, and exploring the combinatorial properties of these new mathematical objects, which are duplication trees (DT). In this paper, we deal with the topological rearrangement of these trees. Classical rearrangements used in phylogeny (NNI, SPR, TBR, ...) cannot be applied directly on DT. We demonstrate that restricting the neighborhood defined by the SPR (Subtree Pruning and Re-grafting) rearrangement to valid duplication trees, allows exploring the whole space of DT. We use these restricted rearrangements in a local search method which improves an initial tree via successive rearrangements and optimizes the parsimony criterion. We show through simulations that this method improves all existing programs for both reconstructing the initial tree and recovering its duplication events.


Local Search Duplication Event Internal Node Local Search Method Clock Tree 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Alberts, B., Bray, D., Lewis, J., Raff, M., Koberts, K., Waston, J.D.: Molecular biology of the cell, 3rd edn. Garland Publishing Inc., New York (1995)Google Scholar
  2. 2.
    Barthélemy, J.P., Guénoche, A.: Trees and proximity representations. Wiley and Sons, Chichester (1991)zbMATHGoogle Scholar
  3. 3.
    Benson, G., Dong, L.: Reconstructing the duplication history of a tandem repeat. In: Proceedings of Intelligent Systems in Molecular Biology (ISMB1999), pp. 44–53. AAAI, Menlo Park (1999)Google Scholar
  4. 4.
    Elemento, O., Gascuel, O.: A fast and accurate distance-based algorithm to reconstruct tandem duplicatin trees. Bioinformatics 18, 92–99 (2002), Proceedings of European Conference on Computational Biology (ECCB 2002)CrossRefGoogle Scholar
  5. 5.
    Elemento, O., Gascuel, O.: An exact and polynomial distance-based algorithm to reconstruct single copy tandem duplication trees. In: Proceedings of Combinatorial Pattern Matching. LNCS, pp. 96–108. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  6. 6.
    Elemento, O., Gascuel, O., Lefranc, M.-P.: Reconstructing the duplication history of tandemly repeated genes. Molecular Biology and Evolution 19, 278–288 (2002)Google Scholar
  7. 7.
    Felsenstein, J.: PHYLIP - PHYLogeny Inference Package. Cladistics 5, 164–166 (1989)Google Scholar
  8. 8.
    Felsenstein, J., Churchill, G.A.: A hidden markov model approach to variation among sites in rate of evolution. Molecular Biology and Evolution 13, 93–104 (1996)Google Scholar
  9. 9.
    Fitch, W.M.: Toward defining the course of evolution: minimum change for a specified tree topology. Systematic Zoology 20, 406–416 (1971)CrossRefGoogle Scholar
  10. 10.
    Fitch, W.M.: Phylogenies constrained by cross-over process as illustrated by human hemoglobins in a thirteen-cycle, eleven amino-acid repeat in human apolipoprotein A-I. Genetics 86, 623–644 (1977)Google Scholar
  11. 11.
    Ganapathy, G., Ramachandran, V., Warnow, T.: Better hill-climbing searches for parsimony. In: Proceedings of the 3nd International Workshop on Algorithms in Bioinformatics. LNCS, pp. 245–258. Springer, Heidelberg (2003)CrossRefGoogle Scholar
  12. 12.
    Gascuel, O., Bertrand, D., Elemento, O.: Reconstructing the duplication history of tandemly repeated sequences. In: Gascuel, O. (ed.) Mathematics of Evolution and Phylogeny, Oxford University Press, Oxford (2004) (in press)Google Scholar
  13. 13.
    Gascuel, O., Hendy, M., Jean-Marie, A., McLachlan, S.: The combinatorics of tandem duplication trees. Systematic Biology 52, 110–118 (2003)CrossRefGoogle Scholar
  14. 14.
    Gladstein, D.S.: Efficient incremental character optimization. Cladistics 13, 21–26 (1997)CrossRefGoogle Scholar
  15. 15.
    Goloboff, P.A.: Methods for faster parsimony analysis. Cladistics 12, 199–220 (1996)CrossRefGoogle Scholar
  16. 16.
    Hallett, M., Lagergren, J., Tofigh, A.: Simultaneous Identification of Duplications and Lateral Transfers. In: RECOMB (2004) (in press)Google Scholar
  17. 17.
    Hartigan, J.A.: Minimum mutation fits to a given tree. Biometrics 29, 53–65 (1973)CrossRefGoogle Scholar
  18. 18.
    Jaitly, D., Kearney, P., Lin, G., Ma, B.: Methods for reconstructing the history of tandem repeats and their application to the human genome. J. of Computer and System Sciences 65, 494–507 (2002)zbMATHCrossRefMathSciNetGoogle Scholar
  19. 19.
    Jeffreys, A.J., Harris, S.: Processes of gene duplication. Nature 296, 9–10 (1981)CrossRefGoogle Scholar
  20. 20.
    Kuhner, M.K., Felsenstein, J.: A simulation comparison of phylogeny algorithms under equal and unequal evolutionary rates. Molecular Biology and Evolution 11, 459–468 (1994)Google Scholar
  21. 21.
    Ohno, S.: Evolution by gene duplication. Springer, New York (1970)Google Scholar
  22. 22.
    Page, D.M., Charleston, M.A.: From gene to organisal phylogeny: Reconciled trees and the gene tree/species tree problem. Melecular Phylogenetics and Evolution 7, 231–240 (1997)CrossRefGoogle Scholar
  23. 23.
    Rambault, A., Grassly, N.C.: Seq-Gen: An application for the Monte Carlo simulation of DNA sequence evolution along phylogenetic trees. Computer Applied Biosciences 13, 235–238 (1997)Google Scholar
  24. 24.
    Saitou, N., Nei, M.: The neighbor-joining method: a new method for reconstructing phylogenetic trees. Molecular Biology and Evolution 4, 406–425 (1987)Google Scholar
  25. 25.
    Sattath, S., Tversky, A.: Additive similarity trees. Psychometrika 42, 319–345 (1977)CrossRefGoogle Scholar
  26. 26.
    Smith, G.P.: Evolution of repeated DNA sequences by unequal crossover. Science 191, 528–535 (1976)CrossRefGoogle Scholar
  27. 27.
    Sneath, P., Sokal, R.: Numerical Taxonomy, pp. 230–234. W.H. Freeman and Company, New York (1973)zbMATHGoogle Scholar
  28. 28.
    Swofford, D.L.: PAUP*. Phylogenetic Analysis Using Parsimony (*and Other Methods). Version 4. Sinauer Associates, Sunderland, Massachusetts (1999)Google Scholar
  29. 29.
    Swofford, D.L., Olsen, P.J., Waddell, P.J., Hillis, D.M.: Phylogenetic Inference. In: Swofford, D.L., Olsen, P.J., Waddell, P.J., Hillis, D.M. (eds.) Molecular Systematics, Sinauer Associates, Sunderland, Massachusetts, pp. 407–514 (1996)Google Scholar
  30. 30.
    Tang, M., Waterman, M.S., Yooseph, S.: Zinc finger gene clusters and tandem gene duplication. Journal of Computational Biology 9, 429–446 (2002)CrossRefGoogle Scholar
  31. 31.
    Yang, Y., Zhang, L.: On counting tandem duplication trees. Molecular Biology and Evolution (2004) (in press)Google Scholar
  32. 32.
    Zhang, L., Ma, B., Wang, L., Xu, Y.: Greedy method for inferring tandem duplication history. Bioinformatics 19, 1497–1504 (2003)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2004

Authors and Affiliations

  • Denis Bertrand
    • 1
  • Olivier Gascuel
    • 1
  1. 1.Equipe Méthodes et Algorithmes pour la Bioinformatique LIRMM-CNRSMontpellier Cedex 5France

Personalised recommendations