Ancestral Maximum Likelihood of Evolutionary Trees Is Hard
Maximum likelihood (ML) (Felsenstein, 1981) is an increasingly popular optimality criterion for selecting evolutionary trees. Finding optimal ML trees appears to be a very hard computational task – in particular, algorithms and heuristics for ML take longer to run than algorithms and heuristics for maximum parsimony (MP). However, while MP has been known to be NP-complete for over 20 years, no such hardness result has been obtained so far for ML.
In this work we make a first step in this direction by proving that ancestral maximum likelihood (AML) is NP-complete. The input to this problem is a set of aligned sequences of equal length and the goal is to find a tree and an assignment of ancestral sequences for all of that tree’s internal vertices such that the likelihood of generating both the ancestral and contemporary sequences is maximized. Our NP-hardness proof follows that for MP given in (Day, Johnson and Sankoff, 1986) in that we use the same reduction from Vertex Cover; however, the proof of correctness for this reduction relative to AML is different and substantially more involved.
KeywordsMaximum Parsimony Evolutionary Tree Vertex Cover Internal Vertex Ancestral Sequence
Unable to display preview. Download preview PDF.
- 10.Neyman, J.: Molecular studies of evolution: A source of novel statistical problems. In: Gupta, S., Jackel, Y. (eds.) Statistical Decision Theory and Related Topics, pp. 1–27. Academic Press, New York (1971)Google Scholar
- 11.Pupko, T., Pe’er, I., Shamir, R., Graur, D.: A Fast Algorithm for Joint Reconstruction of Ancestral Amino Acid Sequences. Molecular Biology and Evolution 17(6), 890–896 (2000)Google Scholar
- 12.Pupko, T., Pe’er, I., Hasegawa, M., Graur, D., Friedman, N.: A branch-and-bound algorithm for the inference of ancestral amino-acid sequences when the replacement rate varies among sites: Application to the evolution of five gene families. Bioinformatics 18(8), 1116–1123 (2002)CrossRefGoogle Scholar
- 13.Sankoff, D., Cedergren, R.: Simultaneous comparison of three or more sequences related by a tree. In: Sankoff, D., Kruskal, J. (eds.) Time Warps, String Edits, and Macromolecules: The Theory and Practice of Sequence Comparison, pp. 253–263. Addison-Wesley Publishing Company, Reading (1983)Google Scholar
- 15.Swofford, D., Maddison, W.: Parsimony, Character-State Reconstructions, and Evolutionary Inferences. In: Mayden, R. (ed.) Systematics, Historical Ecology, and North American Freshwater Fishes, pp. 186–223. Stanford University Press, Stanford (1992)Google Scholar
- 16.Swofford, D., Olsen, G., Waddell, P., Hillis, D.: Phylogenetic Inference. In: Hillis, D., Moritz, C., Mable, B. (eds.) Molecular Systematics, 2nd edn., pp. 407–514. Sinauer Associates, Sunderland (1996)Google Scholar
- 17.Wareham, T.: On the Computational Complexity of Inferring Evolutionary Trees. Technical Report 93-01, Department of Computer Science, Memorial University of Newfoundland (1993)Google Scholar
- 18.Yang, Z., Kumar, S., Nei, M.: A new method of inference of ancestral nucleotide and amino acid sequences. Genetics 141, 1641–1650 (1995)Google Scholar