Estimating Phylogenies with Invariant Functions of Data

Day, William H. E.

doi:10.1007/978-3-642-76307-6_33

Estimating Phylogenies with Invariant Functions of Data

William H. E. Day⁴

Conference paper

390 Accesses
1 Citations

Part of the book series: Studies in Classification, Data Analysis, and Knowledge Organization ((STUDIES CLASS))

Abstract

Estimating phylogenies, or evolutionary trees, is a complex task even under the best of circumstances, and it encounters particular difficulties when using molecular data to investigate distantly related species. In recent years researchers have studied how methods to infer phylogenetic relations, such as those based on parsimony, behave for simple models of nucleic acid evolution. The results are not entirely encouraging: HENDY AND PENNY (1989), for example, illustrated simple cases under which parsimony will converge to an incorrect phylogenetic tree, even for equal rates of evolution. What is encouraging, however, is that researchers are beginning to develop methods of estimating phylogenies which may be robust under conditions where parsimony is not. A strategy shared by some of these methods (CAVENDER AND FELSENSTEIN (1987), LAKE (1987a)) is to use invariant functions of the data to identify the correct topology of the corresponding phylogeny. But which invariants, and how? What assumptions underlie these approaches? I discuss these issues and indicate the direction this research seems to be taking.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Bibliography

Cavender, J. A. (1989), “Mechanized Derivation of Linear Invariants,” Molecular Biology and Evolution, 6, 301–316. [Using assumptions no stronger than those of LAKE (1987a), the author calculates all linear invariants for rooted phylogenies with four species.]
Google Scholar
Cavender, J. A. (1990), “Necessary Conditions for the Method of Inferring Phylogeny by Linear Invariants,” Mathematical Biosciences, submitted. [The sufficient conditions of CAVENDER (1989) for deriving linear invariants are also necessary.]
Google Scholar
Cavender, J. A., AND Felsenstein, J. (1987), “Invariants of Phylogenies in a Simple Case with Discrete States,” Journal of Classification, 4, 57–71. [The authors develop quadratic invariants (K-and L-invariants) for two-state character data involving four species.]
Article MATH Google Scholar
Drolet, S., AND Sankofp, D. (1990), “Quadratic Tree Invariants for Multivalued Characters,” Journal of Theoretical Biology, 144, 117–129. [The authors generalize the work of CAVENDER AND FELSENSTEIN (1987) to obtain quadratic invariants for character data involving four species and having more than two states.]
Article MathSciNet Google Scholar
Felsenstein, J. (1978), “Cases in which Parsimony or Compatibility Methods will be Positively Misleading,” Systematic Zoology, 27, 401–410. [The author examines conditions under which methods of phylogenetic inference will fail to converge to a correct phylogeny as more and more data are accumulated.]
Article Google Scholar
Felsenstein, J. (1982), “Numerical Methods for Inferring Evolutionary Trees,” Quarterly Review of Biology, 57, 379–404. [The author surveys methods of inferring phylogenies from character or distance data.]
Article Google Scholar
Felsenstein, J. (1988), “Phylogenies from Molecular Sequences: Inference and Reliability,” Annual Review of Genetics, 22, 521–565. [The author surveys methods of inferring and evaluating phylogenies from sequence data.]
Article Google Scholar
Felsenstein, J. (1990), “Counting Phylogenetic Invariants,” manuscript. [The author counts the invariants that exist in cases involving four-state characters, four species, and different models of nucleotide substitution.]
Google Scholar
Hendy, M. D., AND Penny, D. (1989), “A Framework for the Quantitative Study of Evolutionary Trees,” Systematic Zoology, 38, 297–309. [The authors extend the work of FELSENSTEIN (1978) by finding new conditions under which parsimony methods will fail to converge to a correct phylogeny as more and more data are accumulated.]
Article Google Scholar
Lake, J. A. (1987a), “A Rate-independent Technique for Analysis of Nucleic Acid Sequences: Evolutionary Parsimony,” Molecular Biology and Evolution, 4, 167–191. [The author develops linear invariants for four-state character data involving four species.]
Google Scholar
Lake, J. A. (1987b), “Origin of the Eukaryotic Nucleus Determined by Rate-invariant Analysis of rRNA Sequences,” Nature, 331, 184–186. [The author applies the method of evolutionary parsimony (LAKE 1987a) to propose a new parkaryotic-karyotic classification.]
Article Google Scholar
Lake, James A. (1990), “Comparative Simulations of Evolutionary Parsimony and Augmented Distance Matrix Phylogenetic Reconstruction Algorithms,” manuscript. [The author concludes that, in general, evolutionary parsimony (LAKE 1987a) is a more robust algorithm than those for maximum parsimony or the augmented distance method of Kimura.]
Google Scholar
Pearl, J., AND Tarsi, M. (1986), “Structuring Causal Trees,” Journal of Complexity, 2, 60–77. [The problem is to infer treelike models of complex phenomena where the leaves represent observable random binary variables, and the interior vertices represent hidden causes which explain interleaf dependencies. The authors derive a relationship on which the invariants of CAVENDER AND FELSENSTEIN (1987) are based.]
Article MATH MathSciNet Google Scholar
Pearl, J. (1986), “Fusion, Propagation, and Structuring in Belief Networks,” Artificial Intelligence, 29, 241–288. [Section 3 of this paper, entitled “Structuring Causal Trees,” includes most of the material found in PEARL AND TARSI (1986).]
Article MATH MathSciNet Google Scholar
Sankopp, D. (1990), “Designer Invariants for Large Phylogenies,” Molecular Biology and Evolution, to appear. [For two-state character data, the author develops quadratic invariants for phylogenies of five species, or for individual edges in phylogenies of any larger size.]
Google Scholar
Sidow, A., AND Wilson, A. C. (1989), “Compositional Parsimony in the Statistical Testing of DNA Trees,” Second International Symposium on Macromolecules, Genes, and Computers, Waterville Valley, NH, USA, August 1989. [The authors extend the method of evolutionary parsimony (LAKE 1987a) to account for heterogeneity in the compositions of bases in DNA sequences.]
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science, Memorial Univ. Newfoundland, St. John’s, NL, A1C 5ST, Canada
William H. E. Day

Authors

William H. E. Day
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institut für Statistik und Wirtschaftsmathematik, Rheinisch-Westfälische Technische Hochschule Aachen, Wüllnerstraße 3, D-5100, Aachen, Germany
Hans-Hermann Bock
Institut für Medizinische Biometrie, Philipps-Universität Marburg, Bunsenstraße 3, D-3500, Marburg, Germany
Peter Ihm

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Day, W.H.E. (1991). Estimating Phylogenies with Invariant Functions of Data. In: Bock, HH., Ihm, P. (eds) Classification, Data Analysis, and Knowledge Organization. Studies in Classification, Data Analysis, and Knowledge Organization. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-76307-6_33

Download citation

DOI: https://doi.org/10.1007/978-3-642-76307-6_33
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-53483-9
Online ISBN: 978-3-642-76307-6
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics