Biological Procedures Online

, Volume 10, Issue 1, pp 66–73 | Cite as

Characterizing gene family evolution

Open Access
Article

Abstract

Gene families are widely used in comparative genomics, molecular evolution, and in systematics. However, they are constructed in different manners, their data analyzed and interpreted differently, with different underlying assumptions, leading to sometimes divergent conclusions. In systematics, concepts like monophyly and the dichotomy between homoplasy and homology have been central to the analysis of phylogenies. We critique the traditional use of such concepts as applied to gene families and give examples of incorrect inferences they may lead to. Operational definitions that have emerged within functional genomics are contrasted with the common formal definitions derived from systematics. Lastly, we question the utility of layers of homology and the meaning of homology at the character state level in the context of sequence evolution. From this, we move forward to present an idealized strategy for characterizing gene family evolution for both systematic and functional purposes, including recent methodological improvements.

Indexing terms

genomics evolution, molecular phylogeny sequence homology 

References

  1. 1.
    Massey SE, Churbanov A, Rastogi S, Liberles DA. 2008. Characterizing positive and negative selection and their phylogenetic effects. Gene 2008; 418:22–26.PubMedCrossRefGoogle Scholar
  2. 2.
    Russell RB, Sasieni PD, Sternberg MJ. Supersites within superfolds. Binding site similarity in the absence of homology. J Mol Biol 1998; 282:903–918.PubMedCrossRefGoogle Scholar
  3. 3.
    Britten R. Almost all human genes resulted from ancient duplication. Proc Natl Acad Sci USA 2006; 103:19027–19032.PubMedCrossRefGoogle Scholar
  4. 4.
    Fitch WM. Homology: A personal view on some of the problems. Trends in Genetics 2000; 16:227–231.PubMedCrossRefGoogle Scholar
  5. 5.
    Hennig W. Phylogenetic systematics. Urbana, IL: University of Illinois Press, 1979.Google Scholar
  6. 6.
    Gordon MS. The concept of monophyly: A speculative essay. Biology and Philosophy 1999; 14:331–348.CrossRefGoogle Scholar
  7. 7.
    Liberles DA, Schreiber DR, Govindarajan S, Chamberlin SG, Benner SA. The Adaptive Evolution Database (TAED). Genome Biology 2001; 2(8):R0028.Google Scholar
  8. 8.
    Berglund-Sonnhammer AC, Steffansson P, Betts MJ, Liberles DA. Optimal gene trees from sequences and species trees using a soft interpretation of parsimony. Journal of Molecular Evolution 2006; 63:240–250.PubMedCrossRefGoogle Scholar
  9. 9.
    Duret L, Mouchiroud D, Gouy M. HOVERGEN: A database of homologous vertebrate genes. Nucleic Acids Research 1994; 22:2360–2365.PubMedCrossRefGoogle Scholar
  10. 10.
    Seoighe C, Johnston CR, Shields DC. Significantly different patterns of amino acid replacement after gene duplication as compared to after speciation. Mol Biol Evol 2003; 20:484–490.PubMedCrossRefGoogle Scholar
  11. 11.
    Roth C, Liberles DA. A systematic search for positive selection in higher plants (Embryophytes). BMC Plant Biology 2006; 6:12.PubMedCrossRefGoogle Scholar
  12. 12.
    Fletcher GL, Hew CL, Davies PL. Antifreeze proteins of teleost fish. Ann Rev Physiol 2001; 63:359–390.CrossRefGoogle Scholar
  13. 13.
    Grant T, Kluge AG. Data exploration in phylogenetic inference: scientific, heuristic, or neither. Cladistics 2003; 19:379–418.CrossRefGoogle Scholar
  14. 14.
    Koonin EV. An apology for orthologs- or brave new memes. Genome Biology 2001; 2(4):comment1005.1–1005.2.CrossRefGoogle Scholar
  15. 15.
    Arvestad L, Berglund AC, Lagergren J, Sennblad B. Gene tree reconstruction and orthology analysis based on an integrated model for duplications and sequence evolution. RECOMB 2004; 2004:326–335.CrossRefGoogle Scholar
  16. 16.
    Hallett M, Lagergren J, Tofigh A. Simultaneous identification of duplications and lateral transfers. RECOMB 2004; 2004:347–358.CrossRefGoogle Scholar
  17. 17.
    Lynch M, Conery JS. The evolutionary fate and consequences of duplicate genes. Science 2000; 290:1151–1155.PubMedCrossRefGoogle Scholar
  18. 18.
    Roth C, Rastogi S, Arvestad L, Dittmar K, Light S, Ekman D, Liberles DA. Evolution after gene duplication: models, mechanisms, sequences, systems, and organisms. J Exp Zool 2007; 308B:58–73.CrossRefGoogle Scholar
  19. 19.
    Brower AV, Schawaroch V. Three steps of homology assessment. Cladistics 1996; 12: 265–272.Google Scholar
  20. 20.
    De Pinna MCC. Concepts and test of homology in the cladistic paradigm. Cladistics 1991; 7: 367–394.CrossRefGoogle Scholar
  21. 21.
    Page RDM, Holmes EC. Molecular Evolution. A Phylogenetic Approach. Blackwell Publishing, Oxford, 2005.Google Scholar
  22. 22.
    Gould SJ. The structure of evolutionary theory. Cambridge, MA: The Belknap Press of Harvard University Press, 2002.Google Scholar
  23. 23.
    Kimura M. A simple model for estimating evolutionary rates of base substitutions through comparative studies of nucleotide sequences. Journal of Molecular Evolution 1980; 16:111–120.PubMedCrossRefGoogle Scholar
  24. 24.
    Farris JS. The logical basis of phylogenetic analysis. In NI Platnick and VA Funk (eds). Advances in Cladistics, vol. 2. New York: Columbia University Press, 1983, pp. 7–36.Google Scholar
  25. 25.
    Kluge AG. Moving targets and shell games. Cladistics 1994; 10:403–413.CrossRefGoogle Scholar
  26. 26.
    Kool ET. Hydrogen bonding, base stacking, and steric effects in DNA replication. Ann Rev Biophys Biomol Struc 2001; 30:1–22.CrossRefGoogle Scholar
  27. 27.
    Chang MS, Brenner, SA. Empirical analysis of protein insertions and deletions determining parameters for the correct placement of gaps in protein sequence alignments. J Mol Biol 2004; 341:617–631.PubMedCrossRefGoogle Scholar
  28. 28.
    Edwards RJ, Shields DC. GASP: Gapped Ancestral Sequence Prediction for proteins. BMC Bioinformatics 2005; 5:123.CrossRefGoogle Scholar
  29. 29.
    Lunter G, Miklos I, Drummond A, Jensen JL, Hein J. Bayesian coestimation of phylogeny and sequence alignment. BMC Bioinformatics 2005; 6:83.PubMedCrossRefGoogle Scholar
  30. 30.
    Redelings BD, Suchard MA. Joint Bayesian estimation of alignment and phylogeny. Syst Biol 2005; 54:401–418.PubMedCrossRefGoogle Scholar
  31. 31.
    Wheeler WC. Sequence alignment, parameter sensitivity, and the phylogenetic analysis of molecular data. Systematic Biology 1995; 44:321–331.Google Scholar
  32. 32.
    Galtier N. Maximum likelihood phylogenetic analysis under a covarion-like model. Mol Biol Evol 2001; 18:866–873.PubMedGoogle Scholar
  33. 33.
    Depristo MA, Weinreich DM, Hartl DL. Missense meanderings in sequence space: A biophysical view of protein evolution. Nature Reviews Genetics 2005; 6:678–687.PubMedCrossRefGoogle Scholar
  34. 34.
    Kleinman CL, Rodrigue N, Bonnard C, Philippe H, Lartillot N. A maximum likelihood framework for protein design. BMC Bioinformatics 2006; 7:326.PubMedCrossRefGoogle Scholar
  35. 35.
    Benner SA. Interpretive proteomics- finding biological meaning in genome and proteome databases. Adv Enzyme Reg 2003; 43:271–359.CrossRefGoogle Scholar
  36. 36.
    Anisimova M, Liberles DA. The quest for natural selection in the age of comparative genomics. Heredity 2007; 99:567–579.PubMedCrossRefGoogle Scholar
  37. 37.
    Scannell DR, Frank AC, Conant GC, Byrne KP, Woolfit M, Wolfe KH. Independent sorting-out of thousands of duplicated gene pairs in two yeast species descended from a whole genome duplication. Proc Natl Acad Sci USA 2007; 104:8397–8402.PubMedCrossRefGoogle Scholar
  38. 38.
    Page RDM. TREEVIEW: An application to display phylogenetic trees on personal computers. Computer Applications in the Biosciences 1996; 12: 357–358.PubMedGoogle Scholar

Copyright information

© Springer 2008

Authors and Affiliations

  1. 1.Department of Molecular BiologyUniversity of WyomingLaramieUSA

Personalised recommendations