Abstract
Phylogenetic profiling involves the comparison of phylogenetic data across gene families. It is possible to construct phylogenetic trees, or related data structures, for specific gene families using a wide variety of tools and approaches. Phylogenetic profiling involves the comparison of this data to determine which families have correlated or coupled evolution. The underlying assumption is that in certain cases these couplings may allow us to infer that the two families are functionally related: that is their function in the cell is coupled. Although this technique can be applied to noncoding genes, it is more commonly used to assess the function of protein coding genes. Examples of proteins that are functionally related include subunits of protein complexes, or enzymes that perform consecutive steps along biochemical pathways. We hypothesize the deletion of one of the families from a genome would then indirectly affect the function of the other. Dozens of different implementations of the phylogenetic profiling technique have been developed over the past decade. These range from the first simple approaches that describe phylogenetic profiles as binary vectors to the most complex ones that attempt to model to the coevolution of protein families on a phylogenetic tree. We discuss a set of these implementations and present the software and databases that are available to perform phylogenetic profiling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Pellegrini M, Marcotte EM, Thompson MJ, Eisenberg D, Yeates TO. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc Natl Acad Sci U S A, 96:4285–4288.
Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res, 25:3389–3402.
Tatusov RL, Fedorova ND, Jackson JD, Jacobs AR, Kiryutin B, Koonin EV, Krylov DM, Mazumder R, Mekhedov SL, Nikolskaya AN, Rao BS, Smirnov S, Sverdlov AV, Vasudevan S, Wolf YI, Yin JJ, Natale DA. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics, 4:41.
Bowers PM, Cokus SJ, Eisenberg D, Yeates TO. (2004) Use of logic relationships to decipher protein network organization. Science, 306:2246–2249.
Date SV, Marcotte EM. (2003) Discovery of uncharacterized cellular systems by genome-wide analysis of functional linkages. Nat Biotechnol, 21:1055–1062.
Pazos F, Valencia A. (2001) Similarity of phylogenetic trees as indicator of protein-protein interaction. Protein Eng, 14:609–614.
Wu J, Kasif S, DeLisi C. (2003) Identification of functional links between genes using phylogenetic profiles. Bioinformatics, 19: 1524–1530.
Kharchenko P, Chen L, Freund Y, Vitkup D, Church GM. (2006) Identifying metabolic enzymes with multiple types of association evidence. BMC Bioinformatics, 7:177.
Liberles DA. (2001) Evaluation of methods for determination of a reconstructed history of gene sequence evolution. Mol Biol Evol, 18:2040–2047.
Barker D, Pagel M. (2005) Predicting functional gene links from phylogenetic-statistical analyses of whole genomes. PLoS Comput Biol, 1:e3.
Barker D, Meade A, Pagel M. (2007) Constrained models of evolution lead to improved prediction of functional linkage from correlated gain and loss of genes. Bioinformatics, 23:14–20.
Cokus S, Mizutani S, Pellegrini M. (2007) An improved method for identifying functionally linked proteins using phylogenetic profiles. BMC Bioinformatics, 8(Suppl 4):S7.
Bar-Joseph Z, Gifford DK, Jaakkola TS. (2001) Fast optimal leaf ordering for hierarchical clustering. Bioinformatics, 17(Suppl 1):S22–S29.
Sun J, Li Y, Zhao Z. (2007) Phylogenetic profiles for the prediction of protein-protein interactions: how to select reference organisms? Biochem Biophys Res Commun, 353:985–991.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G. (2000) Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet, 25:25–29.
Jothi R, Przytycka TM, Aravind L. (2007) Discovering functional linkages and uncharacterized cellular pathways using phylogenetic profile comparisons: a comprehensive assessment. BMC Bioinformatics, 8:173.
Kensche PR, van Noort V, Dutilh BE, Huynen MA. (2008) Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution. J R Soc Interface, 5:151–170.
Li H, Pellegrini M, Eisenberg D. (2005) Detection of parallel functional modules by comparative analysis of genome sequences. Nat Biotechnol, 23:253–260.
Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C. (2009) STRING 8 – a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res, 37:D412–D416.
Bowers PM, Pellegrini M, Thompson MJ, Fierro J, Yeates TO, Eisenberg D. (2004) Prolinks: a database of protein functional linkages derived from coevolution. Genome Biol, 5:R35.
Date SV, Marcotte EM. (2005) Protein function prediction using the Protein Link EXplorer (PLEX). Bioinformatics, 21:2558–2559.
Hu Z, Hung JH, Wang Y, Chang YC, Huang CL, Huyck M, DeLisi C. (2009) VisANT 3.5: multi-scale network visualization, analysis and inference based on the gene ontology. Nucleic Acids Res, 37:W115–W121.
Dandekar T, Snel B, Huynen M, Bork P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem Sci, 23:324–328.
Acknowledgments
The author wishes to acknowledge the UCLA-DOE Institute for Genomics and proteomics for support.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Pellegrini, M. (2012). Using Phylogenetic Profiles to Predict Functional Relationships. In: van Helden, J., Toussaint, A., Thieffry, D. (eds) Bacterial Molecular Networks. Methods in Molecular Biology, vol 804. Springer, New York, NY. https://doi.org/10.1007/978-1-61779-361-5_9
Download citation
DOI: https://doi.org/10.1007/978-1-61779-361-5_9
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-61779-360-8
Online ISBN: 978-1-61779-361-5
eBook Packages: Springer Protocols