Improving Multi-Relief for Detecting Specificity Residues from Multiple Sequence Alignments

Marchiori, Elena

doi:10.1007/978-3-642-12211-8_14

Improving Multi-Relief for Detecting Specificity Residues from Multiple Sequence Alignments

Elena Marchiori¹⁹

Conference paper

760 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 6023))

Abstract

A challenging problem in bioinformatics is the detection of residues that account for protein function specificity, not only in order to gain deeper insight in the nature of functional specificity but also to guide protein engineering experiments aimed at switching the specificity of an enzyme, regulator or transporter. The majority of the state-of-the art algorithms for this task use multiple sequence alignments (MSA’s) to identify residue positions conserved within- and divergent between- protein subfamilies. In this study, we focus on a recent method based on this approach called multi-RELIEF. We analyze and modify the two core parts of the method in order to improve its predictive performance. A parametric generalization of the popular RELIEF machine learning algorithm for weighting residues is introduced and incorporated in multi-RELIEF. The ensemble criterion of multi-RELIEF for merging the weights of multiple runs is simplified. Finally, the method used by multi-RELIEF for exploiting tertiary structure information is modified by incorporating prior information describing the confidence of the original scores assigned to residues. Extensive computational experiments on six real-life datasets show improvement of both robustness and detection capability of the new multi-RELIEF over the original method.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bickel, P.J., Kechris, K.J., Spector, P.C., Wedemayer, G.J., Glazer, A.N.: Finding important sites in protein sequences. Proc. Natl. Acad. Sci. USA 99, 14764–14771 (2002)
Article MATH MathSciNet Google Scholar
Carro, A., Tress, M., de Juan, D., Pazos, F., Lopez-Romero, P., Del Sol, A., Valencia, A., Rojas, A.M.: Treedet: a web server to explore sequence space. Nucleic Acids Res. 35(web server issue), 99 (2006)
Google Scholar
Chakrabarti, S., Panchenko, A.R.: Ensemble approach to predict specificity determinants: benchmarking and validation. BMC Bioinformatics 10, 207 (2009)
Article Google Scholar
Del Sol Mesa, A., Pazos, F., Valencia, A.: Automatic methods for predicting functionally important residues. J. Mol. Biol. 326(4), 1289–1302 (2003)
Article Google Scholar
Feenstra, K.A., Pirovano, W., Krab, K., Heringa, J.: Sequence harmony: detecting functional specificity from alignments. Nucleic Acids Res. 35(web server issue), W495–W498 (2007)
Google Scholar
Gu, X.: A simple statistical method for estimating type-ii (cluster-specific) functional divergence of protein sequence. Mol. Biol. Evol. 23, 1937–1945 (2006)
Article Google Scholar
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Hannenhalli, S.S., Russell, R.B.: Analysis and prediction of functional sub-types from protein sequence alignments. J. Mol. Biol. 303(1), 61–76 (2000)
Article Google Scholar
Kalinina, O.V., Gelfand, M.S., Russell, R.B.: Combining specificity determining and conserved residues improves functional site prediction. BMC Bioinformatics (2009)
Google Scholar
Kalinina, O.V., Novichkov, P.S., Mironov, A.A., Gelfand, M.S., Rakhmaninova, A.B.: SDPpred: a tool for prediction of amino acid residues that determine differences in functional specificity of homologous proteins. Nucleic Acids Res. 32(web server issue), W424–W428 (2004)
Google Scholar
Kononenko, I.: Estimating attributes: Analysis and extensions of relief. In: Bergadano, F., De Raedt, L. (eds.) ECML 1994. LNCS, vol. 784, pp. 171–182. Springer, Heidelberg (1994)
Google Scholar
Kuipers, R.K., Joosten, H.-J.J., Verwiel, E., Paans, S., Akerboom, J., van der Oost, J., Leferink, N.G., van Berkel, W.J., Vriend, G., Schaap, P.J.: Correlated mutation analyses on super-family alignments reveal functionally important residues. Proteins 76(3), 608–616 (2009)
Article Google Scholar
Mihalek, I., Res, I., Lichtarge, O.: A family of evolution-entropy hybrid methods for ranking protein residues by importance. J. Mol. Biol. 336(5), 1265–1282 (2004)
Article Google Scholar
Mirny, L.A., Gelfand, M.S.: Using orthologous and paralogous proteins to identify specificity-determining residues in bacterial transcription factors. J. Mol. Biol. 321(1), 7–20 (2002)
Article Google Scholar
Moore, J.H., White, B.C.: Tuning relieff for genome-wide genetic analysis. In: Marchiori, E., Moore, J.H., Rajapakse, J.C. (eds.) EvoBIO 2007. LNCS, vol. 4447, pp. 166–175. Springer, Heidelberg (2007)
Chapter Google Scholar
Pirovano, W., Feenstra, K.A., Heringa, J.: Sequence comparison by sequence harmony identifies subtype specific functional sites. Nucleic Acids Res. 34, 6540–6548 (2006)
Article Google Scholar
Provost, F., Kohavi, R.: Guest editors’ introduction: On applied research in machine learning. Machine Learning 30, 127–132 (1998)
Article Google Scholar
Shenkin, P.S., Erman, B., Mastrandrea, L.D.: Information-theoretical entropy as a measure of sequence variability. Proteins 11(4), 297–313 (1991)
Article Google Scholar
Sobolev, V., Sorokine, A., Prilusky, J., Abola, E.E., Edelman, M.: Automated analysis of interatomic contacts in proteins. Bioinformatics 15, 327–332 (1999)
Article Google Scholar
Swets, J.A.: Measuring the accuracy of diagnostic systems. Science 240, 1285–1293 (1988)
Article MathSciNet Google Scholar
Whisstock, J.C., Lesk, A.M.: Prediction of protein function from protein sequence and structure. Quart. Rev. Biophys. 36(3), 307–340 (2003)
Article Google Scholar
Ye, K., Feenstra, K.A., Heringa, J., IJzerman, A.P., Marchiori, E.: Multi-relief: a method to recognize specificity determining residues from multiple sequence alignments using a machine-learning approach for feature weighting. Bioinformatics 24(1), 18–25 (2008)
Article Google Scholar
Ye, K., Lameijer, E.W., Beukers, M.W., IJzerman, A.P.: A two-entropies analysis to identify functional positions in the transmembrane region of class a g protein-coupled receptors. Proteins 63, 1018–1030 (2006)
Article Google Scholar
Zhang, Y., Ding, C., Li, T.: Gene selection algorithm by combining relieff and mrmr. BMC Genomics 9(suppl. 2) (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

Radboud University Nijmegen, The Netherlands
Elena Marchiori

Authors

Elena Marchiori
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute for High-Performance Computing and Networking (ICAR), Italian National Research Council (CNR), Via P. Bucci 41C, 87036, Rende, (CS), Italy
Clara Pizzuti
Department of Molecular Physiology and Biophysics, Vanderbilt University, Center for Human Genetics Research, 519 Light Hall, 37232, Nashville, TN, USA
Marylyn D. Ritchie
Department of Animal Production Epidemiology and Ecology, University of Torino, Molecular Biotechnology Center, Via Leonardo da Vinci 44, 10095, Grugliasco, (TO), Italy
Mario Giacobini

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Marchiori, E. (2010). Improving Multi-Relief for Detecting Specificity Residues from Multiple Sequence Alignments. In: Pizzuti, C., Ritchie, M.D., Giacobini, M. (eds) Evolutionary Computation, Machine Learning and Data Mining in Bioinformatics. EvoBIO 2010. Lecture Notes in Computer Science, vol 6023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12211-8_14

Download citation

DOI: https://doi.org/10.1007/978-3-642-12211-8_14
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-12210-1
Online ISBN: 978-3-642-12211-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics