OPTIMA: A New Score Function for the Detection of Remote Homologs

Kann, Maricel; Goldstein, Richard A.

doi:10.1007/978-3-540-44827-3_5

Maricel Kann⁹ &
Richard A. Goldstein⁹

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 2666))

527 Accesses

Abstract

A new method to derive a score function to detect remote relationships between protein sequences has been developed. The new score function, OPTIMA, was obtained after maximization of a function of merit representing a measure of success in recognizing homologs of the newly sequenced protein among thousands of non-homolog sequences in the databases. We find that the new score function obtained in such a manner performs better than standard score functions for the identification of distant homologies.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence analysis. Proc. Nat. Acad. Sci. USA 85, 2444–2448 (1988)
Article Google Scholar
Altschul, S.F., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment tool. J. Mol. Biol. 215, 403–410 (1990)
Google Scholar
Altschul, S.F., Gish, W.: Local alignment statistics. Methods Enzymol. 215, 460–480 (1996)
Article Google Scholar
Lipman, D.J., Pearson, W.R.: Rapid and sensitive protein similarity searches. Science 227, 1435–1441 (1985)
Article Google Scholar
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Article Google Scholar
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, vol. 5 (1978)
Google Scholar
Gonnet, G.H., Cohen, M.A., Benner, S.A.: Exhaustive matching of the entire protein database. Science 256, 1443–1445 (1992)
Article Google Scholar
Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. CABIOS 8, 275–282 (1992)
Google Scholar
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Nat. Acad. Sci. USA 89, 10915–10919 (1992)
Article Google Scholar
Overington, J., Donnelly, D., Johnson, M.S., Šali, A., Blundell, T.L.: Environment-specific amino-acid substitution tables: Tertiary templates and prediction of protein folds. Protein Sci. 1, 216–226 (1992)
Article Google Scholar
Brenner, S.E., Chothia, C., Hubbard, T.J.P.: Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Nat. Acad. Sci. USA 95, 6073–6078 (1998)
Article Google Scholar
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.L.: Gapped Blast and Psi-Blast: A new generation of protein database search programs. Nucl. Acid Res. 25, 3389–3402 (1997)
Article Google Scholar
Tatusov, R.L., Galperin, M.Y., Koonin, E.V.: The COG database: a tool for genome-scale analysis of proteins functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)
Article Google Scholar
Karlin, S., Altschul, S.F.: Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Nat. Acad. Sci. USA 87, 2264–2268 (1990)
Article MATH Google Scholar
Dembo, A., Karlin, S., Zeitouni, O.: Limit distribution of maximal non-aligned two-sequence segmental score. Ann. Prob. 22, 2022 (1994)
Article MATH MathSciNet Google Scholar
Pearson, W.R.: Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84 (1998)
Article Google Scholar
Gumbel, E.J.: Statistics of Extremes. Columbia University Press, New York (1958)
MATH Google Scholar
Gumbel, E.J.: Statistics theory of extreme values and some practical applications. Washington: U.S. Government Printing Office: National Bureau of Standards Applied Mathematics Series, vol. 33 (1954)
Google Scholar
Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Finn, R.D., Sonnhammer, E.L.L.: Pfam 3.1: 1313 multiple alignments match the majority of proteins. Nucleic Acids Research 27, 260–262 (1999)
Article Google Scholar
Dennis Jr., J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Englewood Cliffs. Prentice-Hall, New York (1983)
MATH Google Scholar
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. Cambridge University Press, Cambridge (1992)
MATH Google Scholar
Vogt, G., Etzold, T., Argos, P.: An assessment of amino acid exchange matrices in aligning protein sequences: The twilight zone revisited. J. Mol. Biol. 249, 816–831 (1995)
Article Google Scholar
Metz, C.E.: ROC methodology in radiologic imaging. Invest. Radiol. 21, 720–733 (1986)
Article Google Scholar
Swets, J.A.: Measuring the accuracy of diagnostic system. Science 240, 1285–1293 (1988)
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Chemistry, University of Michigan, Ann Arbor, MI, 48109-1055, USA
Maricel Kann & Richard A. Goldstein

Authors

Maricel Kann
View author publications
You can also search for this author in PubMed Google Scholar
Richard A. Goldstein
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Topic Chairs,
Concettina Guerra
Center for Molecular Biology and Computer Sciecne Department, Brown University, 115 Waterman St., RI 02912, Providence, USA
Sorin Istrail

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kann, M., Goldstein, R.A. (2003). OPTIMA: A New Score Function for the Detection of Remote Homologs. In: Guerra, C., Istrail, S. (eds) Mathematical Methods for Protein Structure Analysis and Design. Lecture Notes in Computer Science(), vol 2666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44827-3_5

Download citation

DOI: https://doi.org/10.1007/978-3-540-44827-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40104-9
Online ISBN: 978-3-540-44827-3
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics