Abstract
A new method to derive a score function to detect remote relationships between protein sequences has been developed. The new score function, OPTIMA, was obtained after maximization of a function of merit representing a measure of success in recognizing homologs of the newly sequenced protein among thousands of non-homolog sequences in the databases. We find that the new score function obtained in such a manner performs better than standard score functions for the identification of distant homologies.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Pearson, W.R., Lipman, D.J.: Improved tools for biological sequence analysis. Proc. Nat. Acad. Sci. USA 85, 2444–2448 (1988)
Altschul, S.F., Gish, W., Miller, W., Myers, E., Lipman, D.: Basic local alignment tool. J. Mol. Biol. 215, 403–410 (1990)
Altschul, S.F., Gish, W.: Local alignment statistics. Methods Enzymol. 215, 460–480 (1996)
Lipman, D.J., Pearson, W.R.: Rapid and sensitive protein similarity searches. Science 227, 1435–1441 (1985)
Smith, T.F., Waterman, M.S.: Identification of common molecular subsequences. J. Mol. Biol. 147, 195–197 (1981)
Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. In: Dayhoff, M.O. (ed.) Atlas of Protein Sequence and Structure, National Biomedical Research Foundation, vol. 5 (1978)
Gonnet, G.H., Cohen, M.A., Benner, S.A.: Exhaustive matching of the entire protein database. Science 256, 1443–1445 (1992)
Jones, D.T., Taylor, W.R., Thornton, J.M.: The rapid generation of mutation data matrices from protein sequences. CABIOS 8, 275–282 (1992)
Henikoff, S., Henikoff, J.G.: Amino acid substitution matrices from protein blocks. Proc. Nat. Acad. Sci. USA 89, 10915–10919 (1992)
Overington, J., Donnelly, D., Johnson, M.S., Šali, A., Blundell, T.L.: Environment-specific amino-acid substitution tables: Tertiary templates and prediction of protein folds. Protein Sci. 1, 216–226 (1992)
Brenner, S.E., Chothia, C., Hubbard, T.J.P.: Assessing sequence comparison methods with reliable structurally identified distant evolutionary relationships. Proc. Nat. Acad. Sci. USA 95, 6073–6078 (1998)
Altschul, S.F., Madden, T.L., Schaffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.L.: Gapped Blast and Psi-Blast: A new generation of protein database search programs. Nucl. Acid Res. 25, 3389–3402 (1997)
Tatusov, R.L., Galperin, M.Y., Koonin, E.V.: The COG database: a tool for genome-scale analysis of proteins functions and evolution. Nucleic Acids Res. 28, 33–36 (2000)
Karlin, S., Altschul, S.F.: Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes. Proc. Nat. Acad. Sci. USA 87, 2264–2268 (1990)
Dembo, A., Karlin, S., Zeitouni, O.: Limit distribution of maximal non-aligned two-sequence segmental score. Ann. Prob. 22, 2022 (1994)
Pearson, W.R.: Empirical statistical estimates for sequence similarity searches. J. Mol. Biol. 276, 71–84 (1998)
Gumbel, E.J.: Statistics of Extremes. Columbia University Press, New York (1958)
Gumbel, E.J.: Statistics theory of extreme values and some practical applications. Washington: U.S. Government Printing Office: National Bureau of Standards Applied Mathematics Series, vol. 33 (1954)
Bateman, A., Birney, E., Durbin, R., Eddy, S.R., Finn, R.D., Sonnhammer, E.L.L.: Pfam 3.1: 1313 multiple alignments match the majority of proteins. Nucleic Acids Research 27, 260–262 (1999)
Dennis Jr., J.E., Schnabel, R.B.: Numerical Methods for Unconstrained Optimization and Nonlinear Equations, Englewood Cliffs. Prentice-Hall, New York (1983)
Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical Recipes in C. Cambridge University Press, Cambridge (1992)
Vogt, G., Etzold, T., Argos, P.: An assessment of amino acid exchange matrices in aligning protein sequences: The twilight zone revisited. J. Mol. Biol. 249, 816–831 (1995)
Metz, C.E.: ROC methodology in radiologic imaging. Invest. Radiol. 21, 720–733 (1986)
Swets, J.A.: Measuring the accuracy of diagnostic system. Science 240, 1285–1293 (1988)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2003 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kann, M., Goldstein, R.A. (2003). OPTIMA: A New Score Function for the Detection of Remote Homologs. In: Guerra, C., Istrail, S. (eds) Mathematical Methods for Protein Structure Analysis and Design. Lecture Notes in Computer Science(), vol 2666. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-44827-3_5
Download citation
DOI: https://doi.org/10.1007/978-3-540-44827-3_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40104-9
Online ISBN: 978-3-540-44827-3
eBook Packages: Springer Book Archive