Abstract
Protein secondary structure prediction is one major task in bioinformatics and various methods in pattern recognition and machine learning have been applied. In particular, it is a challenge to predict β-sheet structures since they range over several discontinuous regions in an amino acid sequence. In this paper, we propose a dynamic programming algorithm for some kind of antiparallel β-sheet, where the proposed approach can be extended for more general classes of β-sheets. Experimental results for real data show that our prediction algorithm has good performance in accuracy. We also show a relation between the proposed algorithm and a grammar-based method. Furthermore, we prove that prediction of planar β-sheet structures is NP-hard.
Chapter PDF
Similar content being viewed by others
References
Abe, N., Mamitsuka, H.: Predicting Protein Secondary Structure Using Stochastic Tree Grammars. Machine Learning 29, 275–301 (1997)
Akutsu, T., Miyano, S.: On the Approximation of Protein Threading. Theor.Comp.Sci. 210, 261–275 (1999)
Asai, K., Hayamizu, S., Handa, K.: Prediction of Protein Secondary Structure by the Hidden Markov Model. Bioinformatics 9, 141–146 (1993)
Berrera, M., Molinari, H., Fogolari, F.: Amino Acid Empirical Contact Energy Definitions for Fold Recognition in the Space of Contact Maps. BMC Bioinformatics 4 (2003)
Boullier, P.: Range Concatenation Grammars. In: Sixth Intl.Workshop on Parsing Technologies (IWPT 2000), pp.53–64 (2000)
Branden, C., Tooze, J.: Introduction to Protein Structure, 2nd edn. Garland Publishing (1999)
Cai, L., Malmberg, R.L., Wu, Y.: Stochastic Modeling of RNA Pseudoknotted Structures: A Grammatical Approach. Bioinformatics 19, i66–i73 (2003)
Chiang, D., Joshi, A.K., Searls, D.B.: Grammatical Representations of Macromolecular Structure. J. Comp. Biol. 13, 1077–1100 (2006)
Dosztányi, Z., Csizmók, V., Tompa, P., Simon, I.: The Pairwise Energy Content Estimated from Amino Acid Composition Discriminates between Folded and Intrinsically Unstructured Proteins. J. Mol. Biol. 347, 827–839 (2005)
Eddy, S.R., Durbin, R.: RNA Sequence Analysis Using Covariance Models. Nucl. Acids Res. 22, 2079–2088 (1994)
Hobohm, U., Scharf, M., Schneider, R., Sander, C.: Selection of a Representative Set of Structures from the Brookhaven Protein Data Bank. Protein Sci. 1, 409–417 (1992)
Hua, S., Sun, Z.: A Novel Method of Protein Secondary Structure Prediction with High Segment Overlap Measure: Support Vector Machine Approach. J. Mol. Biol. 308, 397–407 (2001)
Hubbard, T.J.P.: Use of β-Strand Interaction Pseudo-Potentials in Protein Structure Prediction and Modelling. In: The Twenty-Seventh Annual Hawaii Intl. Conf. on System Sciences, pp. 336–344 (1994)
Kabsch, W., Sander, C.: Dictionary of Protein Secondary Structure: Pattern Recognition of Hydrogen-Bonded and Geometrical Features. Biopolymers 22, 2577–2637 (1983)
Kato, Y., Seki, H., Kasami, T.: RNA Pseudoknotted Structure Prediction Using Stochastic Multiple Context-Free Grammar. IPSJ Trans. Bioinformatics 47, 12–21 (2006)
Krogh, A., Brown, M., Mian, I.S., Sjölander, K., Haussler, D.: Hidden Markov Models in Computational Biology: Applications to Protein Model. J. Mol. Biol. 235, 1501–1531 (1994)
Lathrop, R.H.: The Protein Threading Problem with Sequence Amino Acid Interaction Preferences is NP-Complete. Protein Eng. 7, 1059–1068 (1994)
Lin, K., Simossis, V.A., Taylor, W.R., Heringa, J.: A Simple and Fast Secondary Structure Prediction Method Using Hidden Neural Networks. Bioinformatics 21, 152–159 (2005)
Maier, R.: The Complexity of Some Problems on Subsequences and Supersequences. J. ACM 25, 322–336 (1978)
Muggleton, S., King, R., Sternberg, M.: Protein Secondary Structure Prediction Using Logic-Based Machine Learning. Protein Eng. 5, 647–657 (1992)
Rivas, E., Eddy, S.R.: The Language of RNA: A Formal Grammar that Includes Pseudoknots. Bioinformatics 16, 334–340 (2000)
Rost, B., Sander, C.: Prediction of Protein Secondary Structure at Better than 70% Accuracy. J. Mol. Biol. 232, 584–599 (1993)
Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjölander, K., Underwood, R.C., Haussler, D.: Stochastic Context-Free Grammars for tRNA Modeling. Nucl. Acids Res. 22, 5112–5120 (1994)
Tanaka, S., Scheraga, H.A.: Medium- and Long-Range Interaction Parameters between Amino Acids for Predicting Three-Dimensional Structures of Proteins. Macromolecules 9, 945–950 (1976)
Uemura, Y., Hasegawa, A., Kobayashi, S., Yokomori, T.: Tree Adjoining Grammars for RNA Structure Prediction. Theor. Comp. Sci. 210, 277–303 (1999)
Xu, Y., Xu, D., Uberbacher, E.C.: An Efficient Computational Method for Globally Optimal Threading. J. Comp. Biol. 5, 597–614 (1998)
Zhang, C., Kim, S.H.: Environment-Dependent Residue Contact Energies for Proteins. PNAS 97, 2550–2555 (2000)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2008 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kato, Y., Akutsu, T., Seki, H. (2008). Prediction of Protein Beta-Sheets: Dynamic Programming versus Grammatical Approach. In: Chetty, M., Ngom, A., Ahmad, S. (eds) Pattern Recognition in Bioinformatics. PRIB 2008. Lecture Notes in Computer Science(), vol 5265. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88436-1_6
Download citation
DOI: https://doi.org/10.1007/978-3-540-88436-1_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-88434-7
Online ISBN: 978-3-540-88436-1
eBook Packages: Computer ScienceComputer Science (R0)