Towards 3D Modeling of Interacting TM Helix Pairs Based on Classification of Helix Pair Sequence

  • Witold Dyrka
  • Jean-Christophe Nebel
  • Malgorzata Kotulska
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6282)


Spatial structures of transmembrane proteins are difficult to obtain either experimentally or by computational methods. Recognition of helix-helix contacts conformations, which provide structural skeleton of many transmembrane proteins, is essential in the modeling. Majority of helix-helix interactions in transmembrane proteins can be accurately clustered into a few classes on the basis of their 3D shape. We propose a Stochastic Context Free Grammars framework, combined with evolutionary algorithm, to represent sequence level features of these classes. The descriptors were tested using independent test sets and typically achieved the areas under ROC curves 0.60-0.70; some reached 0.77.


stochastic context-free grammar evolutionary algorithm helix-helix interaction transmembrane protein 


  1. 1.
    Yarov-Yarovoy, V., Schonbrun, J., Baker, D.: Multipass Membrane Protein Structure Prediction Using Rosetta. Proteins 62, 1010–1025 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Tusnady, G.E., Dosztányi, Z., Simon, I.: PDB_TM: selection and membrane localization of transmembrane proteins in the PDB. Nucleic Acids Res. 33, D275–D278 (2005)CrossRefGoogle Scholar
  3. 3.
    Barth, P., Wallner, B., Baker, D.: Prediction of membrane protein structures with complex topologies using limited constraints. Proc. Natl. Acad. Sci. 106, 1409–1414 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  4. 4.
    Wu, S., Zhang, Y.: A comprehensive assessment of sequence-based and templatebased methods for protein contact prediction. Bioinformatics 24, 924–931 (2008)CrossRefPubMedPubMedCentralGoogle Scholar
  5. 5.
    Li, W., et al.: Application of sparse NMR restraints to large-scale protein structure prediction. Biophys. J. 87, 1241–1248 (2004)CrossRefPubMedPubMedCentralGoogle Scholar
  6. 6.
    Izarzugaza, J.M.G., Grana, O., Tress, M.L., Valencia, A., Clarke, N.D.: Assessment of intramolecular contact predictions for CASP7. Proteins 69(suppl. 8), 152–158 (2007)CrossRefPubMedGoogle Scholar
  7. 7.
    Sathyapriya, R., Duarte, J.M., Stehr, H., Filippis, I., Lappe, M.: Defining an Essence of Structure Determining Residue Contacts in Proteins. PLoS Comput. Biol. 5, e1000584 (2009)CrossRefGoogle Scholar
  8. 8.
    Walters, R.F.S., De Grado, W.F.: Helix-packing motifs in membrane proteins. Proc. Natl. Acad. Sci. 103, 13658–13663Google Scholar
  9. 9.
    Russ, W.P., Engelman, D.M.: The GxxxG motif: a framework for transmembrane helix-helix association. J. Mol. Biol. 296(3), 911–919 (2000)CrossRefPubMedGoogle Scholar
  10. 10.
    Waldispühl, J., Steyaert, J.-M.: Modeling and predicting all-transmembrane proteins including helix-helix pairing. Theoretical Computer Science 335, 67–92 (2005)CrossRefGoogle Scholar
  11. 11.
    Holland, J.H.: Adaptation in Natural and Artificial Systems. Univ. Michigan (1975)Google Scholar
  12. 12.
    Goldberg, D.E.: Genetic Algorithms in Search, Optimization and Machine Learning Reading. Addison-Wesley, Reading (1989)Google Scholar
  13. 13.
    O’Neill, M., Ryan, C.: Grammatical Evolution. IEEE Trans. Evol. Comput. 5, 349–358 (2001)CrossRefGoogle Scholar
  14. 14.
    Koza, J.R.: Genetic Programming: On the Programming of Computers by Means of Natural Selection. MIT Press, Cambridge (1992)Google Scholar
  15. 15.
    Sakakibara, Y., Brown, M., Underwood, R.C., Mian, I.S.: Stochastic Context-Free Grammars for Modeling RNA. In: Procs 27th Hawaii Int. Conf. System Sciences (1993)Google Scholar
  16. 16.
    Sakakibara, Y., Brown, M., Hughey, R., Mian, I.S., Sjolander, K., Underwood, R., Haussler, D.: Stochastic Context-Free Grammars for tRNA. Nucleic Acids Res 22, 5112–5120 (1994)CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Knudsen, B., Hein, J.: RNA secondary structure prediction using stochastic context-free grammars and evolutionary history. Bioinformatics 15, 446–454 (1999)CrossRefPubMedGoogle Scholar
  18. 18.
    Mernik, M., Crepinsek, M., Gerlic, G., Zumer, V., Viljem, Z., Bryant, B.R., Sprague, A.: Learning CFG using an Evolutionary Approach. Technical report (2003)Google Scholar
  19. 19.
    Sakakibara, Y.: Learning context-free grammars using tabular representations. Pattern Recognition 38, 1372–1383 (2005)CrossRefGoogle Scholar
  20. 20.
    Keller, B., Lutz, R.: Evolutionary induction of stochastic context free grammars. Pattern Recognition 38, 1393–1406 (2005)CrossRefGoogle Scholar
  21. 21.
    Cielecki, L., Unold, O.: Real-valued GCS classifier system. Int. J. Appl. Math. Comput. Sci. 17, 539–547 (2007)CrossRefGoogle Scholar
  22. 22.
    Dyrka, W., Nebel, J.-C.: A Stochastic Context Free Grammar based Framework for Analysis of Protein Sequences. BMC Bioinformatics 10, 323 (2009)CrossRefPubMedPubMedCentralGoogle Scholar
  23. 23.
    Hutchinson, E.G., Thornton, J.M.: PROMOTIF - A program to identify structural motifs in proteins. Protein Science 5, 212–220 (1996)CrossRefPubMedPubMedCentralGoogle Scholar
  24. 24.
    Dayhoff, M.O., Schwartz, R.M., Orcutt, B.C.: A model of evolutionary change in proteins. Atlas of Protein Sequence and Structure 5, 345–352 (1978)Google Scholar
  25. 25.
    Krogh, A., Brown, M., Mian, I.S., Sjolander, K., Haussler, D.: Hidden Markov models in computational biology: Applications to protein modeling. J. Mol. Biol. 235, 1501–1531 (1994)CrossRefPubMedGoogle Scholar
  26. 26.
    Revesz, G.E.: Introduction to Formal Languages. McGraw-Hill, New York (1983)Google Scholar
  27. 27.
    Gimpelev, M., Forrest, L.R., Murray, D., Honig, B.: Helical Packing Patterns in Membrane and Soluble Proteins. Biophysical J. 87, 4075–4086 (2004)CrossRefGoogle Scholar
  28. 28.
    Kawashima, S., Pokarowski, P., Pokarowska, M., Kolinski, A., Katayama, T., Kanehisa, M.: AAindex: amino acid index database. Nucleic Acids Res. 36, D202–D205 (2008)CrossRefGoogle Scholar
  29. 29.
    Stolcke, A.: An Efficient Probabilistic Context-Free Parsing Algorithm that Computes Prefix Probabilities. Computational Linguistics 21(2), 165–201 (1995)Google Scholar
  30. 30.
    Arabas, J.: Wyklady z algorytmow ewolucyjnych Warsaw: WNT (2004)Google Scholar
  31. 31.
    Wall, M.: GAlib library documentation (version 2.4.4). MIT, Cambridge (1999)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2010

Authors and Affiliations

  • Witold Dyrka
    • 1
  • Jean-Christophe Nebel
    • 2
  • Malgorzata Kotulska
    • 1
  1. 1.Institute of Biomedical Engineering and InstrumentationWroclaw University of TechnologyWroclawPoland
  2. 2.Faculty of Computing, Information Systems and MathematicsKingston UniversityKingston-upon-ThamesUnited Kingdom

Personalised recommendations