Rapid and Accurate Protein Side Chain Prediction with Local Backbone Information

  • Jing Zhang
  • Xin Gao
  • Jinbo Xu
  • Ming Li
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)


High-accuracy protein structure modeling demands accurate and very fast side chain prediction since such a procedure must be repeatedly called at each step of structure refinement. Many known side chain prediction programs, such as SCWRL and TreePack, depend on the philosophy that global information and pairwise energy function must be used to achieve high accuracy. These programs are too slow to be used in the case when side chain packing has to be used thousands of times, such as protein structure refinement and protein design.

We present an unexpected study that local backbone information can determine side chain conformations accurately. LocalPack, our side chain packing program which is based on only local information, achieves equal accuracy as SCWRL and TreePack, yet runs 4-14 times faster, hence providing a key missing piece in our efforts to high-accuracy protein structure modeling.


side chain prediction local backbone features multiclass Support Vector Machines 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Janin, J., Wodak, S., Levitt, M., Maigret, B.: The conformation of amino acid side chains in proteins. J. Mol. Biol. 125, 357–386 (1978)CrossRefGoogle Scholar
  2. 2.
    Bhat, T.N., Sasisekharan, V., Vijayan, M.: An analysis of side-chain conformation in proteins. Int. J. Pept. Protein Res. 14, 170–184 (1979)Google Scholar
  3. 3.
    McGregor, M., Islam, S., Sternberg, M.: Analysis of the relationship between side-chain conformation and secondary structure in globular proteins. J. Mol. Biol. 198, 295–310 (1987)CrossRefGoogle Scholar
  4. 4.
    Summers, N.L., Karplus, M.: Construction of side-chains in homology modeling: Application to the c-terminal lobe of rhizopuspepsin. J. Mol. Biol. 210, 785–810 (1989)CrossRefGoogle Scholar
  5. 5.
    Desjarlais, J., Handel, T.: De novo design of the hydrophobic cores of proteins. Protein Science 4, 2006–2018 (1995)CrossRefGoogle Scholar
  6. 6.
    Dahiyat, B., Mayo, S.: Protein design automation. Protein Science 5, 895–903 (1996)CrossRefGoogle Scholar
  7. 7.
    Dunbrack, R.: Rotamer libraries in the 21st century. Curr. Opin. Struct. Biol. 12, 431–440 (2002)CrossRefGoogle Scholar
  8. 8.
    Xu, J.: Rapid Protein Side-Chain Packing via Tree Decomposition. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 423–439. Springer, Heidelberg (2005)Google Scholar
  9. 9.
    Xu, J., Berger, B.: Fast and accurate algorithms for protein side-chain packing. Journal of ACM 53, 533–557 (2006)CrossRefMathSciNetGoogle Scholar
  10. 10.
    Dunbrack, R., Cohen, F.: Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Science 6, 1661–1681 (1997)Google Scholar
  11. 11.
    Xiang, Z., Honig, B.: Extending the accuracy limits of prediction for side-chain conformations. J. Mol. Biol. 311, 421–430 (2001)CrossRefGoogle Scholar
  12. 12.
    Chandrasekaran, R., Ramachandran, G.: Studies on the conformation of amino acids. XI. Analysis of the observed side group conformations in proteins. Int. J. Protein Research 2, 223–233 (1994)Google Scholar
  13. 13.
    Benedetti, E., Morelli, G., Nemethy, G., Scheraga, H.: Statistical and energetic analysis of sidechain conformations in oligopeptides. Int. J. Peptide Protein Res. 22, 1–15 (1983)Google Scholar
  14. 14.
    Ponder, J., Richards, F.: Tertiary templates for proteins. use of packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol. 193, 775–791 (1987)CrossRefGoogle Scholar
  15. 15.
    Kono, H., Doi, J.: A new method for side-chain conformation prediction using a hopfield network and reproduced rotamers. J. Comp. Chem. 17, 1667–1683 (1996)Google Scholar
  16. 16.
    Maeyer, M., Desmet, J., Lasters, I.: All in one: a highly detailed rotamer library improves both accuracy and speed in the modelling of sidechains by dead-end elimination. Fold Des. 2, 53–66 (1997)CrossRefGoogle Scholar
  17. 17.
    Dunbrack, R., Karplus, M.: Backbone-dependent rotamer library for proteins: Application to side-chain prediction. J. Mol. Biol. 230, 543–574 (1993)CrossRefGoogle Scholar
  18. 18.
    Schrauber, H., Eisenhaber, F., Argos, P.: Rotamers: To be or not to be? An analysis of amino acid sidechain conformations in globular proteins. J. Mol. Biol. 230, 592–612 (1993)CrossRefGoogle Scholar
  19. 19.
    Dunbrack, R., Karplus, M.: Conformational analysis of the backbone-dependent rotamer preferences of protein sidechains. Nature Struct. Biol. 1, 334–340 (1994)CrossRefGoogle Scholar
  20. 20.
    Liang, S., Grishin, N.: Side-chain modeling with an optimized scoring function. Protein Science 11, 322–331 (2002)CrossRefGoogle Scholar
  21. 21.
    Canutescu, A., Shelenkov, A., Dunbrack, R.: A graph-theory algorithm for rapid protein side-chain prediction. Protein Science 12, 2001–2014 (2003)CrossRefGoogle Scholar
  22. 22.
    Peterson, R., Dutton, P., Wand, A.: Improved side-chain prediction accuracy using an ab initio potential energy function and a very large rotamer library. Protein Science 13, 735–751 (2004)CrossRefGoogle Scholar
  23. 23.
    Chazelle, B., Kingsford, C., Singh, M.: A semidefinite programming approach to side chain positioning with new rounding strategies. Informs Journal on Computing 16, 380–392 (2004)CrossRefMathSciNetGoogle Scholar
  24. 24.
    Kingsford, C., Chazelle, B., Singh, M.: Solving and analyzing side-chain positioning problems using linear and integer programming. Bioinformatics 21, 1028–1036 (2005)CrossRefGoogle Scholar
  25. 25.
    Jain, T., Cerutti, D., McCammon, J.: Configurational-bias sampling techinique for predicting side-chain conformations in proteins. Protein Science 15, 2029–2039 (2007)CrossRefGoogle Scholar
  26. 26.
    Yanover, C., Schueler-Furman, O., Weiss, Y.: Minimizing and learning energy functions for side-chain prediction. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 381–395. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  27. 27.
    Roitberg, A., Elber, R.: Modeling side chains in peptides and proteins: Application of the locally enhanced sampling and the simulated annealing methods to find minimum energy functions. Chem. Phys. 95, 9277–9287 (1991)CrossRefGoogle Scholar
  28. 28.
    Street, A., Mayo, S.: Intrinsic beta-sheet propensities result from van der waals interactions between side chains and the local backbone. PNAS 96, 9074–9076 (1999)CrossRefGoogle Scholar
  29. 29.
    Mendes, J., Nagarajaram, H., Soares, C., Blundell, T., Carrondo, M.: Incorporating knowledge-based biases into an energy-based side-chain modeling method: Application to comparative modeling of protein structure. Biopolymers 59, 72–86 (2001)CrossRefGoogle Scholar
  30. 30.
    Rohl, C., Strauss, C., Chivian, D., Baker, D.: Modeling structurally variable regions in homologous proteins with rosetta. Proteins: Structure, Function, and Bioinformatics 55, 656–677 (2004)CrossRefGoogle Scholar
  31. 31.
    Holm, L., Sander, C.: Fast and simple monte carlo algorithm for side chain optimization in proteins: Application to model building by homology. Proteins: Structure, Function and Genetics 14, 213–223 (1992)CrossRefGoogle Scholar
  32. 32.
    Vasquez, M.: An evaluation of discrete and continuum search techniques for conformational analysis of side-chains in proteins. Biopolymers 36, 53–70 (1995)CrossRefGoogle Scholar
  33. 33.
    Tuffery, P., Etchebest, C., Hazout, S., Lavery, R.: A new approach to the rapid determination of protein side chain conformations. J. Biomol. Struct. Dyn. 8, 1267–1289 (1991)Google Scholar
  34. 34.
    Desmet, J., Maeyer, M., Hazes, B., Laster, I.: The dead-end elimination theorem and its use in protein side-chain positioning. Nature 356, 539–542 (1992)CrossRefGoogle Scholar
  35. 35.
    Hwang, J., Liao, W.: Side-chain prediction by neural networks and simulated annealing optimization. Protein Eng. 8, 363–370 (1995)CrossRefGoogle Scholar
  36. 36.
    Lee, C., Subbiah, S.: Prediction of protein side-chain conformation by packing optimization. J. Mol. Biol. 217, 373–388 (1991)CrossRefGoogle Scholar
  37. 37.
    Eriksson, O., Zhou, Y., Elofsson, A.: Side chain-positioning as an integer programming problem. In: Gascuel, O., Moret, B.M.E. (eds.) WABI 2001. LNCS, vol. 2149, pp. 128–141. Springer, Heidelberg (2001)CrossRefGoogle Scholar
  38. 38.
    Akutsu, T.: NP-hardness results for protein side-chain packing. Genome Informatics 8, 180–186 (1997)Google Scholar
  39. 39.
    Pierce, N., Winfree, E.: Protein design is NP-hard. Protein Eng. 15, 779–782 (2002)CrossRefGoogle Scholar
  40. 40.
    Crammer, K., Singer, Y.: On the algorithmic implementation of multiclass kernel-based vector machines. Journal of Machine Learning Research 2, 265–292 (2001)CrossRefGoogle Scholar
  41. 41.
    Tsochantaridis, I., Hofmann, T., Joachims, T., Altun, Y.: Support vector machine learning for interdependent and structured output spaces. In: The 21st International Conference on Machine Learning, vol. 69, pp. 104–111 (2004)Google Scholar
  42. 42.
    Tsochantaridis, I., Joachims, T., Hofmann, T., Altun, Y.: Large margin methods for structured and interdependent output variables. Journal of Machine Learning Research 6, 1453–1484 (2005)MathSciNetGoogle Scholar
  43. 43.
    Taskar, B., Guestrin, C., Koller, D.: Max-margin markov networks. NIPS 16 (2004)Google Scholar
  44. 44.
    Eyal, E., Najmanovich, R., Mcconkey, R.J., Enelman, M., Sobolev, V.: Importance of solvent accessibility and contact surfaces in modeling side-chain conformations in proteins. J. Comput. Chem. 25, 712–724 (2004)CrossRefGoogle Scholar
  45. 45.
    Labesse, G., Colloc’h, N., Pothier, J., Mornon, J.P.: P-SEA, a new efficient assignment of secondary structure from C α trace of proteins. CABIOS 13, 291–295 (1997)Google Scholar
  46. 46.
    Hubbard, S.J., Thornton, J.M.: ‘NACCESS’, Computer Program, Department of Biochemistry and Molecular Biology, University College London (1993)Google Scholar
  47. 47.
    Dasarathy, B.V.: Nearest neighbor (NN) norms: NN pattern classification techniques. IEEE Computer Society Press, Los Alamitos (1990)Google Scholar
  48. 48.
    Shakhnarovich, G., Darrell, T., Indyk, P.: Nearest-Neighbor Methods in Learning and Vision: Theory and Practice (Neural Information Processing). The MIT Press, Cambridge (2006)Google Scholar
  49. 49.
  50. 50.
    Hsu, C.W., Chang, C.C., Lin, C.J.: A practical guide to support vector classification. Technical report, Taipei (2003)Google Scholar
  51. 51.
  52. 52.
    Sali, A., Blundell, T.L.: Comparative protein modelling by satisfaction of spatial restraints. J. Mol. Biol. 234, 779–815 (1993)CrossRefGoogle Scholar
  53. 53.
    Xu, J., Li, M., Kim, D., Xu, Y.: RAPTOR: optimal protein threading by linear programming. Journal of Bioinformatics and Computational Biology 1, 95–117 (2003)CrossRefGoogle Scholar
  54. 54.

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Jing Zhang
    • 1
    • 2
  • Xin Gao
    • 1
  • Jinbo Xu
    • 3
  • Ming Li
    • 1
  1. 1.David R. Cheriton School of Computer ScienceUniversity of WaterlooWaterlooCanada
  2. 2.The Institute for Theoretical Computer Science, Department of Computer Science and TechnologyTsinghua UniversityBeijingChina
  3. 3.Toyota Technological Institute at ChicagoChicagoUSA

Personalised recommendations