A Probabilistic Graphical Model for Ab Initio Folding

  • Feng Zhao
  • Jian Peng
  • Joe DeBartolo
  • Karl F. Freed
  • Tobin R. Sosnick
  • Jinbo Xu
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5541)


Despite significant progress in recent years, ab initio folding is still one of the most challenging problems in structural biology. This paper presents a probabilistic graphical model for ab initio folding, which employs Conditional Random Fields (CRFs) and directional statistics to model the relationship between the primary sequence of a protein and its three-dimensional structure. Different from the widely-used fragment assembly method and the lattice model for protein folding, our graphical model can explore protein conformations in a continuous space according to their probability. The probability of a protein conformation reflects its stability and is estimated from PSI-BLAST sequence profile and predicted secondary structure. Experimental results indicate that this new method compares favorably with the fragment assembly method and the lattice model.


protein structure prediction ab initio folding conditional random fields (CRFs) directional statistics fragment assembly lattice model 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Wu, S., Skolnick, J., Zhang, Y.: Ab initio modeling of small proteins by iterative TASSER simulations. BMC Biology 5, 17+ (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  2. 2.
    Misura, K.M., Chivian, D., Rohl, C.A., Kim, D.E., Baker, D.: Physically realistic homology models built with ROSETTA can be more accurate than their templates. Proceedings of National Academy Sciences 103(14), 5361–5366 (2006)CrossRefGoogle Scholar
  3. 3.
    Zhang, Y., Skolnick, J.: The protein structure prediction problem could be solved using the current PDB library. Proceedings of National Academy Sciences, USA 102(4), 1029–1034 (2005)CrossRefGoogle Scholar
  4. 4.
    Moult, J., Fidelis, K., Rost, B., Hubbard, T., Tramontano, A.: Critical assessment of methods of protein structure prediction (CASP)–round 6. Proteins: Structure, Function and Bioinformatics 61(suppl. 7), 3–7 (2005)CrossRefGoogle Scholar
  5. 5.
    Moult, J., Fidelis, K., Kryshtafovych, A., Rost, B., Hubbard, T., Tramontano, A.: Critical assessment of methods of protein structure prediction-Round VII. Proteins: Structure, Function, and Bioinformatics 69(S8), 3–9 (2007)CrossRefGoogle Scholar
  6. 6.
    Jones, T.A., Thirup, S.: Using known substructures in protein model building and crystallography. EMBO Journal 5, 819–823 (1986)PubMedPubMedCentralGoogle Scholar
  7. 7.
    Claessens, M., van Cutsem, E., Lasters, I., Wodak, S.: Modelling the polypeptide backbone with śpare partsf́rom known protein structures. Protein Engineering 2(5), 335–345 (1989)CrossRefPubMedGoogle Scholar
  8. 8.
    Unger, R., Harel, D., Wherland, S., Sussman, J.L.: A 3D building blocks approach to analyzing and predicting structure of proteins. Proteins: Structure, Function and Genetics 5(4), 355–373 (1989)CrossRefGoogle Scholar
  9. 9.
    Simon, I., Glasser, L., Scheraga, H.A.: Calculation of Protein Conformation as an Assembly of Stable Overlapping Segments: Application to Bovine Pancreatic Trypsin Inhibitor. Proceedings of National Academy Sciences, USA 88(9), 3661–3665 (1991)CrossRefGoogle Scholar
  10. 10.
    Levitt, M.: Accurate modeling of protein conformation by automatic segment matching. Journal of Molecular Biology 226(2), 507–533 (1992)CrossRefPubMedGoogle Scholar
  11. 11.
    Sippl, M.: Recognition of errors in three-dimensional structures of proteins. Proteins: Structure, Function, and Bioinformatics 17, 355–362 (1993)CrossRefGoogle Scholar
  12. 12.
    Wendoloski, J.J., Salemme, F.R.: PROBIT: a statistical approach to modeling proteins from partial coordinate data using substructure libraries. Journal of Molecular Graphics 10(2), 124–126 (1992)CrossRefPubMedGoogle Scholar
  13. 13.
    Bowie, J.U., Eisenberg, D.: An Evolutionary Approach to Folding Small α-Helical Proteins that Uses Sequence Information and an Empirical Guiding Fitness Function. Proceedings of National Academy Sciences, USA 91(10), 4436–4440 (1994)CrossRefGoogle Scholar
  14. 14.
    Xia, Y., Huang, E.S., Levitt, M., Samudrala, R.: Ab initio construction of protein tertiary structures using a hierarchical approach. Journal of Molecular Biology 300(1), 171–185 (2000)CrossRefPubMedGoogle Scholar
  15. 15.
    Kihara, D., Lu, H., Kolinski, A., Skolnick, J.: TOUCHSTONE: An ab initio protein structure prediction method that uses threading-based tertiary restraints. Proceedings of the National Academy of Sciences 98(18), 10125–10130 (2001)CrossRefGoogle Scholar
  16. 16.
    Zhang, Y., Kolinski, A., Skolnick, J.: TOUCHSTONE II: a new approach to ab initio protein structure prediction. Biophysical Journal 85(2), 1145–1164 (2003)CrossRefPubMedPubMedCentralGoogle Scholar
  17. 17.
    Moult, J., Fidelis, K., Zemla, A., Hubbard, T.: Critical assessment of methods of protein structure prediction (CASP)-round V. Proteins: Structure, Function, and Genetics 53(S6), 334–339 (2003)CrossRefGoogle Scholar
  18. 18.
    Moult, J.: A decade of CASP: progress, bottlenecks and prognosis in protein structure prediction. Current Opinion in Structure Biology (June 2005)Google Scholar
  19. 19.
    Simons, K.T., Kooperberg, C., Huang, E., Baker, D.: Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. Journal of Molecular Biology 268(1), 209–225 (1997)CrossRefPubMedGoogle Scholar
  20. 20.
    Li, S.C., Bu, D., Xu, J., Li, M.: Fragment-hmm: A new approach to protein structure prediction. Protein Science, ps.036442.108+ (August 2008)Google Scholar
  21. 21.
    Feldman, H.J., Hogue, C.W.V.: Probabilistic sampling of protein conformations: New hope for brute force? Proteins: Structure, Function, and Genetics 46(1), 8–23 (2002)CrossRefGoogle Scholar
  22. 22.
    Hamelryck, T., Kent, J.T.T., Krogh, A.: Sampling Realistic Protein Conformations Using Local Structural Bias. PLoS Comput Biology 2(9) (September 2006)Google Scholar
  23. 23.
    Boomsma, W., Mardia, K.V., Taylor, C.C., Ferkinghoff-Borg, J., Krogh, A., Hamelryck, T.: A generative, probabilistic model of local protein structure. Proceedings of the National Academy of Sciences 105(26), 8932–8937 (2008)CrossRefGoogle Scholar
  24. 24.
    Zhao, F., Li, S., Sterner, B.W., Xu, J.: Discriminative learning for protein conformation sampling. Proteins: Structure, Function, and Bioinformatics 73(1), 228–240 (2008)CrossRefGoogle Scholar
  25. 25.
    Shen, M.Y., Sali, A.: Statistical potential for assessment and prediction of protein structures. Protein Science 15(11), 2507–2524 (2006)CrossRefPubMedPubMedCentralGoogle Scholar
  26. 26.
    Morozov, A.V., Kortemme, T., Tsemekhman, K., Baker, D.: Close agreement between the orientation dependence of hydrogen bonds observed in protein structures and quantum mechanical calculations. Proceedings of National Academy Sciences 101(18), 6946–6951 (2004)CrossRefGoogle Scholar
  27. 27.
    Levitt, M.: A simplified representation of protein conformations for rapid simulation of protein folding. Journal of Molecular Biology 104, 59–107 (1976)CrossRefPubMedGoogle Scholar
  28. 28.
    Kent, J.T.: The Fisher-Bingham Distribution on the Sphere. Journal of Royal Statistical Society 44, 71–80 (1982)Google Scholar
  29. 29.
    Holm, L., Sander, C.: Database algorithm for generating protein backbone and side-chain co-ordinates from a C alpha trace application to model building and detection of co-ordinate errors. Journal of Molecular Biology 218(1), 183–194 (1991)CrossRefPubMedGoogle Scholar
  30. 30.
    Gront, D., Kmiecik, S., Kolinski, A.: Backbone building from quadrilaterals: A fast and accurate algorithm for protein backbone reconstruction from alpha carbon coordinates. Journal of Computational Chemistry 28(9), 1593–1597 (2007)CrossRefPubMedGoogle Scholar
  31. 31.
    Maupetit, J., Gautier, R., Tufféry, P.: SABBAC: online structural alphabet-based protein backbone reconstruction from alpha-carbon trace. Nucleic Acids Researchearch 34(Webserver issue) (July 2006)Google Scholar
  32. 32.
    Branden, C.-I., Tooze, J.: Introduction to Protein Structure, 2nd edn. Garland Publishing (1999)Google Scholar
  33. 33.
    Wang, G., Dunbrack, R.L.: PISCES: a protein sequence culling server. Bioinformatics 19(12), 1589–1591 (2003)CrossRefPubMedGoogle Scholar
  34. 34.
    Phan, X.-H., Nguyen, L.-M., Nguyen, C.-T.: FlexCRFs: Flexible Conditional Random Field Toolkit (2005),
  35. 35.
    Fitzgerald, J.E., Jha, A.K., Colubri, A., Sosnick, T.R., Freed, K.F.: Reduced Cbeta statistical potentials can outperform all-atom potentials in decoy identification. Protein Science 16(10), 2123–2139 (2007)CrossRefPubMedPubMedCentralGoogle Scholar
  36. 36.
    Colubri, A., Jha, A.K., Shen, M.Y., Sali, A., Berry, R.S., Sosnick, T.R., Freed, K.F.: Minimalist representations and the importance of nearest neighbor effects in protein folding simulations. Journal of Molecular Biology 363(4), 835–857 (2006)CrossRefPubMedGoogle Scholar
  37. 37.
    Ooi, T., Oobatake, M., Nemethy, G., Scheraga, H.A.: Accessible Surface Areas as a Measure of the Thermodynamic Parameters of Hydration of Peptides. Proceedings of the National Academy of Sciences 84(10), 3086–3090 (1987)CrossRefGoogle Scholar
  38. 38.
    Fernández, A., Sosnick, T.R., Colubri, A.: Dynamics of hydrogen bond desolvation in protein folding. Journal of molecular biology 321(4), 659–675 (2002)CrossRefPubMedGoogle Scholar
  39. 39.
    Aarts, E., Korst, J.: Simulated Annealing and Boltzmann Machines: A Stochastic Approach to Combinatorial Optimization and Neural Computing. Wiley, Chichester (1991)Google Scholar
  40. 40.
    Zhang, Y., Skolnick, J.: TM-align: a protein structure alignment algorithm based on the TM-score. Nucleic Acids Research 33(7), 2302–2309 (2005)CrossRefPubMedPubMedCentralGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Feng Zhao
    • 1
  • Jian Peng
    • 1
  • Joe DeBartolo
    • 2
  • Karl F. Freed
    • 3
  • Tobin R. Sosnick
    • 2
  • Jinbo Xu
    • 1
  1. 1.Toyota Technological Institute at ChicagoChicagoUSA
  2. 2.Department of Biochemistry and Molecular Biologythe University of ChicagoChicagoUSA
  3. 3.Department of Chemistrythe University of ChicagoChicagoUSA

Personalised recommendations