Journal of Computer-Aided Molecular Design

, Volume 26, Issue 4, pp 387–396 | Cite as

A collaborative environment for developing and validating predictive tools for protein biophysical characteristics

  • Michael A. JohnstonEmail author
  • Damien Farrell
  • Jens Erik Nielsen


The exchange of information between experimentalists and theoreticians is crucial to improving the predictive ability of theoretical methods and hence our understanding of the related biology. However many barriers exist which prevent the flow of information between the two disciplines. Enabling effective collaboration requires that experimentalists can easily apply computational tools to their data, share their data with theoreticians, and that both the experimental data and computational results are accessible to the wider community. We present a prototype collaborative environment for developing and validating predictive tools for protein biophysical characteristics. The environment is built on two central components; a new python-based integration module which allows theoreticians to provide and manage remote access to their programs; and PEATDB, a program for storing and sharing experimental data from protein biophysical characterisation studies. We demonstrate our approach by integrating PEATSA, a web-based service for predicting changes in protein biophysical characteristics, into PEATDB. Furthermore, we illustrate how the resulting environment aids method development using the Potapov dataset of experimentally measured ΔΔGfold values, previously employed to validate and train protein stability prediction algorithms.


Protein stability Prediction Protein design Data analysis Data integration Molecular modelling 



Funding: Science Foundation Ireland (SFI) President of Ireland Young Researcher award (Grant 04/YI1/M537 to J.E.N). SFI Research Frontiers award (Grant 08/RFP/BIC1140 to J.E.N).

Supplementary material

Supplementary material 1 (MP4 12257 kb)


  1. 1.
    Berman H, Henrick K, Nakamura H (2003) Announcing the worldwide Protein Data Bank. Nat Struct Biol 10(12):980CrossRefGoogle Scholar
  2. 2.
    Porter CT, Bartlett GJ, Thornton JM (2004) The Catalytic Site Atlas: a resource of catalytic sites and residues identified in enzymes using structural data. Nucleic Acids Res 32(Database issue):D129–D133CrossRefGoogle Scholar
  3. 3.
    Kumar MDS, Bava KA, Gromiha MM, Prabakaran P, Kitajima K, Uedaira H, Sarai A (2006) ProTherm and ProNIT: thermodynamic databases for proteins and protein-nucleic acid interactions. Nucleic Acids Res 34(Database issue):D204–D206CrossRefGoogle Scholar
  4. 4.
    Barthelmes J, Ebeling C, Chang A, Schomburg I, Schomburg D (2007) BRENDA, AMENDA and FRENDA: the enzyme information system in 2007. Nucleic Acids Res 35(Database issue):D511–D514CrossRefGoogle Scholar
  5. 5.
    Toseland CP, McSparron H, Davies MN, Flower DR (2006) PPD v1.0–an integrated, web-accessible database of experimentally determined protein pKa values. Nucleic Acids Res 34(Database issue):D199–D203CrossRefGoogle Scholar
  6. 6.
    Farrell D, Miranda ES, Webb H, Georgi N, Crowley PB, McIntosh LP, Nielsen JE (2010) Titration_DB: storage and analysis of NMR-monitored protein pH titration curves. Proteins 78(4):843–857CrossRefGoogle Scholar
  7. 7.
    Block P, Sotriffer CA, Dramburg I, Klebe G (2006) AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB. Nucleic Acids Res 34(Database issue):D522–D526CrossRefGoogle Scholar
  8. 8.
    Liu T, Lin Y, Wen X, Jorissen RN, Gilson MK (2007) BindingDB: a web-accessible database of experimentally determined protein-ligand binding affinities. Nucleic Acids Res 35(Database issue):D198–D201CrossRefGoogle Scholar
  9. 9.
    Wang R, Fang X, Lu Y, Yang C-Y, Wang S (2005) The PDBbind database: methodologies and updates. J Med Chem 48(12):4111–4119CrossRefGoogle Scholar
  10. 10.
    Jensen LJ, Kuhn M, Stark M, Chaffron S, Creevey C, Muller J, Doerks T, Julien P, Roth A, Simonovic M, Bork P, von Mering C (2009) STRING 8–a global view on proteins and their functional interactions in 630 organisms. Nucleic Acids Res 37(Database issue):D412–D416CrossRefGoogle Scholar
  11. 11.
    Rohl CA, Strauss CEM, Misura KMS, Baker D (2004) Protein Structure Prediction Using Rosetta. Methods Enzymol 383:66–93CrossRefGoogle Scholar
  12. 12.
    Zhang Y (2008) I-TASSER server for protein 3D structure prediction. BMC Bioinformatics 9:40CrossRefGoogle Scholar
  13. 13.
    Sham YY, Chu ZT, Tao H, Warshel A (2000) Examining methods for calculations of binding free energies: LRA, LIE, PDLD-LRA, and PDLD/S-LRA calculations of ligands binding to an HIV protease. Proteins 39(4):393–407CrossRefGoogle Scholar
  14. 14.
    Wang R, Lu Y, Wang S (2003) Comparative evaluation of 11 scoring functions for molecular docking. J Med Chem 46(12):2287–2303CrossRefGoogle Scholar
  15. 15.
    Li H, Robertson AD, Jensen JH (2005) Very fast empirical prediction and rationalization of protein pKa values. Proteins 61(4):704–721CrossRefGoogle Scholar
  16. 16.
    Tynan-Connolly BM, Nielsen JE (2006) pKD: re-designing protein pKa values. Nucleic Acids Res 34(Web Server issue):W48–W51CrossRefGoogle Scholar
  17. 17.
    Korkegian A, Black ME, Baker D, Stoddard BL (2005) Computational thermostabilization of an enzyme. Science 308(5723):857–860CrossRefGoogle Scholar
  18. 18.
    Potapov V, Cohen M, Schreiber G (2009) Assessing computational methods for predicting protein stability upon mutation: good on average but not in the details. Protein Eng Des Sel 22(9):553–560CrossRefGoogle Scholar
  19. 19.
    Aloy P, Russell RB (2004) Ten thousand interactions for the molecular biologist. Nat Biotechnol 22(10):1317–1321CrossRefGoogle Scholar
  20. 20.
    Aloy P, Böttcher B, Ceulemans H, Leutwein C, Mellwig C, Fischer S, Gavin A-C, Bork P, Superti-Furga G, Serrano L, Russell RB (2004) Structure-based assembly of protein complexes in yeast. Science 303(5666):2026–2029CrossRefGoogle Scholar
  21. 21.
    Olsson MHM, Parson WW, Warshel A (2006) Dynamical contributions to enzyme catalysis: critical tests of a popular hypothesis. Chem Rev 106(5):1737–1756CrossRefGoogle Scholar
  22. 22.
    Simonson T (2002) Gaussian fluctuations and linear response in an electron transfer protein. Proc Natl Acad Sci U S A, 99(10):6544–6549CrossRefGoogle Scholar
  23. 23.
    Carstensen T, Farrell D, Huang Y, Baker NA, Nielsen JE (2011) On the development of protein pKa calculation algorithms. Proteins. doi: 10.1002/prot.23091
  24. 24.
    Benson G (2010) Editorial. Nucleic Acids Res 38(suppl 2):W1–W2CrossRefGoogle Scholar
  25. 25.
    Farrell D, O’Meara F, Johnston M, Bradley J, Søndergaard CR, Georgi N, Webb H, Tynan-Connolly BM, Bjarnadottir U, Carstensen T, Nielsen JE (2010) Capturing, sharing and analysing biophysical data from protein engineering and protein characterization studies. Nucleic Acids Res 38(20):e186CrossRefGoogle Scholar
  26. 26.
    Johnston MA, Søndergaard CR, Nielsen JE (2011) Integrated prediction of the effect of mutations on multiple protein characteristics. Proteins 79(1):165–178CrossRefGoogle Scholar
  27. 27.
    Tynan-Connolly BM, Nielsen JE (2007) Redesigning protein pKa values. Protein Sci 16(2):239–249CrossRefGoogle Scholar
  28. 28.
    Dolinsky TJ, Czodrowski P, Li H, Nielsen JE, Jensen JH, Klebe G, Baker NA (2007) PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations. Nucleic Acids Res 35(Web Server issue):W522–W525CrossRefGoogle Scholar
  29. 29.
    Guerois G, Nielsen JE, Serrano L (2002) Predicting changes in the stability of proteins and protein complexes: a study of more than 1000 mutations. J Mol Biol 320(2):369–387CrossRefGoogle Scholar
  30. 30.
    Bode B, Halstead DM, Kendall R, Lei Z, Jackson D (2000) The portable batch scheduler and the Maui scheduler on Linux clusters. In: ALS’00: Proceedings of the 4th Annual Linux Showcase & Conference. Berkeley, CA, USA: USENIX Association, pp 27–27Google Scholar
  31. 31.
    Johnston MA, Nielsen JE (2011) Constructing and evaluating predictive models for protein biophysical characteristics. Ann Rep Comput Chem 7:101–122. doi: 10.1016/B978-0-444-53835-2.00012-2 Google Scholar
  32. 32.
    Serrano L, Kellis JT Jr, Cann P, Matouschek A, Fersht AR (1992) The folding of an enzyme. II. Substructure of barnase and the contribution of different interactions to protein stability. J Mol Biol 224(3):783–804CrossRefGoogle Scholar
  33. 33.
    Serrano L, Sancho J, Hirshberg M, Fersht AR (1992) Alpha-helix stability in proteins. I. Empirical correlations concerning substitution of side-chains at the N and C-caps and the replacement of alanine by glycine or serine at solvent-exposed surfaces. J Mol Biol 227(2):544–559CrossRefGoogle Scholar
  34. 34.
    Horovitz A, Matthews JM, Fersht AR (1992) Alpha-helix stability in proteins. II. Factors that influence stability at an internal position. J Mol Biol 227(2):560–568CrossRefGoogle Scholar
  35. 35.
    Needleman SB, Wunsch CD (1970) A general method applicable to the search for similarities in the amino acid sequence of two proteins. J Mol Biol 48(3):443–453CrossRefGoogle Scholar
  36. 36.
    Farrell D, Webb H, Johnston MA, Poulsen TA, Christensen LB, Borchert TV, Nielsen JE (2012) Towards fast determination of protein stability maps: experimental and theoretical analysis of mutants of a Nocardiopsis prasina serine protease. Biochemistry, Accepted for publication.Google Scholar

Copyright information

© Springer Science+Business Media B.V. 2012

Authors and Affiliations

  • Michael A. Johnston
    • 1
    Email author
  • Damien Farrell
    • 1
  • Jens Erik Nielsen
    • 1
  1. 1.School of Biomolecular and Biomedical Science, Centre for Synthesis and Chemical BiologyUCD Conway Institute, University College DublinDublin 4Ireland

Personalised recommendations