Archiving of Integrative Structural Models

  • Helen M. BermanEmail author
  • Jill Trewhella
  • Brinda Vallat
  • John D. Westbrook
Part of the Advances in Experimental Medicine and Biology book series (AEMB, volume 1105)


Integrative or hybrid structural biology involves the determination of three-dimensional structures of macromolecular assemblies by combining information from a variety of experimental and computational methods. Archiving the results of integrative/hybrid modeling methods have complex requirements and existing archiving mechanisms are insufficient to handle these pre-requisites. Three concepts important for archiving integrative/hybrid models are presented in this chapter: (1) building a federated network of structural model and experimental data archives, (2) development of a common set of data standards, and (3) creation of mechanisms for interoperation and data exchange among the repositories in a federation. Methods proposed for achieving these objectives are also discussed.


Protein Data Bank Integrative/hybrid modeling methods PDBx/mmCIF Data standards Data exchange Structural biology federation 



This work was supported by NSF EAGER grant DBI-1519158. We thank our collaborators Andrej Sali and Benjamin Webb for their contributions towards the development of the I/H methods data dictionary and the PDB-Dev prototype system. We thank all our colleagues in the I/H methods Task Force and the members of the wwPDB for their help and support with this project.


  1. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE (2000) The Protein Data Bank. Nucleic Acids Res 28:235–242CrossRefGoogle Scholar
  2. Berman H, Henrick K, Nakamura H, Markley JL (2007) The worldwide Protein Data Bank (wwPDB): ensuring a single, uniform archive of PDB data. Nucleic Acids Res 35:D301–D303CrossRefGoogle Scholar
  3. Berman HM, Westbrook J, Vallat B, Webb B, Sali A (2016) A data dictionary for archiving integrative/hybrid models. 66th annual meeting of the American Crystallographic Association, Denver, CO, USA, pp. 85–SAGoogle Scholar
  4. BioMagResBank (2004) NMRSTAR: dictionary version [Online]. The Board of Regents of the University of Wisconsin System. Available: Accessed 15 Dec 2015
  5. Burley SK, Kurisu G, Markley J, Nakamura H, Velankar S, Berman HM, Sali A, Schwede T, Trewhella J (2017) PDB-Dev: a prototype system for depositing integrative/hybrid structural models. Structure 25:1317–1318CrossRefGoogle Scholar
  6. Desiere F, Deutsch EW, King NL, Nesvizhskii AI, Mallick P, Eng J, Chen S, Eddes J, Loevenich SN, Aebersold R (2006) the peptide atlas project. Nucleic Acids Res 34:D655–D658CrossRefGoogle Scholar
  7. Ferrin T, Huang C, Peeterson E, Goddard T, Couch G, Meng E, Morris S (2017) UCSF ChimeraX [Online]. Available: Accessed 5 July 2017
  8. Fitzgerald P MD, Westbrook JD, Bourne PE, McMahon B, Watenpaugh KD, Berman HM (2005) 4.5 macromolecular dictionary (mmCIF). In: Hall SR, Mcmahon B (eds) International tables for crystallography G. Definition and exchange of crystallographic data. Springer, Dordrecht, pp. 295–443CrossRefGoogle Scholar
  9. GitHub Inc (2007) GitHub: how people build software [Online]. Available Accessed 1 Nov 2013
  10. Haas J, Schwede T (2013) Model archive [Online]. Available: Accessed 12 Oct 2016
  11. Haas J, Roth S, Arnold K, Kiefer F, Schmidt T, Bordoli L, Schwede T (2013) The protein model portal – a comprehensive resource for protein structure and model information. Database (Oxford), 2013, bat031Google Scholar
  12. Hopf TA, Scharfe CP, Rodrigues JP, Green AG, Kohlbacher O, Sander C, Bonvin AM, Marks DS (2014) Sequence co-evolution gives 3D contacts and structures of protein complexes. elife 3Google Scholar
  13. Kachala M, Westbrook J, Svergun D (2016) Extension of the sasCIF format and its applications for data processing and deposition. J Appl Crystallogr 49:302–310CrossRefGoogle Scholar
  14. Kendrew JC, Bodo G, Dintzis HM, Parrish RG, Wyckoff H, Phillips DC (1958) A three-dimensional model of the myoglobin molecule obtained by x-ray analysis. Nature 181:662–666CrossRefGoogle Scholar
  15. Kim SJ, Fernandez-Martinez J, Nudelman I, Shi Y, Zhang W, Raveh B, Herricks T, Slaughter BD, Hogan JA, Upla P, Chemmama IE, Pellarin R, Echeverria I, Shivaraju M, Chaudhury AS, Wang J, Williams R, Unruh JR, Greenberg CH, Jacobs EY, Yu Z, de la Cruz MJ, Mironska R, Stokes DL, Aitchison JD, Jarrold MF, Gerton JL, Ludtke SJ, Akey CW, Chait BT, Sali A, Rout MP (2018) Integrative structure and functional anatomy of a nuclear pore complex. Nature 555(7697):475–482Google Scholar
  16. Lawson CL, Baker ML, Best C, Bi C, Dougherty M, Feng P, van Ginkel G, Devkota B, Lagerstedt I, Ludtke SJ, Newman RH, Oldfield TJ, Rees I, Sahni G, Sala R, Velankar S, Warren J, Westbrook JD, Henrick K, Kleywegt GJ, Berman HM, Chiu W (2011) unified data resource for CryoEM. Nucleic Acids Res 39:D456–D464CrossRefGoogle Scholar
  17. Leitner A, Faini M, Stengel F, Aebersold R (2016) Crosslinking and mass spectrometry: an integrated technology to understand the structure and function of molecular machines. Trends Biochem Sci 41:20–32CrossRefGoogle Scholar
  18. Loquet A, Sgourakis NG, Gupta R, Giller K, Riedel D, Goosmann C, Griesinger C, Kolbe M, Baker D, Becker S, Lange A (2012) Atomic model of the type III secretion system needle. Nature 486:276–279CrossRefGoogle Scholar
  19. Malfois M, Svergun DI (2000) sasCIF: an extension of core crystallographic information file for SAS. J Appl Crystallogr 33:812–816CrossRefGoogle Scholar
  20. Patwardhan A, Lawson CL (2016) Databases and archiving for CryoEM. Methods Enzymol 579:393–412CrossRefGoogle Scholar
  21. Perutz MF, Rossmann MG, Cullis AF, Muirhead H, Will G, North ACT (1960) Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185:416–422CrossRefGoogle Scholar
  22. Politis A, Stengel F, Hall Z, Hernandez H, Leitner A, Walzthoeni T, Robinson CV, Aebersold R (2014) A mass spectrometry-based hybrid method for structural modeling of protein complexes. Nat Methods 11:403–406CrossRefGoogle Scholar
  23. Protein Data Bank (1971) Protein Data Bank. Nature New Biol 233:223Google Scholar
  24. Rambo RP, Tainer JA, Hura GL (2017) BIOISIS [Online]. Available: Accessed 7 Aug 2017
  25. Russel D, Lasker K, Webb B, Velazquez-Muriel J, Tjioe E, Schneidman-Duhovny D, Peterson B, Sali A (2012) Putting the pieces together: integrative modeling platform software for structure determination of macromolecular assemblies. PLoS Biol 10:e1001244CrossRefGoogle Scholar
  26. Sali A, Berman HM, Schwede T, Trewhella J, Kleywegt G, Burley SK, Markley J, Nakamura H, Adams P, Bonvin AM, Chiu W, Peraro MD, Di Maio F, Ferrin TE, Grunewald K, Gutmanas A, Henderson R, Hummer G, Iwasaki K, Johnson G, Lawson CL, Meiler J, Marti-Renom MA, Montelione GT, Nilges M, Nussinov R, Patwardhan A, Rappsilber J, Read RJ, Saibil H, Schroder GF, Schwieters CD, Seidel CA, Svergun D, Topf M, Ulrich EL, Velankar S, Westbrook JD (2015) Outcome of the first wwPDB hybrid/integrative methods task force workshop. Structure 23:1156–1167CrossRefGoogle Scholar
  27. Ulrich EL, Akutsu H, Doreleijers JF, Harano Y, Ioannidis YE, Lin J, Livny M, Mading S, Maziuk D, Miller Z, Nakatani E, Schulte CF, Tolmie DE, Kent Wenger R, Yao H, Markley JL (2008) BioMagResBank. Nucleic Acids Res 36:D402–D408CrossRefGoogle Scholar
  28. Valentini E, Kikhney AG, Previtali G, Jeffries CM, Svergun DI (2015) SASBDB, a repository for biological small-angle scattering data. Nucleic Acids Res 43:D357–D363CrossRefGoogle Scholar
  29. Vallat B, Webb B, Westbrook J, Sali A, Berman H (2016a) Integrative/hybrid methods PDBx/mmCIF dictionary extension [Online]. Available: Accessed 9 June 2016
  30. Vallat B, Webb B, Westbrook J, Sali A, Berman H (2016b) Integrative/hybrid methods PDBx/mmCIF dictionary extension documentation [Online]. Available: Accessed 9 June 2016
  31. Vallat B, Webb B, Westbrook J, Sali A, Berman HM (2016c) The PDB-Dev prototype deposition and archiving system [Online]. Available: Accessed 31 Aug 2016
  32. Vallat B, Webb B, Westbrook JD, Sali A, Berman HM (2018) Development of a prototype system for archiving integrative/hybrid structure models of biological macromolecules. Structure 26(6):894–904 e2CrossRefGoogle Scholar
  33. Vizcaino JA, Deutsch EW, Wang R, Csordas A, Reisinger F, Rios D, Dianes JA, Sun Z, Farrah T, Bandeira N, Binz PA, Xenarios I, Eisenacher M, Mayer G, Gatto L, Campos A, Chalkley RJ, Kraus HJ, Albar JP, Martinez-Bartolome S, Apweiler R, Omenn GS, Martens L, Jones AR, Hermjakob H (2014) ProteomeX change provides globally coordinated proteomics data submission and dissemination. Nat Biotechnol 32:223–226CrossRefGoogle Scholar
  34. Vizcaino JA, Csordas A, Del-Toro N, Dianes JA, Griss J, Lavidas I, Mayer G, Perez-Riverol Y, Reisinger F, Ternent T, Xu QW, Wang R, Hermjakob H (2016) 2016 update of the PRIDE database and its related tools. Nucleic Acids Res 44:D447–D456CrossRefGoogle Scholar
  35. Ward AB, Sali A, Wilson IA (2013) Biochemistry. Integrative structural biology. Science 339: 913–915CrossRefGoogle Scholar
  36. Whitehead TA, Chevalier A, Song Y, Dreyfus C, Fleishman SJ, De Mattos C, Myers CA, Kamisetty H, Blair P, Wilson IA, Baker D (2012) Optimization of affinity, specificity and function of designed influenza inhibitors using deep sequencing. Nat Biotechnol 30:543–548CrossRefGoogle Scholar
  37. Winn MD, Ballard CC, Cowtan KD, Dodson EJ, Emsley P, Evans PR, Keegan RM, Krissinel EB, Leslie AG, McCoy A, McNicholas SJ, Murshudov GN, Pannu NS, Potterton EA, Powell HR, Read RJ, Vagin A, Wilson KS (2011) Overview of the CCP4 suite and current developments. Acta Crystallogr D Biol Crystallogr 67:235–242CrossRefGoogle Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Helen M. Berman
    • 1
    Email author
  • Jill Trewhella
    • 2
    • 3
  • Brinda Vallat
    • 4
  • John D. Westbrook
    • 5
  1. 1.RCSB Protein Data Bank, Department of Chemistry and Chemical Biology, Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayUSA
  2. 2.School of Life and Environmental SciencesThe University of SydneySydneyAustralia
  3. 3.Department of ChemistryUniversity of UtahSalt Lake CityUSA
  4. 4.RCSB, Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayUSA
  5. 5.RCSB Protein Data Bank, Institute for Quantitative BiomedicineRutgers, The State University of New JerseyPiscatawayUSA

Personalised recommendations