Computation of 3D queries for ROCS based virtual screens

  • Gregory J. Tawa
  • J. Christian Baber
  • Christine Humblet


Rapid overlay of chemical structures (ROCS) is a method that aligns molecules based on shape and/or chemical similarity. It is often used in 3D ligand-based virtual screening. Given a query consisting of a single conformation of an active molecule ROCS can generate highly enriched hit lists. Typically the chosen query conformation is a minimum energy structure. Can better enrichment be obtained using conformations other than the minimum energy structure? To answer this question a methodology has been developed called CORAL (COnformational analysis, Rocs ALignment). For a given set of molecule conformations it computes optimized conformations for ROCS screening. It does so by clustering all conformations of a chosen molecule set using pairwise ROCS combo scores. The best representative conformation is that which has the highest average overlap with the rest of the conformations in the cluster. It is these best representative conformations that are then used for virtual screening. CORAL was tested by performing virtual screening experiments with the 40 DUD (Directory of Useful Decoys) data sets. Both CORAL and minimum energy queries were used. The recognition capability of each query was quantified as the area under the ROC curve (AUC). Results show that the CORAL AUC values are on average larger than the minimum energy AUC values. This demonstrates that one can indeed obtain better ROCS enrichments with conformations other than the minimum energy structure. As a result, CORAL analysis can be a valuable first step in virtual screening workflows using ROCS.


Ligand-based virtual screening ROCS Optimized query conformation ROC curve analysis Statistical significance Virtual screening workflow 



The authors would like to thank Will Somers and Tarek Mansour of Wyeth Chemical Sciences for their support, Dave Diller for manuscript suggestions, Ramaswamy Nilikantan for help with the diversity analysis and Youping Huang for help in performing the statistical analysis.


  1. 1.
    Rai BK, Tawa GJ, Katz AH, Humblet C (2009) Modeling G protein-coupled receptors for structure-based drug discovery using low-frequency normal modes for refinement of homology models: application to H3 antagonist. Proteins (accepted for publication)Google Scholar
  2. 2.
    Palczewski K, Kumasaka T, Hori T, Behnke CA, Motoshima H, Fox BA, Trong IL, Teller DC, Okada T, Stenkamp RE, Yamamoto M, Miyano M (2000) Crystal structure of rhodopsin: A G. protein-coupled receptor. Science 289:739–745CrossRefGoogle Scholar
  3. 3.
    Cherezov V, Rosenbaum DM, Hanson MA, Rasmussen SG, Thian FS, Kobilka TS, Choi HJ, Kuhn P, Weis WI, Kobilka BK, Stevens RC (2007) High-resolution crystal structure of an engineered human B2-adrenergic G protein-coupled receptor. Science 318:1258–1265CrossRefGoogle Scholar
  4. 4.
    Jaakola V-P, Griffith MT, Hanson MA, Cherezov V, Chien EYT, Lane JR, Ijzerman AP, Stevens RC (2008) The 2.6 angstrom crystal structure of a human A2A adenosine receptor bound to an antagonist. Science 322:1211–1217CrossRefGoogle Scholar
  5. 5.
    Kim D, Xu D, Guo JT, Ellrott K, Xu Y (2003) PROSPECT II: protein structure prediction program for genome-scale applications. Protein Eng 16:641–650CrossRefGoogle Scholar
  6. 6.
    Petrey D, Xiang Z, Tang CL, Xie L, Gimpelev M et al (2003) Using multiple structure alignments, fast model building, and energetic analysis in fold recognition and homology modeling. Proteins 53(6):430–435CrossRefGoogle Scholar
  7. 7.
    Simons KT, Kooperberg C, Huang E, Baker D (1997) Assembly of protein tertiary structures from fragments with similar local sequences using simulated annealing and Bayesian scoring functions. J Mol Biol 268:209–225CrossRefGoogle Scholar
  8. 8.
    Tresadern G, Bemporad D, Howe TA (2009) Comparison of ligand based virtual screening methods and application to corticotrophin releasing factor 1 receptor. J Mol Graph Model 27:860–870CrossRefGoogle Scholar
  9. 9.
    ROCS 2.3.1, OpenEye Scientific Software, Santa Fe, NM, 2007.
  10. 10.
    Grant JA, Gallard MA, Pickup BG (1996) A fast method of molecular shape comparison: a simple application of a Gaussian description of molecular shape. J Comput Chem 17:1653–1666CrossRefGoogle Scholar
  11. 11.
    Nicholls A, Grant JA (2005) Molecular shape and electrostatics in the encoding of relevant chemical information. J Comput-Aided Mol Des 19:661–686CrossRefGoogle Scholar
  12. 12.
    Freitas RF, Oprea TI, Montanari CA (2008) Two-dimensional QSAR and similarity studies on cruzain inhibitors aimed at improving selectivity over cathepsin L. Bioorg Med Chem 16:838–853CrossRefGoogle Scholar
  13. 13.
    Bostrom J, Greenwood JR, Gottfries J (2003) Assessing the performance of OMEGA with respect to retrieving bioactive conformations. J Mol Graph Model 21:449–462CrossRefGoogle Scholar
  14. 14.
    Bostrom J (2001) Reproducing the conformations of protein-bound ligands: a critical evaluation of several popular conformational searching tools. J Comput Aided Mol Des 15:1137–1152CrossRefGoogle Scholar
  15. 15.
    Diller DD, Merz KM Jr (2002) Can we separate active from inactive conformations? J Comput Aided Mol Des 16:105–112CrossRefGoogle Scholar
  16. 16.
    Hawkins PCD, Skillman GA, Nicholls A (2007) Comparison of shape-matching and docking as virtual screening tools. J Med Chem 50:74–82CrossRefGoogle Scholar
  17. 17.
    Kirchmair J, Distinto S, Markt P, Schuster D, Spitzer GM, Liedl KR, Wolber G (2009) How to optimize shape-based virtual screening: choosing the right query and including chemical information. J Chem Inf Model 49:678–692CrossRefGoogle Scholar
  18. 18.
    Perola E, Charifson PS (2004) Conformational analysis of drug-like molecules bound to proteins: an extensive study of ligand reorganization upon binding. J Med Chem 45:2499–2510CrossRefGoogle Scholar
  19. 19.
    Putta S, Landrum GA, Penzotti JE (2005) Conformation mining: an algorithm for finding biologically relevant conformations. J Med Chem 48:3313–3318CrossRefGoogle Scholar
  20. 20.
    Rush TA (2005) Shaped-based 3-D scaffold hopping method and its application to a bacterial protein–protein interaction. J Med Chem 48:1489–1495CrossRefGoogle Scholar
  21. 21.
    Huang N, Shoichet B, Irwin J (2006) Benchmarking sets for molecular docking. J Med Chem 49:6789–6801CrossRefGoogle Scholar
  22. 22.
    Triballeau N, Acher F, Brabet I, Pin J-P, Bertrand H-O (2005) Virtual screening workflow development guided by the “Receiver Operating Characteristic” curve approach. Applications to high-throughput docking on metabotropic glutamate receptor subtype 4. J Med Chem 48:2534–2547CrossRefGoogle Scholar
  23. 23.
    Hanley JA, McNeil BJ (1982) The meaning and use of the area under a receiver operating characteristic (ROC) curve. Radiology 143:29–36Google Scholar
  24. 24.
    OMEGA 2.2.1, OpenEye Scientific Software, Santa Fe, NM, 2007.
  25. 25.
    Bostrom J (2002) Reproducing the conformations of protein-bound ligands: a critical evaluation of several popular conformational searching tools. J Comput Aided Mol Des 15:1137CrossRefGoogle Scholar
  26. 26.
    Hawkins PCD, Warren GL, Skillman AG, Nicholls A (2008) How to do an evaluation: pitfalls and traps. J Comput Aided Mol Des 22:179–190CrossRefGoogle Scholar
  27. 27.
    Sokal RR, Rohlf FJ (1995) Biometry: the principles and practice of statistics in biological research. W.H. Freeman, New YorkGoogle Scholar
  28. 28.
    Turner DB, Tyrell SM, Willett P (1997) Rapid quantification of molecular diversity for selective database acquisition. J Chem Inf Comput Sci 37:18–22Google Scholar
  29. 29.
    Patterson DE, Cramer RD, Ferguson AM, Clark RD, Weinberger LE (1996) Neighborhood behavior: a useful concept for validation of ‘‘molecular diversity’’ descriptors. J Med Chem 39:3049–3059CrossRefGoogle Scholar
  30. 30.
    Bostrom J, Hogner A, Schmitt S (2006) Do structurally similar ligands bind in a similar fashion? J Med Chem 49:6716–6725CrossRefGoogle Scholar
  31. 31.
    OEChem-C++ theory manual, OEMCSSEARCH. OpenEye Scientific Software: Santa Fe, NM, 2006.
  32. 32.
    Nicholls A (2008) What do we know and when do we know it? J Comput Aided Mol Des 22:239–255CrossRefGoogle Scholar
  33. 33.
    Hassan M, Brown RD, Varna-O’Brien S, Rogers D (2006) Cheminformatics analysis and learning in a data pipelining environment. Mol Divers 10:283–299CrossRefGoogle Scholar
  34. 34.
    Scitegic Inc, Pipeline Pilot Version, 2009.

Copyright information

© Springer Science+Business Media B.V. 2009

Authors and Affiliations

  • Gregory J. Tawa
    • 1
  • J. Christian Baber
    • 2
  • Christine Humblet
    • 1
  1. 1.Chemical SciencesWyeth ResearchPrincetonUSA
  2. 2.Chemical SciencesWyeth ResearchCambridgeUSA

Personalised recommendations