Journal of Computer-Aided Molecular Design

, Volume 20, Issue 3, pp 179–190 | Cite as

Permuting input for more effective sampling of 3D conformer space

  • Giorgio Carta
  • Valeria Onnis
  • Andrew J. S. Knox
  • Darren Fayne
  • David G. Lloyd


SMILES strings and other classic 2D structural formats offer a convenient way to represent molecules as a simplistic connection table, with the inherent advantages of ease of handling and storage. In the context of virtual screening, chemical databases to be screened are often initially represented by canonicalised SMILES strings that can be filtered and pre-processed in a number of ways, resulting in molecules that occupy similar regions of chemical space to active compounds of a therapeutic target. A wide variety of software exists to convert molecules into SMILES format, namely, Mol2smi (Daylight Inc.), MOE (Chemical Computing Group) and Babel (Openeye Scientific Software). Depending on the algorithm employed, the atoms of a SMILES string defining a molecule can be ordered differently. Upon conversion to 3D coordinates they result in the production of ostensibly the same molecule.

In this work we show how different permutations of a SMILES string can affect conformer generation, affecting reliability and repeatability of the results. Furthermore, we propose a novel procedure for the generation of conformers, taking advantage of the permutation of the input strings—both SMILES and other 2D formats, leading to more effective sampling of conformation space in output, and also implementing fingerprint and principal component analyses step to post process and visualise the results.


Conformers Docking Drug discovery Fingerprints Scoring SMILES Virtual screening 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



This work was supported through funding from Science Foundation Ireland and the Irish Health Research Board.


  1. 1.
    Hou T, Xu X (2004) Curr Pharm Des 10(9):1011CrossRefGoogle Scholar
  2. 2.
    Liao C, Liu B, Shi L, Zhou J, Lu XP (2005) Eur J Med Chem 40(7):632CrossRefGoogle Scholar
  3. 3.
    Bringmann BKA (2004) Frequent SMILES. Lernen, Wissensentdeckung und Adaptivität, Workshop GI Fachgruppe Maschinelles Lernen, part of LWA 2004Google Scholar
  4. 4.
    Weininger D (1988) J Chem Inf Comput 28:31CrossRefGoogle Scholar
  5. 5.
    Knox AJS, Meegan MJ, Carta G, Lloyd DG (2005) J Chem Inf Model 45(6):1908–19Google Scholar
  6. 6.
    Vigers GP, Rizzi JP (2004) J Med Chem 47(1):80–89CrossRefGoogle Scholar
  7. 7.
    Kauppi B, Jakob C, Farnegardh M, Yang J, Ahola H, Alarcon M, Calles K, Engstrom O, Harlan J, Muchmore S, Ramqvist AK, Thorell S, Ohman L, Greer J, Gustafsson JA, Carlstedt-Duke J, Carlquist M (2003) J Biol Chem 278(25):22748CrossRefGoogle Scholar
  8. 8.
    Cronet P, Petersen JF, Folmer R, Blomberg N, Sjoblom K, Karlsson U, Lindstedt EL, Bamberg K (2001) Structure (Camb) 9(8):699CrossRefGoogle Scholar
  9. 9.
    Daylight Chemical Informations Systems Inc. (URL: Scholar
  10. 10.
    Molecular Operating Environment (MOE), developed and distributed by Chemical Computing Group (http://wwwchemcompcom)Google Scholar
  11. 11.
    Babel v2.0A3, distributed by Openeye Scientific SoftwareGoogle Scholar
  12. 12.
    Chemsketch v8.17, www.acdlabs.comGoogle Scholar
  13. 13.
    Rarey M, Kramer B, Lengauer T, Klebe G (1996) J Mol Biol 261(3):470CrossRefGoogle Scholar
  14. 14.
    Jones G, Willett P, Glen RC, Leach AR, Taylor R (1997) J Mol Biol 267(3):727CrossRefGoogle Scholar
  15. 15.
    CORINA 3.6, distributed by Molecular Networks GmbHGoogle Scholar
  16. 16.
    Cambridge Structural Database, Scholar
  17. 17.
    OMEGA 1.8.1, distributed by Openeye Scientific SoftwareGoogle Scholar
  18. 18.
    Catalyst v4.9.1, www.accelrys.comGoogle Scholar
  19. 19.
    RUBICON, distributed by Daylight Chemical Informations Systems IncGoogle Scholar
  20. 20.
    Shanno DF, Phua KH (1980) ACM Trans Math Software 6:618Google Scholar
  21. 21.
    Bostrom J, Greenwood JR, Gottfries J (2003) J Mol Graph Model 21(5):449CrossRefGoogle Scholar
  22. 22.
    Bostrom J (2001) J Comput Aided Mol Des 15(12):1137CrossRefGoogle Scholar
  23. 23.
    Ivanciuc O (2003) In: Gasteiger J (ed) Handbook of chemoinformatic , vol. 1. p 103Google Scholar
  24. 24.
    Kudo Y, Sasaki S (1974) J Chem Document 14(4):200CrossRefGoogle Scholar
  25. 25.
    Babel 1.100.2, Distributed by Openeye Scientific SoftwareGoogle Scholar
  26. 26.
    oechem, RMSD, Distributed by Openeye Scientific SoftwareGoogle Scholar

Copyright information

© Springer Science+Business Media B.V. 2006

Authors and Affiliations

  • Giorgio Carta
    • 1
  • Valeria Onnis
    • 1
  • Andrew J. S. Knox
    • 2
  • Darren Fayne
    • 1
  • David G. Lloyd
    • 1
  1. 1.Molecular Design Group, School of Biochemistry and ImmunologyTrinity College DublinDublin 2Ireland
  2. 2.School of Pharmacy and Pharmaceutical SciencesTrinity College DublinDublin 2Ireland

Personalised recommendations