pp 1-24 | Cite as

Programmatic Retrieval of Small Molecule Information from PubChem Using PUG-REST

  • Sunghwan KimEmail author
  • Paul A. Thiessen
  • Evan E. Bolton
Part of the Methods in Pharmacology and Toxicology book series


PubChem ( is an open archive which contains information on small molecules as well as other chemical entities such as lipids, carbohydrates, and (chemically modified) amino acid and nucleic acid sequences (including siRNA and miRNA). Developed and maintained by the US National Institutes of Health, PubChem is a chemical information hub, collecting chemical information from various data sources and disseminating it to the public free of charge. PubChem provides multiple programmatic access routes, including E-Utilities, Power User Gateway (PUG), PUG-SOAP, and PUG-REST. This chapter describes how to access PubChem programmatically through PUG-REST. The syntax of the PUG-REST request URL is explained with many examples that cover various tasks and a series of Perl scripts are provided to demonstrate how these URLs can be included in actual programs.


Cheminformatics E-Utilities Programmatic access PubChem PUG-REST Representational state transfer (REST) 



This research was supported in part by the Intramural Research Program of the National Library of Medicine, National Institutes of Health, the US Department of Health and Human Services.

Supplementary material (6 kb)
Data 1 contains the Perl scripts shown in Figs. 7, 8, 9, 10, 11, 12, 13, and 14 (ZIP 5 kb)


  1. 1.
    Kim S, Thiessen PA, Bolton EE, Chen J, Fu G, Gindulyte A, Han L, He J, He S, Shoemaker BA, Wang J, Yu B, Zhang J, Bryant SH (2016) PubChem substance and compound databases. Nucleic Acids Res 44(D1):D1202–D1213. Scholar
  2. 2.
    Wang YL, Suzek T, Zhang J, Wang JY, He SQ, Cheng TJ, Shoemaker BA, Gindulyte A, Bryant SH (2014) PubChem BioAssay: 2014 update. Nucleic Acids Res 42(D1):D1075–D1082. Scholar
  3. 3.
    Kim S (2016) Getting the most out of PubChem for virtual screening. Expert Opin Drug Discov 11(9):843–855. Scholar
  4. 4.
    Kim S, Thiessen PA, Bolton EE, Bryant SH (2015) PUG-SOAP and PUG-REST: web services for programmatic access to chemical information in PubChem. Nucleic Acids Res 43(W1):W605–W611. Scholar
  5. 5.
    Kim S, Thiessen PA, Cheng T, Yu B, Bolton EE (2018) An update on PUG-REST: RESTful interface for programmatic access to PubChem. Nucleic Acids Res 46(W1):W563–W570. Scholar
  6. 6.
  7. 7.
    Entrez programming utilities help.
  8. 8.
    Entrez Help (2005) National Center for Biotechnology Information. Accessed 9 Nov 2015
  9. 9.
    SOAP Specifications.
  10. 10.
    Fielding RT (2000) Representational state transfer (REST). In: Architectural styles and the design of network-based software architectures. University of California, IrvineGoogle Scholar
  11. 11.
    Fielding RT, Taylor RN (2000) Principled design of the modern Web architecture. Proceedings of the 22nd international conference on software engineering. pp 407–416.
  12. 12.
  13. 13.
  14. 14.
    Bolton EE, Chen J, Kim S, Han L, He S, Shi W, Simonyan V, Sun Y, Thiessen PA, Wang J, Yu B, Zhang J, Bryant SH (2011) PubChem3D: a new resource for scientists. J Cheminform 3(1):32. Scholar
  15. 15.
    Bolton EE, Kim S, Bryant SH (2011) PubChem3D: conformer generation. J Cheminform 3(1):4. Scholar
  16. 16.
    Kim S, Bolton EE, Bryant SH (2013) PubChem3D: conformer ensemble accuracy. J Cheminform 5(1):1. Scholar
  17. 17.
    PubChem Structure Download Service.
  18. 18.
  19. 19.
  20. 20.
    Medical Subject Headings.
  21. 21.
    Anatomical Therapeutic Chemical (ATC) Classification System.
  22. 22.
    International Patent Classification (IPC).
  23. 23.
  24. 24.
    Holliday JD, Hu CY, Willett P (2002) Grouping of coefficients for the calculation of inter-molecular similarity and dissimilarity using 2D fragment bit-strings. Comb Chem High Throughput Screen 5(2):155–166Google Scholar
  25. 25.
    Holliday JD, Salim N, Whittle M, Willett P (2003) Analysis and display of the size dependence of chemical similarity coefficients. J Chem Inf Comput Sci 43(3):819–828. Scholar
  26. 26.
    Chen X, Reynolds CH (2002) Performance of similarity measures in 2D fragment-based similarity searching: comparison of structural descriptors and similarity coefficients. J Chem Inf Comput Sci 42(6):1407–1414. Scholar
  27. 27.
    ROCS—Rapid Overlay of Chemical Structures (2011) 3.1.1 ed. OpenEye Scientific Software, Inc., Santa Fe, NMGoogle Scholar
  28. 28.
    Bolton EE, Kim S, Bryant SH (2011) PubChem3D: similar conformers. J Cheminform 3(1):13. Scholar
  29. 29.
    Grant JA, Gallardo MA, Pickup BT (1996) A fast method of molecular shape comparison: a simple application of a Gaussian description of molecular shape. J Comput Chem 17(14):1653–1666Google Scholar
  30. 30.
    LWP::Simple—simple procedural interface to LWP.
  31. 31.
  32. 32.
  33. 33.
    HTTP::Request::Common—Construct common HTTP::Request objects.

Copyright information

© Springer Science+Business Media, LLC, part of Springer Nature 2018

Authors and Affiliations

  • Sunghwan Kim
    • 1
    Email author
  • Paul A. Thiessen
    • 1
  • Evan E. Bolton
    • 1
  1. 1.National Center for Biotechnology InformationNational Library of Medicine, National Institutes of Health, US Department of Health and Human ServicesBethesdaUSA

Personalised recommendations