Skip to main content

Representation and Searching of Chemical Structure Information in Patents

  • Chapter
  • First Online:
Current Challenges in Patent Information Retrieval

Part of the book series: The Information Retrieval Series ((INRE,volume 37))

  • 1570 Accesses

Abstract

This chapter describes the techniques that are used to represent and to search for molecular structures in chemical patents. There are two types of structure: specific structures that describe individual molecules and generic structures that describe sets of structurally related molecules. Methods for representing and searching specific structures have been well established for many years, and the techniques are also applicable, albeit with substantial modification, to the processing of generic structures.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Berks AH (2001) Current state of the art of Markush topological search systems. World Pat Inf 23:5–13

    Article  Google Scholar 

  2. Willett P (2008) From chemical documentation to chemoinformatics: fifty years of chemical information science. J Inf Sci 34:477–499

    Article  Google Scholar 

  3. Gasteiger J (2006) The central role of chemoinformatics. Chemomet Intell Lab Syst 82:200–209

    Article  Google Scholar 

  4. Leach AR, Gillet VJ (2007) An introduction to chemoinformatics. Kluwer, Dordrecht

    Book  Google Scholar 

  5. Gasteiger J, Engel T (eds) (2003) Chemoinformatics: a textbook. Wiley-VCH, Weinheim

    Google Scholar 

  6. Kosata B (2009) Chemical entity formatting. In: Banville DL (ed) Chemical information mining. CRC Press, Boca Raton, FL

    Google Scholar 

  7. Warr WA (2011) Representation of chemical structures. WIREs Comput Mol Sci 1(4):557–579

    Article  Google Scholar 

  8. Barnard JM, Kenny PW, Wallace PN (2012) Representing chemical structures in databases for drug design. In: Livingstone DJ, Davis AM (eds) Drug design strategies: quantitative approaches. Royal Society of Chemistry, Cambridge, pp 164–191

    Google Scholar 

  9. Martin YC, Willett P (eds) (1998) Designing bioactive molecules: three-dimensional techniques and applications. American Chemical Society, Washington, DC

    Google Scholar 

  10. Weininger D (1988) SMILES, a chemical language and information-system. 1. Introduction to methodology and encoding rules. J Chem Inf Comp Sci 28:31–36

    Article  Google Scholar 

  11. Heller S, McNaught A et al (2013) InChI – the worldwide chemical structure identifier standard. J Cheminf 5:7

    Article  Google Scholar 

  12. Dalby A, Nourse JG et al (1992) Description of several chemical structure file formats used by computer programs developed at Molecular Design Limited. J Chem Inf Comp Sci 22:244–255

    Article  Google Scholar 

  13. Diestel R (2010) Graph theory, vol 173, 4th edn, Graduate tests in mathematics. Springer, New York, NY

    Book  MATH  Google Scholar 

  14. Wilson DRJ (2010) Introduction to graph theory. Prentice Hall, Harlow

    Google Scholar 

  15. Warr WA (2010) Tautomerism in chemical information management systems. J Comput-Aided Mol Des 24:497–520

    Article  Google Scholar 

  16. Morgan H (1965) The generation of a unique machine description for chemical structures – a technique developed at Chemical Abstracts Service. J Chem Doc 5:107–113

    Article  Google Scholar 

  17. McKay BD (1981) Practical graph isomorphism. Congressus Numerantium 30:45–87

    MathSciNet  MATH  Google Scholar 

  18. Barnard JM (1993) Substructure searching methods – old and new. J Chem Inf Comp Sci 33:532–538

    Article  Google Scholar 

  19. Sussenguth EH (1965) A graph-theoretic algorithm for matching chemical structures. J Chem Doc 5:36–43

    Article  Google Scholar 

  20. Ullmann JR (1976) An algorithm for subgraph isomorphism. J ACM 23:31–42

    Article  MathSciNet  Google Scholar 

  21. Eckert H, Bajorath J (2007) Molecular similarity analysis in virtual screening: foundations, limitation and novel approaches. Drug Discov Today 12:225–233

    Article  Google Scholar 

  22. Willett P (2009) Similarity methods in chemoinformatics. Ann Rev Inf Sci Technol 43:3–71

    Google Scholar 

  23. Johnson MA, Maggiora GM (eds) (1990) Concepts and applications of molecular similarity. Wiley, New York, NY

    Google Scholar 

  24. Weininger D (1998) Simpatico. Presented at MUG 98, the Daylight Chemical Information Systems User Group Meeting, Santa Fe, NM, USA, 24–27 Feb 1998. http://www.daylight.com/meetings/mug98/Weininger/mug98mark/mug98mark.html. Accessed March 2014

  25. Rhodes J, Boyer S et al (2007) Mining patents using molecular similarity search. Pac Symp Biocomput 12:304–315

    Google Scholar 

  26. Fliri A, Moysan E et al (2009) Methods for processing generic chemical structure representations. US Patent 2009/0132464

    Google Scholar 

  27. Fliri A, Moysan E, Nolte M (2010) Method for creating virtual compound libraries within Markush structure patent claims. WO Patent 2010/065144 A2

    Google Scholar 

  28. Muresan S, Petrov P et al (2011) Making every SAR point count: the development of chemistry connect for the large-scale integration of structure and bioactivity data. Drug Discov Today 16:1019–1030

    Article  Google Scholar 

  29. Tyrchan C, Boström J et al (2012) Exploiting structural information in patent specifications for key compound prediction. J Chem Inf Model 52:1480–1489

    Article  Google Scholar 

  30. Downs GM, Barnard JM (2011) Chemical patent information systems. WIREs Comput Mol Sci 1:727–741. doi:10.1002/wcms.41

    Article  Google Scholar 

  31. Williams AJ, Yerin A (2009) Automated identification and conversion of chemical names to structure-searchable information. In: Banville DL (ed) Chemical information mining. CRC Press, Boca Raton, FL

    Google Scholar 

  32. Heifets A, Jurisica I (2011) SCRIPDB: a portal for easy access to syntheses, chemicals and reactions in patents. Nucl Acids Res 2011:1–6

    Google Scholar 

  33. Dethlefsen W, Lynch MF et al (1991) Computer storage and retrieval of generic chemical structures in patents, Part 11. Theoretical aspects of the use of structure languages in a retrieval system. J Chem Inf Comp Sci 31:233–253

    Article  Google Scholar 

  34. Barnard JM, Downs GM, von Scholley-Pfab A, Brown RD (2000) Use of Markush structure analysis techniques for descriptor generation and clustering of large combinatorial libraries. J Mol Graph Model 18:452–463

    Article  Google Scholar 

  35. Barnard JM (1991) A comparison of different approaches to Markush structure handling. J Chem Inf Comput Sci 31:64–68

    Article  Google Scholar 

  36. Lynch MF, Holliday JD (1996) The Sheffield Generic Structures Project – a retrospective review. J Chem Inf Comp Sci 36:930–936

    Article  Google Scholar 

  37. Cosgrove DA, Green KM et al (2012) A system for encoding and searching Markush structures. J Chem Inf Model 52:1936–1947

    Article  Google Scholar 

  38. Gillet VJ, Downs GM et al (1987) Computer-storage and retrieval of generic chemical structures in patents. 8. Reduced chemical graphs and their applications in generic chemical-structure retrieval. J Chem Inf Comp Sci 27:126–137

    Article  Google Scholar 

  39. Csepregi S (2009) Markush structures – from molecules towards patents. Presented at the International Conference for Science & Business Information (ICIC), Sitges, Spain, 18–21 Oct 2009. http://www.haxel.com/icic/archive/2009/programme/. Accessed April 2013

  40. Franzreb KH, Hornbach P et al (1991) Structure searches in patent literature: a comparison study between IDC GREMAS and Derwent Chemical Code. J Chem Inf Comput Sci 31:284–289

    Article  Google Scholar 

  41. Simmons ES (2004) The online divide: a professional user’s perspective on Derwent database development in the online era. World Pat Inf 26:45–47

    Article  Google Scholar 

  42. Holliday JD, Downs GM et al (1993) Computer storage and retrieval of generic chemical structures in patents, Part 15. Generation of topological fragment descriptors from nontopological representation of generic structure components. J Chem Inf Comp Sci 33:369–377

    Article  Google Scholar 

  43. Downs GM, Gillet VJ et al (1989) Computer storage and retrieval of generic chemical structures in patents, Part 10. Assignment and logical bubble-up of ring screens for structurally explicit generics. J Chem Inf Comp Sci 29:215–224

    Article  Google Scholar 

  44. Benichou P, Klimczak C, Borne P (1997) Handling genericity in chemical structures using the Markush Darc software. J Chem Inf Comput Sci 37:43–53

    Article  Google Scholar 

  45. Ebe T, Sanderson KA, Wilson PS (1991) The Chemical Abstracts Service generic chemical (Markush) structure storage and retrieval capability. 2. The MARPAT file. J Chem Inf Comput Sci 31:31–36

    Article  Google Scholar 

  46. Fisanick W (1990) The Chemical Abstract’s Service generic chemical (Markush) structure storage and retrieval capability. 1. Basic concepts. J Chem Inf Comput Sci 30:145–154

    Article  Google Scholar 

  47. Schmuff NR (1991) A comparison of the MARPAT and Markush DARC software. J Chem Inf Comput Sci 31:53–59

    Article  Google Scholar 

  48. Newbold S (2009) Marpat searching in context: creating the ideal answer set and beyond. Presented at the RSC CICAG Meeting “Should I Really Be Searching Patents?”, Royal Society of Chemistry, London. www.rsc.org/images/S_NewboldOct2009_tcm18-167683.pdf. Accessed April 2014

  49. Cielen E (2009) Searching Markush formulae directed to medical applications. World Pat Inf 31:178–183

    Article  Google Scholar 

  50. Barnard JM, Wright PM (2009) Towards in-house searching of Markush structures from patents. World Pat Inf 31:97–103

    Article  Google Scholar 

  51. Deng W, Berthel SJ, So WV (2011) Intuitive patent Markush structure visualization tool for medicinal chemists. J Chem Inf Model 51:511–520

    Article  Google Scholar 

  52. Deng W, Scott E, Berthel SJ, So WV (2012) Deconvoluting complex patent Markush structures: a novel R-group numbering system. World Pat Inf 34:128–133

    Article  Google Scholar 

  53. Banville DL (2009) Chemical information mining: facilitating literature-based discovery. CRC Press, Boca Raton, FL

    Google Scholar 

  54. Valko AT, Johnson AP (2009) CLiDE Pro: the latest generation of CLiDE, a tool for optical chemical structure recognition. J Chem Inf Model 49:780–787

    Article  Google Scholar 

  55. Zimmerman M (2009) Chemical depictions—the grand challenge in patents. Presented at the International Conference for Science & Business Information (ICIC), Sitges, Spain, 18–21 October 2009. http://www.haxel.com/icic/archive/2009/programme/. Accessed April 2014

  56. Haupt CS (2009) Markush structure reconstruction: a prototype for their reconstruction from image and text into a searchable, context sensitive grammar based extension of SMILES. Thesis, Fraunhofer SCAI. http://publica.fraunhofer.de/eprints/urn:nbn:de:0011-n-1144222.pdf

  57. Eigner-Pitto V, Eiblmaier J et al (2012) ChemProspector and generic structures: advanced mining and searching of chemical content. J Cheminf 4:O17

    Article  Google Scholar 

  58. Bone RGA, Kendall JT (2008) Markush under threat: US PTO considers alternatives. Indus Biotechnol 4:246–251

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Peter Willett .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer-Verlag GmbH Germany

About this chapter

Cite this chapter

Downs, G.M., Holliday, J.D., Willett, P. (2017). Representation and Searching of Chemical Structure Information in Patents. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 37. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53817-3_15

Download citation

  • DOI: https://doi.org/10.1007/978-3-662-53817-3_15

  • Published:

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-662-53816-6

  • Online ISBN: 978-3-662-53817-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics