Skip to main content

Domain Adaptation of General Natural Language Processing Tools for a Patent Claim Visualization System

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8201))

Abstract

In this study we present a first step towards domain adaptation of Natural Language Processing (NLP) tools, which we use in a pipeline for a system to create a dependency claim graph (DCG). Our system takes advantage of patterns occurring in the patent domain notably of the characteristic of patent claims of containing technical terminology combined with legal rhetorical structure. Such patterns make the sentences generally difficult to understand for people, but can be leveraged by our system to assist the cognitive process of understanding the innovation described in the claim. We present this set of patterns, together with an extensive evaluation showing that the results are, even for this relatively difficult genre, at least 90% correct, as identified by both expert and non-expert users. The assessment of each generated DCG is based upon completeness, connection and a set of pre-defined relations.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   49.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Sheremetyeva, S.: Natural language analysis of patent claims. In: Proc ACL-2003, Workshop on Patent Corpus Processing, pp. 66–73 (2003)

    Google Scholar 

  2. Hunt, D., Nguyen, L., Rodgers, M.: Patent Searching Tools & Techniques. John Wiley &Sons, New Jersey (2007)

    Google Scholar 

  3. Lupu, M., Huang, J., Zhu, J.: Evaluation of Chemical Information Retrieval Tools. In: Croft, W.B., Lupu, M., Mayer, K., Tait, J., Trippe, J.A. (eds.) Current Challenges in patent Information Retrieval. Springer (2011)

    Google Scholar 

  4. Hansen, P.: Task-based Information Seeking and Retrieval in the Patent Domain: Processes and Relationships. Tampere University Press (Doctoral dissertation), Tampere (2011)

    Google Scholar 

  5. Uematsu, S., Kim, J.-D., Sujii, J.: Bridging the gap between domain-oriented and linguistically-oriented semantics. In: Proc ACL-2009, Workshop BioNLP 2009, pp. 162–170 (2009)

    Google Scholar 

  6. Giesbrecht, E., Evert, S.: Part-of-speech tagging - A solved task? An evaluation of POS taggers for the Web as corpus. In: Alegria, I., Leturia, I., Sharoff, S. (eds.) WAC5 (2009)

    Google Scholar 

  7. Ferraro, G.: Towards deep content extraction from specialized discourse: The case of verbal relation in patent claims Department of Information and communication Technologies: Universitat Pompeu Fabra (Doctoral dissertation) (2012)

    Google Scholar 

  8. Parapatics, P., Dittenbach, M.: Patent Claim Decomposition for Improved Information Extraction. In: Lupu, M., Mayer, K., Tait, J., Trippe, J.A. (eds.) Current Challenges in patent Information Retrieval. Springer (2011)

    Google Scholar 

  9. Ferraro, G., Wanner, L.: Towards the derivation of verbal content relations from patent claims using deep syntactic structures. Knowledge-Based Systems 24(8), 1233–1244 (2011)

    Article  Google Scholar 

  10. Verberne, S., D’hondt, E., Oostdijk, N., Koster, C.: Quantifying the Challenges in Parsing Patent Claims. In: Workshop of AsPIRe, pp. 14–21 (2010)

    Google Scholar 

  11. Wäschle, K., Riezler, S.: Analyzing parallelism and domain similarities in the MAREC patent corpus. In: Salampasis, M., Larsen, B. (eds.) IRFC 2012. LNCS, vol. 7356, pp. 12–27. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  12. Koster, H.-A.C., Beney, J., Verberne, S., Vogel, M.: Phrase-Based Documentation Categorization. In: Croft, W.B., Lupu, M., Mayer, K., Tait, J., Trippe, J.A. (eds.) Current Challenges in patent Information Retrieval. Springer (2011)

    Google Scholar 

  13. Justeson, S.J., Katz, M.S.: Technical terminology: some linguistic properties and an algorithm for identification in text. Natural Language Engineering 1(1) (1995)

    Google Scholar 

  14. Bouayad-Agha, N., Casamayor, G., Ferraro, G., Wanner, L.: Simplification of Patent Claim Sentences for their Paraphrasing and Summarization. In: Lane, H.C., Guesgen, H.W. (eds.) The 22nd International Florida Artificial Intelligence Research Society Conference, Sanibel Island, Florida, USA, May 19-21, AAAI Press (2009)

    Google Scholar 

  15. Shinmori, A., Okumura, M., Marukawa, Y., Iwayama, M.: Patent claim processing for readability: structure analysis and term explanation. In: Proc. ACL-2003 Workshop on Patent Corpus Processing, Stroudsburg, PA, USA, vol. 20, pp. 56–65 (2003)

    Google Scholar 

  16. Andersson, L., Mahdabi, P., Hanbury, A., Rauber, A.: Exploring patent passage retrieval using nouns phrases. In: Serdyukov, P., Braslavski, P., Kuznetsov, S.O., Kamps, J., Rüger, S., Agichtein, E., Segalovich, I., Yilmaz, E. (eds.) ECIR 2013. LNCS, vol. 7814, pp. 676–679. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  17. Ramshaw, A.L., Marcu, P.M.: Text Chunking Using Transformation-Based Learning. In: 3rd Workshop on Very Large Corpora, Cambridge, MA, USA (1995)

    Google Scholar 

  18. Toutanova, K., Klein, D., Manning, C., Singer, Y.: Feature-Rich Part-of-Speech Tagging with a Cyclic Dependency Network. In: Proc. of HLT-NAACL, pp. 252–259 (2003)

    Google Scholar 

Download references

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Andersson, L., Lupu, M., Hanbury, A. (2013). Domain Adaptation of General Natural Language Processing Tools for a Patent Claim Visualization System. In: Lupu, M., Kanoulas, E., Loizides, F. (eds) Multidisciplinary Information Retrieval. IRFC 2013. Lecture Notes in Computer Science, vol 8201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-41057-4_8

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-41057-4_8

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-41056-7

  • Online ISBN: 978-3-642-41057-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics