Advertisement

Multilingual Patent Text Retrieval Evaluation: CLEF–IP

  • Florina PiroiEmail author
  • Allan Hanbury
Chapter
Part of the The Information Retrieval Series book series (INRE, volume 41)

Abstract

The CLEF–IP evaluation lab ran between 2009 and 2013 with a two-fold expressed purpose: (a) to encourage research in the area of patent retrieval with a focus on cross language retrieval, and (b) to provide a large and clean data set of patent related data, in the three main European languages, for experimentation. In its first year, CLEF–IP organized one task only, a text retrieval task that modelled the “Search for Prior Art” done by experts at patent offices. In the following years the types of CLEF–IP tasks broadened to include patent text classification, patent image retrieval and classification, and (formal) structure recognition. With each task, the test collection was extended to accommodate for the additional tasks. In this chapter we overview the evaluation tasks dealing with the textual content of the patents. The Intellectual Property (IP) domain is one where specific expertise is critical, implementing Information Retrieval (IR) approaches to support some of its tasks cannot be done without the use of this domain know-how. Even when such know-how is at hand, retrieval results, in general, do not come close to the expectations of patent experts.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Notes

Acknowledgements

We give our thanks to the following people:

‐ the advisory board members which helped shape the evaluation lab in its early years: Gianni Amati, Atsushi Fujii, Makoto Iwayama, Kalervo Järvelin, Noriko Kando, Javier Pose Rodríguez, Mark Sanderson, Henk Thomas, Anthony Trippe, Christa Womser-Hacker

‐ the numerous patent experts that helped us understand many of the patent system’s subtilities

‐ previous CLEF–IP organizers and co-organizers: Giovanna Roda, John Tait, Veronika Zenz, Mihai Lupu, Igor Filippov, Walid Magdy, Alan P. Sexton

‐ the participating teams who submitted such a variety of solutions to the proposed tasks (see CLEF–IP workshop notes).

CLEF–IP was supported along the years by Matrixware GmbH, Vienna and Information Retrieval Facility, Vienna as first data and infrastructure provider, by the PROMISE, EU Network of Excellence (FP7-258191), and FFG FIT-IT Impex project (No. 825846).

References

  1. Adams S (2000) Using the International Patent Classification in an online environment. World Patent Information 22(4):291–300CrossRefGoogle Scholar
  2. Adams SR (2011) Information sources in patents, 3rd edn. K.G. Saur, MunichGoogle Scholar
  3. Alberts D, Yang CB, Fobare-DePonio D, Koubek K, Robins S, Rodgers M, Simmons E, DeMarco D (2011) Introduction to patent searching. In: Lupu M, Mayer K, Tait J, Trippe AJ (eds) Current challenges in patent information retrieval, the information retrieval series, vol 29. Springer, Berlin, pp 3–43. http://dx.doi.org/10.1007/978-3-642-19231-9_1 CrossRefGoogle Scholar
  4. Anthon C (1841) A classical dictionary: containing an account of the principal proper names mentioned in ancient authors, and intended to elucidate all the important points connected with the geography, history, biography, mythology, and fine arts of the Greeks and Romans together with an account of coins, weights, and measures, with tabular values of the same. Harper & BrosGoogle Scholar
  5. Attar R, Fraenkel AS (1977) Local feedback in full-text retrieval systems. J ACM 24(3):397–417. http://doi.acm.org/10.1145/322017.322021 CrossRefGoogle Scholar
  6. Becks D, Womser-Hacker C, Mandl T, Kölle R (2009) Patent retrieval experiments in the context of the CLEF IP Track 2009. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  7. Becks D, Mandl T, Womser-Hacker C (2010) Phrases or terms? The impact of different query types. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  8. Correa S, Buscaldi D, Rosso P (2009) NLEL-MAAT at CLEF-IP. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  9. Derieux F, Bobeica M, Pois D, Raysz JP (2010) Combining semantics and statistics for patent classification. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  10. D’hondt E, Verberne S, Alink W, Cornacchia R (2011) Combining document representations for prior-art retrieval. In: Petras V, Forner P, Clough P, Ferro N (eds) (2011) CLEF 2011 Working Notes, CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
  11. Eiselt A, Oberreuter G (2013) The simpler the better - Retrieval Moderl comparison for Prior-Art Search in Patents at CLEF-IP 2013. In: Forner P, Navigli R, Tufis D, Ferro N (eds) (2013) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
  12. EPO (1973) European Patent Convention (EPC), Implementing Regulations, Examination Procedure. http://www.epo.org/law-practice/legal-texts/html/epc/2016/e/r71.html. Accessed Dec 2018
  13. EPO (2018) Guidelines for Examination in the European Patent Office. Directorate Patent Law 5.2.1. http://www.epo.org/law-practice/legal-texts/guidelines.html. Accessed Dec 2018Google Scholar
  14. Frietsch R, Schmoch U, Looy B, Walsh P, Devroede R, Du Plessis M, Jung T, Meng Y, Neuhäusler P, Peeters B, Schubert T (2010) The value and indicator function of patents. Expertenkommission Forschung und Innovation (EFI) Studien zum deutschen Innovationssystem 15(15-2010)Google Scholar
  15. Galasso A, Schankerman M (2013) Do patents help or hinder innovation? World Economic Forum. https://www.weforum.org/agenda/2013/05/do-patents-help-or-hinder-innovation. Accessed Dec 2018
  16. Giachanou A, Salampasis M, Satratzemi M, Samaras N (2013) Report on the CLEF-IP 2013 experiments: multilayer collection selection on topically organized patents. In: Forner P, Navigli R, Tufis D, Ferro N (eds) (2013) CLEF 2013 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1179/
  17. Gobeill J, Ruch P (2012) BiTeM site report for the Claims to Passage task in CLEF-IP 2012. In: Forner P, Karlgren J, Womser-Hacker C, Ferro N (eds) (2012) CLEF 2012 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1178/
  18. Gobeill J, Theodoro D, Ruch P (2009) Exploring a wide range of simple pre and post processing strategies for patent searching in CLEF IP 2009. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  19. Graf E, Azzopardi L (2008) A methodology for building a patent test collection for prior art search. In: Proceedings of the second international workshop on evaluating information access (EVIA)Google Scholar
  20. Graf E, Azzopardi L, van Rijsbergen K (2009) Automatically generating queries for prior art search. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  21. Guyot J, Benzineb K, Falquet G (2010) myClass: a mature tool for patent classification. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  22. Hall BH (2017) Patents. Palgrave, Macmillan, pp 1–9. https://doi.org/10.1057/978-1-349-95121-5_1393-2 Google Scholar
  23. Hanbury A, Zenz V, Berger H (2010) 1st international workshop on advances in patent information retrieval (AsPIRe’10). SIGIR Forum 44(1):19–22. http://doi.acm.org/10.1145/1842890.1842893 CrossRefGoogle Scholar
  24. Hunt D, Nguyen L, Rodgers M (2007) Patent searching: tools & techniques. Wiley, New YorkGoogle Scholar
  25. Iwayama M, Fujii A, Kando N, Marukawa Y (2003) An empirical study on retrieval models for different document genres: patents and newspaper articles. In: Proceedings of the 26th international ACM SIGIR conference on research and development in information retrieval. ACM, New York, SIGIR ‘03, pp 251–258Google Scholar
  26. Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446. http://doi.acm.org/10.1145/582415.582418 CrossRefGoogle Scholar
  27. Kamps J, Pehcevski J, Kazai G, Lalmas M, Robertson S (2008) INEX 2007 evaluation measures. In: Focused access to XML documents, 6th international workshop of the initiative for the evaluation of XML retrieval, INEX 2007, Dagstuhl Castle, Germany, December 17–19, 2007. Selected papers. Lecture notes in computer science, vol 4862. Springer, Berlin, pp 24–33Google Scholar
  28. Kando N, Leong MK (2000) Workshop on patent retrieval (SIGIR 2000 Workshop Report). SIGIR Forum 34(1):28–30CrossRefGoogle Scholar
  29. Kumagai KI (2005) History of Japanese Industrial Property System. http://www.jpo.go.jp/torikumi_e/kokusai_e/training/textbook/pdf/History_of_Japanese_Industrial_Property_System(2005).pdf. Accessed Feb 2018
  30. Lipani A, Palotti J, Lupu M, Piroi F, Zuccon G, Hanbury A (2017) Fixed-cost pooling strategies based on IR evaluation measures. In: Advances in information retrieval - 39th European conference on IR research, ECIR 2017, Aberdeen, April 8–13, 2017, Proceedings, pp 357–368. https://doi.org/10.1007/978-3-319-56608-5_28 Google Scholar
  31. Lopez P, Romary L (2009) Multiple retrieval models and regression models for prior art search. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  32. Lopez P, Romary L (2010) Experiments with citation mining and key-term extraction for prior art search. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  33. Lupu M, Hanbury A (2013) Patent retrieval. Found Trends IR 7:1–97Google Scholar
  34. Lupu M, Huang J, Zhu J, Tait J (2009) TREC-CHEM: large scale chemical information retrieval evaluation at TREC. SIGIR Forum 43(2):63–70CrossRefGoogle Scholar
  35. Lupu M, Mayer K, Kando N, Trippe A (2017) Current challenges in patent information retrieval, 2nd edn. Springer, Berlin. https://doi.org/10.1007/978-3-662-53817-3 Google Scholar
  36. Magdy W, Jones G (2010a) PRES: a score metric for evaluating recall-oriented information retrieval applications. In: Proceedings of the 33rd international ACM SIGIR conference on research and development in information retrieval, ACM, New York, SIGIR ‘10, pp 611–618. http://doi.acm.org/10.1145/1835449.1835551
  37. Magdy W, Jones GJF (2010b) Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/ CrossRefGoogle Scholar
  38. Magdy W, Leveling J, Jones G (2009) DCU at CLEF-IP 2009: exploring standard IR techniques on patent retrieval. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  39. Mahdabi P, Andersson L, Hanbury A, Crestani F (2011) Report on the CLEF-IP 2011 experiments: exploring patent summarization. In: Petras V, Forner P, Clough P, Ferro N (eds) (2011) CLEF 2011 Working Notes, CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1177/CLEF2011wn-CLEF-IP-MahdabiEt2011.pdf
  40. May C (2010) The Venetian moment: new technologies, legal innovation and the institutional origins of intellectual property. Prometheus 20:159–179. http://www.tandfonline.com/doi/full/10.1080/08109020210138979 CrossRefGoogle Scholar
  41. McDonnell P (1969) Technical information management in the U.S. patent office. J Chem Doc 9(5):220–224CrossRefGoogle Scholar
  42. Mossoff A (2007) Who cares What Thomas Jefferson thought about patents - reevaluating the patent privilege in historical context. Cornell Law Rev 92(5):953. https://scholarship.law.cornell.edu/clr/vol92/iss5/2 Google Scholar
  43. Osborn M, Strzalkowski T, Marinescu M (1997) Evaluating document retrieval in patent database: a preliminary report. In: Proceedings of the 6th international conference on information and knowledge management. ACM, New York, pp 216–221. http://doi.acm.org/10.1145/266714.266899 Google Scholar
  44. PCT (1970) Patent Cooperation Treaty. http://www.wipo.int/pct/en/treaty/about.html. Accessed Aug 2015
  45. Pfaller W (2013a) Bergrecht, Monopole, Privilegien. http://www.wolfgang-pfaller.de/berg.htm. Accessed Feb 2018
  46. Pfaller W (2013b) Schon die alten Griechen …. http://www.wolfgang-pfaller.de/sybaris.htm. Accessed Feb 2018
  47. Piroi F, Zenz V (2011) Evaluating information retrieval in the intellectual property domain: the CLEF-IP campaign. In: Lupu M, Mayer K, Tait J, Trippe AJ (eds) Current challenges in patent information retrieval, the information retrieval series, vol 29. Springer, Berlin, pp 87–108. http://dx.doi.org/10.1007/978-3-642-19231-9_4 CrossRefGoogle Scholar
  48. Piroi F, Lupu M, Hanbury A, Zenz V (2011) CLEF-IP 2011: retrieval in the intellectual property domain. In: Petras V, Forner P, Clough P, Ferro N (eds) (2011) CLEF 2011 Working Notes, CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
  49. Piroi F, Lupu M, Hanbury A, Sexton AP, Magdy W, Filippov IV (2012) CLEF-IP 2012: retrieval experiments in the intellectual property domain. In: Forner P, Karlgren J, Womser-Hacker C, Ferro N (eds) (2012) CLEF 2012 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1178/
  50. Piroi F, Lupu M, Hanbury A (2013) Overview of CLEF-IP 2013 Lab - information retrieval in the patent domain. In: Forner P, Müller H, Paredes R, Rosso P, Stein B (eds) Information access evaluation meets multilinguality, multimodality, and visualization. Proceedings of the fourth international conference of the CLEF initiative (CLEF 2013). Lecture Notes in Computer Science (LNCS), vol 8138. Springer, HeidelbergGoogle Scholar
  51. Rich GS (1993) Are Letters Patent Grants of Monopoly? Western New England Law Review 15. https://core.ac.uk/display/76563380
  52. Roda G, Tait J, Piroi F, Zenz V (2010) CLEF-IP 2009: retrieval experiments in the intellectual property domain. In: Peters C, Nunzio GD, Kurimo M, Mostefa D, Penas A, Roda G (eds) Multilingual information access evaluation I. Text retrieval experiments 10th workshop of the cross-language evaluation forum, CLEF 2009, vol 6241. Springer, Berlin, pp 385–409Google Scholar
  53. Schneller I (2002) Japanese File Index Classification and F-terms. World Patent Inf 24(3):197–201CrossRefGoogle Scholar
  54. Skolnik H (1977) Historical aspects of patent systems. J Chem Inf Comput Sci 17(3):119–121. https://pubs.acs.org/doi/abs/10.1021/ci60011a002 CrossRefGoogle Scholar
  55. Spark-Jones K, Van Rijsbergen C (1975) Report on the need for and provision of an ‘ideal’ information retrieval test collection. 5266, Computer Laboratory, Univ. Cambridge. http://sigir.org/files/museum/pub-14/pub_14.pdf
  56. Szarvas G, Herbert B, Ggurevych I (2009) Prior art search using international patent classification codes and all-claims-queries. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  57. Tait J, Harris C, Lupu M (eds) (2010) PaIR ‘10: proceedings of the 3rd international workshop on patent information retrieval. ACM, New YorkGoogle Scholar
  58. Teodoro D, Gobeill J, Pasche E, Vishnyakova D, Ruch P, Lovis C (2010) Automatic prior art searching and patent encoding at CLEF-IP’10. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  59. Toucedo C, Losada D (2009) University of Santiago de Compostella at CLEF-IP’09. In: Borri F, Nardi A, Peters C, Ferro N (eds) (2009) CLEF 2009 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1175/
  60. Verberne S, D’hondt E (2011) Patent classification experiments with the Linguistic Classification System LCS in CLEF-IP 2011. In: Petras V, Forner P, Clough P, Ferro N (eds) (2011) CLEF 2011 Working Notes, CEUR Workshop Proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1177/
  61. Voorhees EM, Harman DK (2005) TREC: experiment and evaluation in information retrieval (digital libraries and electronic publishing). The MIT Press, CambridgeGoogle Scholar
  62. Wanagiri M, Adriani M (2010) Prior art retrieval using various patent document fields contents. In: Braschler M, Harman DK, Pianta E, Ferro N (eds) (2010) CLEF 2010 working notes. CEUR workshop proceedings (CEUR-WS.org), ISSN 1613-0073. http://ceur-ws.org/Vol-1176/
  63. WIPO (2013) Handbook on industrial property information and documentation. Part 8: terms and abbreviations concerning industrial property information and documentation. http://www.wipo.int/standards/en/part_08.html. Accessed Jan 2019

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Institute of Information Systems EngineeringTU WienViennaAustria

Personalised recommendations