Evaluating Information Retrieval in the Intellectual Property Domain: The Clef–Ip Campaign

Piroi, Florina; Zenz, Veronika

doi:10.1007/978-3-642-19231-9_4

Evaluating Information Retrieval in the Intellectual Property Domain: The Clef–Ip Campaign

Florina Piroi⁵ &
Veronika Zenz⁶

Chapter

1607 Accesses

Part of the book series: The Information Retrieval Series ((INRE,volume 29))

Abstract

The Clef–Ip track ran for the first time within the Clef 2009 campaign. The purpose of the track was twofold: (a) to encourage and facilitate research in the area of patent retrieval by providing a large clean data set for experimentation; (b) to create a large test collection of patents in the three main European languages for the evaluation of cross-lingual information access. The track focused on the task of prior art search, to which a second task was added in 2010, the patent classification task. The participating teams deployed a variety of Information Retrieval techniques, adapted or custom-made, to tackle with this specific domain and tasks. This chapter reports on activities undertaken to provide a set of topics for the two tasks, to extract the relevance assessments for the provided topics, and on evaluating the effectiveness of the employed retrieval methods.

This is a preview of subscription content, log in via an institution.

Notes

1.
It is our direct experience that these explanations helped Ir researchers the most in understanding the relationships between the different kinds of patent documents constituting a patent.
2.
For EP patents, documents at different stages have the same numeric identifier. For other patent offices this is not always the case. For example, the patent document US-6689545-B2 represents a US granted patent with its application document publication number US-2003011722-A1.
3.
For a complete list of kind codes used by various patent offices see http://tinyurl.com/EPO-kindcodes.
4.
See http://www.wipo.int/classifications/ipc/en/.
5.
Although the Marec collection was created after the first Clef–Ip campaign was set up in 2009, the documents in the Clef–Ip’09 corpus are included in the Marec collection, and use the same Dtd.
6.
http://www.wipo.int/pct/en/.
7.
http://www.alfresco.com/.
8.
http://docasu.sourceforge.net/.
9.
trec–eval version 8.0 http://trec.nist.gov/trec_eval.

References

Conference on Multilingual and Multimodal Information Access Evaluation (2010). http://clef2010.org/
Cross Language Evaluation Forum. http://www.clef-campaign.org
European Patent Convention (EPC). http://www.epo.org/patents/law/legal-texts. URL http://www.epo.org/patents/law/legal-texts/epc.html
Fujii A, Iwayama M, Kando N (2007) Overview of the patent retrieval task at the NTCIR-6 workshop. In: Kando N, Evans DK (eds) Proceedings of the sixth NTCIR workshop meeting on evaluation of information access technologies: information retrieval, question answering, and cross-lingual information access. National Institute of Informatics, Tokyo, pp 359–365
Google Scholar
Graf E, Azzopardi L (2008) A methodology for building a patent test collection for prior art search. In: Proceedings of the second international workshop on evaluating information access (EVIA)
Google Scholar
Guidelines for Examination in the European Patent Office (2009). http://www.epo.org/patents/law/legal-texts/guidelines.html.
Järvelin K, Kekäläinen J (2002) Cumulated gain-based evaluation of IR techniques. ACM Trans Inf Syst 20(4):422–446
Article Google Scholar
Magdy W, Jones GJF (2010) PRES: A score metric for evaluating recall-oriented information retrieval applications. In: SIGIR
Google Scholar
NTCIR Project (2010) Evaluation of information access technologies research infrastructure for comparative evaluation of information retrieval and access technologies. http://research.nii.ac.jp/ntcir/index-en.html
Peters C, Di Nunzio G, Kurimo M, Mostefa D, Penas A, Roda G (eds) (2010) Multilingual information access evaluation I. Text retrieval experiments. Lecture notes in computer science, vol 6241. Springer, Berlin
Google Scholar
Piroi F, Tait J (2010) CLEF–IP 2010: Retrieval experiments in the intellectual property domain. Tech Rep IRF-TR-2010-0005, Information Retrieval Facility, Vienna, Austria. URL http://www.ir-facility.org/research/publications-reports/technical-reports/files/irf-tr-2010-00005.pdf
Piroi F, Roda G, Zenz V (2009) CLEF-IP 2009 evaluation summary. Tech Rep IRF-TR-2009-00001, Information Retrieval Facility, Vienna, Austria. URL http://www.ir-facility.org/research/technical-reports/files/irf_tr_2009_00001.pdf
Roda G, Tait J, Piroi F, Zenz V (2010) CLEF-IP 2009: Retrieval experiments in the intellectual property domain. In: Peters C, Di Nunzio G, Kurimo M, Mostefa D, Penas A, Roda G (eds) Multilingual information access evaluation I. Text retrieval experiments. Lecture notes in computer science, vol 6241. Springer, Berlin, pp 385–409. doi:10.1007/978-3-642-15754-7_47
Chapter Google Scholar
Suzan Verberne Eva D’hondt, NOCHK
Google Scholar
Text Retrieval Conference. http://trec.nist.gov
The MAtrixware REsearch Collection (2010). http://ir-facility.net/prototypes/marec/description/overview/

Download references

Acknowledgements

We thank Matrixware Information Systems GmbH for making available the patent corpus for this track, and for co-organizing the first evaluation campaign. We also thank Judy Hickey and Henk Tomas for sharing their know-how on prior art searches and patent life-cycles with us.

Author information

Authors and Affiliations

Information Retrieval Facility, Vienna, Austria
Florina Piroi
max.recall information systems, Vienna, Austria
Veronika Zenz

Authors

Florina Piroi
View author publications
You can also search for this author in PubMed Google Scholar
Veronika Zenz
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Florina Piroi .

Editor information

Editors and Affiliations

Information Retrieval Facility, Donau-City Straße 1, Vienna, 1220, Austria
Mihai Lupu
Information Retrieval Facility, Donau-City Straße 1, Vienna, 1220, Austria
Katja Mayer
Information Retrieval Facility, Donau-City Straße 1, Vienna, 1220, Austria
John Tait
3LP Advisors, Post Rd. 7003, Dublin, 43016, Ohio, USA
Anthony J. Trippe

Appendix

Table 4.2 Indexing the data. Below, x indicates a field is used, – not used, x! indicates special treatment and ? indicates a lack of information on field usage

Full size table

Table 4.3 Query generation, retrieval systems, and ranking. Below, x indicates a field is used, - not used, x! indicates special treatment and ? indicates a lack of information on field usage

Full size table

Table 4.4 Systems, methods and document fields used in the Classification task

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Piroi, F., Zenz, V. (2011). Evaluating Information Retrieval in the Intellectual Property Domain: The Clef–Ip Campaign. In: Lupu, M., Mayer, K., Tait, J., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 29. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19231-9_4

Download citation

DOI: https://doi.org/10.1007/978-3-642-19231-9_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19230-2
Online ISBN: 978-3-642-19231-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Abstract

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Appendix

Appendix

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation