Advertisement

Building Queries for Prior-Art Search

  • Parvaz Mahdabi
  • Mostafa Keikha
  • Shima Gerani
  • Monica Landoni
  • Fabio Crestani
Part of the Lecture Notes in Computer Science book series (LNCS, volume 6653)

Abstract

Prior-art search is a critical step in the examination procedure of a patent application. This study explores automatic query generation from patent documents to facilitate the time-consuming and labor-intensive search for relevant patents. It is essential for this task to identify discriminative terms in different fields of a query patent, which enables us to distinguish relevant patents from non-relevant patents. To this end we investigate the distribution of terms occurring in different fields of the query patent and compare the distributions with the rest of the collection using language modeling estimation techniques. We experiment with term weighting based on the Kullback-Leibler divergence between the query patent and the collection and also with parsimonious language model estimation. Both of these techniques promote words that are common in the query patent and are rare in the collection. We also incorporate the classification assigned to patent documents into our model, to exploit available human judgements in the form of a hierarchical classification. Experimental results show that the retrieval using the generated queries is effective, particularly in terms of recall, while patent description is shown to be the most useful source for extracting query terms.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Alink, W., Cornacchia, R., de Vries, A.P.: Building strategies, a year later. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  2. 2.
    Atkinson, K.H.: Toward a more rational patent search paradigm. In: Proceedings of the 1st ACM Workshop on Patent Information Retrieval, pp. 37–40 (2008)Google Scholar
  3. 3.
    Azzopardi, L., Vanderbauwhede, W., Joho, H.: Search system requirements of patent analysts. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 775–776 (2010)Google Scholar
  4. 4.
    Azzopardi, L., Vinay, V.: Retrievability: an evaluation measure for higher order information access tasks. In: ACM Conference on Information and Knowledge Management, pp. 561–570 (2008)Google Scholar
  5. 5.
    Bashir, S., Rauber, A.: Improving Retrievability of Patents in Prior-Art Search. In: European Conference on Information Retrieval, pp. 457–470 (2010)Google Scholar
  6. 6.
    Fujii, A.: Enhancing patent retrieval by citation analysis. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 793–794 (2007)Google Scholar
  7. 7.
    Fujii, A., Iwayama, M., Kando, N.: Overview of Patent Retrieval Task at NTCIR-4. In: Proceedings of NTCIR-4 Workshop (2004)Google Scholar
  8. 8.
    Fujii, A., Iwayama, M., Kando, N.: Introduction to the special issue on patent processing. Information Processing and Management 43(5), 1149–1153 (2007)CrossRefGoogle Scholar
  9. 9.
    Fujita, S.: Revisiting the Document Length Hypotheses- NTCIR-4 CLIR and Patent Experiments at Patolis. In: Proceedings of NTCIR-4 Workshop (2004)Google Scholar
  10. 10.
    Graf, E., Frommholz, I., Lalmas, M., van Rijsbergen, K.: Knowledge modeling in prior art search. In: First Information Retrieval Facility Conference on Advances in Multidisciplinary Retrieval, pp. 31–46 (2010)Google Scholar
  11. 11.
    Hiemstra, D., Robertson, S.E., Zaragoza, H.: Parsimonious language models for information retrieval. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 178–185 (2004)Google Scholar
  12. 12.
    Iwayama, M., Fujii, A., Kando, N., Takano, A.: Overview of patent retrieval task at NTCIR-3. In: Proceedings of NTCIR Workshop (2002)Google Scholar
  13. 13.
    Konishi, K.: Query terms extraction from patent document for invalidity search. In: Proc. of NTCIR 2005 (2005)Google Scholar
  14. 14.
    Lopez, P., Romary, L.: Experiments with citation mining and key-term extraction for prior art search. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  15. 15.
    Magdy, W., Jones, G.J.F.: Applying the KISS Principle for the CLEF-IP 2010 Prior Art Candidate Patent Search Task. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  16. 16.
    Magdy, W., Jones, G.J.F.: Examining the robustness of evaluation metrics for patent retrieval with incomplete relevance judgements. In: Multilingual and Multimodal Information Access Evaluation, International Conference of the Cross-Language Evaluation Forum, pp. 82–93 (2010)Google Scholar
  17. 17.
    Magdy, W., Jones, G.J.F.: PRES: a score metric for evaluating recall-oriented information retrieval applications. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 611–618 (2010)Google Scholar
  18. 18.
    Manning, C., Schütze, H.: Foundations of Statistical Natural Language Processing. MIT Press, Cambridge (1999)zbMATHGoogle Scholar
  19. 19.
    Mase, H., Matsubayashi, T., Ogawa, Y., Iwayama, M., Oshio, T.: Proposal of two-stage patent retrieval method considering the claim structure. ACM Transactions on Asian Language Information Processing 4(2), 190–206 (2005)CrossRefGoogle Scholar
  20. 20.
    Meij, E., Weerkamp, W., de Rijke, M.: A query model based on normalized log-likelihood. In: ACM Conference on Information and Knowledge Management, pp. 1903–1906 (2009)Google Scholar
  21. 21.
    Piori, F.: CLEF-IP 2010: Prior-Art Candidate Search Evaluation Summary. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  22. 22.
    Piroi, F.: CLEF-IP 2010: Retrieval Experiments in the Intellectual Property Domain. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  23. 23.
    Roda, G., Tait, J., Piroi, F., Zenz, V.: CLEF-IP 2009: Retrieval Experiments in the Intellectual Property Domain (2009)Google Scholar
  24. 24.
    Shaw, J.A., Fox, E.A.: Combination of multiple searches. In: TREC 1994 (1994)Google Scholar
  25. 25.
    Takaki, T., Fujii, A., Ishikawa, T.: Associative document retrieval by query subtopic analysis and its application to invalidity patent search. In: ACM Conference on Information and Knowledge Management, pp. 399–405 (2004)Google Scholar
  26. 26.
    Teodoro, D., Gobeill, J., Pasche, E., Vishnyakova, D., Ruch, P., Lovis, C.: Automatic Prior Art Searching and Patent Encoding at CLEF-IP 2010. In: Workshop of the Cross-Language Evaluation Forum, LABs and Workshops, Notebook Papers (2010)Google Scholar
  27. 27.
    Xue, X., Croft, W.B.: Automatic query generation for patent search. In: ACM Conference on Information and Knowledge Management, pp. 2037–2040 (2009)Google Scholar
  28. 28.
    Xue, X., Croft, W.B.: Transforming patents into prior-art queries. In: International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 808–809 (2009)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2011

Authors and Affiliations

  • Parvaz Mahdabi
    • 1
  • Mostafa Keikha
    • 1
  • Shima Gerani
    • 1
  • Monica Landoni
    • 1
  • Fabio Crestani
    • 1
  1. 1.Faculty of InformaticsUniversity of LuganoLuganoSwitzerland

Personalised recommendations