Document Image Classification, with a Specific View on Applications of Patent Images

Csurka, Gabriela

doi:10.1007/978-3-662-53817-3_12

Document Image Classification, with a Specific View on Applications of Patent Images

Gabriela Csurka⁷

Chapter
First Online: 26 March 2017

1576 Accesses
6 Citations

Part of the book series: The Information Retrieval Series ((INRE,volume 37))

Abstract

The main focus of this chapter is document image classification and retrieval, where we analyse and compare different parameters for the run-length histogram and Fisher vector-based image representations. We do an exhaustive experimental study using different document image data sets, including the MARG benchmarks, two data sets built on customer data and the images from the patent image classification task of the CLEF-IP 2011. The aim of the study is to give guidelines on how to best choose the parameters such that the same features perform well on different tasks. As an example of such need, we describe the image-based patent retrieval tasks of CLEF-IP 2011, where we used the same image representation to predict the image type and retrieve relevant patents.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Ah-Pine J, Csurka G, Clinchant S (2015) Unsupervised visual and textual information fusion in cbir using graph-based methods. ACM Trans Inf Syst 33(2):1–31
Article Google Scholar
Akata Z, Perronnin F, Harchaoui Z, Schmid C (2014) Good practice in large-scale learning for image classification. Trans Pattern Anal Mach Intell 36:507–520
Article Google Scholar
Bagdanov AD, Worring M (2004) Multiscale document description using rectangular granulometries. Int J Doc Anal Recognit 6:181–191
Article MATH Google Scholar
Bai B, Weston J, Grangier D, Collobert R, Sadamasa K, Qi Y, Chapelle O, Weinberger KQ (2009) Supervised semantic indexing. In: ACM international conference on information and knowledge management (CIKM)
Google Scholar
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: COMPSTAT, pp 177–186
Google Scholar
Chan Y-K, Chang C-C (2001) Image matching using run-length feature. Pattern Recogn Lett 22:447–455
Article MATH Google Scholar
Chen N, Blostein D (2007) A survey of document image classification: problem statement, classifier architecture and performance evaluatio. Int J Doc Anal Recognit 10:1–16
Article Google Scholar
Csurka G, Dance C, Fan L, Willamowski J, Bray C (2004) Visual categorization with bags of keypoints. In: ECCV workshop on statistical learning for computer vision, vol 1, pp 1–2
Google Scholar
Csurka G, Renders J-M, Jacquet G (2011) XRCEś participation at patent image classification and image-based patent retrieval tasks of the Clef-IP 2011. In: Intellectual property evaluation campaign (CLEF-IP)
Google Scholar
Cullen JF, Jonathan JJH, Hart PE (1997) Document image database retrieval and browsing using texture analysis. In: International conference on document analysis and recognition (ICDAR), vol 2, pp 718–721
Google Scholar
Davis JV, Kulis B, Jain P, Sra S, Dhillon IS (2007) Information-theoretic metric learning. In: International conference on machine learning (ICML)
Google Scholar
Gordo A (2013) Document image representation, classification and retrieval in large-scale domains. PhD thesis, Computer Vision Center, Universitat Autònoma de Barcelona
Google Scholar
Gordo A, Perronnin F (2010) A bag-of-pages approach to unordered multi-page document classification. In: International conference on pattern recognition (ICPR)
Google Scholar
Gordo A, Perronnin F (2011) Asymmetric distances for binary embeddings. In: IEEE conference on computer vision and pattern recognition (CVPR). https://ai2-s2-pdfs.s3.amazonaws.com/d191/544940caac5f57363968539856343ad9a02d.pdf
Gordo A, Perronnin F, Valveny E (2012) Document classification using multiple views. In: International workshop on document analysis systems (DAS), pp 33–37
Google Scholar
Gordo A, Perronnin F, Valveny E (2013) Large-scale document image retrieval and classification with runlength histograms and binary embeddings. Pattern Recogn 46(7):1898–1905
Article Google Scholar
Harley A, Ufkes A, Derpanis K (2015) Evaluation of deep convolutional nets for document image classification and retrieval. In: International conference on document analysis and recognition (ICDAR), pp 991–995
Google Scholar
Heroux P, Diana S, Ribert A, Trupin E (1998) Classification method study for automatic form class identification. In: International conference on pattern recognition (ICPR), vol 1, pp 926–928
Google Scholar
Kang L, Kumar J, Ye P, Liy Y, Doermann D (2014) Convolutional neural networks for document image classification. In: International conference on pattern recognition (ICPR), pp 3168–3172
Google Scholar
Keysers D, Shafait F, Breuel TM (2007) Document image zone classification - a simple high-performance approach. In: International conference on computer vision theory and applications (VISAPP), pp 44–51
Google Scholar
Krapac J, Verbeek J, Jurie F (2011) Modeling spatial layout with fisher vectors for image categorization. In: IEEE international conference on computer vision (ICCV)
Google Scholar
Lazebnik S, Schmid C, Ponce J (2006) Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In: IEEE conference on computer vision and pattern recognition (CVPR), vol 2, pp 2169–2178
Google Scholar
Lowe D (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Mensink T, Verbeek J, Perronnin F, Csurka G (2013) Distance-based image classification: generalizing to new classes at near-zero cost. Trans Pattern Anal Mach Intell 35(11):2624–2637
Article Google Scholar
Perronnin F, Dance C (2007) Fisher kernels on visual vocabularies for image categorization. In: IEEE conference on computer vision and pattern recognition (CVPR), pp 1–8
Google Scholar
Perronnin F, Liu Y, Sánchez J, Poirier H (2010) Large scale image retrieval with compressed fisher vectors. In: IEEE conference on computer vision and pattern recognition (CVPR)
Google Scholar
Perronnin F, Sánchez J, Mensink T (2010) Improving the fisher kernel for large-scale image classification. In: European conference on computer vision (ECCV), pp 143–156
Google Scholar
Piroi F, Lupu M, Hanbury A, Zenz V (2011) CLEF-IP 2011: retrieval in the intellectual property domain. In: Intellectual property evaluation campaign (CLEF-IP)
Google Scholar
Pratikakis I, Gatos B, Ntirogiannis K (2012) ICFHR 2012 competition on handwritten document image binarization. In: Proceedings of the ICFHR
Google Scholar
Rusiñol M, Frinken V, Karatzas D, Bagdanov AD, Llados J (2014) Multimodal page classification in administrative document image streams. Int J Doc Anal Recognit 17:331–341
Article Google Scholar
Sánchez J, Perronnin F (2011) High-dimensional signature compression for large-scale image classification. In: IEEE conference on computer vision and pattern recognition (CVPR)
Google Scholar
Sarkar P (2006) Image classification: classifying distributions of visual features. In: International conference on pattern recognition (ICPR), vol 2, pp 472–475
Google Scholar
Shin C, Doermann D, Rosenfeld A (2001) Classification of document pages using structure-based features. Int J Doc Anal Recognit 3:232–247
Article Google Scholar
Sivic J, Zisserman A (2003) Video google: a text retrieval approach to object matching in videos. In: IEEE international conference on computer vision (ICCV)
Google Scholar
The Medical Article Records Groundtruth Dataset (2003) https://ceb.nlm.nih.gov/inactive-communications-engineering-branch-projects/medical-article-records-groundtruth-marg/. Last visited Jan 2017
The NIST Structured Forms Database (NIST Special Database 2) (2010) https://www.nist.gov/srd/nist-special-database-2. Last visited Jan 2017
Vedaldi A, Zisserman A (2012) Sparse kernel approximations for efficient classification and detection. In: IEEE conference on computer vision and pattern recognition (CVPR)
Google Scholar
Weinberger K, Saul L (2009) Distance metric learning for large margin nearest neighbor classification. J Mach Learn Res 10:207–244
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Xerox Research Centre Europe, 6 chemin de Maupertuis, 38240, Meylan, France
Gabriela Csurka

Authors

Gabriela Csurka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Gabriela Csurka .

Editor information

Editors and Affiliations

Institute for Software Engineering & Interactive Systems, Vienna University of Technology, Vienna, Austria
Mihai Lupu
Research Platform Responsible Research and Innovation in Academic Practice, University of Vienna, Vienna, Austria
Katja Mayer
Information & Society Research Division, National Institute of Informatics, Tokyo, Japan
Noriko Kando
Patinformatics, LLC , Dublin, Ohio, USA
Anthony J. Trippe

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Csurka, G. (2017). Document Image Classification, with a Specific View on Applications of Patent Images. In: Lupu, M., Mayer, K., Kando, N., Trippe, A. (eds) Current Challenges in Patent Information Retrieval. The Information Retrieval Series, vol 37. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-662-53817-3_12

Download citation

DOI: https://doi.org/10.1007/978-3-662-53817-3_12
Published: 26 March 2017
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-662-53816-6
Online ISBN: 978-3-662-53817-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics