Skip to main content

SPODS: A Dataset of Color-Official Documents and Detection of Logo, Stamp, and Signature

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 10481))

Abstract

Office automation is an active area of research. It involves archival and retrieval of official documents. For developing a system for this purpose, it is necessary to have an extensive benchmark dataset consisting various types of official documents. However, it is hard to make available real world official documents as they are mostly confidential. In the absence of such benchmark datasets, it is difficult to evaluate newly developed algorithms. Hence, efforts have been made to build dataset consisting of different categories of documents that resemble real world official documents. In this work, we present a dataset called as scanned pseudo-official data-set (SPODS) which is created by us and made available online. Official documents are usually distinguished by presence of logo, stamp, signature, date, etc. The paper also presents a new approach for the detection of logo, stamp, and signature using spectral filtering and part based features. A comparative study on performances of the proposed method and existing algorithms on the SPODS dataset demonstrates the effectiveness of the proposed technique.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    The dataset is available at http://www.facweb.iitkgp.ernet.in/~jay/spods/.

References

  1. CBIR benchmark databases. http://savvash.blogspot.in/2008/12/benchmark-databases-for-cbir.html. Accessed 11 Jan 2016

  2. Tobacco 800 dataset. http://www.umiacs.umd.edu/~zhugy/tobacco800.html. Accessed 7 Dec 2015

  3. Ahmed, S., Malik, M.I., Liwicki, M., Dengel, A.: Signature segmentation from document images. In: International Conference on Frontiers in Handwriting Recognition (ICFHR), pp. 425–429. IEEE (2012)

    Google Scholar 

  4. Ahmed, S., Shafait, F., Liwicki, M., Dengel, A.: A generic method for stamp segmentation using part-based features. In: Proceedings of the 12th International Conference on Document Analysis and Recognition (ICDAR), pp. 708–712. IEEE (2013)

    Google Scholar 

  5. Bay, H., Ess, A., Tuytelaars, T., Van Gool, L.: Speeded-up robust features (surf). Comput. Vis. Image Underst. 110(3), 346–359 (2008)

    Article  Google Scholar 

  6. Comaniciu, D., Meer, P.: Mean shift: a robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24(5), 603–619 (2002)

    Article  Google Scholar 

  7. Dey, S., Mukherjee, J., Sural, S.: Logo and stamp detection from document images by finding outliers. In: Proceedings of the 5th National Conference on Computer Vision, Pattern Recognition, Image Processing and Graphics (NCVPRIPG). IEEE (2015)

    Google Scholar 

  8. Dey, S., Mukherjee, J., Sural, S., Bhowmick, P.: Colored rubber stamp removal from document images. In: Maji, P., Ghosh, A., Murty, M.N., Ghosh, K., Pal, S.K. (eds.) PReMI 2013. LNCS, vol. 8251, pp. 545–550. Springer, Heidelberg (2013). doi:10.1007/978-3-642-45062-4_75

    Chapter  Google Scholar 

  9. Doermann, D., Tombre, K., et al.: Handbook of Document Image Processing and Recognition. Springer, London (2014)

    Book  MATH  Google Scholar 

  10. Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley, Hoboken (2012)

    MATH  Google Scholar 

  11. Jain, R., Doermann, D.: Logo retrieval in document images. In: Proceedings of the 10th IAPR International Workshop on Document Analysis Systems, pp. 135–139. IEEE (2012)

    Google Scholar 

  12. Le, V.P., Nayef, N., Visani, M., Ogier, J.M., De Tran, C.: Document retrieval based on logo spotting using key-point matching. In: Proceedings of the 22nd International Conference on Pattern Recognition (ICPR), pp. 3056–3061. IEEE (2014)

    Google Scholar 

  13. Liu, L., Yu, M., Shao, L.: Multiview alignment hashing for efficient image search. IEEE Trans. Image Process. 24(3), 956–966 (2015)

    Article  MathSciNet  Google Scholar 

  14. Mandal, R., Roy, P.P., Pal, U.: Signature segmentation from machine printed documents using conditional random field. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 1170–1174. IEEE (2011)

    Google Scholar 

  15. Micenková, B., van Beusekom, J.: Stamp detection in color document images. In: Proceedings of the 11th International Conference on Document Analysis and Recognition (ICDAR), pp. 1125–1129. IEEE (2011)

    Google Scholar 

  16. Nandedkar, A.V., Mukhopadhyay, J., Sural, S.: Text-graphics separation to detect logo and stamp from color document images: a spectral approach. In: Proceedings of the 13th International Conference on Document Analysis and Recognition (ICDAR), pp. 571–575. IEEE (2015)

    Google Scholar 

  17. Roy, P.P., Pal, U., Lladós, J.: Document seal detection using GHT and character proximity graphs. Pattern Recogn. 44(6), 1282–1295 (2011)

    Article  Google Scholar 

  18. Rusiñol, M., Lladós, J.: Efficient logo retrieval through hashing shape context descriptors. In: Proceedings of the 9th IAPR International Workshop on Document Analysis Systems, pp. 215–222. ACM (2010)

    Google Scholar 

  19. Smeulders, A.W., Worring, M., Santini, S., Gupta, A., Jain, R.: Content-based image retrieval at the end of the early years. IEEE Trans. Pattern Anal. Mach. Intell. 22(12), 1349–1380 (2000)

    Article  Google Scholar 

  20. Srihari, S.N., Shetty, S., Chen, S., Srinivasan, H., Huang, C., Agam, G., Frieder, O.: Document image retrieval using signatures as queries. In: Proceedings of the 2nd International Conference on Document Image Analysis for Libraries (DIAL), pp. 198–203. IEEE (2006)

    Google Scholar 

  21. Wang, H., Chen, Y.: Logo detection in document images based on boundary extension of feature rectangles. In: Proceedings of the 10th International Conference on Document Analysis and Recognition (ICDAR), pp. 1335–1339. IEEE (2009)

    Google Scholar 

  22. Wong, K.Y., Casey, R.G., Wahl, F.M.: Document analysis system. IBM J. Res. Dev. 26(6), 647–656 (1982)

    Article  Google Scholar 

  23. Zhu, G., Doermann, D.: Automatic document logo detection. In: Proceedings of the 9th International Conference on Document Analysis and Recognition (ICDAR), vol. 2, pp. 864–868. IEEE (2007)

    Google Scholar 

  24. Zhu, G., Zheng, Y., Doermann, D., Jaeger, S.: Signature detection and matching for document image retrieval. IEEE Trans. Pattern Anal. Mach. Intell. 31(11), 2015–2031 (2009)

    Article  Google Scholar 

Download references

Acknowledgments

This work is partially sponsored by the Ministry of Communications & Information Technology, Govt. of India; Ref.: MCIT 11(19)/2010-HCC (TDIL) dt. 28-12-2010.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Amit Vijay Nandedkar .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Nandedkar, A.V., Mukherjee, J., Sural, S. (2017). SPODS: A Dataset of Color-Official Documents and Detection of Logo, Stamp, and Signature. In: Mukherjee, S., et al. Computer Vision, Graphics, and Image Processing. ICVGIP 2016. Lecture Notes in Computer Science(), vol 10481. Springer, Cham. https://doi.org/10.1007/978-3-319-68124-5_19

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-68124-5_19

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-68123-8

  • Online ISBN: 978-3-319-68124-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics