Advertisement

A novel method for binarization of scene text images and its application in text identification

  • Ranjit Ghoshal
  • Anandarup Roy
  • Ayan Banerjee
  • Bibhas Chandra Dhara
  • Swapan K. Parui
Theoretical Advances
  • 73 Downloads

Abstract

The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly available ICDAR 2011 Born Digital Data set. We introduce a new concept of variance map of a gray-level image for detection of text boundary in an image. Based on this boundary information, the image is binarized by means of adaptive thresholding. This binarization procedure produces a number of connected components. Next, these connected components are examined in order to identify possible text components. In this context, a number of shape-based features that distinguish between text and non-text components are proposed. We consider text component identification as an one-class classification problem, i.e., the ground truth information for only the text class is available for the ICDAR 2011 Born Digital Data set. Then, the ground truth text components are used to obtain a certain statistical distribution of the shape-based features. Here, we observe that all the features may not follow a single family of distributions. Therefore, we construct a joint distribution by using multivariate Gaussian copula which allows a coupling of different marginal distributions. As our experiments suggest, the copula-based method is superior to multivariate Gaussian distribution in describing the feature distribution. Finally, a text connected component of an unknown class is subjected to the trained statistical model, and by performing a hypothesis test we successfully identify a possible text component. For a comparative study, we consider a number of state-of-the-art methods. Our proposed approach significantly outperforms most of these methods in terms of recall, precision and F-measure in both the binarization and text identification tasks.

Keywords

Scene image binarization Scene text identification Connected component One-class classifier Gaussian copula 

References

  1. 1.
    Bhattacharya U, Parui SK, Mondal S (2009) Devanagari and bangla text extraction from natural scene images. In: Proc. of the int. conf. on document analysis and recognition, pp 171–175Google Scholar
  2. 2.
    Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698CrossRefGoogle Scholar
  3. 3.
    Clavelli A, Karatzas D, Lladós J (2010) A framework for the assessment of text extraction algorithms on complex colour images. In: Proceedings of the 9th IAPR international workshop on document analysis systems, DAS ’10. ACM, pp 19–26Google Scholar
  4. 4.
    Dance CR, Seegar M (1999) On the evaluation of document analysis components by recall, precision, and accuracy. In: Proc. of the fifth int. conf. on document analysis and recognition, pp 713–716Google Scholar
  5. 5.
    Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396CrossRefGoogle Scholar
  6. 6.
    Gatos B, Pratikakis I, Perantonis SJ (2008) Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information. In: International conference on pattern recognition, ICPR ’08. IEEE, pp 1–4Google Scholar
  7. 7.
    Ghoshal R, Roy A, Bhowmik TK, Parui SK (2011) Decision tree based recognition of bangla text from outdoor scene images. In: Proc. of the 18th international conference on neural information processing, pp 538–546Google Scholar
  8. 8.
    Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high frequency wavelet coefficients. In: Proceedings of the international conference on pattern recognition, vol. 1, pp 425–428Google Scholar
  9. 9.
    Gomez L, Karatzas D (2016) A fine-grained approach to scene text script identification. In: Proceedings of the 12th IAPR international workshop on document analysis systems, DAS ’16, pp 192–197Google Scholar
  10. 10.
    Jung K, Kim IK, Kurata T, Kourogi M, Han HJ (2002) Text scanner with text detection technology on image sequences. In: Proceedings of the international conference on pattern recognition, vol. 3, pp 473–476Google Scholar
  11. 11.
    Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V (2015) Icdar 2015 robust reading competition-challenge 1: reading text in born-digital images (web and email). In: Proceedings of the 13th international conference of document analysis and recognition, ICDAR ’15. IEEE, pp 1156–1160Google Scholar
  12. 12.
    Karatzas D, Robles S, Gomez L (2014) An on-line platform for ground truthing and performance evaluation of text extraction systems. In: Proceedings of the 11th IAPR international workshop on document analysis systems, pp 242–246Google Scholar
  13. 13.
    Karatzas D, Robles Mestre S, Mas J, Nourbakhsh F, Roy PP (2011) Icdar 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email). In: In Proc. 11th international conference of document analysis and recognition, ICDAR ’11. IEEE, pp 1485–1490Google Scholar
  14. 14.
    Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda, LGi, Mestre SR, Mas J, Mota DF, Almazn JA, Heras LPdl (2013) Icdar 2013 robust reading competition. In: 12th international conference on document analysis and recognition, ICDAR ’13, pp 1484–1493Google Scholar
  15. 15.
    Kumar D, Ramakrishnan AG (2012) Octymist:otsu-canny minimal spanning tree for born-digital images. In: Proceedings of the 10th IAPR international workshop on document analysis systems, DAS ’12, pp 389–393Google Scholar
  16. 16.
    Li H, Doermann D (1998) Automatic identification of text in digital video key frames. In: Fourteenth international conference on pattern recognition, ICPR ’98. IEEE pp 129–132Google Scholar
  17. 17.
    Liang J, Doermann D, Li H (2005) Camera based analysis of text and documents : a survey. Int J Doc Anal Recognit 7:84–104CrossRefGoogle Scholar
  18. 18.
    Lienhart R, Stuber F (1996) Automatic text recognition in digital videos. In: Image and video processing IV, proc. SPIE 2666, pp 180–188Google Scholar
  19. 19.
    Liu CL, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437CrossRefGoogle Scholar
  20. 20.
    Lu S, Su B, Tan CL (2010) Document image binarization using background estimation and stroke edge. Int J Doc Anal Recogn 13(4):303–314CrossRefGoogle Scholar
  21. 21.
    Nelsen RB (2006) An introduction to copulas. Springer, BerlinMATHGoogle Scholar
  22. 22.
    Niblack W (1986) An introduction to digital image processing. Prentice Hall, Englewood CliffsGoogle Scholar
  23. 23.
    Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):377–393MathSciNetCrossRefGoogle Scholar
  24. 24.
    Roy A, Pal A, Garain U (2017) Jclmm: a finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation. Pattern Recognit 66:160–173CrossRefGoogle Scholar
  25. 25.
    Sauvola J, Pietikinen M (2000) Adaptive document image binarization. Pattern Recognit 2:225–236CrossRefGoogle Scholar
  26. 26.
    Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. In: 13th international conference on document analysis and recognition, ICDAR ’15, pp 531 – 535Google Scholar
  27. 27.
    Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419CrossRefGoogle Scholar
  28. 28.
    Sobottka K, Bunke H, Kronenberg H (1999) Identification of text on colored book and journal covers. In: Proceedings of the international conference on document analysis and recognition, pp 57–63Google Scholar
  29. 29.
    Tsai C, Lee H (2002) Binarization of color document images via luminance and saturation color features. IEEE Trans Image Process 11(4):434–451CrossRefGoogle Scholar
  30. 30.
    Wang QF, Yin F, Liu CL (2012) Handwritten chinese text recognition by integrating multiple contexts. IEEE Trans Pattern Anal Mach Intell 34(8):1469–1481CrossRefGoogle Scholar
  31. 31.
    Wu V, Manmatha R, Riseman EM (1999) Textfinder: an automatic system to detect and recognize text in images. IEEE Trans Pattern Anal Mach Intell 21(11):1224–1229CrossRefGoogle Scholar
  32. 32.
    Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937CrossRefGoogle Scholar
  33. 33.
    Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983CrossRefGoogle Scholar

Copyright information

© Springer-Verlag London Ltd., part of Springer Nature 2018

Authors and Affiliations

  • Ranjit Ghoshal
    • 1
  • Anandarup Roy
    • 2
  • Ayan Banerjee
    • 3
  • Bibhas Chandra Dhara
    • 4
  • Swapan K. Parui
    • 5
  1. 1.St. Thomas’ College of Engineering and TechnologyKolkataIndia
  2. 2.Usha Martin UniversityRanchiIndia
  3. 3.Lexmark Research and Development CorporationKolkataIndia
  4. 4.Department of Information TechnologyJadavpur UniversityKolkataIndia
  5. 5.CVPR UnitIndian Statistical InstituteKolkataIndia

Personalised recommendations