A novel method for binarization of scene text images and its application in text identification

Ghoshal, Ranjit; Roy, Anandarup; Banerjee, Ayan; Dhara, Bibhas Chandra; Parui, Swapan K.

doi:10.1007/s10044-018-0687-2

A novel method for binarization of scene text images and its application in text identification

Theoretical Advances
Published: 14 February 2018

Volume 22, pages 1361–1375, (2019)
Cite this article

Pattern Analysis and Applications Aims and scope Submit manuscript

Ranjit Ghoshal¹,
Anandarup Roy²,
Ayan Banerjee³,
Bibhas Chandra Dhara⁴ &
…
Swapan K. Parui⁵

460 Accesses
6 Citations
Explore all metrics

Abstract

The aim of this article is twofold. First, we propose an effective methodology for binarization of scene images. For our present study, we use the publicly available ICDAR 2011 Born Digital Data set. We introduce a new concept of variance map of a gray-level image for detection of text boundary in an image. Based on this boundary information, the image is binarized by means of adaptive thresholding. This binarization procedure produces a number of connected components. Next, these connected components are examined in order to identify possible text components. In this context, a number of shape-based features that distinguish between text and non-text components are proposed. We consider text component identification as an one-class classification problem, i.e., the ground truth information for only the text class is available for the ICDAR 2011 Born Digital Data set. Then, the ground truth text components are used to obtain a certain statistical distribution of the shape-based features. Here, we observe that all the features may not follow a single family of distributions. Therefore, we construct a joint distribution by using multivariate Gaussian copula which allows a coupling of different marginal distributions. As our experiments suggest, the copula-based method is superior to multivariate Gaussian distribution in describing the feature distribution. Finally, a text connected component of an unknown class is subjected to the trained statistical model, and by performing a hypothesis test we successfully identify a possible text component. For a comparative study, we consider a number of state-of-the-art methods. Our proposed approach significantly outperforms most of these methods in terms of recall, precision and F-measure in both the binarization and text identification tasks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Region Growing-Based Scheme for Extraction of Text from Scene Images

References

Bhattacharya U, Parui SK, Mondal S (2009) Devanagari and bangla text extraction from natural scene images. In: Proc. of the int. conf. on document analysis and recognition, pp 171–175
Canny J (1986) A computational approach to edge detection. IEEE Trans Pattern Anal Mach Intell 8(6):679–698
Article Google Scholar
Clavelli A, Karatzas D, Lladós J (2010) A framework for the assessment of text extraction algorithms on complex colour images. In: Proceedings of the 9th IAPR international workshop on document analysis systems, DAS ’10. ACM, pp 19–26
Dance CR, Seegar M (1999) On the evaluation of document analysis components by recall, precision, and accuracy. In: Proc. of the fifth int. conf. on document analysis and recognition, pp 713–716
Figueiredo M, Jain AK (2002) Unsupervised learning of finite mixture models. IEEE Trans Pattern Anal Mach Intell 24(3):381–396
Article Google Scholar
Gatos B, Pratikakis I, Perantonis SJ (2008) Improved document image binarization by using a combination of multiple binarization techniques and adapted edge information. In: International conference on pattern recognition, ICPR ’08. IEEE, pp 1–4
Ghoshal R, Roy A, Bhowmik TK, Parui SK (2011) Decision tree based recognition of bangla text from outdoor scene images. In: Proc. of the 18th international conference on neural information processing, pp 538–546
Gllavata J, Ewerth R, Freisleben B (2004) Text detection in images based on unsupervised classification of high frequency wavelet coefficients. In: Proceedings of the international conference on pattern recognition, vol. 1, pp 425–428
Gomez L, Karatzas D (2016) A fine-grained approach to scene text script identification. In: Proceedings of the 12th IAPR international workshop on document analysis systems, DAS ’16, pp 192–197
Jung K, Kim IK, Kurata T, Kourogi M, Han HJ (2002) Text scanner with text detection technology on image sequences. In: Proceedings of the international conference on pattern recognition, vol. 3, pp 473–476
Karatzas D, Gomez-Bigorda L, Nicolaou A, Ghosh S, Bagdanov A, Iwamura M, Matas J, Neumann L, Chandrasekhar V (2015) Icdar 2015 robust reading competition-challenge 1: reading text in born-digital images (web and email). In: Proceedings of the 13th international conference of document analysis and recognition, ICDAR ’15. IEEE, pp 1156–1160
Karatzas D, Robles S, Gomez L (2014) An on-line platform for ground truthing and performance evaluation of text extraction systems. In: Proceedings of the 11th IAPR international workshop on document analysis systems, pp 242–246
Karatzas D, Robles Mestre S, Mas J, Nourbakhsh F, Roy PP (2011) Icdar 2011 robust reading competition-challenge 1: reading text in born-digital images (web and email). In: In Proc. 11th international conference of document analysis and recognition, ICDAR ’11. IEEE, pp 1485–1490
Karatzas D, Shafait F, Uchida S, Iwamura M, Bigorda, LGi, Mestre SR, Mas J, Mota DF, Almazn JA, Heras LPdl (2013) Icdar 2013 robust reading competition. In: 12th international conference on document analysis and recognition, ICDAR ’13, pp 1484–1493
Kumar D, Ramakrishnan AG (2012) Octymist:otsu-canny minimal spanning tree for born-digital images. In: Proceedings of the 10th IAPR international workshop on document analysis systems, DAS ’12, pp 389–393
Li H, Doermann D (1998) Automatic identification of text in digital video key frames. In: Fourteenth international conference on pattern recognition, ICPR ’98. IEEE pp 129–132
Liang J, Doermann D, Li H (2005) Camera based analysis of text and documents : a survey. Int J Doc Anal Recognit 7:84–104
Article Google Scholar
Lienhart R, Stuber F (1996) Automatic text recognition in digital videos. In: Image and video processing IV, proc. SPIE 2666, pp 180–188
Liu CL, Koga M, Fujisawa H (2002) Lexicon-driven segmentation and recognition of handwritten character strings for japanese address reading. IEEE Trans Pattern Anal Mach Intell 24(11):1425–1437
Article Google Scholar
Lu S, Su B, Tan CL (2010) Document image binarization using background estimation and stroke edge. Int J Doc Anal Recogn 13(4):303–314
Article Google Scholar
Nelsen RB (2006) An introduction to copulas. Springer, Berlin
MATH Google Scholar
Niblack W (1986) An introduction to digital image processing. Prentice Hall, Englewood Cliffs
Google Scholar
Otsu N (1979) A threshold selection method from gray-level histograms. IEEE Trans Syst Man Cybern 9(1):377–393
Article Google Scholar
Roy A, Pal A, Garain U (2017) Jclmm: a finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation. Pattern Recognit 66:160–173
Article Google Scholar
Sauvola J, Pietikinen M (2000) Adaptive document image binarization. Pattern Recognit 2:225–236
Article Google Scholar
Shi B, Yao C, Zhang C, Guo X, Huang F, Bai X (2015) Automatic script identification in the wild. In: 13th international conference on document analysis and recognition, ICDAR ’15, pp 531 – 535
Shivakumara P, Phan TQ, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419
Article Google Scholar
Sobottka K, Bunke H, Kronenberg H (1999) Identification of text on colored book and journal covers. In: Proceedings of the international conference on document analysis and recognition, pp 57–63
Tsai C, Lee H (2002) Binarization of color document images via luminance and saturation color features. IEEE Trans Image Process 11(4):434–451
Article Google Scholar
Wang QF, Yin F, Liu CL (2012) Handwritten chinese text recognition by integrating multiple contexts. IEEE Trans Pattern Anal Mach Intell 34(8):1469–1481
Article Google Scholar
Wu V, Manmatha R, Riseman EM (1999) Textfinder: an automatic system to detect and recognize text in images. IEEE Trans Pattern Anal Mach Intell 21(11):1224–1229
Article Google Scholar
Yin XC, Pei WY, Zhang J, Hao HW (2015) Multi-orientation scene text detection with adaptive clustering. IEEE Trans Pattern Anal Mach Intell 37(9):1930–1937
Article Google Scholar
Yin XC, Yin X, Huang K, Hao HW (2014) Robust text detection in natural scene images. IEEE Trans Pattern Anal Mach Intell 36(5):970–983
Article Google Scholar

Download references

Author information

Authors and Affiliations

St. Thomas’ College of Engineering and Technology, Kolkata, 700023, India
Ranjit Ghoshal
Usha Martin University, 12th Mile, Ranchi Khunti Road, Ranchi, Jharkhand, 835221, India
Anandarup Roy
Lexmark Research and Development Corporation, Kolkata, India
Ayan Banerjee
Department of Information Technology, Jadavpur University, Kolkata, 700098, India
Bibhas Chandra Dhara
CVPR Unit, Indian Statistical Institute, 203 B. T. Road, Kolkata, 700108, India
Swapan K. Parui

Authors

Ranjit Ghoshal
View author publications
You can also search for this author in PubMed Google Scholar
Anandarup Roy
View author publications
You can also search for this author in PubMed Google Scholar
Ayan Banerjee
View author publications
You can also search for this author in PubMed Google Scholar
Bibhas Chandra Dhara
View author publications
You can also search for this author in PubMed Google Scholar
Swapan K. Parui
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ranjit Ghoshal.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Ghoshal, R., Roy, A., Banerjee, A. et al. A novel method for binarization of scene text images and its application in text identification. Pattern Anal Applic 22, 1361–1375 (2019). https://doi.org/10.1007/s10044-018-0687-2

Download citation

Received: 21 March 2017
Accepted: 18 January 2018
Published: 14 February 2018
Issue Date: November 2019
DOI: https://doi.org/10.1007/s10044-018-0687-2

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A novel method for binarization of scene text images and its application in text identification

Abstract

Access this article

Similar content being viewed by others

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Region Growing-Based Scheme for Extraction of Text from Scene Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Navigation

A novel method for binarization of scene text images and its application in text identification

Abstract

Access this article

Similar content being viewed by others

Text Extraction from Scene Images Through Local Binary Pattern and Business Features Based Color Image Segmentation

A Variance Based Image Binarization Scheme and Its Application in Text Segmentation

Region Growing-Based Scheme for Extraction of Text from Scene Images

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation