A Robust Approach to Extraction of Texts from Camera Captured Images

Banerjee, Sudipto; Mullick, Koustav; Bhattacharya, Ujjwal

doi:10.1007/978-3-319-05167-3_3

Sudipto Banerjee¹⁷,
Koustav Mullick¹⁷ &
Ujjwal Bhattacharya¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8357))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

897 Accesses
2 Citations

Abstract

Here, we present our recent study of a robust but simple approach to extraction of texts from camera-captured images. In the proposed approach, we first identify pixels which are highly specular. Connected components of this set of specular pixels are obtained. Pixels belonging to each such component are separately binarized using the well-known Otsu’s approach. We next apply smoothing on the whole image before obtaining its Canny edge representation. Bounding rectangle of each connected component of the Canny edge image is obtained and multiple components with pairwise overlapping bounding boxes are merged. Otsu’s thresholding technique is applied separately on different parts of input image defined by the resulting bounding boxes. Although Otsu’s thresholding approach does not generally provide acceptable performance on camera captured images, we observed its suitability when applied severally as in the above. The binarized specular components obtained at the initial stage replace the corresponding regions of the latter binarized image. Finally, a set of postprocessing operations is used to remove certain non-text components of the binarized image.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst., Man Cybern. 9(1), 62–66 (1979)
Article MathSciNet Google Scholar
Kittler, J., Illingworth, J., Foglein, J.: Threshold selection based on a simple image statistic. Comp. Vision Graph. Image Proc. 30(2), 125–147 (1985)
Google Scholar
Sauvola, J.J., Pietikainen, M.: Adaptive document image binarization. Patt. Recog. 33(2), 225–236 (2000)
Google Scholar
Niblack, W.: An Introduction to Digital Image Processing. Prentice Hall, New York (1986)
Google Scholar
Stathis, P., Kavallieratou, E., Papamarkos, N.: An evaluation technique for binarization algorithms. J. Univ. Comp. Sci. 14(18), 3011–3030 (2008)
Google Scholar
Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Markov random field based binarization for hand-held devices captured document images. In: Proceedings of Indian Conference on Comp. Vision Graph. Image Proceedings, pp. 71–76 (2010)
Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 robust reading competitions. In: Proceedings of the 7th Internationl Conference on Document Analysis and Recognition, pp. 682–687 (2003)
Google Scholar
Shafer, S.A.: Using color to separate reflection components. Color Res. Appl. 10, 210–218 (1985)
Article Google Scholar
He, Y., et al.: Enhancement of camera-based whiteboard images. In: XVII-DRR (SPIE Proceedings Series, vol. 7534, pp. 1–10 (2010)
Google Scholar
Canny, J.: A computational approach to edge detection. IEEE Trans. Patt. Anal. Mach. Intell. 8(6), 679–698 (1986)
Article Google Scholar
Roy Chowdhury, A., Bhattacharya, U., Parui, S.K.: Text detection of two major Indian scripts in natural scene images. In: Iwamura, M., Shafait, F. (eds.) CBDAR 2011. LNCS, vol. 7139, pp. 42–57. Springer, Heidelberg (2012)
Google Scholar
Roy Chowdhury, A., Bhattacharya, U., Parui, S.K.: Scene text detection using sparse stroke information and MLP. In: Proceedings of International Conference on Pattern Recognition, pp. 294–297 (2012)
Google Scholar
Kasar, T. et al.: Font and background color independent text binarization. In: Proceedings of CBDAR, pp. 3–9 (2007)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of CVPR, pp. 2963–2970 (2010)
Google Scholar
Borgefors, G.: Distance transformations in digital images. Comp. Vis. Graph. Image Proc. 34, 344–371 (1986)
Article Google Scholar
Chen, H., et al.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: Proceedings of ICIP (2011)
Google Scholar
Merino-Gracia, C., Lenc, K., Mirmehdi, M: A head-mounted device for recognizing text in natural scenes. In: Proceedings of CBDAR, pp. 27–32 (2011)
Google Scholar
Zhang, J., Kasturi, R.: Text detection using edge gradient and graph spectrum. In: Proceedings of ICPR, pp. 3979–3982 (2010)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Heritage Institute of Technology, Kolkata, India
Sudipto Banerjee & Koustav Mullick
Computer Vision and Pattern Recognition Unit, Indian Statistical Institute, Kolkata, India
Ujjwal Bhattacharya

Authors

Sudipto Banerjee
View author publications
You can also search for this author in PubMed Google Scholar
Koustav Mullick
View author publications
You can also search for this author in PubMed Google Scholar
Ujjwal Bhattacharya
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ujjwal Bhattacharya .

Editor information

Editors and Affiliations

Graudate School of Engineering, Osaka Prefecture University, Osaka, Japan
Masakazu Iwamura
The University of Western Australia, Crawley, West Australia, Australia
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Banerjee, S., Mullick, K., Bhattacharya, U. (2014). A Robust Approach to Extraction of Texts from Camera Captured Images. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-05167-3_3
Published: 19 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics