Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus

Ye, Qixiang; Doermann, David

doi:10.1007/978-3-319-05167-3_4

Qixiang Ye¹⁷ &
David Doermann¹⁷

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8357))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

887 Accesses
5 Citations

Abstract

In this paper, we propose an approach to scene text detection that leverages both the appearance and consensus of connected components. A component appearance is modeled with an SVM based dictionary classifier and the component consensus is represented with color and spatial layout features. Responses of the dictionary classifier are integrated with the consensus features into a discriminative model, where the importance of features is determined with a text level training procedure. In text detection, hypotheses are generated on component pairs and an iterative extension procedure is used to aggregate hypotheses into text objects. In the detection procedure, the discriminative model is used to perform classification as well as control the extension. Experiments show that the proposed approach reaches the state of the art in both detection accuracy and computational efficiency, and in particularly, it performs best when dealing with low-resolution text in clutter backgrounds.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

1.
“Connected component” is shorted as “component” in the followings.

References

Liang, J., Doermann, D., Li, H.: Camera-based analysis of text and documents: a survey. Int. J. Doc. Anal. Recogn. 7, 84–104 (2005)
Article Google Scholar
Merino-Gracia, C., Lenc, K., Mirmehdi, M.: A Head-Mounted device for recognizing text in natural scenes. In: Proceedings of Workshop on Camera-Based Document Analysis and Recognition, pp. 29–41 (2011)
Google Scholar
Yi, C., Tian, Y.: Localizing text in scene images by boundary clustering, stroke segmentation and string fragment classification. IEEE Trans. Image Process. 21(9), 4256–4268 (2012)
Article MathSciNet Google Scholar
Zhao, X., Lin, K.H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text from corners: a novel approach to detect text and caption in videos. IEEE Trans. Image Process. 20(3), 790–799 (2011)
Article MathSciNet Google Scholar
Phan, T.Q., Shivakumara, P., Tan, C.L.: Text detection in natural scenes using gradient vector flow-guided symmetry. In: Proceedings of the IEEE International Conference Pattern Recognition (2012)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of the IEEE International Conference, CVPR (2010)
Google Scholar
Mosleh, A., Bouguila, N., Hamza, A.: Ben: image text detection using a bandlet-Based edge detector and stroke width transform. In: Proceedings of the British Machine Vision Conference (2012)
Google Scholar
Neumann, L., Matas, J.: Text localization in real-world images using efficiently pruned exhaustive search. In: Proceedings of the International Conference on Document Analysis and Recognition (2011)
Google Scholar
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: Proceedings of the IEEE International Conference on Image Processing (2011)
Google Scholar
Neumann, L., Matas, J.: Real-time scene text location and recognition. In: Proceedings of the IEEE International Conference on CVPR (2012)
Google Scholar
Koo, H., Kim, D.H.: Scene text detection via connected component clustering and non-text filtering. IEEE Trans. Image Process. 22(6), 2296–2305 (2013)
Article MathSciNet Google Scholar
Pan, Y., Hou, X., Liu, C.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans., Image Process. 20(3), 800–813 (2011)
Article MathSciNet Google Scholar
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)
Article MathSciNet Google Scholar
Shahab, A., Shafait, F., Dengel, A.: ICDAR 2011 robust reading competition challenge 2: reading text in scene images. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Google Scholar
Lee, J., Lee, P., Lee, S., Yuille, A., Koch, C.: AdaBoost for text detection in natural scene. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Google Scholar
Wang, K., Babenko, B., Belongie, S.: End-to-End scene text recognition. In: Proceedings of the IEEE International Conference on Computer Vision (2011)
Google Scholar
Coates, A., Carpenter, B., Case, C., Satheesh, S., Suresh, B., Wang, T., Wu, D.J., Ng Andrew, Y.: Text detection and character recognition in scene images with unsupervised feature learning. In: Proceedings of the IEEE International Conference on Document Analysis and Recognition (2011)
Google Scholar
Wang, T., Wu, D. J., Coates, A., Andrew, Y.N.: End-to-end text recognition with convolution neural networks. In: Proceedings of the IEEE International Conference on Pattern Recognition (2012)
Google Scholar
Nister, D., Stewenius, H.: Linear time maximally stable extremal regions. In: Proceedings of the European Conference on Computer Vision (2008)
Google Scholar
Ye, Q., Han, Z., Jiao, J., Liu, J.: Human detection in images via piecewise linear support vector machines. IEEE Trans. Image Process. 22(2), 778–789 (2013)
Article MathSciNet Google Scholar

Download references

Acknowledgement

The partial support of this research by DARPA through BBN/ DARPA Award HR0011-08-C-0004 under subcontract 9500009235, the US Government through NSF Award IIS-0812111 is gratefully acknowledged.

Author information

Authors and Affiliations

Institute of Advanced Computer Studies, University of Maryland, College Park, USA
Qixiang Ye & David Doermann

Authors

Qixiang Ye
View author publications
You can also search for this author in PubMed Google Scholar
David Doermann
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Qixiang Ye .

Editor information

Editors and Affiliations

Graudate School of Engineering, Osaka Prefecture University, Osaka, Japan
Masakazu Iwamura
The University of Western Australia, Crawley, West Australia, Australia
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ye, Q., Doermann, D. (2014). Scene Text Detection via Integrated Discrimination of Component Appearance and Consensus. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2013. Lecture Notes in Computer Science(), vol 8357. Springer, Cham. https://doi.org/10.1007/978-3-319-05167-3_4

Download citation

DOI: https://doi.org/10.1007/978-3-319-05167-3_4
Published: 19 March 2014
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-05166-6
Online ISBN: 978-3-319-05167-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics