Abstract
Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Chars74K Dataset, http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Google Street View, http://maps.google.com
ICDAR Robust Reading Dataset, http://algoval.essex.ac.uk/icdar/Datasets.html
ImageMagick, http://www.imagemagick.org
knfbReader, http://www.knfbreader.com
LabelMe Dataset, http://labelme.csail.mit.edu/
Microsoft Text Detection Database, http://research.microsoft.com/en-us/um/people/eyalofek/text_detection_database.zip
NEOCR Dataset, http://www6.cs.fau.de/research/projects/pixtract/neocr
Street View Text Dataset, http://vision.ucsd.edu/~kai/svt/
The PASCAL Visual Object Classes Challenge, http://pascallin.ecs.soton.ac.uk/challenges/VOC/
Word Lens, http://questvisual.com/
de Campos, T.E., Babu, M.R., Varma, M.: Character Recognition in Natural Images. In: International Conference on Computer Vision Theory and Applications (2009)
Chang, L.Z., ZhiYing, S.Z.: Robust Pre-processing Techniques for OCR Applications on Mobile Devices. In: ACM International Conference on Mobile Technology, Application and Systems (2009)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting Text in Natural Scenes with Stroke Width Transform. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)
Ferzli, R., Karam, L.J.: A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB). IEEE Transactions on Image Processing 18(4), 717–728 (2009)
Liang, J., Doermann, D., Li, H.: Camera-based Analysis of Text and Documents: A Survey. International Journal on Document Analysis and Recognition 7, 84–104 (2005)
Lopresti, D., Zhou, J.: Locating and Recognizing Text in WWW Images. Information Retrieval 2(2-3), 177–206 (2000)
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, H., Okamoto, M., Yamamoto, H., Miyao, H.M., Zhu, J., Ou, W., Wolf, C., Jolion, J.M., Todoran, L., Worring, M., Lin, X.: ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions. International Journal on Document Analysis and Recognition 7(2-3), 105–122 (2005)
Lucas, S., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 Robust Reading Competitions. In: IEEE International Conference on Document Analysis and Recognition, pp. 682–687 (2003)
Nagy, R., Dicker, A., Meyer-Wegener, K.: Definition and Evaluation of the NEOCR Dataset for Natural-Image Text Recognition. Tech. Rep. CS-2011-07, University of Erlangen, Dept. of Computer Science (2011)
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision 77, 157–173 (2008)
Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)
Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1733–1746 (2009)
Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1994)
Wu, W., Chen, X., Yang, J.: Incremental Detection of Text on Road Signs from Video with Application to a Driving Assistant System. In: ACM International Conference on Multimedia, pp. 852–859. ACM, New York (2004)
Zhu, Q., Yeh, M.C., Cheng, K.T.: Multimodal Fusion using Learned Text Concepts for Image Categorization. In: ACM International Conference on Multimedia, pp. 211–220. ACM, New York (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Nagy, R., Dicker, A., Meyer-Wegener, K. (2012). NEOCR: A Configurable Dataset for Natural Image Text Recognition. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-29364-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)