NEOCR: A Configurable Dataset for Natural Image Text Recognition

Nagy, Robert; Dicker, Anders; Meyer-Wegener, Klaus

doi:10.1007/978-3-642-29364-1_12

Robert Nagy¹⁸,
Anders Dicker¹⁸ &
Klaus Meyer-Wegener¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7139))

Included in the following conference series:

International Workshop on Camera-Based Document Analysis and Recognition

1155 Accesses
21 Citations

Abstract

Recently growing attention has been paid to recognizing text in natural images. Natural image text OCR is far more complex than OCR in scanned documents. Text in real world environments appears in arbitrary colors, font sizes and font types, often affected by perspective distortion, lighting effects, textures or occlusion. Currently there are no datasets publicly available which cover all aspects of natural image OCR. We propose a comprehensive well-annotated configurable dataset for optical character recognition in natural images for the evaluation and comparison of approaches tackling with natural image text OCR. Based on the rich annotations of the proposed NEOCR dataset new and more precise evaluations are now possible, which give more detailed information on where improvements are most required in natural image text OCR.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 69.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Chars74K Dataset, http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/
Google Street View, http://maps.google.com
ICDAR Robust Reading Dataset, http://algoval.essex.ac.uk/icdar/Datasets.html
ImageMagick, http://www.imagemagick.org
knfbReader, http://www.knfbreader.com
LabelMe Dataset, http://labelme.csail.mit.edu/
Microsoft Text Detection Database, http://research.microsoft.com/en-us/um/people/eyalofek/text_detection_database.zip
NEOCR Dataset, http://www6.cs.fau.de/research/projects/pixtract/neocr
Street View Text Dataset, http://vision.ucsd.edu/~kai/svt/
The PASCAL Visual Object Classes Challenge, http://pascallin.ecs.soton.ac.uk/challenges/VOC/
Word Lens, http://questvisual.com/
de Campos, T.E., Babu, M.R., Varma, M.: Character Recognition in Natural Images. In: International Conference on Computer Vision Theory and Applications (2009)
Google Scholar
Chang, L.Z., ZhiYing, S.Z.: Robust Pre-processing Techniques for OCR Applications on Mobile Devices. In: ACM International Conference on Mobile Technology, Application and Systems (2009)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting Text in Natural Scenes with Stroke Width Transform. In: IEEE International Conference on Computer Vision and Pattern Recognition, pp. 2963–2970 (2010)
Google Scholar
Ferzli, R., Karam, L.J.: A No-Reference Objective Image Sharpness Metric Based on the Notion of Just Noticeable Blur (JNB). IEEE Transactions on Image Processing 18(4), 717–728 (2009)
Article MathSciNet Google Scholar
Liang, J., Doermann, D., Li, H.: Camera-based Analysis of Text and Documents: A Survey. International Journal on Document Analysis and Recognition 7, 84–104 (2005)
Article Google Scholar
Lopresti, D., Zhou, J.: Locating and Recognizing Text in WWW Images. Information Retrieval 2(2-3), 177–206 (2000)
Article Google Scholar
Lucas, S.M., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R., Ashida, K., Nagai, H., Okamoto, M., Yamamoto, H., Miyao, H.M., Zhu, J., Ou, W., Wolf, C., Jolion, J.M., Todoran, L., Worring, M., Lin, X.: ICDAR 2003 Robust Reading Competitions: Entries, Results, and Future Directions. International Journal on Document Analysis and Recognition 7(2-3), 105–122 (2005)
Article Google Scholar
Lucas, S., Panaretos, A., Sosa, L., Tang, A., Wong, S., Young, R.: ICDAR 2003 Robust Reading Competitions. In: IEEE International Conference on Document Analysis and Recognition, pp. 682–687 (2003)
Google Scholar
Nagy, R., Dicker, A., Meyer-Wegener, K.: Definition and Evaluation of the NEOCR Dataset for Natural-Image Text Recognition. Tech. Rep. CS-2011-07, University of Erlangen, Dept. of Computer Science (2011)
Google Scholar
Russell, B.C., Torralba, A., Murphy, K.P., Freeman, W.T.: LabelMe: A Database and Web-Based Tool for Image Annotation. International Journal of Computer Vision 77, 157–173 (2008)
Article Google Scholar
Wang, K., Belongie, S.: Word Spotting in the Wild. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 591–604. Springer, Heidelberg (2010)
Chapter Google Scholar
Weinman, J.J., Learned-Miller, E., Hanson, A.R.: Scene Text Recognition Using Similarity and a Lexicon with Sparse Belief Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence 31(10), 1733–1746 (2009)
Article Google Scholar
Wolberg, G.: Digital Image Warping. IEEE Computer Society Press, Los Alamitos (1994)
Google Scholar
Wu, W., Chen, X., Yang, J.: Incremental Detection of Text on Road Signs from Video with Application to a Driving Assistant System. In: ACM International Conference on Multimedia, pp. 852–859. ACM, New York (2004)
Google Scholar
Zhu, Q., Yeh, M.C., Cheng, K.T.: Multimodal Fusion using Learned Text Concepts for Image Categorization. In: ACM International Conference on Multimedia, pp. 211–220. ACM, New York (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science 6 (Data Management), University of Erlangen-Nürnberg, Martensstr. 3., Erlangen, Germany
Robert Nagy, Anders Dicker & Klaus Meyer-Wegener

Authors

Robert Nagy
View author publications
You can also search for this author in PubMed Google Scholar
Anders Dicker
View author publications
You can also search for this author in PubMed Google Scholar
Klaus Meyer-Wegener
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Graduate School of Engineering, Dept. of Computer Science and Intelligent Systems, Osaka Prefecture University, 1-1 Gakuencho, Naka Sakai, 599-8531, Osaka, Japan
Masakazu Iwamura
German Research Center for Artificial Intelligence, Multimedia Analysis and Data Mining Competence Center, Trippstadter Str. 122, 67663, Kaiserslautern, Germany
Faisal Shafait

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Nagy, R., Dicker, A., Meyer-Wegener, K. (2012). NEOCR: A Configurable Dataset for Natural Image Text Recognition. In: Iwamura, M., Shafait, F. (eds) Camera-Based Document Analysis and Recognition. CBDAR 2011. Lecture Notes in Computer Science, vol 7139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-29364-1_12

Download citation

DOI: https://doi.org/10.1007/978-3-642-29364-1_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-29363-4
Online ISBN: 978-3-642-29364-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics