Abstract
Words are the most indispensable information in human life. It is important to analyze and understand the meaning of words. Compared with the general visual elements, the text conveys rich and high-level meaning information, which enables the computer to better understand the semantic content of the text. With the rapid development of computer technology, the research on text information recognition has made great achievements. However, in the face of dealing with text characters in natural scenes, there are certain limitations in the recognition of natural scene images. Because scene images have more interference and complexity than text, these factors make the identification of natural scene image texts facing many challenges. This paper focused on the recognition of natural scene image texts, and mainly studied a text recognition method based on deep learning in natural scene images. Firstly, text recognition is based on Kares using the Dense Convolutional Network (DenseNet) network model by using the existing standard test data set. Secondly, each character is classified using Softmax outputs to achieve the use of automatic learning. The characteristics of the context replace the manually defined features, which improve the recognition efficiency and accuracy. Lastly, the text recognition of the natural scene image is realized. And the method is suitable for problems encountered in medical images.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Goel, V., Mishra, A., Alahari, K., et al.: Whole is greater than sum of parts: recognizing scene text words. In: International Conference on Document Analysis and Recognition, pp. 398–402. IEEE Computer Society (2013)
Rodriguez-Serrano, J.A., Perronnin, F.: Label embedding for text recognition. In: BMVC (2013)
Goodfellow, I.J., Bulatov, Y., Ibarz, J., et al.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. Comput. Sci. (2013)
Goodfellow, I.J., Warde-Farley, D., Mirza, M., et al.: Maxout networks. arXiv preprint arXiv:1302.4389 (2013)
Yaeger, L.S., Lyon, R.F., Webb, B.J.: Effective training of a neural network character classifier for word recognition. In: Advances in Neural Information Processing Systems, pp. 807–816 (1997)
Jaderberg, M., Simonyan, K., Vedaldi, A., et al.: Synthetic data and artificial neural networks for natural scene text recognition. Eprint Arxiv (2014)
Zhang, K., Zuo, W., Chen, Y., et al.: Beyond a Gaussian denoiser: residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 26(7), 3142–3155 (2017)
Jaderberg, M., Simonyan, K., Vedaldi, A., et al.: Reading text in the wild with convolutional neural networks. Int. J. Comput. Vis. 116(1), 1–20 (2016)
Burden, V., Campbell, R.: The development of word-coding skills in the born deaf: an experimental study of deaf school-leavers. Br. J. Dev. Psychol. 12(3), 331–349 (1994)
Goodfellow, I.J., Bulatov, Y., Ibarz, J., et al.: Multi-digit number recognition from street view imagery using deep convolutional neural networks. arXiv preprint arXiv:1312.6082 (2013)
Harrington, S.J., Klassen, R.V.: Subpixel character positioning with antialiasing with grey masking techniques: U.S. Patent 5,701,365 23 December 1997
Shi, J., Malik, J.: Normalized cuts and image segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905 (2000)
Aghdam, M.H., Ghasem-Aghaee, N., Basiri, M.E.: Text feature selection using ant colony optimization. Expert Syst. Appl. 36(3), 6843–6853 (2009)
Tian, J., Chen, D.M.: Optimization in multi-scale segmentation of high-resolution satellite images for artificial feature recognition. Int. J. Remote Sens. 28(20), 4625–4644 (2007)
Fang, W., Zhang, F., Sheng, V.S., Ding, Y.: A method for improving CNN-based image recognition using DCGAN. CMC Comput. Mater. Continua 57(1), 167–178 (2018)
Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)
Coates, A., Carpenter, B., Case, C., et al.: Text detection and character recognition in scene images with unsupervised feature learning. In: International Conference on Document Analysis and Recognition, pp. 440–445. IEEE (2011)
Huang, G., Liu, Z., Van Der Maaten, L., et al.: Densely connected convolutional networks. In: CVPR, vol. 1, no. 2, p. 3 (2017)
Bien, Z., Chung, M.J., Chang, P.H., et al.: Integration of a rehabilitation robotic system (KARES II) with human-friendly man-machine interaction units. Auton. Rob. 16(2), 165–191 (2004)
Zhu, Y., Newsam, S.: DenseNet for Dense Flow, pp. 790–794 (2017)
Salakhutdinov, R., Hinton, G.E.: Replicated softmax: an undirected topic model. In: International Conference on Neural Information Processing Systems, pp. 1607–1614. Curran Associates Inc. (2009)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)
Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Oquab, M., Bottou, L., Laptev, I., Sivic, J.: Learning and transferring mid-level image representations using convolutional neural networks. In: Computer Vision and Pattern Recognition, pp. 1717–1724 (2014)
Zhang, Y., Wang, Y., Li, Y., Wu, X.: Sentiment classification based on piecewise pooling convolutional neural network. CMC Comput. Mater. Continua 56(2), 285–297 (2018)
Krizhevsky, A., Hinton, G.: Convolutional deep belief networks on cifar-10. Unpublished Manuscript 40(7) (2010)
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE (2015)
Huang, F., Ash, J., Langford, J., et al.: Learning Deep ResNet Blocks Sequentially using Boosting Theory (2018)
Szegedy, C., Ioffe, S., Vanhoucke, V., et al.: Inception-v4, inception-resnet and the impact of residual connections on learning. In: AAAI, vol. 4, p. 12 (2017)
Abadi, M., Barham, P., Chen, J., et al.: Tensorflow: a system for large-scale machine learning. In: OSDI, vol. 16, pp. 265–283 (2016)
Bottou, L.: Large-scale machine learning with stochastic gradient descent. In: Lechevallier, Y., Saporta, G. (eds.) Proceedings of COMPSTAT 2010, pp. 177–186. Physica, Heidelberg (2010). https://doi.org/10.1007/978-3-7908-2604-3_16
Acknowledgments
This study was supported by the Scientific Research Foundation (KYTZ201718) of CUIT. (KYTZ201718).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Li, Y., Luo, C., Yang, H., Wu, T. (2019). Convolutional Neural Networks for Scene Image Recognition. In: Sun, X., Pan, Z., Bertino, E. (eds) Artificial Intelligence and Security. ICAIS 2019. Lecture Notes in Computer Science(), vol 11632. Springer, Cham. https://doi.org/10.1007/978-3-030-24274-9_42
Download citation
DOI: https://doi.org/10.1007/978-3-030-24274-9_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-24273-2
Online ISBN: 978-3-030-24274-9
eBook Packages: Computer ScienceComputer Science (R0)