Abstract
The main causes of getting poor results in video text detection is low quality of frames and which is affected by different factors like de-blurring, complex background, illumination etc. are few of the challenges encountered in image enhancement. This paper proposes a technique for enhancing image quality for better human perception along with text detection for video frames. An approach based on set of smart and effective CNN denoisers are designed and trained to denoise an image by adopting variable splitting technique, the robust denoisers are plugged into model based optimization methods with HQS framework to handle image deblurring and super resolution problems. Further, for detecting text from denoised frames, we have used state-of-art methods such as MSER (Maximally Extremal Regions) and SWT (Stroke Width Transform) and experiments are done on our database, ICDAR and YVT database to demonstrate our proposed work in terms of precision, recall and F-measure.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Sato, T., Kanade, T., Hughes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content Based Access of Image and Video Databases, Bombay, India, pp. 52–60 (1998)
Li, H., Kia, O., Doermann, D.: Text enhancement in digital video. In: Proceedings of SPIE, Document Recognition IV, pp. 1–8 (1999)
Li, H., Doerman, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. Image Process. 9, 147–156 (2000)
Li, H., Doermann, D.: A video text detection system based on automated training. In: Proceedings of IEEE International Conference on Pattern Recognition, pp. 223–226 (2000)
Chen, D., Odobez, J., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of International Conference on Pattern Recognition, Quebec, Canada, vol. 4, pp. 227–230 (2002)
Rainer, L., Stuber, F.: Automatic text recognition in digital videos. Technical Report, University of Mannheim (1995)
Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: can plain neural networks compete with BM3D? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012)
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Dong, W., Zhang, L., Shi, G., Li, X.: Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2013)
Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolution neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014)
Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recogn. 31(12), 2055–2076 (1998)
Petter, M., Fragoso, V., Turk, M., Baur, C.: Automatic text detection for mobile augmented reality translation. In: Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV 2011), pp. 48–55 (2011)
Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)
Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L.: Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images. IEEE Trans. Circ. Syst. Video Technol. 23(10), 1729–1739 (2013)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 2963–2970 (2010)
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British Machine Vision Conference, vol. 1, pp. 384–393 (2002)
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolution neural networks. In: Proceedings of International Conference on Pattern Recognition (ICPR 2012), pp. 3304–3308 (2012)
Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Proceedings of the 13th European Conference on Computer Vision (ECCV 2014), pp. 512–528 (2014)
Yin, X.-C., Yin, X., Huang, K., Hao, H.-W.: Robust text detection in natural scene images. IEEE Trans. PAMI 36(5), 970–983 (2014)
Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. In: Computer Vision and Pattern Recognition, CVPR (2017)
Andrews, H.C., Hunt, B.R.: Digital Image Restoration. Prentice-Hall Signal Processing Series, vol. 1. Prentice-Hall, Englewood Cliffs (1977)
Campisi, P., Egiazarian, K.: Blind Image Deconvolution: Theory and Applications. CRC Press, New York (2016)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B: Robust text detection in natural scene images with edge-enhanced maximally stable extremal regions. In: 18th IEEE International Conference Image Processing (ICIP), pp. 2609–2612 (2011)
Acknowledgment
The work carried out in this paper was supported by High Performance Computing Lab, under UPE Grant Department of Studies in Computer Science, University of Mysore, Mysore.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG, part of Springer Nature
About this paper
Cite this paper
Sunil, C., Chethan, H.K., Raghunandan, K.S., Hemantha Kumar, G. (2018). A Deep Convolution Neural Network Based Model for Enhancing Text Video Frames for Detection. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_42
Download citation
DOI: https://doi.org/10.1007/978-3-319-76348-4_42
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)