A Deep Convolution Neural Network Based Model for Enhancing Text Video Frames for Detection

Sunil, C.; Chethan, H. K.; Raghunandan, K. S.; Hemantha Kumar, G.

doi:10.1007/978-3-319-76348-4_42

A Deep Convolution Neural Network Based Model for Enhancing Text Video Frames for Detection

C. Sunil¹⁸,
H. K. Chethan¹⁸,
K. S. Raghunandan¹⁹ &
…
G. Hemantha Kumar¹⁹

Conference paper
First Online: 22 March 2018

1831 Accesses

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 736))

Abstract

The main causes of getting poor results in video text detection is low quality of frames and which is affected by different factors like de-blurring, complex background, illumination etc. are few of the challenges encountered in image enhancement. This paper proposes a technique for enhancing image quality for better human perception along with text detection for video frames. An approach based on set of smart and effective CNN denoisers are designed and trained to denoise an image by adopting variable splitting technique, the robust denoisers are plugged into model based optimization methods with HQS framework to handle image deblurring and super resolution problems. Further, for detecting text from denoised frames, we have used state-of-art methods such as MSER (Maximally Extremal Regions) and SWT (Stroke Width Transform) and experiments are done on our database, ICDAR and YVT database to demonstrate our proposed work in terms of precision, recall and F-measure.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Sato, T., Kanade, T., Hughes, E.K., Smith, M.A.: Video OCR for digital news archive. In: Proceedings of IEEE Workshop on Content Based Access of Image and Video Databases, Bombay, India, pp. 52–60 (1998)
Google Scholar
Li, H., Kia, O., Doermann, D.: Text enhancement in digital video. In: Proceedings of SPIE, Document Recognition IV, pp. 1–8 (1999)
Google Scholar
Li, H., Doerman, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. Image Process. 9, 147–156 (2000)
Article Google Scholar
Li, H., Doermann, D.: A video text detection system based on automated training. In: Proceedings of IEEE International Conference on Pattern Recognition, pp. 223–226 (2000)
Google Scholar
Chen, D., Odobez, J., Bourlard, H.: Text segmentation and recognition in complex background based on Markov random field. In: Proceedings of International Conference on Pattern Recognition, Quebec, Canada, vol. 4, pp. 227–230 (2002)
Google Scholar
Rainer, L., Stuber, F.: Automatic text recognition in digital videos. Technical Report, University of Mannheim (1995)
Google Scholar
Burger, H.C., Schuler, C.J., Harmeling, S.: Image denoising: can plain neural networks compete with BM3D? In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 2392–2399 (2012)
Google Scholar
Dong, C., Loy, C.C., He, K., Tang, X.: Image super-resolution using deep convolution networks. IEEE Trans. Pattern Anal. Mach. Intell. 38(2), 295–307 (2016)
Article Google Scholar
Dong, W., Zhang, L., Shi, G., Li, X.: Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 22(4), 1620–1630 (2013)
Article MathSciNet MATH Google Scholar
Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolution neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014)
Google Scholar
Jain, A.K., Yu, B.: Automatic text location in images and video frames. Pattern Recogn. 31(12), 2055–2076 (1998)
Article Google Scholar
Petter, M., Fragoso, V., Turk, M., Baur, C.: Automatic text detection for mobile augmented reality translation. In: Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV 2011), pp. 48–55 (2011)
Google Scholar
Lyu, M.R., Song, J., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans. Circ. Syst. Video Technol. 15(2), 243–255 (2005)
Article Google Scholar
Shivakumara, P., Phan, T.Q., Lu, S., Tan, C.L.: Gradient vector flow and grouping-based method for arbitrarily oriented scene text detection in video images. IEEE Trans. Circ. Syst. Video Technol. 23(10), 1729–1739 (2013)
Article Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: Proceedings of International Conference on Computer Vision and Pattern Recognition, CVPR 2010, pp. 2963–2970 (2010)
Google Scholar
Matas, J., Chum, O., Urban, M., Pajdla, T.: Robust wide baseline stereo from maximally stable extremal regions. In: Proceedings of British Machine Vision Conference, vol. 1, pp. 384–393 (2002)
Google Scholar
Wang, T., Wu, D.J., Coates, A., Ng, A.Y.: End-to-end text recognition with convolution neural networks. In: Proceedings of International Conference on Pattern Recognition (ICPR 2012), pp. 3304–3308 (2012)
Google Scholar
Jaderberg, M., Vedaldi, A., Zisserman, A.: Deep features for text spotting. In: Proceedings of the 13th European Conference on Computer Vision (ECCV 2014), pp. 512–528 (2014)
Google Scholar
Yin, X.-C., Yin, X., Huang, K., Hao, H.-W.: Robust text detection in natural scene images. IEEE Trans. PAMI 36(5), 970–983 (2014)
Article Google Scholar
Zhang, K., Zuo, W., Gu, S., Zhang, L.: Learning deep CNN denoiser prior for image restoration. In: Computer Vision and Pattern Recognition, CVPR (2017)
Google Scholar
Andrews, H.C., Hunt, B.R.: Digital Image Restoration. Prentice-Hall Signal Processing Series, vol. 1. Prentice-Hall, Englewood Cliffs (1977)
Google Scholar
Campisi, P., Egiazarian, K.: Blind Image Deconvolution: Theory and Applications. CRC Press, New York (2016)
Google Scholar
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transform. In: IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR (2010)
Google Scholar
Chen, H., Tsai, S.S., Schroth, G., Chen, D.M., Grzeszczuk, R., Girod, B: Robust text detection in natural scene images with edge-enhanced maximally stable extremal regions. In: 18th IEEE International Conference Image Processing (ICIP), pp. 2609–2612 (2011)
Google Scholar

Download references

Acknowledgment

The work carried out in this paper was supported by High Performance Computing Lab, under UPE Grant Department of Studies in Computer Science, University of Mysore, Mysore.

Author information

Authors and Affiliations

Department of Computer Science and Engineering, Maharaja Research Foundation, Maharaja Institute of Technology, Mysore, Karnataka, India
C. Sunil & H. K. Chethan
Department of Studies in Computer Science, University of Mysore, Mysore, Karnataka, India
K. S. Raghunandan & G. Hemantha Kumar

Authors

C. Sunil
View author publications
You can also search for this author in PubMed Google Scholar
H. K. Chethan
View author publications
You can also search for this author in PubMed Google Scholar
K. S. Raghunandan
View author publications
You can also search for this author in PubMed Google Scholar
G. Hemantha Kumar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to C. Sunil .

Editor information

Editors and Affiliations

Machine Intelligence Research Labs , Auburn, Washington, USA
Ajith Abraham
Department of Computer Science, South Asian University, Chanakyapuri, Delhi, India
Pranab Kr. Muhuri
Faculty of Information and Communication Technology, Universiti Teknikal Malaysia Melaka , Durian Tunggal, Melaka, Malaysia
Azah Kamilah Muda
Machine Intelligence Research Labs , Auburn, Washington, USA
Niketa Gandhi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sunil, C., Chethan, H.K., Raghunandan, K.S., Hemantha Kumar, G. (2018). A Deep Convolution Neural Network Based Model for Enhancing Text Video Frames for Detection. In: Abraham, A., Muhuri, P., Muda, A., Gandhi, N. (eds) Intelligent Systems Design and Applications. ISDA 2017. Advances in Intelligent Systems and Computing, vol 736. Springer, Cham. https://doi.org/10.1007/978-3-319-76348-4_42

Download citation

DOI: https://doi.org/10.1007/978-3-319-76348-4_42
Published: 22 March 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-76347-7
Online ISBN: 978-3-319-76348-4
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics