A Video Text Detection Method Based on Key Text Points

Li, Zhi; Liu, Guizhong; Qian, Xueming; Wang, Chen; Ma, Yana; Yang, Yang

doi:10.1007/978-3-642-15702-8_26

Zhi Li²²,
Guizhong Liu²²,
Xueming Qian²²,
Chen Wang²²,
Yana Ma²² &
…
Yang Yang²²

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 6297))

Included in the following conference series:

Pacific-Rim Conference on Multimedia

1476 Accesses
1 Citations

Abstract

This paper proposes a novel video text detection method based on the key text points. For text detection, the keyframes is decomposed by wavelet transform. The key text points (KTPs) are determined by three resulting high-frequency subbands, and merged by the morphological operations. The anti-texture-direction-projection method is proposed for text line localization and verification. A fast text tracking scheme is proposed, in which text detection is only implemented on the first keyframe of an identical text line in the duration. The appearing (disappearing) frame is determined by a fast search method. Experimental results show that the proposed text detection method is robust to the font size, style, color and alignment of texts. The proposed text tracking greatly speeds up the text detection.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Tang, X., Gao, X., Liu, J., Zhang, H.: A spatial-temporal approach for video caption detection and recognition. IEEE Transaction on Neural Networks 13, 961–971 (2002)
Article Google Scholar
Ye, Q., Huang, Q., Gao, W., Zhao, D.: Fast and robust text detection in images and video frames. Image and Vision Computing 23, 565–576 (2005)
Article Google Scholar
Hase, H., Shinokawa, T., Yoneda, M., Suen, C.Y.: Character string extraction from color documents. Pattern Recognition 34, 1349–1365 (2001)
Article MATH Google Scholar
Qian, X., Liu, G., Wang, H., Su, R.: Text detection, localization, and tracking in compressed video. Signal Processing: Image communication 22, 752–768 (2007)
Article Google Scholar
Lyu, M.R., Song, J.Q., Cai, M.: A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Transaction on Circuits and Systems for Video Technology 15, 243–255 (2005)
Article Google Scholar
Jung, K., Kim, K.I., Jain, A.K.: Text information extraction in images and video: a survey. Pattern Recognition 37, 977–997 (2004)
Article Google Scholar
Chen, T.: Text localization using DWT fusion algorithm. In: IEEE International Conference on Communication Technology, pp. 722–725 (2008)
Google Scholar
Chen, D., Odobez, J., Thiran, J.: A localization/ verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods. Signal Processing: Image Communication 19, 205–217 (2004)
Article Google Scholar
Hua, X.S., Yin, P., Zhang, H.J.: Efficient video text recognition using multiple frame integration. In: IEEE International Conference on Image Processing, vol. 2, pp. 397–400 (2002)
Google Scholar
Wang, R., Jin, W., Wu, L.: A novel video caption detection approach using multi-frame integration. In: International Conference on Pattern Recognition, pp. 449–452 (2004)
Google Scholar
Sato, T., Kanade, T.: Video OCR: Indexing digital news libraries by recognition of superimposed captions. Multimedia Systems 7, 385–395 (1999)
Article Google Scholar
Lienhart, R., Effelsberg, W.: Automatic text segmentation and text recognition for video indexing. Multimedia Systems 8, 69–81 (2000)
Article Google Scholar
Tanaka, M., Goto, H.: Text-tracking wearable camera system for visually-impaired people. In: International Conference on Pattern Recognition, pp. 1–4 (2008)
Google Scholar
Gargi, U., Crandall, D., Antani, S., Gandhi, T., Keener, R., Kasturi, R.: A system for automatic text detection in video. In: International Conference on Document Analysis and Recognition, pp. 29–32 (1999)
Google Scholar
Jiang, H., Liu, G., Qian, X., Nan, N., Guo, D., Li, Z., Sun, L.: A fast and effective text tracking in compressed video. In: IEEE International Symposium on Multimedia (ISM), pp. 136–141 (2008)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Electronic and Information Engineering, Xi’an Jiaotong University, Xi’an, China
Zhi Li, Guizhong Liu, Xueming Qian, Chen Wang, Yana Ma & Yang Yang

Authors

Zhi Li
View author publications
You can also search for this author in PubMed Google Scholar
Guizhong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xueming Qian
View author publications
You can also search for this author in PubMed Google Scholar
Chen Wang
View author publications
You can also search for this author in PubMed Google Scholar
Yana Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yang Yang
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

School of Computer Science, University of Nottingham, Jubilee Campus, NG8 1BB, Nottingham, UK
Guoping Qiu
The Centre for Multimedia Signal Processing, The Hong Kong Polytechnic University, Hong Kong, China
Kin Man Lam
Faculty of System Design, Tokyo Metropolitan University, 6-6, Asahigaoka, 191-0065, Hino-city, Tokyo
Hitoshi Kiya
Shanghai Key Laboratory of Intelligent Information Processing, Department of Computer Science & Engineering, Fudan University, Shanghai, China
Xiang-Yang Xue
Department of Electrical Engineering, University of Southern California, 90089-2564, Los Angeles, CA
C.-C. Jay Kuo
LIACS Media Lab, Leiden University,
Michael S. Lew

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, Z., Liu, G., Qian, X., Wang, C., Ma, Y., Yang, Y. (2010). A Video Text Detection Method Based on Key Text Points. In: Qiu, G., Lam, K.M., Kiya, H., Xue, XY., Kuo, CC.J., Lew, M.S. (eds) Advances in Multimedia Information Processing - PCM 2010. PCM 2010. Lecture Notes in Computer Science, vol 6297. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-15702-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-642-15702-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-15701-1
Online ISBN: 978-3-642-15702-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics