Video Preprocessing

Lu, Tong; Palaiahnakote, Shivakumara; Tan, Chew Lim; Liu, Wenyin

doi:10.1007/978-1-4471-6515-6_2

Tong Lu⁷,
Shivakumara Palaiahnakote⁸,
Chew Lim Tan⁹ &
…
Wenyin Liu¹⁰

Part of the book series: Advances in Computer Vision and Pattern Recognition ((ACVPR))

1141 Accesses

Abstract

Extracting texts from video always faces variations in font style, size, color, orientation, and brightness; thus, video preprocessing techniques are required to reduce the complexity of the succeeding steps consisting of video text detection, localization, segmentation, recognition, and script identification. This chapter gives a brief overview of the preprocessing techniques that are often used in video text detection. After introducing image preprocessing operators, we discuss several color-based and texture-based preprocessing techniques, respectively. Since image segmentation plays an important role in video text detection, we then introduce several image segmentation approaches. Next, the motion analysis technique which is helpful to improve the efficiency or the accuracy of video text detection by tracing text from temporal frames is introduced. Most of the introduced preprocessing operators and methods have been realized by MATLAB or OpenCV (Open Source Computer Vision Library), and readers can make use of these open sources for practice.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Hardcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Jung K, In Kim K, Jain AK (2004) Text information extraction in images and video: a survey. Pattern Recog 37(5):977–997
Article Google Scholar
Lienhart RW, Stuber F (1996) Automatic text recognition in digital videos. Proc SPIE 2666(3):180–188
Article Google Scholar
Crane R (1996) Simplified approach to image processing: classical and modern techniques in C. Prentice Hall PTR. 317
Google Scholar
Szeliski R (2010) Computer vision: algorithms and applications. Springer, New York
Google Scholar
Kopf J et al (2007) Capturing and viewing gigapixel images. ACM Trans Graph 26(3):93
Article Google Scholar
Roberts LG (1963) Machine perception of three-dimensional solids, DTIC Document
Google Scholar
Engel K et al (2006) Real-time volume graphics: AK Peters, Limited
Google Scholar
Ritter GX, Wilson JN (1996) Handbook of computer vision algorithms in image algebra, vol 1. Citeseer
Google Scholar
Hasan YMY, Karam LJ (2000) Morphological text extraction from images. IEEE Trans Image Process 9(11):1978–1983
Article Google Scholar
Jae-Chang S, Dorai C, Bolle R (1998) Automatic text extraction from video for content-based annotation and retrieval. In: Proceedings of the fourteenth international conference on pattern recognition, 1998
Google Scholar
Kim H-K (1996) Efficient automatic text location method and content-based indexing and structuring of video database. J Vis Commun Image Represent 7(4):336–344
Article Google Scholar
Jain AK, Yu BIN (1998) Automatic text location in images and video frames. Pattern Recogn 31(12):2055–2076
Article Google Scholar
Zhong Y, Karu K, Jain AK (1995) Locating text in complex color images. Pattern Recogn 28(10):1523–1535
Article Google Scholar
Wong EK, Chen M (2003) A new robust algorithm for video text extraction. Pattern Recogn 36(6):1397–1406
Article MATH Google Scholar
Shivakumara P, Trung Quy P, Tan CL (2011) A laplacian approach to multi-oriented text detection in video. IEEE Trans Pattern Anal Mach Intell 33(2):412–419
Article Google Scholar
Kim KI, Jung K, Kim JH (2003) Texture-based approach for text detection in images using support vector machines and continuously adaptive mean shift algorithm. IEEE Trans Pattern Anal Mach Intell 25(12):1631–1639
Article MathSciNet Google Scholar
Ye Q et al (2007) Text detection and restoration in natural scene images. J Vis Commun Image Represent 18(6):504–513
Article Google Scholar
Shivakumara P, Trung Quy P, Tan CL (2009) A robust wavelet transform based technique for video text detection. In: ICDAR ‘09. 10th international conference on document analysis and recognition, 2009
Google Scholar
Zhong J, Jian W, Yu-Ting S (2009) Text detection in video frames using hybrid features. In: International conference on machine learning and cybernetics, 2009
Google Scholar
Zhao M, Li S, Kwok J (2010) Text detection in images using sparse representation with discriminative dictionaries. Image Vis Comput 28(12):1590–1599
Article Google Scholar
Shivakumara P, Trung Quy P, Tan CL (2010) New Fourier-statistical features in RGB space for video text detection. Circ Syst Video Technol IEEE Trans 20(11):1520–1532
Article Google Scholar
Rongrong J et al (2008) Directional correlation analysis of local Haar binary pattern for text detection. In: IEEE international conference on multimedia and expo, 2008
Google Scholar
Hanif SM, Prevost L (2007) Text detection in natural scene images using spatial histograms. In: 2nd workshop on camera based document analysis and recognition, Curitiba
Google Scholar
Chucai Y, YingLi T (2011) Text detection in natural scene images by Stroke Gabor Words. In: International conference on document analysis and recognition (ICDAR), 2011
Google Scholar
Qian X et al (2007) Text detection, localization, and tracking in compressed video. Signal Process Image Commun 22(9):752–768
Article Google Scholar
ZHANG YJ (2002) Image engineering and related publications. Int J Image Graph 02(03):441–452
Article Google Scholar
Haralick RM, Shapiro LG (1985) Image segmentation techniques. Comp Vis Graph Image Process 29(1):100–132
Article Google Scholar
Shapiro LG, Stockman GC (2001) Computer vision. Prentice-Hall, New Jersey, pp 279–325
Google Scholar
Otsu N (1975) A threshold selection method from gray-level histograms. Automatica 11(285–296):23–27
Google Scholar
Saraf Y (2006) Algorithms for image segmentation, Birla Institute of Technology and Science
Google Scholar
Comaniciu D, Meer P (2002) Mean shift: a robust approach toward feature space analysis. Pattern Anal Mach Intell IEEE Trans on 24(5):603–619
Article Google Scholar
Pantofaru C, Hebert M (2005) A comparison of image segmentation algorithms. Robotics Institute, p 336
Google Scholar
Felzenszwalb P, Huttenlocher D (2004) Efficient graph-based image segmentation. Int J Comp Vis 59(2):167–181
Article Google Scholar
Shi J, Malik J (2000) Normalized cuts and image segmentation. Pattern Anal Mach Intell IEEE Trans 22(8):888–905
Article Google Scholar
Palma D, Ascenso J, Pereira F (2004) Automatic text extraction in digital video based on motion analysis. In: Campilho A, Kamel M (eds) Image analysis and recognition. Springer, Berlin, pp 588–596
Chapter Google Scholar
Brox T et al (2004) High accuracy optical flow estimation based on a theory for warping. In: Pajdla T, Matas J (eds) Computer vision – ECCV 2004. Springer, Berlin, pp 25–36
Chapter Google Scholar
Beauchemin SS, Barron JL (1995) The computation of optical flow. ACM Comput Surv 27(3):433–466
Article Google Scholar
Anandan P (1989) A computational framework and an algorithm for the measurement of visual motion. Int J Comp Vis 2(3):283–310
Article Google Scholar
Weickert J, Schnörr C (2001) A theoretical framework for convex regularizers in PDE-based computation of image motion. Int J Comp Vis 45(3):245–264
Article MATH Google Scholar
Mémin E, Pérez P (2002) Hierarchical estimation and segmentation of dense motion fields. Int J Comp Vis 46(2):129–155
Article MATH Google Scholar
Sun D et al (2008) Learning optical flow. In: Forsyth D, Torr P, Zisserman A (eds) Computer vision – ECCV 2008. Springer, Berlin, pp 83–97
Chapter Google Scholar
Black MJ, Anandan P (1996) The robust estimation of multiple motions: parametric and piecewise-smooth flow fields. Comp Vis Image Underst 63(1):75–104
Article Google Scholar
Głowacz A, Mikrut Z, Pawlik P (2012) Video detection algorithm using an optical flow calculation method. In: Dziech A, Czyżewski A (eds) Multimedia communications, services and security. Springer, Berlin, pp 118–129
Chapter Google Scholar
Horn BKP, Schunck BG (1981) Determining optical flow. Artif Intell 17(1–3):185–203
Article Google Scholar
Lucas BD, Kanade T (1981) An iterative image registration technique with an application to stereo vision. In: IJCAI
Google Scholar
Kui L et al (2010) Optical flow and principal component analysis-based motion detection in outdoor videos. EURASIP J Adv Signal Proc
Google Scholar
Zhao Y et al (2011) Real-time video caption detection. In: Proceedings of the ninth IAPR international workshop on graphics recognition
Google Scholar
Gonzalez RC, Woods RE, Eddins SL Digital image processing using matlab. Prentice Hall
Google Scholar
Nixon MS, Aguado AS Feature extraction & image processing for computer vision, 3rd edn. Academic
Google Scholar
Forsyth DA, Ponce J Computer vision: a modern approach, 2nd edn. Prentice Hall Press
Google Scholar
Petrou M, Petrou C Image processing: the fundamentals. Wiley
Google Scholar
Haralick RM, Shanmugam K (1973) Its’Hak Dinstein. Textual features for image classification. IEEE Trans Syst Man Cybern 3(6):610–621
Article MathSciNet Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science and Technology, Nanjing University, Nanjing, China
Tong Lu
Faculty of CSIT, University of Malaya, Kuala Lumpur, Malaysia
Shivakumara Palaiahnakote
National University of Singapore, Singapore, Singapore
Chew Lim Tan
Multimedia Software Engineering Research Center, City University of Hong Kong, Kowloon Tong, Hong Kong SAR
Wenyin Liu

Authors

Tong Lu
View author publications
You can also search for this author in PubMed Google Scholar
Shivakumara Palaiahnakote
View author publications
You can also search for this author in PubMed Google Scholar
Chew Lim Tan
View author publications
You can also search for this author in PubMed Google Scholar
Wenyin Liu
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Lu, T., Palaiahnakote, S., Tan, C.L., Liu, W. (2014). Video Preprocessing. In: Video Text Detection. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6515-6_2

Download citation

DOI: https://doi.org/10.1007/978-1-4471-6515-6_2
Published: 30 June 2014
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6514-9
Online ISBN: 978-1-4471-6515-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics