Abstract
This chapter introduces methods for text binarization and recognition as post-processing for text detection. Binarization pertains to the separation of text from the background of the detected text block. As an example of binarization methods, this chapter presents a fusion method which combines wavelet and gradient bands for the text lines with the help of k-means clustering on every row and column to binarize the image. Since video and natural scene images suffer from low resolution and complex background, it is hard to develop effective binarization methods which preserve shapes of characters without losing text pixels. Therefore, this chapter further presents a method for character shape reconstruction using ring radius transform. The method obtains radius values for each pixel in the edge image of the input character image, which is based on distance to the nearest white pixel. The medial axis pixel is then found horizontally and vertically by selecting maximum radius values between the strokes. This medial axis value helps in filling the gap between end points while preserving the shape of the character.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Doermann D, Liang J, Li H (2003) Progress in camera-based document image analysis. In: Proceedings of the ICDAR, pp 606–616
Zang J, Kasturi R (2008) Extraction of text objects in video documents: recent progress. In: Proceedings of the DAS, pp 5–17
Wang K, Belongie S (2010) Word spotting in the wild. In: Proceedings of the ECCV, pp 591–604
Tang X, Gao X, Liu J, Zhang H (2002) A spatial-temporal approach for video caption detection and recognition. IEEE Trans Neural Netw 13:961–971
Lyu MR, Song J, Cai M (2005) A comprehensive method for multilingual video text detection, localization, and extraction. IEEE Trans CSVT 15:243–255
Mishara A, Alahari K, Jawahar CV (2011) An MRF model for binarization of natural scene text. In: Proceedings of the ICDAR, pp 11–16
Neumann L, Matas J (2011) A method for text localization and recognition in real-world images. In: Proceedings of the ACCV, pp 770–783
Chen D, Odobez JM (2005) Video text recognition using sequential Monte Carlo and error voting methods. Pattern Recogn Lett 26:1386–1403
Niblack W (1986) An introduction to digital image processing. Prentice Hall, Englewood Cliffs
Sauvola J, Seeppanen T, Haapakoski S, Pietikainen M (1997) Adaptive document binarization. In: Proceedings of the ICDAR, pp 147–152
He J, Do QDM, Downton AC, Kim JH (2005) A comparison of binarization methods for historical archive documents. In: Proceedings of the ICDAR, pp 538–542
Ntirogiannis K, Gotos B, Pratikakis I (2011) Binarization of textual content in video frames. In: Proceedings of the ICDAR, pp 673–677
Saidane Z, Garcia C (2007) Robust binarization for video text recognition. In: Proceedings of the ICDAR, pp 874–879
Zhou Z, Li L, Tan CL (2010) Edge based binarization of video text images. In: Proceedings of the ICPR, pp 133–136
Roy S, Shivakumara P, Roy P, Tan CL (2012) Wavelet-gradient-fusion for video text binarization. In: Proceedings of the ICPR, pp 3300–3303
Shivakumara P, Phan TQ, Tan CL (2011) A Laplacian approach to multi-oriented text detection in video. IEEE Trans PAMI 33:412–419
Pajares G, Cruz JM (2004) A wavelet-based image fusion tutorial. Pattern Recogn 37:1855–1872
Tesseract http://code.google.com/p/tesseract-ocr/
Chen X, Yang J, Zhang J, Waibel A (2004) Automatic detection and recognition of signs from natural scenes. IEEE Trans Image Process 13
Zhou P, Li L, Tan CL (2009) Character recognition under severe perspective distortion. In: Proceedings of the ICDAR, pp 676–680
Pan YF, Hou X, Liu CL (2009) Text localization in natural scene images based on conditional random field. In: Proceedings of the ICDAR, pp 6–10
Pan YF, Hou X, Liu CL (2008) A robust system to detect and localize texts in natural scene images. In: Proceedings of the DAS, pp 35–42
Chen D, Odobez JM, Bourlard H (2004) Text detection and recognition in images and video frames. Pattern Recogn 37:595–608
Lee SH, Kim JH (2008) Complementary combination of holistic and component analysis for recognition of low-resolution video character images. Pattern Recogn Lett 29:383–391
Ghosh A, Petkov N (2005) Robustness of shape descriptors to incomplete contour representations. IEEE Trans PAMI 27:1793–1804
Wang J, Yan H (1999) Mending broken handwriting with a macrostructure analysis method to improve recognition. Pattern Recogn Lett 20:855–864
Yu D, Yan H (2001) Reconstruction of broken handwritten digits based on structural morphological features. Pattern Recogn 34:235–254
Allier B, Emptoz H (2002) Degraded character image restoration using active contours: a first approach. In: Proceedings of the ACM symposium on document engineering, pp 142–148
Allier B, Bali N, Emptoz H (2006) Automatic accurate broken character restoration for patrimonial documents. IJDAR 8:246–261
Epshtein B, Ofek E, Wexler Y (2010) Detecting text in natural scenes with stroke width transform. In: Proceedings of the CVPR, pp 2963–2970
Phan TQ, Shivakumara P, Tan CL (2011) A gradient vector flow-based method for video character segmentation. In: Proceedings of the ICDAR, pp 1024–1028
Shivakumara P, Ding Bei Hong, Zhao D, Lu S, Tan CL (2012) A new iterative-midpoint-method for video character gap filling. In: Proceedings of the ICPR, pp 673–676
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer-Verlag London
About this chapter
Cite this chapter
Lu, T., Palaiahnakote, S., Tan, C.L., Liu, W. (2014). Post-processing of Video Text Detection. In: Video Text Detection. Advances in Computer Vision and Pattern Recognition. Springer, London. https://doi.org/10.1007/978-1-4471-6515-6_5
Download citation
DOI: https://doi.org/10.1007/978-1-4471-6515-6_5
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-6514-9
Online ISBN: 978-1-4471-6515-6
eBook Packages: Computer ScienceComputer Science (R0)