Abstract
Overlaid texts are annotated text on video frames embedded externally for providing additional information to viewer of video sequences. The externally embedded texts can be used for auto-indexing and searching of video files in a video library using contextual contents inside video files. In this paper, we proposed a novel algorithm to detect and extract the overlaid text in digital video which allows users to get a much deeper understanding of video content. The proposed algorithm uses SVM as machine learning approach to filter/extract text more accurately. It uses multi-resolution processing algorithm due to which the proposed algorithm is able to extract embedded text of different font size from same video frame. Text detection from video sequences enables us to auto-indexing of video based on text embedded on video frames. Embedded texts enable deaf and hard-of-hearing users to watch videos. It is also useful for the people, who have hearing impairments from understanding the content of video. It also helps to those kinds of people who want to watch video in sound-sensitive environments.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Bosch, A., Zisserman, A., Munoz, X.: Image classification using random forests and ferns. In: 11th IEEE International Conference on Computer Vision, pp. 1–8 (2007)
Wolf, C., Jolion, J.: Object count/area graphs for the evaluation of object detection and segmentation algorithms. Int. J. Doc. Anal. Recognit. 8(4), 280–296 (2006)
Li, H., Doermann, D., Kia, O.: Automatic text detection and tracking in digital video. IEEE Trans. IP 9(1), 147–156 (2000)
Huang, X., Ma, H.: Automatic detection and localization of natural scene text in video. In: Proceedings of the 20th IEEE International Conference on Pattern Recognition, pp. 3216–3219, Aug 2010
Zhao, X., Lin, K.H., Fu, Y., Hu, Y., Liu, Y., Huang, T.S.: Text from corners: a novel approach to detect text and caption in videos. IEEE Trans. Image Process. 20(3), 790–799 (2011)
Kim, W., Kim, C.: A new approach for overlay text detection and extraction from complex video scene. IEEE Trans. Image Process. 18(2), 401–411 (2009)
Ekin, A.: Information based overlaid text detection by classifier fusion. In: IEEE International Conference on Acoustics, Speech and Signal Processing, pp. II-753–II-756 (2006)
Shivakumara, P., Phan, T.Q., Tan, C.L.: A laplacian approach to multi-oriented text detection in video. IEEE Trans. Pattern Anal. Mach. Intell. 33(2), 412–419 (2011)
Li, Z., Liu, G., Qian, X., Guo, D., Jiang, H.: Effective and efficient video text extraction using key text points. IET Image Process. 5(8), 671–683 (2011)
Ye, Q., Huang, Q.: A New text detection algorithm in images/video frames PCM. LNCS 3332, 858–865 (2004)
Hua, X., Yin, P., Zhang, H.J.: Efficient video text recognition using multiple frame integration. IEEE Int. Conf. Image Process. (ICIP) 2, 397–400 (2002)
Hua, X.-S., Chert, X.-R., Wenyin, L., Zhang, H.-J.: Automatic location of text in video frames. In: Proceedings of the 2001 ACM Workshops on Multimedia, 24–27 Sept 2001
Winger, L.L., Robinson, J.A., Jernigan, M.E.: Low-complexity character extraction in low-contrast scene images. Int. J. Pattern Recognit. Artif. Intell. 14(2), 113–135 (2000)
Shivakumara, P., Dutta, A., Phan, T.Q., Tan, C.L., Pal, U.: A novel mutual nearest neighbor based symmetry for text frame classification in video. Pattern Recognit. 44, 1671–1683 (2011)
Yang, H., Quehl, B., Sack, H.: A framework for improved video text detection and recognition. Multimed. Tools Appl. 69(1), 217–245 (2014)
Liu, X., Wang, W.: Robustly extracting captions in videos based on stroke-like edges and spatio-temporal analysis. IEEE Trans. Multimed. 14(2), 482–489 (2012)
Epshtein, B., Ofek, E., Wexler, Y.: Detecting text in natural scenes with stroke width transforms. In: IEEE Conference on Computer Vision and Pattern Recognition, San Francisco (2010)
Jung, C., Liu, Q., Kim, J.: A stroke filter and its application to text localization. Pattern Recogn. Lett. 30(2), 114–122 (2009)
Shivakumara, P., Sreedhar, R.P., Phan, T.Q., Lu, S., Tan, C.L.: Multioriented video scene text detection through Bayesian classification and boundary growing. IEEE Trans. Circuits Syst. Video 22(8), 1227–1235 (2012)
Yi, C., Tian, Y.: Text string detection from natural scenes by structure-based partition and grouping. IEEE Trans. Image Process. 20(9), 2594–2605 (2011)
Chen, H., Tsai, S., Schroth, G., Chen, D., Grzeszczuk, R., Girod, B.: Robust text detection in natural images with edge-enhanced maximally stable extremal regions. In: Proceedings of the 18th IEEE International Conference on Image Processing, pp. 2609–2612, Sept 2011
Zhao, M., Li, S., Kwok, J.: Text detection in images using sparse representation with discriminative dictionaries. Image Vis. Comput. 28(12), 1590–1599 (2010)
Pan, Y.F., Hou, X., Liu, C.L.: A hybrid approach to detect and localize texts in natural scene images. IEEE Trans. Image Process. 20(3), 800–813 (2011)
Anthimopoulos, M., Gatos, B., Pratikakis, I.: A two-stage scheme for text detection in video images. Image Vis. Comput. 28(9), 1413–1426 (2010)
Zhuge, Y.Z., Lu, H.C.: Robust video text detection with morphological filtering enhanced MSER. J. Comput. Sci. Technol. 30(2), 353–363 (2015)
Huang, X., Ma, H., Ling, C.X., Gao, G.: Detecting both superimposed and scene text with multiple languages and multiple alignments in video. Springer Science + Business Media, LLC (2012)
Anoop, K., Gangan, M.P., Lajish, V.L.: Advances in Signal Processing and Intelligent Recognition Systems, Advances in Intelligent Systems and Computing. Springer International Publishing, Switzerland (2016)
Lee, S., Ahn, J., Lee, Y., Jo, K.: Beginning Frame and Edge Based Name Text Localization in News Interview Videos. ICIC 2016, Springer International Publishing, Switzerland, Part III, pp. 583–594 (2016)
Yi, J., Peng, Y., Xiao, J.: Color-based clustering for text detection and extraction in image. ACM MM 847–850 (2007)
Anthimopoulos, M., Gatos, B., Pratikakis, I.: Multiresolution text detection in video frames. In: International Conference on Computer Vision Theory and Applications, pp. 161–166 (2007)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Kumari, L., Dey, V., Raheja, J.L. (2018). A Three-Layer Approach for Overlay Text Extraction in Video Stream. In: Pant, M., Ray, K., Sharma, T., Rawat, S., Bandyopadhyay, A. (eds) Soft Computing: Theories and Applications. Advances in Intelligent Systems and Computing, vol 584. Springer, Singapore. https://doi.org/10.1007/978-981-10-5699-4_9
Download citation
DOI: https://doi.org/10.1007/978-981-10-5699-4_9
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5698-7
Online ISBN: 978-981-10-5699-4
eBook Packages: EngineeringEngineering (R0)