Abstract
Since the number of digital multimedia libraries is growing rapidly, the need to efficiently index, browse and retrieve this information is also increased. In this context, text appearing in images represents an important entity for indexing and retrieval purposes. Often, text is superimposed over complex image background and its recognition by a commercial optical character recognition (OCR) engine is difficult. Thus, there is the need for a text segmentation process, including background removal and binarization, in order to achieve a satisfactory recognition rate by OCR. In this paper, an unsupervised learning method for text segmentation in images with complex backgrounds is presented. First, the color of the text and background is determined based on a color quantizer. Then, the pixel color and the standard deviation of the wavelet transformed image are used to distinguish between text and non-text pixels. To classify pixels into text and background, a slightly modified k-means algorithm is applied which is used to produce a binarized text image. The segmentation result is fed into a commercial OCR software to investigate the segmentation quality. The performance of our approach is demonstrated by presenting experimental results for a set of video frames.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Agnihotri, L., Dimitrova, N.: Text Detection for Video Analysis. In: Proc. of International Conference on Multimedia Computing and Systems, Florence, pp. 109–113 (1999)
Antani, S., Crandall, D., Kasturi, R.: Robust Extraction of Text in Video. In: Proc. of IEEE International Conference on Pattern Recognition, Barcelona, vol. 1, pp. 1445–1449 (2000)
Gllavata, J., Ewerth, R., Freisleben, B.: Finding Text in Images via Local Thresholding. In: Proc. of the 3rd IEEE Int’l Symposium on Signal Processing and Information Technology, Darmstadt, Germany (2003)
Gllavata, J., Ewerth, R., Freisleben, B.: A Robust Algorithm for Text Detection in Images. In: 3rd Int’l Symposium on Image and Signal Processing and Analysis, Rome, pp. 611–616 (2003)
Hua, X.S., Yin, P., Zhang, H.J.: Efficient Video Text Recognition Using Multiple Frame Integration. In: Proc. of IEEE International Conference on Image Processing, Rochester, NewYork, vol. 2, pp. 397–400 (2002)
Li, H., Kia, O., Doermann, D.: Text Enhancement in Digital Videos. In: SPIE. Document Recognition and Retrieval VI, vol. 3651, pp. 2–9 (1999)
Lienhart, R., Wernicke, A.: Localizing and Segmenting Text in Images and Videos. IEEE Transact.on Circuits and Systems for Video Technology 12(4), 256–258 (2002)
Loprestie, D., Zhou, J.Y.: Locating and Recognizing Text in WWW Images. In: Information Retrieval, pp. 177–206. Kluwer Academic Publishers, Dordrecht (2000)
Miene, A., Hermes, T., Ioannidis, G.: Extracting Textual Information from Digital Videos. In: Proc. of IEEE Sixth International Conference on Document Analysis and Recognition, Seattle, Washington, pp. 1079–1083 (2001)
Niblack, W.: An Introduction to Digital Processing, pp. 115–116. Prentice Hall, Englewood Cliffs (1986)
Odobez, J.M., Chen, D.: Robust Video Text Segmentation and Recognition with Multiple Hypotheses. In: Proc. of IEEE International Conference on Image Processing 2002, Rochester, NewYork, vol. II. pp. 433–436 (2002)
Otsu, N.: A Threshold Selection Method from Gray-Level Histograms. IEEE Transactions on Systems, Man and Cybernetics 9(1), 62–66 (1979)
Sato, T., Kanade, T., Huges, E.K., Smith, M.A., Satoh, S.: Video OCR: Indexing Digital News Libraries by Recognition of Superimposed Caption. ACM Multimedia Systems 7(5), 385–395 (1999)
Sauvola, J., Seppänen, T., Haapakoski, S., Pietikäinen, M.: Adaptive Document Binarization. In: Proc. of International Conference on Document Binarization, vol. 1, pp. 14–152 (1997)
Villasenor, J., Belzer, B., Liao, J.: Wavelet Filter Evaluation for Efficient Image Compression. IEEE Transactions on Image Processing 4, 1053–1060 (1995)
Wolf, C., Jolion, J.M., Chassaing, F.: Text Localization, Enhancement and Binarization in Multimedia Documents. In: Proc. of International Conference on Pattern Recognition, Quebec City, Canada, vol. 4, pp. 1037–1040 (2002)
Wu, V., Manmatha, R., Riseman, E.M.: Textfinder: An Automatic System to Detect and Recognize Text in Images. IEEE Transactions on Pattern Analysis and Machine Intelligence 21(11), 1224–1229 (1999)
Wu, X.: YIQVector Quantization in a New Color Palette Architecture. IEEE Transactions on Image Processing 5(2), 321–329 (1996)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gllavata, J., Ewerth, R., Stefi, T., Freisleben, B. (2004). Unsupervised Text Segmentation Using Color and Wavelet Features. In: Enser, P., Kompatsiaris, Y., O’Connor, N.E., Smeaton, A.F., Smeulders, A.W.M. (eds) Image and Video Retrieval. CIVR 2004. Lecture Notes in Computer Science, vol 3115. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-27814-6_28
Download citation
DOI: https://doi.org/10.1007/978-3-540-27814-6_28
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22539-3
Online ISBN: 978-3-540-27814-6
eBook Packages: Springer Book Archive