Abstract
This paper describes a new approach for the segmentation of characters in images on Web pages. In common with the authors’ previous work in this subject, this approach attempts to emulate the ability of humans to differentiate between colours. In this case, pixels of similar colour are first grouped using a colour distance defined in a perceptually uniform colour space (as opposed to the commonly used RGB). The resulting colour connected components are then grouped to form larger (character-like) regions with the aid of a fuzzy propinquity measure. This measure expresses the likelihood for merging two components based on two features. The first feature is the colour distance in the L * a * b * colour space. The second feature expresses the topological relationship of two components. The results of the method indicate a better performance than the previous method devised by the authors and comparable (possibly better) performance to other existing methods.
Chapter PDF
References
M.K. Brown, “Web Page Analysis for Voice Browsing”, Proceedings of the 1st International Workshop on Web Document Analysis (WDA’2001), Seattle, USA, September 2001, pp. 59–61.
G. Penn, J. Hu, H. Luo and R. McDonald, “Flexible Web Document Analysis for Delivery to Narrow-Bandwidth Devices”, Proceedings of the 6th International Conference on Document Analysis and Recognition (ICDAR’01), Seattle, USA, September 2001, pp. 1074–1078.
A. Antonacopoulos, D. Karatzas and J. Ortiz Lopez, “Accessing Textual Information Embedded in Internet Images”, Proceedings of SPIE Internet Imaging II, San Jose, USA, January 24–26, 2001, pp.198–205.
J. Zhou and D. Lopresti, “Extracting Text from WWW Images”, Proceedings of the 4th International Conference on Document Analysis and Recognition (ICDAR’97), Ulm, Germany, August, 1997
D. Lopresti and J. Zhou, “Document Analysis and the World Wide Web”, Proceedings of the 2nd IAPR Workshop on Document Analysis Systems (DAS’96), Marven, Pennsylvania, October 1996, pp. 417–424.
H. Li; D. Doermann and O. Kia, “Automatic text detection and tracking in digital video”, IEEE Transactions on Image Processing, vol. 9, issue 1, Jan. 2000, pp. 147–156.
D. Lopresti and J. Zhou, “Locating and Recognizing Text in WWW Images”, Information Retrieval, 2 (2/3), May 2000, pp. 177–206.
A.K. Jain and B. Yu, “Automatic Text Location in Images and Video Frames”, Pattern Recognition, vol 31, no. 12, 1998, pp.2055–2076.
A. Antonacopoulos and F. Delporte, “Automated Interpretation of Visual Representations: Extracting textual Information from WWW Images”, Visual Representations and Interpretations, R. Paton and I Neilson eds., Springer, London, 1999.
A. Antonacopoulos and D. Karatzas “An Anthropocentric Approach to Text Extraction from WWW Images”, Proceedings of the 4 th IAPR Workshop on Document Analysis Systems (DAS’2000), Rio de Janeiro, Brazil, December 2000, pp. 515–526.
R. C. Carter and E. C. Carter, “CIE L*u*v* Color-Difference Equations for Self-Luminous Displays,” Color Research and Applications, vol. 8, 1983, pp. 252–253.
K. McLaren, “The development of CIE 1976 (L*a*b*) Uniform Colour Space and Colourdiference Formlua,” Journal of the Society of Dyers and Colourists, vol. 92, 1976, pp. 338–341.
G. Wyszecki and W. S. Stiles, Color Science-Concepts and Methods, Quantitative Data Formulas. John Wiley, New York, 1967.
A. Antonacopoulos, “Page Segmentation Using the Description of the Background”, Computer Vision and Image Understanding, vol. 70, 1998, pp. 350–369.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2002 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Antonacopoulos, A., Karatzas, D. (2002). Fuzzy Segmentation of Characters in Web Images Based on Human Colour Perception. In: Lopresti, D., Hu, J., Kashi, R. (eds) Document Analysis Systems V. DAS 2002. Lecture Notes in Computer Science, vol 2423. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45869-7_35
Download citation
DOI: https://doi.org/10.1007/3-540-45869-7_35
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-44068-0
Online ISBN: 978-3-540-45869-2
eBook Packages: Springer Book Archive