Abstract
It is presented herein a new entropy-based segmentation algorithm for images of historical documents. The algorithm provides high quality images and it also improves OCR (Optical Character Recognition) responses for typed documents. It adapts its settings to achieve better quality images through changes in the logarithmic base that defines entropy. For this purpose, a measure for image fidelity is applied just as information inherent to images of documents.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Parker, J.R.: Algorithms for Image Processing and Computer Vision. J.Wiley &Sons, Chichester (1997)
Lins, R.D., Guimarães, M.S., França, L.R., Rosa, L.G.: An Environment for Processing Images of Historical Documents”. Microprocessing & Microprogramming. North-Holland, Amsterdam (1995)
Shannon, C.: A Mathematical Theory of Communication. Bell System Technology 27, 370–423, 623-656 (1948)
Pun, T.: Entropic Thresholding, A New Approach. In: C.Graphics and Image Proc. (1981)
Kapur, J.N., Sahoo, P.K., Wong, A.K.C.: A New Method for Gray-Level Picture Thresholding using the Entropy of the Histogram. Computer Vision, Graphics and Image Processing 29(3) (1985)
Johannsen, G., Bille, J.: A Threshold Selection Method using Information Measures. In: Proceedings, 6th Int. Conf. Pattern Recognition, Munich, Germany, pp. 140–143 (1982)
Kapur, J.N.: Measures of Information and their Applications. J.Wiley & Sons, Chichester (1994)
Haralick, R., Shanmugam, K., Dinstein. I.: Textural Features for Image Classification. IEEE Trans. on Systems, Man and Cybernetics (November 1973)
Franke, K., Bunnemeyer, O., Sy, T.: Ink texture analysis for writer identification. In: 8th Int. Workshop on Frontiers in Handwriting Recognition, Ontario, Canada (2002)
Janssen, T.J.W.M.: Understanding image quality. In: Int. Conf. on Image Processing (2001)
Li, X.: Blind Image Quality Assessment. In: IEEE Int. Conf. on Image Processing (ICIP 2002), New York, USA (2002)
Cao, C.L.T., Shen, R.P.: Restoration of archival documents using a wavelet technique. IEEE Trans. on Patt. Analysis and Machine Intelligence, 1399–1404 (2002)
Salton, G.: Automatic Text Processing: The Transformation, Analysis and Retrieval of Information by Computer. Addison-Wesley Pub. co, Reading (1994)
Michie, D., Spiegelhalter, D., Taylor, C.: Machine Learning, Neural and Statistical Classification. Ellis Horwood (1994)
Junker, M., Hoch, R., Dengel, A.: On the evaluation of document analysis components by recall, precision, and accuracy. In: 5th Int. Conf. on Doc. Analysis and Recognition (1999)
Mello, C.A.B., Lins, R.D.: A Comparative Study on Commercial OCR Tools. In: Vision Interface 1999, Canada (1999)
Wang, Z., Bovik, A.C., Lu, L.: Why is image quality assessment so difficult. In: IEEE Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP 2002), Florida, USA (2002)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2004 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
de Mello, C.A.B. (2004). Image Segmentation of Historical Documents: Using a Quality Index. In: Campilho, A., Kamel, M. (eds) Image Analysis and Recognition. ICIAR 2004. Lecture Notes in Computer Science, vol 3212. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30126-4_26
Download citation
DOI: https://doi.org/10.1007/978-3-540-30126-4_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23240-7
Online ISBN: 978-3-540-30126-4
eBook Packages: Springer Book Archive