Abstract
Document images obtained from scanners or photocopiers usually have a black margin which interferes with subsequent stages of page segmentation algorithms. Thus, the margins must be removed at the initial stage of a document processing application. This paper presents an algorithm which we have developed for document margin removal based upon the detection of document corners from projection profiles. The algorithm does not make any restrictive assumptions regarding the input document image to be processed. It neither needs all four margins to be present nor needs the corners to be right angles. In the case of the tilted documents, it is able to detect and correct the skew. In our experiments, the algorithm was successfully applied to all document images in our databases of French and Arabic document images which contain more than two hundred images with different types of layouts, noise, and intensity levels.
Chapter PDF
Similar content being viewed by others
References
Manmatha, R., Rothfeder, J.L.: A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1212–1225 (2005)
Peerawit, W., Kawtrakul, A.: Marginal noise removal from document images using edge density. In: 4th Information and Computer Engineering Postgraduate Workshop, Phuket, Thailand (2004)
Fan, K.-C., Wang, Y.-K., Lay, T.-R.: Marginal noise removal of document images. Pattern Recognition 35, 2593–2611 (2002)
Shafait, F., van Beusekom, J., Keysers, D., Breuel, T.: Page Frame Detection for Marginal Noise Removal from Scanned Documents. In: Ersbøll, B.K., Pedersen, K.S. (eds.) SCIA 2007. LNCS, vol. 4522, pp. 651–660. Springer, Heidelberg (2007)
Shafait, F., van Beusekom, J., Keysers, D., Breuel, T.M.: Document cleanup using page frame detection. International Journal of Document Analysis and Recognition 11, 81–96 (2008)
Du, X., Pan, W., Bui, T.D.: Text Line Segmentation in Handwritten Documents Using Mumford-Shah Model. In: Proceedings of the 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 2008), Montreal, Canada (2008)
Li, Y., Zheng, Y., Doermann, D., Jaeger, S.: Script independent text line segmentation in freestyle handwritten documents. IEEE Trans. Pattern Analysis and Machine Intelligence 30, 1313–1329 (2008)
Stamatopoulos, N., Gatos, B., Kesidis, A.: Automatic Borders Detection of Camera Document Images. In: 2nd International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2007), Curitiba, Brazil, pp. 71–78 (2007)
Le, D.X., Thoma, G.R., Wechsler, H.: Automated Borders Detection and Adaptive Segmentation for Binary Document Images. In: Proceedings of the International Conference on Pattern Recognition (ICPR 1996) Volume III-Volume 7276, p. 737. IEEE Computer Society, Los Alamitos (1996)
Ávila, B.T., Lins, R.D.: Efficient Removal of Noisy Borders from Monochromatic Documents. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 249–256. Springer, Heidelberg (2004)
Cinque, L., Levialdi, S., Lombardi, L., Tanimoto, S.: Segmentation of page images having artifacts of photocopying and scanning. Pattern Recognition 35, 1167–1177 (2002)
Zhang, Z., Tan, C.L.: Recovery of Distorted Document Images from Bound Volumes. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, p. 429. IEEE Computer Society, Los Alamitos (2001)
Te-Hsiu, S., Chih-Chung, L., Po-Shen, Y., Fang-Chih, T.: Boundary-based corner detection using K-cosine. In: IEEE International Conference on Systems, Man and Cybernetics, 2007. ISIC, pp. 1106–1111 (2007)
Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Systems Journal 4(1), 25–30 (1965)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2009 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Haji, M.M., Bui, T.D., Suen, C.Y. (2009). Simultaneous Document Margin Removal and Skew Correction Based on Corner Detection in Projection Profiles. In: Foggia, P., Sansone, C., Vento, M. (eds) Image Analysis and Processing – ICIAP 2009. ICIAP 2009. Lecture Notes in Computer Science, vol 5716. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04146-4_109
Download citation
DOI: https://doi.org/10.1007/978-3-642-04146-4_109
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04145-7
Online ISBN: 978-3-642-04146-4
eBook Packages: Computer ScienceComputer Science (R0)