Advertisement

Simultaneous Document Margin Removal and Skew Correction Based on Corner Detection in Projection Profiles

  • M. Mehdi Haji
  • Tien D. Bui
  • Ching Y. Suen
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5716)

Abstract

Document images obtained from scanners or photocopiers usually have a black margin which interferes with subsequent stages of page segmentation algorithms. Thus, the margins must be removed at the initial stage of a document processing application. This paper presents an algorithm which we have developed for document margin removal based upon the detection of document corners from projection profiles. The algorithm does not make any restrictive assumptions regarding the input document image to be processed. It neither needs all four margins to be present nor needs the corners to be right angles. In the case of the tilted documents, it is able to detect and correct the skew. In our experiments, the algorithm was successfully applied to all document images in our databases of French and Arabic document images which contain more than two hundred images with different types of layouts, noise, and intensity levels.

Keywords

Document margin layout analysis projection profile corner detection skew correction 

References

  1. 1.
    Manmatha, R., Rothfeder, J.L.: A Scale Space Approach for Automatically Segmenting Words from Historical Handwritten Documents. IEEE Trans. Pattern Anal. Mach. Intell. 27, 1212–1225 (2005)CrossRefGoogle Scholar
  2. 2.
    Peerawit, W., Kawtrakul, A.: Marginal noise removal from document images using edge density. In: 4th Information and Computer Engineering Postgraduate Workshop, Phuket, Thailand (2004) Google Scholar
  3. 3.
    Fan, K.-C., Wang, Y.-K., Lay, T.-R.: Marginal noise removal of document images. Pattern Recognition 35, 2593–2611 (2002)CrossRefzbMATHGoogle Scholar
  4. 4.
    Shafait, F., van Beusekom, J., Keysers, D., Breuel, T.: Page Frame Detection for Marginal Noise Removal from Scanned Documents. In: Ersbøll, B.K., Pedersen, K.S. (eds.) SCIA 2007. LNCS, vol. 4522, pp. 651–660. Springer, Heidelberg (2007)CrossRefGoogle Scholar
  5. 5.
    Shafait, F., van Beusekom, J., Keysers, D., Breuel, T.M.: Document cleanup using page frame detection. International Journal of Document Analysis and Recognition 11, 81–96 (2008)CrossRefGoogle Scholar
  6. 6.
    Du, X., Pan, W., Bui, T.D.: Text Line Segmentation in Handwritten Documents Using Mumford-Shah Model. In: Proceedings of the 11th International Conference on Frontiers in Handwriting Recognition (ICFHR 2008), Montreal, Canada (2008) Google Scholar
  7. 7.
    Li, Y., Zheng, Y., Doermann, D., Jaeger, S.: Script independent text line segmentation in freestyle handwritten documents. IEEE Trans. Pattern Analysis and Machine Intelligence 30, 1313–1329 (2008)CrossRefGoogle Scholar
  8. 8.
    Stamatopoulos, N., Gatos, B., Kesidis, A.: Automatic Borders Detection of Camera Document Images. In: 2nd International Workshop on Camera-Based Document Analysis and Recognition (CBDAR 2007), Curitiba, Brazil, pp. 71–78 (2007)Google Scholar
  9. 9.
    Le, D.X., Thoma, G.R., Wechsler, H.: Automated Borders Detection and Adaptive Segmentation for Binary Document Images. In: Proceedings of the International Conference on Pattern Recognition (ICPR 1996) Volume III-Volume 7276, p. 737. IEEE Computer Society, Los Alamitos (1996)Google Scholar
  10. 10.
    Ávila, B.T., Lins, R.D.: Efficient Removal of Noisy Borders from Monochromatic Documents. In: Campilho, A.C., Kamel, M.S. (eds.) ICIAR 2004. LNCS, vol. 3212, pp. 249–256. Springer, Heidelberg (2004)CrossRefGoogle Scholar
  11. 11.
    Cinque, L., Levialdi, S., Lombardi, L., Tanimoto, S.: Segmentation of page images having artifacts of photocopying and scanning. Pattern Recognition 35, 1167–1177 (2002)CrossRefzbMATHGoogle Scholar
  12. 12.
    Zhang, Z., Tan, C.L.: Recovery of Distorted Document Images from Bound Volumes. In: Proceedings of the Sixth International Conference on Document Analysis and Recognition, p. 429. IEEE Computer Society, Los Alamitos (2001)CrossRefGoogle Scholar
  13. 13.
    Te-Hsiu, S., Chih-Chung, L., Po-Shen, Y., Fang-Chih, T.: Boundary-based corner detection using K-cosine. In: IEEE International Conference on Systems, Man and Cybernetics, 2007. ISIC, pp. 1106–1111 (2007)Google Scholar
  14. 14.
    Bresenham, J.E.: Algorithm for computer control of a digital plotter. IBM Systems Journal 4(1), 25–30 (1965)CrossRefGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • M. Mehdi Haji
    • 1
  • Tien D. Bui
    • 1
  • Ching Y. Suen
    • 1
  1. 1.Centre for Pattern Recognition and Machine IntelligenceConcordia UniversityMontrealCanada

Personalised recommendations