Advertisement

Handwritten and Printed Text Separation: Linearity and Regularity Assessment

  • Sameh HamrouniEmail author
  • Florence Cloppet
  • Nicole Vincent
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8814)

Abstract

In this paper, we address the issue of discerning handwriting from machine-printed text in real documents (This work is funded by the PiXL project, supported by the “Fonds national pour laSociété Numérique” of the French State. http://valconum.fr/index.php/les-projets/pixl). We present a reliable method based on a novel set of features belonging to two different categories, linearity and regularity, invariant to translation and scaling. Specifically, a novel linearity measure derived from the histogram of straight line segment lengths is introduced. The resulting framework is independent of the document layout andsupports any latin language used. Its performances are assessed on real documents dataset comprising heterogeneous administrative images.Experimental results demonstrate its accuracy, allowing up to 90 % recognition rate.

Keywords

Recognition Rate Document Image Straight Line Segment Text Line Optical Character Recognition 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Belaïd, A., Santosh, K.C., D’Andecy, V.P.: Handwritten and printed text separation in real document. CoRR, abs/1303.4614 (2013)Google Scholar
  2. 2.
    Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., Papamarkos, N.: Handwritten and machine printed text separation in document images using the bag of visual words. In: International Conference on Frontiers in Handwriting Recognition (2012)Google Scholar
  3. 3.
    Peng, X., Setlur, S., Govindaraju, V., Sitaram, R.: Handwritten text separation from annotated machine printed documents using markov random fields. IJDAR 16(1), 1–16 (2013)CrossRefGoogle Scholar
  4. 4.
    Wahl, R., Wong, K., Casey, R.: Block Segmentation and Text Extraction in Mixed Text/Image Documents. IBM Research Lab, San Jose, California, Research Report RJ3356 (40312) (December 1981)Google Scholar
  5. 5.
    Zheng, Y., Li, H., Doermann, D.: Machine printed text and handwriting identification in noisy document images. University of Maryland, College Park, Technical Report (September 2003)Google Scholar
  6. 6.
    Shirdhonkar, M., Kokare, M.B.: Discrimination between printed and handwritten text in documents. IJCA 3, 131–134 (2010). Special Issue on RTIPPRGoogle Scholar
  7. 7.
    Bilane, P., Bres, S., Emptoz, H.: Robust directional features for wordspotting in degraded syriac manuscripts. In: International Workshop on Content-Based Multimedia Indexing, CBMI 2008, pp. 526–533 (June 2008)Google Scholar
  8. 8.
    Berlemont, S., Aaron, B., Cloppet, F., Olivo-Marin, J.-C.: Detection of linear structures in biological images. In: Conference Record of the Forty-First Asilomar, Signals, Systems and Computers 2007, pp. 1279–1283 (November 2007)Google Scholar
  9. 9.
    Siddiqi, I., Vincent, N.: Text independent writer recognition using redundant writing patterns with contour-based orientation and curvature features. Pattern Recognition 43(11), 3853–3865 (2010)CrossRefzbMATHGoogle Scholar
  10. 10.
    Wall, K., Danielsson, P.-E.: A fast sequential method for polygonal approximation of digitized curves. Computer Vision Graphics and Image Processing 28(3), 220–227 (1984)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2014

Authors and Affiliations

  • Sameh Hamrouni
    • 1
    Email author
  • Florence Cloppet
    • 1
  • Nicole Vincent
    • 1
  1. 1.LIPADEUniversity of Paris DescartesParisFrance

Personalised recommendations