Handwritten and Printed Text Separation: Linearity and Regularity Assessment
In this paper, we address the issue of discerning handwriting from machine-printed text in real documents (This work is funded by the PiXL project, supported by the “Fonds national pour laSociété Numérique” of the French State. http://valconum.fr/index.php/les-projets/pixl). We present a reliable method based on a novel set of features belonging to two different categories, linearity and regularity, invariant to translation and scaling. Specifically, a novel linearity measure derived from the histogram of straight line segment lengths is introduced. The resulting framework is independent of the document layout andsupports any latin language used. Its performances are assessed on real documents dataset comprising heterogeneous administrative images.Experimental results demonstrate its accuracy, allowing up to 90 % recognition rate.
KeywordsRecognition Rate Document Image Straight Line Segment Text Line Optical Character Recognition
Unable to display preview. Download preview PDF.
- 1.Belaïd, A., Santosh, K.C., D’Andecy, V.P.: Handwritten and printed text separation in real document. CoRR, abs/1303.4614 (2013)Google Scholar
- 2.Zagoris, K., Pratikakis, I., Antonacopoulos, A., Gatos, B., Papamarkos, N.: Handwritten and machine printed text separation in document images using the bag of visual words. In: International Conference on Frontiers in Handwriting Recognition (2012)Google Scholar
- 4.Wahl, R., Wong, K., Casey, R.: Block Segmentation and Text Extraction in Mixed Text/Image Documents. IBM Research Lab, San Jose, California, Research Report RJ3356 (40312) (December 1981)Google Scholar
- 5.Zheng, Y., Li, H., Doermann, D.: Machine printed text and handwriting identification in noisy document images. University of Maryland, College Park, Technical Report (September 2003)Google Scholar
- 6.Shirdhonkar, M., Kokare, M.B.: Discrimination between printed and handwritten text in documents. IJCA 3, 131–134 (2010). Special Issue on RTIPPRGoogle Scholar
- 7.Bilane, P., Bres, S., Emptoz, H.: Robust directional features for wordspotting in degraded syriac manuscripts. In: International Workshop on Content-Based Multimedia Indexing, CBMI 2008, pp. 526–533 (June 2008)Google Scholar
- 8.Berlemont, S., Aaron, B., Cloppet, F., Olivo-Marin, J.-C.: Detection of linear structures in biological images. In: Conference Record of the Forty-First Asilomar, Signals, Systems and Computers 2007, pp. 1279–1283 (November 2007)Google Scholar