Abstract
Discrimination of handwritten and machine printed text in a scanned document image is an important process as the Optical Character Recognizers (OCRs) available are domain specific. In this paper, a novel approach has been proposed to discriminate handwritten and machine printed word components based on the structure. In the binarized form of the word component, due to the informative foreground overlay on the null background, transitions from 0-1 and 1-0 occur at the contour of the component structure. The count and occurrence of these transitions are used to discriminate handwritten and machine printed word components. The proposed method is robust and simple. Extensive experimentation has been conducted over a wide range of data samples (English words).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Doermann D, Li H, Kia O, Kilic K (1997) Detection of duplicates in document image databases. Technical report, February 1997
Hull JJ (1994) Document image matching and retrieval with multiple distortion-invariant descriptors. In: Proceedings of the international workshop on document analysis systems, pp 383–400
Spitz AL (1995) Using character shape codes for word spotting in document images. In: Proceedings of shape, structure and pattern recognition, World Scientific, Singapore, pp 382–389
Gowda SD, Nagabhushan P (2007) Equivalence between two document images in terms of geometry and entropy. In: Proceedings of advanced computer vision and information technology, Aurangabad, INDIA, pp 210–216
Tse J, Curtis D, Bunch J, Jones C, Yfantis EA, Thomas A (2007) Handwritten and typewritten word and character separation in unconstrained document images. Proceedings of IPCV
Kavallieratou E, Stamatatos S (2004) Discrimination of machine-printed from handwritten text using simple structural characteristics. In: Proceedings of 17th international conference on pattern recognition, vol I, 2004
Fan KC, Wang LS, Tu YT (1998) Classification of machine—printed and handwritten texts using character block layout variance. Pattern Recognit 31(9):1275–1284
Guo JK, Ma MY (2001) Separating handwritten material from machine printed text using hidden Markov models. In: Proceedings of 6th international conference in document analysis and recognition
da Silva LF, Conci A, Sanchez A (2009) Automatic discrimination between printed and handwritten text in documents. In: Proceedings of computer graphics and image processing, 2009
Farooq F, Sridharan K, Govindaraju V (2006) Identifying handwritten text in mixed documents. In: Proceedings of IEEE 18th international conference on pattern recognition (ICPR)
Shirdhonkar MS, Kokare MB (2010) Discrimination between printed and handwritten text in documents. In: Proceedings of IJCA special issue on “recent trends in image processing and pattern recognition
Acknowledgments
This work has been supported by Visveswaraya Technological University (VTU) under the Research Grant Scheme 2010-11 (Ref. No VTU/Aca./2011-12/A-9/13097). The authors acknowledge the support provided by VTU.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer India
About this paper
Cite this paper
Narayan, S., Gowda, S.D. (2013). Discrimination of Handwritten and Machine Printed Text in Scanned Document Images. In: Chakravarthi, V., Shirur, Y., Prasad, R. (eds) Proceedings of International Conference on VLSI, Communication, Advanced Devices, Signals & Systems and Networking (VCASAN-2013). Lecture Notes in Electrical Engineering, vol 258. Springer, India. https://doi.org/10.1007/978-81-322-1524-0_46
Download citation
DOI: https://doi.org/10.1007/978-81-322-1524-0_46
Published:
Publisher Name: Springer, India
Print ISBN: 978-81-322-1523-3
Online ISBN: 978-81-322-1524-0
eBook Packages: EngineeringEngineering (R0)