Abstract
The field of Document Image Processing has encountered sensational development and progressively across the board relevance lately. Luckily, propels in PC innovation have kept pace with the fast development in the volume of picture information in different applications. One such utilization of Document picture preparing is OCR (Optical Character Recognition). Pre-preparing is one of the pre-imperative stages in the handling of record pictures which changes the archive to a frame reasonable for ensuing stages. In this paper, various preprocessing techniques are proposed for the enhancement of degraded document images. The algorithms implemented are adept at handling variety of noises that include foxing effect, illumination correction, show through effect, stain marks, and pen and other scratch marks removal. The techniques devised works based on noise degradation models generated from the attributes of noisy pixels which are commonly found in degraded or ancient document images. Further, these noise models are employed for the detection of noisy regions in the image to undergo the enhancement process. The enhancement procedures employed include the local normalization, convolution using central measures like mean and standard deviation, and Sauvola’s adaptive binarization technique. The outcomes of the preprocessing procedure is very promising and are adaptable to various degraded document scenarios.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsReferences
Gupta MR, Jacobson NP, Garcia EK (2007) OCR binarization and image pre-processing for searching historical documents. Pattern Recogn 40(2):389–397
Farooq F, Govindaraju V, Perrone M (2005) Pre-processing methods for handwritten Arabic documents. In: Proceedings of the eighth international conference on document analysis and recognition, 2005, pp 267–271. IEEE
Rani NS, Vasudev T (2018) An efficient technique for detection and removal of lines with text stroke crossings in document images. In: Proceedings of international conference on cognition and recognition, pp 83–97. Springer, Singapore
Rani DANS, Vineeth P, Ajith D (2016) Detection and removal of graphical components in pre-printed documents. Int J Appl Eng Res 11(7):4849–4856
Gatos B, Pratikakis I, Perantonis SJ (2006) Adaptive degraded document image binarization. Pattern Recogn 39(3):317–327
Farooq F, Govindaraju V, Perrone M (2005) Pre-processing methods for handwritten Arabic documents. In: Proceedings eighth international conference on document analysis and recognition, 2005, pp 267–271. IEEE
O’Gorman L (1993) The document spectrum for page layout analysis. IEEE Trans Pattern Anal Mach Intell 15(11):1162–1173
Rehman A, Saba T (2014) Neural networks for document image preprocessing: state of the art. Artif Intell Rev 42(2):253–273
Gatos B, Ntirogiannis K, Pratikakis I (2009) ICDAR 2009 document image binarization contest (DIBCO 2009). In: 10th international conference on document analysis and recognition, 2009. ICDAR’09, pp 1375–1382. IEEE
Kavallieratou E, Stamatatos E (2006) Improving the quality of degraded document images. In: Second international conference on document image analysis for libraries, 2006. DIAL’06, 10-pp. IEEE
Chang SG, Yu B, Vetterli M (2000) Adaptive wavelet thresholding for image denoising and compression. IEEE Trans Image Process 9(9):1532–1546
Hsia CH, Hoang HG, Tu HY (2015) Document image enhancement using adaptive directional lifting-based wavelet transform. In: 2015 IEEE international conference on consumer electronics-Taiwan (ICCE-TW), pp 432–433. IEEE
Ntogas N, Veintzas D (2008) A binarization algorithm for historical manuscripts. In: WSEAS Proceedings of the international conference on mathematics and computers in science and engineering, no. 12. World Scientific and Engineering Academy and Society
Kitadai A, Nakagawa M, Baba H, Watanabe A (2012) Similarity evaluation and shape feature extraction for character pattern retrieval to support reading historical documents. In: 2012 10th IAPR international workshop on document analysis systems (DAS), pp 359–363. IEEE
Shirai K, Endo Y, Kitadai A, Inoue S, Kurushima N, Baba H et al (2013) Character shape restoration of binarized historical documents by smoothing via geodesic morphology. In: 2013 12th international conference on document analysis and recognition (ICDAR), pp 1285–1289. IEEE
Lu SJ, Tan CL (2007) Binarization of badly illuminated document images through shading estimation and compensation. In: Ninth international conference on document analysis and recognition, 2007. ICDAR 2007, vol 1, pp 312–316. IEEE
Kavallieratou E, Antonopoulou H (2005) Cleaning and enhancing historical document images. In: International conference on advanced concepts for intelligent vision systems. Springer, Berlin, pp 681–688
Wolf C (2010) Document ink bleed-through removal with two hidden markov random fields and a single observation field. IEEE Trans Pattern Anal Mach Intell 32(3):431–447
Shi Z, Govindaraju V (2004) Historical document image enhancement using background light intensity normalization. In: Proceedings of the 17th international conference on pattern recognition. ICPR 2004, vol 1, pp 473–476. IEEE
Garain U, Paquet T, Heutte L (2006) On foreground—background separation in low quality document images. IJDAR 8(1):47
Kanungo T, Haralick RM, Phillips I (1993) Global and local document degradation models. In: Proceedings of the second international conference on document analysis and recognition, 1993. IEEE, pp 730–734
Lee J-S (1980) Digital image enhancement and noise filtering by use of local statistics. IEEE Trans Pattern Anal Mach Intell 2:165–168
Ord JK, Getis A (1995) Local spatial autocorrelation statistics: distributional issues and an application. Geogr Anal 27(4):286–306
Young IT, Van Vliet LJ (1995) Recursive implementation of the Gaussian filter. Sig Process 44(2):139–151
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Shobha Rani, N., Sajan Jain, A., Kiran, H.R. (2019). A Unified Preprocessing Technique for Enhancement of Degraded Document Images. In: Pandian, D., Fernando, X., Baig, Z., Shi, F. (eds) Proceedings of the International Conference on ISMAC in Computational Vision and Bio-Engineering 2018 (ISMAC-CVB). ISMAC 2018. Lecture Notes in Computational Vision and Biomechanics, vol 30. Springer, Cham. https://doi.org/10.1007/978-3-030-00665-5_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-00665-5_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-00664-8
Online ISBN: 978-3-030-00665-5
eBook Packages: EngineeringEngineering (R0)