Advertisement

The Method of Clearing Printed and Handwritten Texts from Noise

  • S. ChernenkoEmail author
  • S. Lychko
  • M. Kovalkova
  • Y. Esina
  • V. Timofeev
  • K. Varshamov
  • A. Karlov
  • A. Pozdeev
Conference paper
  • 7 Downloads
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 1156)

Abstract

The article reviews the existing methods and algorithms for clearing printed and handwritten texts from noise and proposes an alternative approach. Among the solutions analyzed, a group of methods based on adaptive threshold conversion is distinguished. Our method for clearing print and handwritten documents from noise is based on using of a convolutional neural network ensemble with a U-Net architecture and a multi-layer perceptron. Using consequently a convolutional neural network and a multilayer perceptron demonstrates high efficiency in small training sets. As a result of applying our method to the entire test sample, an image cleaning degree of 93% was achieved. In the future, these methods can be introduced in libraries, hospitals, news companies where people work with non-digitized papers and digitization is needed.

Keywords

Computer vision Artificial intelligence CNN Neural networks Segmentation OCR U-net Backpropagation 

References

  1. 1.
    Khorosheva, T.: Neural network control interface of the speaker dependent computer system «Deep Interactive Voice Assistant DIVA» to help people with speech impairments. In: International Conference on Intelligent Information Technologies for Industry. Springer, Cham (2018)Google Scholar
  2. 2.
    Cai, J., Liu, Z.-Q.: Off-line unconstrained handwritten word recognition. Int. J. Pattern Recognit. Artif. Intell. 14(03), 259–280 (2000)CrossRefGoogle Scholar
  3. 3.
    Fan, K.-C., Wang, L.-S., Tu, Y.-T.: Classification of machine-printed and handwritten texts using character block layout variance. Pattern Recogn. 31(9), 1275–1284 (1998)CrossRefGoogle Scholar
  4. 4.
    Imade, S., Tatsuta, S., Wada, T.: Segmentation and classification for mixed text/image documents using neural network. In: Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR 1993). IEEE (1993)Google Scholar
  5. 5.
    Impedovo, S., Ottaviano, L., Occhinegro, S.: Optical character recognition—a survey. Int. J. Pattern Recognit. Artif. Intell. 5(01n02), 1–24 (1991)CrossRefGoogle Scholar
  6. 6.
    Rehman, A., Kurniawan, F., Saba, T.: An automatic approach for line detection and removal without smash-up characters. Imaging Sci. J. 59(3), 177–182 (2011)CrossRefGoogle Scholar
  7. 7.
    Brown, M.K., Ganapathy, S.: Preprocessing techniques for cursive script word recognition. Pattern Recogn. 16(5), 447–458 (1983)CrossRefGoogle Scholar
  8. 8.
    Bradley, D., Roth, G.: Adaptive thresholding using the integral image. J. Graph. Tools 12(2), 13–21 (2007)CrossRefGoogle Scholar
  9. 9.
    Jawas, N., Nanik, S.: Image inpainting using erosion and dilation operation. Int. J. Adv. Sci. Technol. 51, 127–134 (2013)Google Scholar
  10. 10.
    Buades, A., Coll, B., Morel, J.M.: A non-local algorithm for image denoising. In: 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2005), vol. 2. IEEE (2005)Google Scholar
  11. 11.
    Hornik, K., Maxwell, S., Halbert, W.: Multilayer feedforward networks are universal approximators. Neural Networks 2(5), 359–366 (1989)CrossRefGoogle Scholar
  12. 12.
    Hecht-Nielsen, R.: Theory of the backpropagation neural network. In: Neural Networks for Perception, pp. 65–93. Academic Press, New York (1992)Google Scholar
  13. 13.
    Smith, R.: An overview of the Tesseract OCR engine. In: Ninth International Conference on Document Analysis and Recognition, vol. 2. IEEE (2007)Google Scholar
  14. 14.
    OpenCV: Color Conversions. https://docs.opencv.org/3.4/de/d25/imgproc_color_conversions.html. Accessed 01 May 2019
  15. 15.
    Güneş, A., Habil, K., Efkan, D.: Optimizing the color-to-grayscale conversion for image classification. Signal Image Video Process. 10(5), 853–860 (2016)CrossRefGoogle Scholar
  16. 16.
    Sahare, P., Dhok, S.B.: Review of text extraction algorithms for scene-text and document images. IETE Tech. Rev. 34(2), 144–164 (2017)CrossRefGoogle Scholar
  17. 17.
    Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, Cham (2015)Google Scholar
  18. 18.
    He, K.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)CrossRefGoogle Scholar
  19. 19.
    Adamo, F.: An automatic document processing system for medical data extraction. Measurement 61, 88–99 (2015)CrossRefGoogle Scholar
  20. 20.
    Denoising dirty documents. https://www.kaggle.com/c/denoising-dirty-documents. Accessed 01 July 2019

Copyright information

© Springer Nature Switzerland AG 2020

Authors and Affiliations

Personalised recommendations