Advertisement

Improving the Perceptual Quality of Document Images Using Deep Neural Network

  • Ram Krishna PandeyEmail author
  • A. G. Ramakrishnan
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11555)

Abstract

Given a low-resolution binary document image, we aim to improve its perceptual quality for enhanced readability. We have proposed a simple, deep learning based model, that uses convolution with transposed convolution and sub-pixel layers in the best possible way to construct the high-resolution image. The proposed architecture scales across the three different scripts tested, namely Tamil, Kannada and Roman. To show that the reconstructed output has enhanced readability, we have used the objective criterion of optical character recognizer (OCR) character level accuracy. The reported results by our CTCS architecture shows significant improvement in terms of the subjective criterion of human readability and objective criterion of OCR character level accuracy.

Keywords

Readability Binary document image Super-resolution Deep learning OCR 

References

  1. 1.
    Shi, Z., Setlur, S., Govindaraju, V.: Image enhancement for degraded binary document images. In: Document Analysis and Recognition (ICDAR). IEEE (2011)Google Scholar
  2. 2.
    El Harraj, A., Raissouni, N.: OCR accuracy improvement on document images through a novel pre-processing approach. arXiv preprint arXiv:1509.03456 (2015)
  3. 3.
    Kumar, V., Bansal, A., Tulsiyan, G.H., Mishra, A., Namboodiri, A., Jawahar, C.V.: Sparse document image coding for restoration. In: 12th IEEE International Conference on Document Analysis and Recognition (ICDAR), pp. 713–717 (2013)Google Scholar
  4. 4.
    Pandey, R.K., Ramakrishnan, A.G.: Language independent single document image super-resolution using CNN for improved recognition. arXiv preprint arXiv:1701.08835 (2017)
  5. 5.
    Pandey, R.K., Ramakrishnan, A.G.: Efficient document-image super-resolution using convolutional neural network. Sadhana 43(2), 15 (2018)Google Scholar
  6. 6.
    Pandey, R.K., Maiya, S.R., Ramakrishnan, A.G.: A new approach for upscaling document images for improving their quality. In: 14th IEEE India Council International Conference (INDICON). IEEE (2017)Google Scholar
  7. 7.
    Pandey, R.K., Vignesh, K., Ramakrishnan, A.G., Chandrahasa, B.: Binary document image super resolution for improved readability and OCR performance. arXiv preprint arXiv:1812.02475 (2018)
  8. 8.
    LeCun, Y.A., Bottou, L., Orr, G.B., Müller, K.-R.: Efficient BackProp. In: Montavon, G., Orr, G.B., Müller, K.-R. (eds.) Neural Networks: Tricks of the Trade. LNCS, vol. 7700, pp. 9–48. Springer, Heidelberg (2012).  https://doi.org/10.1007/978-3-642-35289-8_3Google Scholar
  9. 9.
    Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Proceedings 3rd International Conference Learning Representations (2014)Google Scholar
  10. 10.
    Xu, L., Ren, J.S., Liu, C., Jia, J.: Deep convolutional neural network for image deconvolution. In: Advances in Neural Information Processing Systems, pp. 1790–1798 (2014)Google Scholar
  11. 11.
    Shi, W., et al.: Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2016)Google Scholar
  12. 12.
    Shivakumar, H.R., Ramakrishnan, A.G.: A tool that converted 200 Tamil books for use by blind students. In: Proceedings of the 12th International Tamil Internet Conference, Kuala Lumpur, Malaysia (2013)Google Scholar
  13. 13.
    He, K., Zhang, X., Ren, S., Sun, J.: Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. In: Proceedings of the IEEE international conference on computer vision, pp. 1026–1034 (2015)Google Scholar
  14. 14.
    Glasner, D., Shai, B., Michal, I.: Super-resolution from a single image. In: 12th IEEE International Conference on Computer Vision (2009)Google Scholar
  15. 15.
    Yang, J., Wright, J., Huang, T.S., Ma, Y.: Image super-resolution via sparse representation. IEEE Trans. Image Process. 19(11), 2861–2873 (2010)Google Scholar
  16. 16.
    Timofte, R., De Smet, V., Van Gool, L.: Anchored neighborhood regression for fast example-based super-resolution. In: Proceedings of the IEEE International Conference on Computer Vision (2013)Google Scholar
  17. 17.
    Dong, C., Loy, C.C., He, K., Tang, X.: Learning a deep convolutional network for image super-resolution. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, vol. 8692, pp. 184–199. Springer, Cham (2014).  https://doi.org/10.1007/978-3-319-10593-2_13Google Scholar
  18. 18.
    Ledig, C., et al.: Photo-realistic single image super-resolution using a generative adversarial network. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 105–114 (2017)Google Scholar
  19. 19.
    Lai, W.S., Huang, J.B., Ahuja, N., Yang, M.H.: Deep laplacian pyramid networks for fast and accurate superresolution. In: IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, no. 3, p. 5 (2017)Google Scholar
  20. 20.
    Kumar, D., Ramakrishnan, A.G.: Power-law transformation for enhanced recognition of born-digital word images. In: International Conference on Signal Processing and Communications (SPCOM), pp. 1–5. IEEE (2012)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Department of Electrical EngineeringIndian Institute of ScienceBangaloreIndia

Personalised recommendations