Skip to main content

TextNet for Text-Related Image Quality Assessment

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 11140))

Abstract

With the rapid increase of consumer photos, annotating and retrieving such images with text are becoming more significant, which requires optical character recognition (OCR) techniques. However, to predict OCR accuracy, text-related image quality assessment (TIQA) is necessary and of great value, especially in online business processes. With more interests in text, TIQA aims to compute the quality score of an image through predicting the degree of degradation at textual regions.

To assess text-related quality on detected textlines, this paper proposes a deep neural network, TextNet, which mainly includes three layers: encoder, decoder, and prediction. The decoder layer combines the encoded feature map with the decoded map through deconvolution and concatenation. The prediction layer is designed for textline detection and quality assessment with a new loss function. Under the TIQA framework, the overall text-related image quality is computed through pooling the quality of all detected textlines by way of weighted averaging. Experimental results show that the proposed framework can work well in jointly assessing text related image quality and detecting textlines, even for unknown scene images.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://pan.baidu.com/s/1sRPuedHEwdvUYVcGh86uqg.

References

  1. BRISQUE software release. http://live.ece.utexas.edu/research/quality/BRISQUE_release.zip

  2. Buta, M., Neumann, L., Matas, J.: FASText: efficient unconstrained scene text detector. In: IEEE International Conference on Computer Vision, pp. 1206–1214 (2015)

    Google Scholar 

  3. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Li, F.F.: ImageNet: a large-scale hierarchical image database. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2009, pp. 248–255 (2009)

    Google Scholar 

  4. Kang, L., Ye, P., Li, Y., Doermann, D.: A deep learning approach to document image quality assessment. In: IEEE International Conference on Image Processing, pp. 2570–2574 (2014)

    Google Scholar 

  5. Karatzas, D., et al.: ICDAR 2015 competition on robust reading. In: International Conference on Document Analysis and Recognition, pp. 1156–1160 (2015)

    Google Scholar 

  6. Karatzas, D., et al.: ICDAR 2013 robust reading competition. In: International Conference on Document Analysis and Recognition, pp. 1484–1493 (2013)

    Google Scholar 

  7. Kim, K.H., Hong, S., Roh, B., Cheon, Y., Park, M.: PVANET: deep but lightweight neural networks for real-time object detection (2016)

    Google Scholar 

  8. Kumar, J., Chen, F., Doermann, D.: Sharpness estimation for document and scene images. In: International Conference on Pattern Recognition, pp. 3292–3295 (2013)

    Google Scholar 

  9. Li, H., Zhu, F., Qiu, J.: CG-DIQA: no-reference document image quality assessment based on character gradient (2018). https://arxiv.org/abs/1807.04047

  10. Nayef, N.: Metric-based no-reference quality assessment of heterogeneous document images. In: SPIE Electronic Imaging, p. 94020L-12 (2015)

    Google Scholar 

  11. Nayef, N., Luqman, M.M., Prum, S., Eskenazi, S., Chazalon, J., Ogier, J.M.: SmartDoc-QA: a dataset for quality assessment of smartphone captured document images - single and multiple distortions. In: International Conference on Document Analysis and Recognition, pp. 1231–1235 (2015)

    Google Scholar 

  12. Peng, X., Cao, H., Natarajan, P.: Document image quality assessment using discriminative sparse representation. In: Document Analysis Systems, pp. 227–232 (2016)

    Google Scholar 

  13. Rusinol, M., Chazalon, J., Ogier, J.M.: Combining focus measure operators to predict OCR accuracy in mobile-captured document images. In: IAPR International Workshop on Document Analysis Systems, pp. 181–185 (2014)

    Google Scholar 

  14. Shi, B., Yao, C., Liao, M., Yang, M., Xu, P., Cui, L., Belongie, S., Lu, S., Bai, X.: ICDAR 2017 competition on reading Chinese text in the wild (RCTW-17). In: 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), vol. 01, pp. 1429–1434 (2017)

    Google Scholar 

  15. Sudre, C.H., Li, W., Vercauteren, T., Ourselin, S., Cardoso, M.J.: Generalised dice overlap as a deep learning loss function for highly unbalanced segmentations (2017)

    Google Scholar 

  16. Tian, Z., Huang, W., He, T., He, P., Qiao, Y.: Detecting text in natural image with connectionist text proposal network. In: European Conference on Computer Vision, pp. 56–72 (2016)

    Chapter  Google Scholar 

  17. Xu, J., Ye, P., Li, Q., Liu, Y., Doermann, D.: No-reference document image quality assessment based on high order image statistics. In: IEEE International Conference on Image Processing, pp. 3289–3293 (2016)

    Google Scholar 

  18. Ye, P., Doermann, D.: Document image quality assessment: a brief survey. In: International Conference on Document Analysis and Recognition, pp. 723–727 (2013)

    Google Scholar 

  19. Ye, Q., Doermann, D.: Text detection and recognition in imagery: a survey. IEEE Trans. Pattern Anal. Mach. Intell. 37(7), 1480–1500 (2015)

    Article  Google Scholar 

  20. Zhang, Z., Zhang, C., Shen, W., Yao, C., Liu, W., Bai, X.: Multi-oriented text detection with fully convolutional networks, pp. 4159–4167 (2016)

    Google Scholar 

  21. Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, 21–26 July 2017, pp. 2642–2651 (2017)

    Google Scholar 

  22. Zhu, Y., Yao, C., Bai, X.: Scene text detection and recognition: recent advances and future trends. Front. Comput. Sci. 10(1), 19–36 (2016)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hongyu Li .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Li, H., Qiu, J., Zhu, F. (2018). TextNet for Text-Related Image Quality Assessment. In: Kůrková, V., Manolopoulos, Y., Hammer, B., Iliadis, L., Maglogiannis, I. (eds) Artificial Neural Networks and Machine Learning – ICANN 2018. ICANN 2018. Lecture Notes in Computer Science(), vol 11140. Springer, Cham. https://doi.org/10.1007/978-3-030-01421-6_27

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-01421-6_27

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-01420-9

  • Online ISBN: 978-3-030-01421-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics