An End-to-End Preprocessor Based on Adversiarial Learning for Mongolian Historical Document OCR

Su, Xiangdong; Xu, Huali; Zhang, Yue; Kang, Yanke; Gao, Guanglai; Batusiren

doi:10.1007/978-3-030-29894-4_21

An End-to-End Preprocessor Based on Adversiarial Learning for Mongolian Historical Document OCR

Xiangdong Su¹⁰,
Huali Xu¹⁰,
Yue Zhang¹⁰,
Yanke Kang¹⁰,
Guanglai Gao¹⁰ &
…
Batusiren¹⁰

Conference paper
First Online: 23 August 2019

2652 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 11672))

Abstract

In Mongolian historical document recognition, preprocessing mainly involves image binarization and denoising. This is a challenging task and greatly effects the accuracy of the recognition result. Concerning the fact that image binarization and denoising are both image-to-image tasks, this paper proposes an end-to-end preprocessor for Mongolian historical document OCR. The preprocessor is trained in an adversarial learning fashion and deal with binarization and denoising simultaneously. The input of the preprocessor is the color image of Mongolian document images, and the output is the clean binary images which can be used for word recognition. The preprocessor was trained on a limited dataset and performed better than the combination of binarization and denoising methods used in earlier Mongolian historical document OCR systems.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

Isola, P., Zhu, J.-Y., Zhou, T., Efros, A.A.: Image-to-image translation with conditional adversarial networks. CoRR, vol. abs/1611.07004 (2016)
Google Scholar
Mirza, M., Osindero, S.: Conditional generative adversarial nets. CoRR, vol. abs/1411.1784 (2014)
Google Scholar
Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)
Google Scholar
Otsu, N.: A threshold selection method from gray-level histograms. IEEE Trans. Syst. Man Cybern. 9(1), 62–66 (1979)
Article Google Scholar
Florian, W.: Efficient document image binarization using heterogeneous computing and interactive machine learning, Licentiate Dissertation (2018)
Google Scholar
Calvo-Zaragoza, J., Gallego, A.: A selectional auto-encoder approach for document image binarization (2017)
Google Scholar
Tensmeyer, C., Martinez, T.: Document Image Binarization with Fully Convolutional Neural Networks (2017)
Google Scholar
Szlam, A., Denton, E., Chintala, S., Fergus, R.: Deep generative image models using a laplacian pyramid of adversarial networks. In: Computer Vision and Pattern Recognition (2015)
Google Scholar
Gauthier, J.: Conditional generative adversarial nets for convolutional face generation. In: Convolutional Neural Networks for Visual Recognition (2014)
Google Scholar
Ioffe, S., Szegedy, C.: Batch normalization: accelerating deep network training by reducing internal covariate shift (2015)
Google Scholar
Collis, J.: Glossary of deep learning: batch normalisation (2017)
Google Scholar
Fischer, P., Ronneberger, O., Brox, T.: U-net: convolutional networks for biomedical image segmentation (2015)
Google Scholar
Tamura, S., Waibel, A.: Noise reduction using connectionist models. In: International Conference on Acoustics, Speech, and Signal Processing, pp. 553–556 (1988)
Google Scholar

Download references

Acknowledgment

This work was funded by National Natural Science Foundation of China (Grant No. 61762069), Natural Science Foundation of Inner Mongolia Autonomous Region (Grant No. 2017BS0601, Grant No. 2018MS06025) and program of higher-level talents of Inner Mongolia University (Grant No. 21500-5165161).

Author information

Authors and Affiliations

College of Computer Science, Inner Mongolia University, Hohhot, China
Xiangdong Su, Huali Xu, Yue Zhang, Yanke Kang, Guanglai Gao & Batusiren

Authors

Xiangdong Su
View author publications
You can also search for this author in PubMed Google Scholar
Huali Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yue Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Yanke Kang
View author publications
You can also search for this author in PubMed Google Scholar
Guanglai Gao
View author publications
You can also search for this author in PubMed Google Scholar
Batusiren
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xiangdong Su , Huali Xu or Yue Zhang .

Editor information

Editors and Affiliations

Department of Computing, Macquarie University, Sydney, NSW, Australia
Abhaya C. Nayak
RIKEN Center for Integrative Medical Sciences, Yokohama, Japan
Alok Sharma

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Su, X., Xu, H., Zhang, Y., Kang, Y., Gao, G., Batusiren (2019). An End-to-End Preprocessor Based on Adversiarial Learning for Mongolian Historical Document OCR. In: Nayak, A., Sharma, A. (eds) PRICAI 2019: Trends in Artificial Intelligence. PRICAI 2019. Lecture Notes in Computer Science(), vol 11672. Springer, Cham. https://doi.org/10.1007/978-3-030-29894-4_21

Download citation

DOI: https://doi.org/10.1007/978-3-030-29894-4_21
Published: 23 August 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29893-7
Online ISBN: 978-3-030-29894-4
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics