Abstract
Detecting characters from historical documents is a challenging problem due to the cursive and connected characters. It is a fundamental task for recognizing historical documents. In this paper, we propose a two-dimensional context box proposal network. The network has three parts: feature extraction, two-dimensional context, and box proposal regression. For feature extraction, we employ VGG16 to extract features from an input image. Then, the extracted features are processed by two-dimensional context. We employ two Bidirectional Long Short Term Memory (BLSTM) to explore vertical and horizontal meaningful context. Finally, the bounding box is predicted from the output of the two-dimensional context. We tested our proposed system on 28 books of Kuzushiji documents. The results of the experiments show the effectiveness of our proposed two-dimensional context for character detection. Our system achieved 88.50% and 92.46% of F1 score on validation and testing sets, respectively, which outperforms the baseline system (Connectionist Text Proposal Network).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Kuzushi challenge. http://codh.rois.ac.jp/char-shape/
Nguyen, H.T., Ly, N.T., Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Attempts to recognize anomalously deformed Kana in Japanese historical documents. In: Proceedings of the 2017 Workshop on Historical Document and Processing, Kyoto, Japan, pp. 31–36, November 2017
Le, A.D., Clanuwat, T., Kitamoto, A.: A human-inspired recognition system for pre-modern Japanese historical documents. IEEE Access 7(1), 84163–84169
Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical Japanese literature. In: NeurIPS 2018 Workshop on Machine Learning for Creativity and Design, December 2018
Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: CVPR (2018)
Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Le, A.D. (2019). Detecting Kuzushiji Characters from Historical Documents by Two-Dimensional Context Box Proposal Network. In: Dang, T., Küng, J., Takizawa, M., Bui, S. (eds) Future Data and Security Engineering. FDSE 2019. Lecture Notes in Computer Science(), vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_53
Download citation
DOI: https://doi.org/10.1007/978-3-030-35653-8_53
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-35652-1
Online ISBN: 978-3-030-35653-8
eBook Packages: Computer ScienceComputer Science (R0)