Skip to main content

Detecting Kuzushiji Characters from Historical Documents by Two-Dimensional Context Box Proposal Network

  • Conference paper
  • First Online:
Future Data and Security Engineering (FDSE 2019)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11814))

Included in the following conference series:

Abstract

Detecting characters from historical documents is a challenging problem due to the cursive and connected characters. It is a fundamental task for recognizing historical documents. In this paper, we propose a two-dimensional context box proposal network. The network has three parts: feature extraction, two-dimensional context, and box proposal regression. For feature extraction, we employ VGG16 to extract features from an input image. Then, the extracted features are processed by two-dimensional context. We employ two Bidirectional Long Short Term Memory (BLSTM) to explore vertical and horizontal meaningful context. Finally, the bounding box is predicted from the output of the two-dimensional context. We tested our proposed system on 28 books of Kuzushiji documents. The results of the experiments show the effectiveness of our proposed two-dimensional context for character detection. Our system achieved 88.50% and 92.46% of F1 score on validation and testing sets, respectively, which outperforms the baseline system (Connectionist Text Proposal Network).

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kuzushi challenge. http://codh.rois.ac.jp/char-shape/

  2. Nguyen, H.T., Ly, N.T., Nguyen, K.C., Nguyen, C.T., Nakagawa, M.: Attempts to recognize anomalously deformed Kana in Japanese historical documents. In: Proceedings of the 2017 Workshop on Historical Document and Processing, Kyoto, Japan, pp. 31–36, November 2017

    Google Scholar 

  3. Le, A.D., Clanuwat, T., Kitamoto, A.: A human-inspired recognition system for pre-modern Japanese historical documents. IEEE Access 7(1), 84163–84169

    Article  Google Scholar 

  4. Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., Ha, D.: Deep learning for classical Japanese literature. In: NeurIPS 2018 Workshop on Machine Learning for Creativity and Design, December 2018

    Google Scholar 

  5. Zhou, X., et al.: EAST: an efficient and accurate scene text detector. In: CVPR (2018)

    Google Scholar 

  6. Tian, Z., Huang, W., He, T., He, P., Qiao, Yu.: Detecting text in natural image with connectionist text proposal network. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 56–72. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_4

    Chapter  Google Scholar 

  7. Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Anh Duc Le .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2019 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Le, A.D. (2019). Detecting Kuzushiji Characters from Historical Documents by Two-Dimensional Context Box Proposal Network. In: Dang, T., Küng, J., Takizawa, M., Bui, S. (eds) Future Data and Security Engineering. FDSE 2019. Lecture Notes in Computer Science(), vol 11814. Springer, Cham. https://doi.org/10.1007/978-3-030-35653-8_53

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-35653-8_53

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-35652-1

  • Online ISBN: 978-3-030-35653-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics