Fast and Accurate Tree-Based Clustering for Japanese/Chinese Character Recognition

  • Yuichi Abe
  • Takahiro Sasaki
  • Hideaki Goto
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8157)


Recognizing text in natural scene images is very important to develop various systems such as an assistant device for visually-impaired people. Multilingual scene text recognition is also becoming important for wearable camera devices with language translation feature. Since computational resources are limited on such mobile devices, fast and accurate Optical Character Recognition (OCR) algorithm is needed. Nearest Neighbor (NN) search is quite popular in feature vector-based OCR systems, and its speed improvement is required. In this paper, we develop an OCR scheme with tree-based clustering technique with LDA (Linear Discriminant Analysis) aiming at real-time Japanese/Chinese character recognition. The experimental results using ETL9B dataset show that our proposed method is 94.6% faster than our previous method, also beating other techniques, at mere 0.24% accuracy drop from the full linear search.


Fast Nearest Neighbor search Linear Discriminant Analysis (LDA) real-time character recognition Approximate Nearest Neighbor (ANN) search multilingual OCR 


  1. 1.
    Koga, M., Mine, R., Takahashi, T., Yamazaki, M., Yamaguchi, T.: Camera-based Kanji OCR for Mobile-phones Practical Issues. In: Proc. of ICDAR, pp. 635–639 (2005)Google Scholar
  2. 2.
    Mancas-Thilou, C., Ferreira, S., Demeyer, J., Minetti, C., Gosselin, B.: A multifunctional reading assistant for the visually impaired. EURASIP Journal on Image and Video Processing, 1–11 (2007)Google Scholar
  3. 3.
    Arya, S., Mount, D., Netanyahu, N., Silverman, R., Wu, A.: An Optimal Algorithm for Approximate Nearest Neighbor Searching in Fixed Dimensions. Journal of the ACM 6(45), 891–923 (1998)CrossRefMathSciNetGoogle Scholar
  4. 4.
    Datar, M., Immorlica, N.: P. Indyk, V.M.: Locality-Sensitive Hashing Scheme Based on p-Stable Distributions. In: Proc. of the Twentieth Annual Symposium on Computational Geometry, pp. 253–262 (2004)Google Scholar
  5. 5.
    Sobu, Y., Goto, H., Aso, H.: Binary Tree-Based Accuracy-Keeping Clustering Using CDA for Very Fast Japanese Character Recognition. In: Proc. of MVA 2011, pp. 299–302 (2011)Google Scholar
  6. 6.
    Zhang, H., Guo, J., Chen, G., Li, C.: HCL2000 – A Large-scale Handwritten Chinese Character Database for Handwritten Character Recognition. In: Proc. of ICDAR, pp. 286–290 (2009)Google Scholar
  7. 7.
    Sasaki, T., Goto, H.: High-Accuracy Clustering Using LDA for Fast Japanese Character Recognition. IEICE Technical Report, PRMU2012–73, 19–24 (2012) (in Japanese)Google Scholar
  8. 8.
    Barnea, D., Silverman, H.: A Class of Algorithms for Fast Digital Image Registration. IEEE Trans. on Computers 2, C-21, 179–186 (1972)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Yuichi Abe
    • 1
  • Takahiro Sasaki
    • 1
  • Hideaki Goto
    • 2
  1. 1.Graduate School of Information SciencesTohoku UniversityJapan
  2. 2.Cyberscience CenterTohoku UniversityJapan

Personalised recommendations