An Arabic Optical Character Recognition System Using Restricted Boltzmann Machines

  • Abdullah M. Rashwan
  • Mohamed S. Kamel
  • Fakhri Karray
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8259)

Abstract

Most of the state-of-the-art Arabic Optical Character Recognition systems use Hidden Markov Models to model Arabic characters. Much of the attention is paid to provide the HMM system with new features, pre-processing, or post-processing modules to improve the performances. In this paper, we present an Arabic OCR system using Restricted Boltzmann Machines (RBMs) to model Arabic characters. The recently announced ALTEC dataset for typewritten OCR system is used to train and test the system. The results show a 26% increase in the average word accuracy rate and 8% increase in the average character accuracy rate compared to the HMM system.

References

  1. 1.
    Bazzi, I., Schwartz, R., Makhoul, J.: An omnifont open-vocabulary ocr system for english and arabic. IEEE Transactions on Pattern Analysis and Machine Intelligence 21, 495–504 (1999)CrossRefGoogle Scholar
  2. 2.
    Khorsheed, M.: Offline recognition of omnifont arabic text using the hmm toolkit (htk). Pattern Recognition Letters 28, 1563–1571 (2007)CrossRefGoogle Scholar
  3. 3.
    Attia, M., Rashwan, M., El-Mahallawy, M.: Autonomously normalized horizontal differentials as features for hmm-based omni font-written ocr systems for cursively scripted languages. In: 2009 IEEE International Conference on Signal and Image Processing Applications (ICSIPA), pp. 185–190 (2009)Google Scholar
  4. 4.
    Rashwan, A.M., Rashwan, M.A., Abdel-Hameed, A., Abdou, S., Khalil, A.H.: A robust omnifont open-vocabulary arabic ocr system using pseudo-2d-hmm. In: Proc. SPIE, vol. 8297 (2012)Google Scholar
  5. 5.
    Mohamed, A., Dahl, G., Hinton, G.: Acoustic modeling using deep belief networks. IEEE Transactions on Audio, Speech, and Language Processing 20, 14–22 (2012)CrossRefGoogle Scholar
  6. 6.
    Hinton, G.: Training products of experts by minimizing contrastive divergence. Neural Computation 14, 2002 (2000)Google Scholar
  7. 7.
    Hinton, G., Deng, L., Yu, D., Dahl, G., Mohamed, A., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T., Kingsbury, B.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Processing Magazine 29 (2012)Google Scholar
  8. 8.
  9. 9.
    Young, S.J., Evermann, G., Gales, M.J.F., Hain, T., Kershaw, D., Moore, G., Odell, J., Ollason, D., Povey, D., Valtchev, V., Woodland, P.C.: The HTK Book, version 3.4. Cambridge University Engineering Department, Cambridge, UK (2006)Google Scholar
  10. 10.
    Stolcke, A.: SRILM – an extensible language modeling toolkit. In: Proceedings of ICSLP, Denver, USA, vol. 2, pp. 901–904 (2002)Google Scholar
  11. 11.
    Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313, 504–507 (2006)MathSciNetCrossRefMATHGoogle Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2013

Authors and Affiliations

  • Abdullah M. Rashwan
    • 1
  • Mohamed S. Kamel
    • 1
  • Fakhri Karray
    • 1
  1. 1.University of WaterlooCanada

Personalised recommendations