Handwritten Bangla character and numeral recognition using convolutional neural network for low-memory GPU

Abstract

In this work, a convolutional neural network (CNN) based architecture is proposed for low memory GPU to recognize the handwritten isolated Bangla characters and numerals. The merit of the proposed architecture is the lesser number of trainable parameters as compared to the standard deep architectures and enabling it to train the proposed architecture on the low-memory GPU. The features from various layers of CNN are fused to handle the multi-scale nature of a character. The spatial pyramid pooling on the fused features produces a fixed size feature vector. It helps to reduce the number of parameters of the proposed model. Extensive experiments have been conducted on various versions of publicly available Bangla character dataset CMATERdb. The proposed architecture yields competitive results as compared to the fine-tuned standard deep architectures such as AlexNet, VGGNet, and GoogLeNet.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. 1.

    Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029

    Article  Google Scholar 

  2. 2.

    Roy PP, Pal U, Lladós J, Kimura F (2008) Convex hull based approach for multi oriented character recognition from graphical documents. In: 19th international conference on pattern recognition, pp 1–4

  3. 3.

    Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149

    Article  Google Scholar 

  4. 4.

    Das N, Acharya K, Sarkar R, Basu S, Kundu M, Nasipuri M (2014) A benchmark image database of isolated Bangla handwritten compound characters. Int J Doc Anal Recognit 17(4):413

    Article  Google Scholar 

  5. 5.

    Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556

  6. 6.

    Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9

  7. 7.

    Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99

  8. 8.

    Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690

  9. 9.

    Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184

    Article  Google Scholar 

  10. 10.

    Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten english character recognition based on convolutional neural network. In: 10th IAPR International workshop on document analysis systems, pp 125–129

  11. 11.

    Ciresan DC, Schmidhuber J (2013) Multi-column deep neural networks for offline handwritten Chinese character classification. Technical report, IDSIA

  12. 12.

    Kim I, Xie X (2015) Handwritten hangul recognition using deep convolutional neural networks. Int J Doc Anal Recognit 18(1):1

    Article  Google Scholar 

  13. 13.

    Mehrotra K, Jetley S, Deshmukh A, Belhe S (2013) Unconstrained handwritten Devanagari character recognition using convolutional neural networks. In: Proceedings of the 4th international workshop on multilingual OCR, p 15

  14. 14.

    Singh P, Verma A, Chaudhari NS (2016) Deep convolutional neural network classifier for handwritten Devanagari character recognition. In: Information systems design and intelligent applications, pp 551–561

    Google Scholar 

  15. 15.

    Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: 13th international conference on document analysis and recognition, pp 1021–1025

  16. 16.

    Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105

  17. 17.

    Bhattacharya U, Shridhar M, Parui SK (2006) On recognition of handwritten Bangla characters. In: Computer vision, graphics and image processing, pp 817–828

    Google Scholar 

  18. 18.

    Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recognit 42(7):1467

    Article  Google Scholar 

  19. 19.

    Bhattacharya U, Shridhar M, Parui SK, Sen P, Chaudhuri B (2012) Offline recognition of handwritten Bangla characters: an efficient two-stage approach. Pattern Anal Appl 15(4):445

    MathSciNet  Article  Google Scholar 

  20. 20.

    Sarkhel R, Das N, Saha AK, Nasipuri M (2016) A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition. Pattern Recognit 58:172

    Article  Google Scholar 

  21. 21.

    Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2012) An MLP based approach for recognition of handwritten Bangla’ numerals. arXiv:1203.0876

  22. 22.

    Santosh K (2011) Character recognition based on dtw-radon. In: International conference on document analysis and recognition, pp 264–268

  23. 23.

    Das N, Sarkar R, Basu S, Kundu M, Nasipuri M, Basu DK (2012) A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application. Appl Soft Comput 12(5):1592

    Article  Google Scholar 

  24. 24.

    Khan HA, Al Helal A, Ahmed KI (2014) Handwritten Bangla digit recognition using sparse representation classifier. In: International conference on informatics, electronics and vision, pp 1–6

  25. 25.

    Alom MZ, Sidike P, Taha TM, Asari VK (2017) Handwritten Bangla digit recognition using deep learning. arXiv:1705.02680

  26. 26.

    Bag S, Harit G, Bhowmick P (2014) Recognition of Bangla compound characters using structural decomposition. Pattern Recognit 47(3):1187

    Article  Google Scholar 

  27. 27.

    Roy S, Das N, Kundu M, Nasipuri M (2017) Handwritten isolated Bangla compound character recognition: a new benchmark using a novel deep learning approach. Pattern Recognit Lett 90:15

    Article  Google Scholar 

  28. 28.

    Das N, Sarkar R, Basu S, Saha PK, Kundu M, Nasipuri M (2015) Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recognit 48(6):2054

    Article  Google Scholar 

  29. 29.

    LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278

    Article  Google Scholar 

  30. 30.

    Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning, pp 807–814

  31. 31.

    Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106

    Article  Google Scholar 

  32. 32.

    Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. In: Advances in neural information processing systems, pp 4898–4906

  33. 33.

    Le H, Borji A (2017) What are the receptive, effective receptive, and projective fields of neurons in convolutional neural networks? arXiv:1705.07049

  34. 34.

    Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580

  35. 35.

    Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833

    Google Scholar 

  36. 36.

    LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541

    Article  Google Scholar 

  37. 37.

    He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361

    Google Scholar 

  38. 38.

    Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456

  39. 39.

    Park SJ, Hong KS, Lee S (2017) Rdfnet: Rgb-d multi-level residual feature fusion for indoor semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 4980–4989

  40. 40.

    Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv:1212.5701

  41. 41.

    Bergstra J, Bastien F, Breuleux O, Lamblin P, Pascanu R, Delalleau O, Desjardins G, Warde-Farley D, Goodfellow I, Bergeron A et al (2011) Theano: deep learning on gpus with python. In: Neural information processing systems, vol. 3, pp 1–48

  42. 42.

    Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook, pp 667–685

    Google Scholar 

  43. 43.

    He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5353–5360

  44. 44.

    He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Prateek Keserwani.

Ethics declarations

Conflict of interest

The authors declared that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Keserwani, P., Ali, T. & Roy, P.P. Handwritten Bangla character and numeral recognition using convolutional neural network for low-memory GPU. Int. J. Mach. Learn. & Cyber. 10, 3485–3497 (2019). https://doi.org/10.1007/s13042-019-00938-1

Download citation

Keywords

  • Convolutional neural network
  • Bangla characters and numerals
  • Low-memory GPU