Handwritten Bangla character and numeral recognition using convolutional neural network for low-memory GPU
Abstract
In this work, a convolutional neural network (CNN) based architecture is proposed for low memory GPU to recognize the handwritten isolated Bangla characters and numerals. The merit of the proposed architecture is the lesser number of trainable parameters as compared to the standard deep architectures and enabling it to train the proposed architecture on the low-memory GPU. The features from various layers of CNN are fused to handle the multi-scale nature of a character. The spatial pyramid pooling on the fused features produces a fixed size feature vector. It helps to reduce the number of parameters of the proposed model. Extensive experiments have been conducted on various versions of publicly available Bangla character dataset CMATERdb. The proposed architecture yields competitive results as compared to the fine-tuned standard deep architectures such as AlexNet, VGGNet, and GoogLeNet.
Keywords
Convolutional neural network Bangla characters and numerals Low-memory GPUNotes
Compliance with ethical standards
Conflict of interest
The authors declared that they have no conflict of interest.
References
- 1.Mori S, Suen CY, Yamamoto K (1992) Historical review of OCR research and development. Proc IEEE 80(7):1029Google Scholar
- 2.Roy PP, Pal U, Lladós J, Kimura F (2008) Convex hull based approach for multi oriented character recognition from graphical documents. In: 19th international conference on pattern recognition, pp 1–4Google Scholar
- 3.Chacko BP, Krishnan VRV, Raju G, Anto PB (2012) Handwritten character recognition using wavelet energy and extreme learning machine. Int J Mach Learn Cybern 3(2):149Google Scholar
- 4.Das N, Acharya K, Sarkar R, Basu S, Kundu M, Nasipuri M (2014) A benchmark image database of isolated Bangla handwritten compound characters. Int J Doc Anal Recognit 17(4):413Google Scholar
- 5.Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
- 6.Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9Google Scholar
- 7.Ren S, He K, Girshick R, Sun J (2015) Faster R-CNN: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99Google Scholar
- 8.Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: 2017 IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, July 21–26, 2017, pp 6517–6525. https://doi.org/10.1109/CVPR.2017.690
- 9.Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2018) Deeplab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE Trans Pattern Anal Mach Intell 40(4):834–848. https://doi.org/10.1109/TPAMI.2017.2699184 Google Scholar
- 10.Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten english character recognition based on convolutional neural network. In: 10th IAPR International workshop on document analysis systems, pp 125–129Google Scholar
- 11.Ciresan DC, Schmidhuber J (2013) Multi-column deep neural networks for offline handwritten Chinese character classification. Technical report, IDSIAGoogle Scholar
- 12.Kim I, Xie X (2015) Handwritten hangul recognition using deep convolutional neural networks. Int J Doc Anal Recognit 18(1):1Google Scholar
- 13.Mehrotra K, Jetley S, Deshmukh A, Belhe S (2013) Unconstrained handwritten Devanagari character recognition using convolutional neural networks. In: Proceedings of the 4th international workshop on multilingual OCR, p 15Google Scholar
- 14.Singh P, Verma A, Chaudhari NS (2016) Deep convolutional neural network classifier for handwritten Devanagari character recognition. In: Information systems design and intelligent applications, pp 551–561Google Scholar
- 15.Maitra DS, Bhattacharya U, Parui SK (2015) CNN based common approach to handwritten character recognition of multiple scripts. In: 13th international conference on document analysis and recognition, pp 1021–1025Google Scholar
- 16.Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In: Advances in neural information processing systems, pp 1097–1105Google Scholar
- 17.Bhattacharya U, Shridhar M, Parui SK (2006) On recognition of handwritten Bangla characters. In: Computer vision, graphics and image processing, pp 817–828Google Scholar
- 18.Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2009) A hierarchical approach to recognition of handwritten Bangla characters. Pattern Recognit 42(7):1467zbMATHGoogle Scholar
- 19.Bhattacharya U, Shridhar M, Parui SK, Sen P, Chaudhuri B (2012) Offline recognition of handwritten Bangla characters: an efficient two-stage approach. Pattern Anal Appl 15(4):445MathSciNetGoogle Scholar
- 20.Sarkhel R, Das N, Saha AK, Nasipuri M (2016) A multi-objective approach towards cost effective isolated handwritten Bangla character and digit recognition. Pattern Recognit 58:172Google Scholar
- 21.Basu S, Das N, Sarkar R, Kundu M, Nasipuri M, Basu DK (2012) An MLP based approach for recognition of handwritten Bangla’ numerals. arXiv:1203.0876
- 22.Santosh K (2011) Character recognition based on dtw-radon. In: International conference on document analysis and recognition, pp 264–268Google Scholar
- 23.Das N, Sarkar R, Basu S, Kundu M, Nasipuri M, Basu DK (2012) A genetic algorithm based region sampling for selection of local features in handwritten digit recognition application. Appl Soft Comput 12(5):1592Google Scholar
- 24.Khan HA, Al Helal A, Ahmed KI (2014) Handwritten Bangla digit recognition using sparse representation classifier. In: International conference on informatics, electronics and vision, pp 1–6Google Scholar
- 25.Alom MZ, Sidike P, Taha TM, Asari VK (2017) Handwritten Bangla digit recognition using deep learning. arXiv:1705.02680
- 26.Bag S, Harit G, Bhowmick P (2014) Recognition of Bangla compound characters using structural decomposition. Pattern Recognit 47(3):1187Google Scholar
- 27.Roy S, Das N, Kundu M, Nasipuri M (2017) Handwritten isolated Bangla compound character recognition: a new benchmark using a novel deep learning approach. Pattern Recognit Lett 90:15Google Scholar
- 28.Das N, Sarkar R, Basu S, Saha PK, Kundu M, Nasipuri M (2015) Handwritten Bangla character recognition using a soft computing paradigm embedded in two pass approach. Pattern Recognit 48(6):2054Google Scholar
- 29.LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278Google Scholar
- 30.Nair V, Hinton GE (2010) Rectified linear units improve restricted Boltzmann machines. In: Proceedings of the 27th international conference on machine learning, pp 807–814Google Scholar
- 31.Hubel DH, Wiesel TN (1962) Receptive fields, binocular interaction and functional architecture in the cat’s visual cortex. J Physiol 160(1):106Google Scholar
- 32.Luo W, Li Y, Urtasun R, Zemel R (2016) Understanding the effective receptive field in deep convolutional neural networks. In: Advances in neural information processing systems, pp 4898–4906Google Scholar
- 33.Le H, Borji A (2017) What are the receptive, effective receptive, and projective fields of neurons in convolutional neural networks? arXiv:1705.07049
- 34.Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR (2012) Improving neural networks by preventing co-adaptation of feature detectors. arXiv:1207.0580
- 35.Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: European conference on computer vision, pp 818–833Google Scholar
- 36.LeCun Y, Boser B, Denker JS, Henderson D, Howard RE, Hubbard W, Jackel LD (1989) Backpropagation applied to handwritten zip code recognition. Neural Comput 1(4):541Google Scholar
- 37.He K, Zhang X, Ren S, Sun J (2014) Spatial pyramid pooling in deep convolutional networks for visual recognition. In: European conference on computer vision, pp 346–361Google Scholar
- 38.Ioffe S, Szegedy C (2015) Batch normalization: accelerating deep network training by reducing internal covariate shift. In: International conference on machine learning, pp 448–456Google Scholar
- 39.Park SJ, Hong KS, Lee S (2017) Rdfnet: Rgb-d multi-level residual feature fusion for indoor semantic segmentation. In: IEEE conference on computer vision and pattern recognition, pp 4980–4989Google Scholar
- 40.Zeiler MD (2012) Adadelta: an adaptive learning rate method. arXiv:1212.5701
- 41.Bergstra J, Bastien F, Breuleux O, Lamblin P, Pascanu R, Delalleau O, Desjardins G, Warde-Farley D, Goodfellow I, Bergeron A et al (2011) Theano: deep learning on gpus with python. In: Neural information processing systems, vol. 3, pp 1–48Google Scholar
- 42.Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook, pp 667–685Google Scholar
- 43.He K, Sun J (2015) Convolutional neural networks at constrained time cost. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5353–5360Google Scholar
- 44.He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778Google Scholar