Performance evaluation of classifiers for the recognition of offline handwritten Gurmukhi characters and numerals: a study

Abstract

Classification is a process to pull out patterns from a number of classes by using various statistical properties and artificial intelligence techniques. The problem of classification is considered as one of the important problems for the development of applications and for efficient data analysis. Based on the learning adaptability and capability to solve complex computations, classifiers are always the best suited for the pattern recognition problems. This paper presents a comparative study of various classifiers and the results achieved for offline handwritten Gurmukhi characters and numerals recognition. Various classifiers used and evaluated in this study include k-nearest neighbors, linear-support vector machine (SVM), RBF-SVM, Naive Bayes, decision tree, convolution neural network and random forest classifier. For the experimental work, authors used a balanced data set of 13,000 samples that includes 7000 characters and 6000 numerals. To assess the performance of classifiers, authors have used the Waikato Environment for Knowledge Analysis which is an open source tool for machine learning. The performance is assessed by considering various parameters such as accuracy rate, size of the dataset, time taken to train the model, false acceptance rate, false rejection rate and area under receiver operating characteristic Curve. The paper also highlights the comparison of correctness of tests obtained by applying the selected classifiers. Based on the experimental results, it is clear that classifiers considered in this study have complementary rewards and they should be implemented in a hybrid manner to achieve higher accuracy rates. After executing the experimental work, their comparison and analysis, it is concluded that the Random Forest classifier is performing better than other recently used classifiers for character and numeral recognition of offline handwritten Gurmukhi characters and numerals with the recognition accuracy of 87.9% for 13,000 samples.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10
Fig. 11

References

  1. Amin A, Singh S (1998) Recognition of hand-printed Chinese characters using decision trees/machine learning C4.5 system. Pattern Anal Appl 1(2):130–141

    Article  Google Scholar 

  2. Anil R, Manjusha K, Kumar SS, Soman KP (2015) Convolutional neural networks for the recognition of Malayalam characters. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications (FICTA), pp 493–500

  3. Bhowmik TK, Bhattacharya U, Parui SK (2004) Recognition of Bangla handwritten characters using an MLP classifier based on stroke features. In: Proceedings of international conference on neural information processing (ICONIP’04), pp 814–819

  4. Blue JL, Candela GT, Grother PJ, Chellappa R, Wilson CL (1994) Evaluation of pattern classifiers for fingerprint and OCR applications. Pattern Recognit 27(4):485–501

    Article  Google Scholar 

  5. Breiman L (2001) Random forests. Mach Learn 45(1):5–32

    Article  Google Scholar 

  6. Cordella LP, Stefano CD, Fontanella F, Freca ASD (2014) Random forest for reliable pre-classification of handwritten characters. In: Proceedings of the 22nd international conference on pattern recognition, pp 1319–1324

  7. Desai AA (2010) Gujarati handwritten numeral optical character reorganization through neural network. Pattern Recognit 43(7):2582–2589

    Article  Google Scholar 

  8. Dietterich TG (1998) Approximate statistical tests for comparing supervised classification learning algorithms. Neural Comput 10(7):1895–1924

    Article  Google Scholar 

  9. Elakkiya V, Muthumani I, Jegajothi M (2017) Tamil text recognition using KNN classifier. Adv Nat Appl Sci 11(7):41–45

    Google Scholar 

  10. Favata JT, Srikantan G, Srihari SN (1994) Handprinted character/digit recognition using a multiple feature/resolution philosophy. In: Proceedings of 4th international workshop on frontiers of handwriting recognition, pp 57–66

  11. Han J, Kamber M (2001) Data mining concepts and techniques. Morgan Kaufmann Publishers, San Francisco, pp 70–181

    Google Scholar 

  12. Hazra TK, Singh DP, Daga N (2017) Optical character recognition using KNN on custom image dataset. In: Proceedings of the 8th annual conference on industrial automation and electromechanical engineering, pp 110–114

  13. He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778

  14. Homenda W, Lesinski L (2011) Features selection in character recognition with random forest classifier. In: Proceedings of the international conference on computational collective intelligence, pp 93–102

  15. Huang J, Lu J, Ling CX (2003) Comparing Naïve Bayes, decision trees, and SVM with AUC and accuracy. In: Proceedings of the third IEEE international conference on data mining, pp 1–4

  16. Jain AK, Duin RPW, Mao J (2000) Statistical pattern recognition: a review. IEEE Trans Pattern Anal Mach Intell 22(1):4–37

    Article  Google Scholar 

  17. Jeong SW, Kim SH, Cho WH (1999) Performance comparison of statistical and neural network classifiers in handwritten digits’ recognition. In: Lee S-W (ed) Advances in handwriting recognition. World Scientific, Singapore, pp 406–415

    Google Scholar 

  18. Jindal MK, Sharma RK, Lehal GS (2008) Structural features for recognizing degraded printed Gurmukhi script. In: Proceedings of the 5th international conference on information technology: new generations (ITNG), pp 668–673

  19. John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence, pp 338–345

  20. John R, Raju G, Guru DS (2007) 1D wavelet transform of projection profiles for isolated handwritten Malayalam character recognition. In: Proceedings of international conference on computational intelligence and multimedia applications (ICCIMA), vol 2, pp 481–485

  21. Kim YS (2008) Comparison of the decision tree, artificial neural network, and linear regression methods based on the number and types of independent variables and sample size. Expert Syst Appl 34(2):1227–1234

    Article  Google Scholar 

  22. Krizhevsky A, Sutskever I, Hinton GE (2012) ImageNet classification with deep convolutional neural networks. In: Proceedings of the 25th international conference on neural information processing, vol 1, pp 1097–1105

  23. Kumar M, Sharma RK, Jindal MK (2012) Offline handwritten Gurmukhi character recognition: study of different features and classifiers combinations. In: Proceedings of international workshop on document analysis and recognition, IIT Bombay, pp 94–99

  24. Kumar M, Sharma RK, Jindal MK (2013a) A novel feature extraction technique for offline handwritten Gurmukhi character recognition. IETE J Res 59(6):687–692

    Article  Google Scholar 

  25. Kumar M, Sharma RK, Jindal MK (2013b) Size of training set vis-a-vis recognition accuracy of handwritten character recognition system. J Emerg Technol Web Intell 5(4):380–384

    Google Scholar 

  26. Kumar M, Sharma RK, Jindal MK (2014a) Efficient feature extraction techniques for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(4):381–391

    Article  Google Scholar 

  27. Kumar M, Jindal MK, Sharma RK (2014b) A novel hierarchical technique for offline handwritten Gurmukhi character recognition. Natl Acad Sci Lett 37(6):567–572

    Article  Google Scholar 

  28. Kumar M, Jindal MK, Sharma RK, Jindal SR (2018) Character and numeral recognition for non-indic and indic scripts: a survey. Artif Intell Rev. https://doi.org/10.1007/s10462-017-9607-x

    Article  Google Scholar 

  29. Lajish VL (2007) Handwritten character recognition using perceptual fuzzy-zoning and class modular neural networks. In: Proceedings of 4th international conference on innovations in information technology (ICIIT), pp 188–192

  30. LeCun Y, Bengio Y (1990) Handwritten digit recognition with a back-propagation network. In: Proceedings of the advances in neural information processing systems, pp 396–404

  31. LeCun Y, Bottou L, Bengio Y, Haffner P (1998) Gradient-based learning applied to document recognition. Proc IEEE 86(11):2278–2324

    Article  Google Scholar 

  32. Lee DS, Srihari SN (1993) Handprinted digit recognition: a comparison of algorithms. In: Proceedings of 3rd international workshop on frontiers of handwriting recognition, pp 153–164

  33. Lehal GS, Singh C, Lehal R (2001) A shape based post processor for Gurmukhi OCR. In: Proceedings of the 6th international conference on document analysis and recognition (ICDAR), pp 1105–1109

  34. Liu CL, Sako H, Fujisawa H (2002) Performance evaluation of pattern classifiers for handwritten character recognition. Int J Doc Anal Recognit 4(3):191–204

    Article  Google Scholar 

  35. Liu C, Liu J, Yu F, Huang Y, Chen J (2013) Handwritten character recognition with sequential convolutional neural network. In: Proceedings of the international conference on machine learning and cybernetics, pp 291–296

  36. Rachidi Y, Mahani Z (2017) Handwritten Amazigh character recognition system for image obtained by camera phone. Int J Sci Eng Res 8(3):1319–1324

    Google Scholar 

  37. Raju G (2008) Wavelet transform and projection profiles in handwritten character recognition—a performance analysis. In: Proceedings of international conference on advanced computing and communications, pp 309–314

  38. Ramanan M, Ramanan A, Charles EYA (2015) A hybrid decision tree for printed Tamil character recognition using SVMs. In: Proceedings of the 15th international conference on advances in ICT for emerging regions (ICTer), pp 130–141

  39. Rampalli R, Ramakrishnan AG (2011) Fusion of complementary online and offline strategies for recognition of handwritten Kannada characters. J Univers Comput Sci (JUCS) 17(1):81–93

    Google Scholar 

  40. Rashad M, Semary NA (2014) Isolated printed Arabic character recognition using KNN and random forest tree classifiers. In: Proceedings of the international conference on advanced machine learning technologies and applications, pp 11–17

  41. Rathi R, Pandey RK, Jangid M (2012) Offline handwritten Devanagari vowels recognition using KNN classifier. Int J Comput Appl 49(23):11–16

    Google Scholar 

  42. Sastry PN, Krishnan R, Ram BVS (2010) Classification and identification of Telugu handwritten characters extracted from palm leaves using decision tree approach. ARPN J Eng Appl Sci 5(3):22–32

    Google Scholar 

  43. Shanthi N, Duraiswamy K (2010) A novel SVM based handwritten Tamil character recognition system. Pattern Anal Appl (PAA) 13(2):173–180

    MathSciNet  Article  Google Scholar 

  44. Sharma DV, Jhajj P (2010) Recognition of isolated handwritten characters in Gurmukhi script. Int J Comput Appl 4(8):9–17

    Google Scholar 

  45. Sharma DV, Lehal GS (2009) Form field frame boundary removal for form processing system in Gurmukhi script. In: Proceedings of the 10th international conference on document analysis and recognition (ICDAR), pp 256–260

  46. Sharma A, Kumar R, Sharma RK (2008) Online handwritten Gurmukhi character recognition using elastic matching. In: Proceedings of the congress on image and signal processing, pp 391–396

  47. Sharma DV, Lehal GS, Mehta S (2009) Shape encoded post processing of Gurmukhi OCR. In: Proceedings of the 10th international conference on document analysis and recognition (ICDAR), pp 788–792

  48. Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: Proceedings of the international conference on learning representations, pp 1–14

  49. Sundaram S, Ramakrishnan AG (2008) Two dimensional principal component analysis for online character recognition. In: Proceedings of 11th international conference on frontiers in handwriting recognition (ICFHR), pp 88–94

  50. Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Computer vision and pattern recognition. arXiv:1409.4842

  51. Wu C, Fan W, He Y, Sun J, Naoi S (2014) Handwritten character recognition by alternately trained relaxation convolutional neural network. In: Proceedings of the 14th international conference on frontiers in handwriting recognition, pp 291–296

  52. Yuan A, Bai G, Jiao L, Liu Y (2012) Offline handwritten English character recognition based on convolutional neural network. In: Proceedings of the 10th IAPR international workshop on document analysis systems, pp 125–129

  53. Zahedi M, Eslami S (2012) Improvement of random forest classifier through localization of Persian handwritten OCR. ACEEE Int J Inf Technol 2(1):13–17

    Google Scholar 

  54. Zeiler MD, Fergus R (2014) Visualizing and understanding convolutional networks. In: Proceedings of the European conference on computer vision, pp 818–833

  55. Zhu X, Shi Y, Wang S (1999) A new distinguishing algorithm of connected character images based on Fourier transform. In: Proceedings of 4th international conference on document analysis and recognition, pp 788–791

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Munish Kumar.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest in this work.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Kumar, M., Jindal, M.K., Sharma, R.K. et al. Performance evaluation of classifiers for the recognition of offline handwritten Gurmukhi characters and numerals: a study. Artif Intell Rev 53, 2075–2097 (2020). https://doi.org/10.1007/s10462-019-09727-2

Download citation

Keywords

  • Artificial intelligence
  • Classification algorithms
  • Supervised learning
  • Performance measurement
  • Comparative studies