Online Bangla handwritten word recognition using HMM and language model

Abstract

This paper proposes a model which is used to recognize online handwritten Bangla words. This word recognition module comprises of different modules: preprocessing of the word samples, segmentation of words into basic strokes, recognizing the basic strokes using multilayer perceptron, followed by recognition of words using Hidden Markov Model (HMM) aided by Language Model (LM). For stroke recognition, two different feature extraction techniques (point-based and curvature-based procedures) are used using late fusion technique. Top 5 stroke recognition choices are used to construct HMM for the prediction of word sample. An N-gram LM is applied as a post-processing step to rectify the HMM outcomes if required. A total of 50 different word samples with 110 instances each are used to evaluate the proposed model. The overall stroke-level and word-level recognition accuracies obtained by this model are 95.4% and 90.3%, respectively. The proposed model can be extended to recognize online handwritten words written in other script like Devanagari, Assamese, and Gurumukhi, etc. The methodologies described in the manuscript can also be applied for offline word recognition purpose.

This is a preview of subscription content, log in to check access.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

References

  1. 1.

    Sen S, Bhattacharyya A, Sarkar R, Roy K, Doermann D (2018) Application of structural and topological features to recognize online handwritten Bangla characters. Trans Asian Low Resour Lang Inf Process. https://doi.org/10.1145/3178457

    Article  Google Scholar 

  2. 2.

    Acharyya A, Rakshit S, Sarkar R, Basu S, Nasipuri M (2013) Handwritten word recognition using MLP based classifier: a holistic approach. Int J Comput Sci 10(2):422–427

    Google Scholar 

  3. 3.

    Hangare M, Dhandra BV (2010) Offline handwritten script identification in document images. Int J Comput Appl 4(6):6–10

    Google Scholar 

  4. 4.

    Graves A, Schmidhuber J (2009) Offline handwriting recognition with multi dimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552

  5. 5.

    Sarkar R, Das N, Basu S, Kundu M, Nasipuri M, Basu D (2008) A two-stage approach for segmentation of handwritten Bangla word images. In: International conference on frontiers in handwriting recognition, pp 403–408

  6. 6.

    Pham V, Kermorvant C, Louradour J (2013) Dropout improves recurrent neural networks for handwriting recognition. ArXiv preprint arXiv:1312.4569

  7. 7.

    Pal U, Jayadevan R, Sharma N (2012) Handwriting recognition in Indian regional scripts: a survey of offline techniques. ACM Trans Asian Lang Inf Process 11(1):1–35

    Article  Google Scholar 

  8. 8.

    Singh PK, Sarkar R, Nasipuri M (2015) Offline script identification from multilingual Indic-script documents: a state-of-the-art. Comput Sci Rev 15(16):1–28

    MathSciNet  Article  Google Scholar 

  9. 9.

    Sharma MK, Dhaka VP (2016) Segmentation of English Offline handwritten cursive scripts using a feedforward neural network. Neural Comput Appl 27(5):1369–1379

    Article  Google Scholar 

  10. 10.

    Shaw B, Parui SK, Shridhar M (2008) Offline handwritten devanagari word recognition: a segmentation based approach. In: Proceedings of Pattern Recognition, pp 1–4

  11. 11.

    Natarajan P, Saleem S, Prasad R, MacRostie E, Subramanian K (2008) Multi-lingual offline handwriting recognition using hidden Markov models: a script-independent approach. Arab Chin Handwrit Recognit 4768:231–250

    Article  Google Scholar 

  12. 12.

    Choisy C (2007) Dynamic handwritten keyword spotting based on the NSHP-HMM. In: Proceedings of the international conference on document analysis and recognition, pp 242–246

  13. 13.

    Plotz T, Fink GA (2009) Markov models for offline handwriting recognition: a survey. Int J Doc Anal Recogn 12(4):269–298

    Article  Google Scholar 

  14. 14.

    Oval S, Shirawale S (2015) Recognizing handwritten Devanagari words using recurrent neural network. In: Proceedings of the 3rd international conference on frontiers of intelligent computing: theory and applications, pp 413–421

  15. 15.

    Ray A, Rajeswar S, Chaudhuri S (2015) Text recognition using deep BLSTM networks. In: 8th international conference on advances in pattern recognition. https://doi.org/10.1109/icapr.2015.7050699

  16. 16.

    Shivram A, Zhu B, Setlur S, Nakagawa M, Govindaraju V (2013) Segmentation based on-line word recognition: a conditional random field driven beam search strategy. In: International conference on document analysis and recognition, pp 852–856

  17. 17.

    Vescovo G, Rizzi A (2007) Online handwriting recognition by the symbolic histograms approach. In: International conference on granular computing, pp 686–690

  18. 18.

    Ghosh R, Keshri P, Kumar P (2018) RNN Based online handwritten word recognition in devanagari script. In: International conference on frontiers in handwriting recognition, pp 517–522

  19. 19.

    Liwicki M, Graves A, Fernández S, Bunke H, Schmidhuber J (2007) A novel approach to on-line handwriting recognition based on bidirectional long short-term memory networks. In: Proceedings of the 9th international conference on document analysis and recognition

  20. 20.

    Graves A, Fernandez S, Liwicki M, Bunke H, Schmidhuber J (2008) Unconstrained online handwriting recognition with recurrent neural networks. In: Advances in neural information processing systems, pp 577–584

  21. 21.

    Yuan A, Bai G, Yang P, Guo Y, Zhao X (2012) Handwritten English word recognition based on convolutional neural networks. In: international conference on frontiers in handwriting recognition, pp 207–212

  22. 22.

    Fink GA, Vajda S, Bhattacharya U, Parui SK, Chaudhuri BB (2010) Online Bangla word recognition using sub-stroke level features and hidden Markov models. In: International conference on frontiers in handwriting recognition, pp 393–398

  23. 23.

    Mohiuddin S, Bhattacharya U, Parui SK (2011) Unconstrained Bangla online handwriting recognition based on MLP and SVM. In: Proceedings of the 2011 joint workshop on Multilingual OCR and Analytics for noisy unstructured Text Data. https://doi.org/10.1145/2034617.2034635

  24. 24.

    Chowdhury S, Garai U, Chattopadhyay T (2011) A weighted finite-state transducer (WFST)-based language model for online Indic script handwriting recognition. In: International conference on document analysis and recognition, pp 599–602

  25. 25.

    Bhattacharya U, Nigam A, Rawat YS, Parui SK (2008) An analytic scheme for online handwritten Bangla cursive word recognition. In: International conference on frontiers in handwriting recognition, pp 320–325

  26. 26.

    Bhattacharya N, Pal U, Roy K (2011) Individual character segmentation from single stroke of Bangla online handwritten text. Int J Mach Intell 3(4):251–258

    Google Scholar 

  27. 27.

    Bhattacharya N, Pal U (2012) Stroke segmentation and recognition from Bangla online handwritten text. In: International conference on frontiers in handwriting recognition, pp 740–745

  28. 28.

    Ghosh R (2009) Segmentation of online Bangla handwritten word. In: IEEE International advance computing conference, pp 658–663

  29. 29.

    Sen S, Chowdhury S, Mitra M, Schwenker F, Sarkar R, Roy K (2018) A novel segmentation technique for online handwritten Bangla words. Pattern Recogn Lett. https://doi.org/10.1016/j.patrec.2018.02.008

    Article  Google Scholar 

  30. 30.

    Bharath A, Sriganesh M (2007) Hidden Markov models for online handwritten tamil word recognition. In: 9th international conference on document analysis and recognition. https://doi.org/10.1109/icdar.2007.4378761

  31. 31.

    Chowdhury K, Alam L, Sarmin S, Arefin S, Hoque MM (2015) A fuzzy features based online handwritten Bangla word recognition framework. In: 18th international conference on computer and information technology, pp 21–23

  32. 32.

    Bhattacharya N, Pal U, Roy PP (2017) Stroke-order normalization for online Bangla handwriting recognition. In: 14th IAPR conference on document analysis and recognition, pp 206–211

  33. 33.

    Mukherjee PS, Bhattacharya U, Parui SK, Chakraborty B (2017) A hybrid model for end to end online handwriting recognition. In: 14th IAPR international conference on document analysis and recognition, pp 658–663

  34. 34.

    Frinken V, Bhattacharya N, Pal U (2014) Design of unsupervised feature extraction system for on-line bangla handwriting recognition. In: 11th IAPR international workshop on document analysis systems, pp 355–359

  35. 35.

    Srimony A, Dutta Chowdhuri S, Bhattacharya U, Parui SK (2014) Holistic recognition of online handwritten words based on an ensemble of SVM classifiers. In: 11th IAPR international workshop on document analysis systems, pp 86-90

  36. 36.

    Chakraborty B, Mukherjee P S, Bhattacharya U (2016) Bangla online handwriting recognition using recurrent neural network architecture. In: Proceedings of the tenth Indian conference on computer vision, graphics and image processing, pp 1–8

  37. 37.

    Roy K, Sharma N, Pal T, Pal U (2007) Online handwritten Bangla recognition system. In: International conference on pattern recognition. https://doi.org/10.1142/9789812772381_0018

  38. 38.

    Roy K, Bandhopadhyay A, Mondal R (2012) Stroke-database design for online handwriting recognition in Bangla. Int J Modern Eng Res 2(4):2534–2540

    Google Scholar 

  39. 39.

    Pinquier J, Karaman S, Letoupin L, Guyot P, Megret R, Benois-Pineau J, Gaestel Y, Dartigues J (2012) Strategies for multiple feature fusion with Hierarchical HMM: application to activity recognition from wearable audiovisual sensors. In: International conference on pattern recognition, pp 3192–3195

Download references

Author information

Affiliations

Authors

Corresponding author

Correspondence to Ram Sarkar.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Sen, S., Bhattacharyya, A., Mitra, M. et al. Online Bangla handwritten word recognition using HMM and language model. Neural Comput & Applic 32, 9939–9951 (2020). https://doi.org/10.1007/s00521-019-04518-w

Download citation

Keywords

  • Online handwriting
  • Word segmentation
  • Word recognition
  • Stroke recognition