Advertisement

SVM with Inverse Fringe as Feature for Improving Accuracy of Telugu OCR Systems

  • Amit Patel
  • Burra Sukumar
  • Chakravarthy Bhagvati
Conference paper
Part of the Advances in Intelligent Systems and Computing book series (AISC, volume 518)

Abstract

Designing an OCR system with high accuracy is quite a tough task as the system performance gets affected by its component modules. The accuracy and quality of the OCR system depends on impact of each module. The overall system performance changes if there is an improvement in a module. In our work at present, we have developed an OCR system for Telugu (Drishti System). We proposed in our paper SVM algorithm with inverse fringe as feature for Telugu OCR. The idea is to improve the performance of system by increasing recognition accuracy of the developed system. Support vector machines (SVM) was shown by several researchers to deliver high performance on Indic OCRs. SVMs have been applied to Telugu OCR and are tested with different features. In our experiments, we used fringe distance and its complementary version, the inverse fringe as a feature to the SVM. These two features have been used to develop the working model of Telugu OCR with an accuracy approaching 90%. It is shown that the performance is good over more than 300 classes. With inverse fringe as feature, the system with 325 classes is trained with 15543 labeled Telugu characters and tested over 75335 unlabeled Telugu characters; the accuracy of the system is found 99.50%. The SVM-based classifier is tested on our scanned image document corpus of more than 4500 pages and about 5,000,000 symbols. Evaluation of end-to-end system performance is done in our experiments. From the results, it has been depicted that SVM classifier is giving an improvement of approximately 1.24% over the developed Telugu OCR (Drishti System).

Keywords

Fringe map Telugu script Telugu OCR System performance Indian scripts 

References

  1. 1.
    P. Pavan Kumar, Chakravarthy Bhagvati, Atul Negi, Arun Agarwal, Bulusu Lakshmana Deekshatulu: Towards Improving the Accuracy of Telugu OCR Systems. ICDAR pp. 910–914, 2011.Google Scholar
  2. 2.
    C. V. Lakshmi and C. Patvardhan: An optical character recognition system for printed Telugu text. Pattern Analysis and Applications, vol. 7, no. 2, pp. 190–204, 2004.Google Scholar
  3. 3.
    C. V. Lakshmi, R. Jain, and C. Patvardhan: OCR of printed Telugu text with high recognition accuracies. Computer Vision, Graphics and Image Processing, pp. 786–795, 2006.Google Scholar
  4. 4.
    Atul Negi, Chakravarthy Bhagvati, B. Krishna: An OCR System for Telugu. IEEE Document Analysis and Recognition, Sixth International Conference, 2001.Google Scholar
  5. 5.
    O. Trier and A. K. Jain: Goal-directed evaluation of binarization methods. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 17, no. 12, pp. 1191–1201, 1995.Google Scholar
  6. 6.
    K. Y. Wong, R. G. Casey, and F. M. Wahl: Document analysis system. IBM Journal of Res. Develop., vol. 26, no. 6, pp. 647–656, 1982.Google Scholar
  7. 7.
    V. Govindaraju and S. Srirangaraj: Guide to OCR for Indic Scripts. Advances in Pattern Recognition, Springer 2010.Google Scholar
  8. 8.
    S. Rajasekaran and B. Deekshatulu: Recognition of printed Telugu characters. Computer Graphics and Image Processing, vol. 6, no. 4, pp. 335–360, 1977.Google Scholar
  9. 9.
    A. K. Pujari, C. D. Naidu, M. S. Rao, and B. C. Jinaga: An intelligent character recognizer for Telugu scripts using multiresolution analysis and associative memory. Image and Vision Computing, vol. 22, no. 14, pp. 1221–1227, 2004.Google Scholar
  10. 10.
    C. J. Burges: A tutorial on support vector machines for pattern recognition. Data Mining and Knowledge Discovery, 2(2): 121–167, 1998.Google Scholar
  11. 11.
    Xiao-Xiao Niu, Ching Y. Suen: A novel hybrid CNN-SVM classifier for recognizing handwritten digits. Centre for Pattern Recognition and Machine Intelligence, Concordia University, Suite EV003.403, 1455 de Maisonneuve Blvd. West, Montreal, Quebec, Canada H3G 1M8.Google Scholar
  12. 12.
    S. Abe: Support vector machines for pattern classification. Springer-Verlag, London 2005.Google Scholar
  13. 13.
    C. C. Chang and C. J. Lin: LIBSVM: a library for support vector machines. 2001, Software available at http://www.csie.ntu.edu.tw/~cjlin/libsvm.
  14. 14.
    R. L. Brown: The Fringe Distance Measure: An Easily Calculated Image Distance Measure with Recognition Results Comparable to Gaussian Blurring. IEEE TRANSACTIONS ON SYSTEMS, MAN, AND CYBERNETICS, VOL. 24, NO. I, JANUARY 1994.Google Scholar
  15. 15.
    P. Pavan Kumar, C. Bhagvati, A. Negi, A. Agarwal, and B. L. Deekshatulu: Towards improving the accuracy of Telugu OCR systems. 2011 International Conference on Document Analysis and Recognition, pages 910–914, 2011.Google Scholar

Copyright information

© Springer Nature Singapore Pte Ltd. 2018

Authors and Affiliations

  • Amit Patel
    • 1
  • Burra Sukumar
    • 2
  • Chakravarthy Bhagvati
    • 2
  1. 1.Rajiv Gandhi University of Knowledge Technologies, IIIT NuzvidNuzvid, KrishnaIndia
  2. 2.School of Computer and Information SciencesUniversity of HyderabadHyderabadIndia

Personalised recommendations