Comparative Analysis of Gabor and Discriminating Feature Extraction Techniques for Script Identification

Rani, Rajneesh; Dhir, Renu; Lehal, G. S.

doi:10.1007/978-3-642-19403-0_27

Rajneesh Rani²,
Renu Dhir² &
G. S. Lehal³

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 139))

Included in the following conference series:

International Conference on Information Systems for Indian Languages

716 Accesses
1 Citations

Abstract

A considerable amount of success has been achieved in developing monolingual OCR systems for Indian Scripts. But in a country like India, where many languages and scripts exist, it is more common that a single document contain words from more than one script. Therefore a script identification system is required to select the appropriate OCR. This paper presents a comparative analysis of two different feature extraction techniques for script identification of each word. In this work, for script identification discriminating and Gabor filter based features are computed of Punjabi words and English numerals. Extracted feature are simulated with Knn and SVM classifiers to identify the script and then recognition rates are compared. It has been observed that by selecting the appropriate value of k and appropriate kernel function with appropriate combination of feature extraction and classification scheme, there is significant drop in error rate.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Dhanya, D., Ramakrishnan, A.G.: Simultaneous Recognition of Tamil and Roman Scripts. In: The Proc. Tamil Internet, Kuala Lumpur, pp. 64–68 (2001)
Google Scholar
Rani, R., Dhir, R.: A Survey: Recognition of Scripts in Bi-Lingual/Multi-Lingual Indian Documents. National Journal of PIMT Journal of Research 2(1), 55–60 (2009)
Google Scholar
Abirami, S., Manjula, D.: A Survey of Script Identification Techniques for Multi-Script Document Images. international journal of Recent trends in Engineering 1(2), 246–249 (2009)
Google Scholar
Devijver, P.A., Kittler, J.: Pattern Recognition: A statistical Approach. Prentice –Hall, London (1982)
MATH Google Scholar
Wood, S., Yao, X., Krishnamurthi, K., Dang, L.: Language identification from for printrd trxt independent od fsegmentation. In: Proc of International Conference on Image Processing, pp. 428–431 (1995)
Google Scholar
Dhanya, D., Ramakrishnan, A.G., Pati, P.B.: Script identification in printed bilingual documents. Sadhana 27(part 1), 73–82 (2002)
Article MATH Google Scholar
Pal, U., Sinha, S., Chaudhuri, B.B.: Word-wise Script identification from a document containing English,Devnagari and Telgu Text. In: The Proc. of NCDAR, pp. 213–220 (2003)
Google Scholar
Padma, M.C., Vijya, P.A.: Language Identification of Kannada, Hindi and English Text Words through Visual Discriminating features. The International Journal of Computational Intelligence Systems 1(2), 116–126 (2008)
Article Google Scholar
Dhir, R., Singh, C., Lehal, G.S.: A Structural Feature Based Approach for Script Identification of Gurmukhi and Roman Character and Words. In: The proc. of 39th Annual National Convention of Computer Society of India (CSI) held at Mumbai, India (2004)
Google Scholar
Pati, P.B., Raju, S.S., Pati, N., Ramakrishnan, A.G.: Gabor filters for document analysis in Indian Bilingual Documents. In: The Proc. Of ICISIP, pp. 123–126 (2004)
Google Scholar
Pati, P.B., Ramakrishnan, A.G.: HVS inspired system for Script Identification in Indian Multi-Script Documents. In: Proc. of 7th International Workshop on Document Analysis System, Nelson Newland, pp. 380–389 (2006)
Google Scholar
Pati, P.B., Ramakrishnan, A.G.: Word level multi-script identification. The Pattern Recognition Letters 29, 1218–1219 (2008)
Article Google Scholar
Dhandra, B.V., Mallikarjun, H., Hegadi, R., Malemath, V.S.: Word-wise Script Identification from Bilingual Documents based on Morphological Reconstruction. In: The Proc. of First IEEE International Conference on Digital Information Management, pp. 389–394 (2006)
Google Scholar
Dhandra, B.V., Mallikarjun, H., Hegadi, R., Malemath, V.S.: Word–wise Script Identification based on Morphological Reconstruction in Printed Bilingual Documents. In: The Proc. of IET International Conference on Vision Information Engineering VIE, Bangalore, pp. 389–393 (2006)
Google Scholar
Dhandra, B.V., Hangarge, M.: On Separation of English Numerals from Multilingual Document Images. The Journal of Multimedia 2(6), 26–33 (2007)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of CSE, NIT, Jalandhar, Punjab, India
Rajneesh Rani & Renu Dhir
Department of CSE, Punjabi University, Patiala, Punjab, India
G. S. Lehal

Authors

Rajneesh Rani
View author publications
You can also search for this author in PubMed Google Scholar
Renu Dhir
View author publications
You can also search for this author in PubMed Google Scholar
G. S. Lehal
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, Punjabi University, Patiala, India
Chandan Singh , Gurpreet Singh Lehal , Jyotsna Sengupta , Dharam Veer Sharma & Vishal Goyal , , , &

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Rani, R., Dhir, R., Lehal, G.S. (2011). Comparative Analysis of Gabor and Discriminating Feature Extraction Techniques for Script Identification. In: Singh, C., Singh Lehal, G., Sengupta, J., Sharma, D.V., Goyal, V. (eds) Information Systems for Indian Languages. ICISIL 2011. Communications in Computer and Information Science, vol 139. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-19403-0_27

Download citation

DOI: https://doi.org/10.1007/978-3-642-19403-0_27
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-19402-3
Online ISBN: 978-3-642-19403-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics