Skip to main content

A System for Recognition of Named Entities in Odia Text Corpus Using Machine Learning Algorithm

  • Conference paper
  • First Online:
Computational Intelligence in Data Mining - Volume 1

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 31))

Abstract

This paper presents a novel approach to recognize named entities in Odia corpus. The development of a NER system for Odia using Support Vector Machine is a challenging task in intelligent computing. NER aims at classifying each word in a document into predefined target named entity classes in a linear and non-linear fashion. Starting with named entity annotated corpora and a set of features it requires to develop a base-line NER System. Some language specific rules are added to the system to recognize specific NE classes. Moreover, some gazetteers and context patterns are added to the system to increase its performance as it is observed that identification of rules and context patterns requires language-based knowledge to make the system work better. We have used required lexical databases to prepare rules and identify the context patterns for Odia. Experimental results show that our approach achieves higher accuracy than previous approaches.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Kudo, T., Matsumoto, Y.: Chunking with support vector machine. In: Proceedings of NAACL, pp. 192–199 (2001)

    Google Scholar 

  2. Biswas, S., Mishra, S.P., Acharya, S., Mohanty, S.: A hybrid Oriya named entity recognition system: harnessing the power of rule. Int. J. Artif. Intell. Expert Syst. 1(1), 639–643 (2010)

    Google Scholar 

  3. Ekbal, A., Bandyopadhyay, S.: Bengali named entity recognition using support vector machine. In: Proceedings of the IJCNLP-08 Workshop on NER for South and South East Asian Languages, pp. 51–58 (2008)

    Google Scholar 

  4. Saha, S.K., Sarkar, S., Mitra, P.: A hybrid feature set based maximum entropy hindi named entity recognition. In: Proceedings of the 3rd International Joint Conference on NLP, Hyderabad, India, pp. 343–349, Jan 2008

    Google Scholar 

  5. Goyal, A.: Named entity recognition for South Asian languages. In: Proceedings of the IJCNLP-08 Workshop on NER for South and South-East Asian Languages, Hyderabad, India, pp. 89–96, Jan 2008

    Google Scholar 

  6. Sasidhar, B., Yohan, P.M., Babu, A.V., Govardhan, A.: A survey on named entity recognition in Indian languages with particular reference to Telugu. Int. J. Comput. Sci. 8(2). ISSN 1694-0814. www.IJCSI.org (2011)

  7. Chieu, H.L., Ng, H.T.: Named entity recognition: a maximum entropy approach using global information. In: 19th International Conference on Computational Linguistics (COLING 2002), 24 Aug–1 Sept 2002

    Google Scholar 

  8. Dash, N.S.: Indian scenario in language corpus generation. In: Dash, N.S., Dash, P.D., Sarkar, P. (eds.) Rainbow of Linguistics, vol. I, pp. 129–162. T Media Publication, Kolkata (2007)

    Google Scholar 

  9. Das, B.R., Patnaik, S., Dash, N.S.: Development of Odia language corpus from modern news paper texts: some problems and issues. In: Proceedings of the International Conference on Intelligent Computing, Communication and Devices (ICCD 2014). SOA University, Bhubaneswar, India, Springer Book Series on AISC, pp. 88–94 (2014)

    Google Scholar 

  10. Sharma, P., Sharma, U., Kalita, J.: Named entity recognition: a survey for the Indian languages. Language in India. Special Volume: Problems of Parsing in Indian Languages 11(5). www.languageinindia.com, May 2011

  11. Ekbal, A., Bandyopadhyay, S.: Named entity recognition using support vector machine: a language independent approach. Int. J. Electr. Electron. Eng. 4(2), 155–170 (2010)

    Google Scholar 

  12. Saha, S.K., Ghosh, P.S., Sarkar, S., Mitra, P.: Named entity recognition in Hindi using maximum entropy and transliteration. Res. J. Comput. Sci. Comput. Eng. Appl. 33–41 (2008)

    Google Scholar 

  13. Bharati, A., Sangal, R., Chaitnya, V.: Natural language processing—a Paninian perspective. Prentice Hall-India, New Delhi (1995)

    Google Scholar 

  14. Ray, P.R., Harish, V., Sarkar, S., Basu, A.: Part of speech tagging and local word grouping techniques for natural language parsing in Hindi. In: Proceedings of the International Conference on Natural Language Processing (ICON 2003), pp. 118–125 (2003)

    Google Scholar 

  15. Satish, K.: Neural Network Book: A Classroom Approach, 10th edn. TMH Publication, New Delhi (2010)

    Google Scholar 

  16. Mahapatra, D.: Adhunika Odia Byakarana (Modern Odia Grammar), 5th edn. Kitab Mahal, Cuttack (2010)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bishwa Ranjan Das .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer India

About this paper

Cite this paper

Das, B.R., Patnaik, S., Baboo, S., Dash, N.S. (2015). A System for Recognition of Named Entities in Odia Text Corpus Using Machine Learning Algorithm. In: Jain, L., Behera, H., Mandal, J., Mohapatra, D. (eds) Computational Intelligence in Data Mining - Volume 1. Smart Innovation, Systems and Technologies, vol 31. Springer, New Delhi. https://doi.org/10.1007/978-81-322-2205-7_30

Download citation

  • DOI: https://doi.org/10.1007/978-81-322-2205-7_30

  • Published:

  • Publisher Name: Springer, New Delhi

  • Print ISBN: 978-81-322-2204-0

  • Online ISBN: 978-81-322-2205-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics