Developing Hindi POS Tagger for Homoeopathy Clinical Language

Sukhadeve, Pramod P.; Dwivedi, Sanjay K.

doi:10.1007/978-3-642-27317-9_32

Pramod P. Sukhadeve¹⁸ &
Sanjay K. Dwivedi¹⁸

Part of the book series: Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering ((LNICST,volume 86))

Included in the following conference series:

International Conference on Computer Science and Information Technology

Abstract

Part of speech tagging is one of the most basic preprocessing tasks of machine translation in NLP. The problem of tagging in natural language processing is to find a way to tag every word in a text as a meticulous part of speech. In this paper, we first present different approaches and some of the grammatical rules for tagging homoeopathy clinical sentences. Further in the paper we have our approach development of a Hindi tagger by using homoeopathy clinical sentences, for this purpose we have developed a corpus comprising of 250 sentences at present having 20060 words and 3420 tokens. The accuracy of POS tagging is calculated by using standard formula, and achieved the accuracy of 89.55%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Jurafsky, D., Martin, J.H.: Word classes and Part-Of-Speech Tagging. In: Speech and Language Processing, ch. 8. Prentice Hall (2000)
Google Scholar
Halevi, Y.: Part of Speech Tagging. In: Seminar in Natural Language Processing and Computational Linguistics, School of Computer Science, Tel Aviv University, Israel (April 2006)
Google Scholar
Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21(4), 543–565
Google Scholar
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 133–142. Association for Computational Linguistics, Somerset
Google Scholar
Merialdo, B.: Tagging English text with a probabilistic model. Computational Linguistics 20(2), 155–171
Google Scholar
Brants, T.: TnT-a statistical part-of-speech tagger. In: Proceedings of the 6th Applied NLP Conference, ANLP-2000 (April 2000)
Google Scholar
Schulze, B.M., et al.: Comparitive State-of-the-art Survey and Assessment of General Interest Tools, Technical Report DIB – I, DECIDE Project, Institute for Natural Language Processing, Stuttgart (1994)
Google Scholar
Antony, P.J., Mohan, S.P., Soman, K.P.: SVM Based Part of Speech Tagger for Malayalam. In: IEEE International Conference on Recent Trends in Information, Telecommunication and Computing, pp. 339–341 (2010)
Google Scholar
Gim’enez, J., M’arquez, L.: SVMTtool: Technical manual, vol. 3 (August 2006)
Google Scholar
Samuelsson, C., Voutilainen, A.: Comparing a linguistic and a stochastic tagger. In: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics, pp. 246–253. Association for Computational Linguistics, Morristown
Google Scholar
Kuba, A., Hócza, A., Csirik, J.A.: POS Tagging of Hungarian with Combined Statistical and Rule-Based Methods. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 113–120. Springer, Heidelberg (2004)
Chapter Google Scholar
Kumar, D., Josan, G.S.: Part of Speech Taggers for Morphologically Rich Indian Languages: A Survey. International Journal of Computer Applications (0975-8887) 6(5), 1–9 (2010)
Article Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Babasaheb Bhimrao Ambedkar University, Lucknow, India
Pramod P. Sukhadeve & Sanjay K. Dwivedi

Authors

Pramod P. Sukhadeve
View author publications
You can also search for this author in PubMed Google Scholar
Sanjay K. Dwivedi
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Jackson State University, Jackson, MS, USA
Natarajan Meghanathan
University of Calcutta, Calcutta, India
Nabendu Chaki
Wireilla Net Solutions PTY Ltd., Melbourne, VIC, Australia
Dhinaharan Nagamalai

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Sukhadeve, P.P., Dwivedi, S.K. (2012). Developing Hindi POS Tagger for Homoeopathy Clinical Language. In: Meghanathan, N., Chaki, N., Nagamalai, D. (eds) Advances in Computer Science and Information Technology. Computer Science and Information Technology. CCSIT 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 86. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27317-9_32

Download citation

DOI: https://doi.org/10.1007/978-3-642-27317-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27316-2
Online ISBN: 978-3-642-27317-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics