Abstract
Part of speech tagging is one of the most basic preprocessing tasks of machine translation in NLP. The problem of tagging in natural language processing is to find a way to tag every word in a text as a meticulous part of speech. In this paper, we first present different approaches and some of the grammatical rules for tagging homoeopathy clinical sentences. Further in the paper we have our approach development of a Hindi tagger by using homoeopathy clinical sentences, for this purpose we have developed a corpus comprising of 250 sentences at present having 20060 words and 3420 tokens. The accuracy of POS tagging is calculated by using standard formula, and achieved the accuracy of 89.55%.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Jurafsky, D., Martin, J.H.: Word classes and Part-Of-Speech Tagging. In: Speech and Language Processing, ch. 8. Prentice Hall (2000)
Halevi, Y.: Part of Speech Tagging. In: Seminar in Natural Language Processing and Computational Linguistics, School of Computer Science, Tel Aviv University, Israel (April 2006)
Brill, E.: Transformation-based error-driven learning and natural language processing: A case study in part-of-speech tagging. Computational Linguistics 21(4), 543–565
Ratnaparkhi, A.: A maximum entropy model for part-of-speech tagging. In: Brill, E., Church, K. (eds.) Proceedings of the Conference on Empirical Methods in Natural Language Processing, pp. 133–142. Association for Computational Linguistics, Somerset
Merialdo, B.: Tagging English text with a probabilistic model. Computational Linguistics 20(2), 155–171
Brants, T.: TnT-a statistical part-of-speech tagger. In: Proceedings of the 6th Applied NLP Conference, ANLP-2000 (April 2000)
Schulze, B.M., et al.: Comparitive State-of-the-art Survey and Assessment of General Interest Tools, Technical Report DIB – I, DECIDE Project, Institute for Natural Language Processing, Stuttgart (1994)
Antony, P.J., Mohan, S.P., Soman, K.P.: SVM Based Part of Speech Tagger for Malayalam. In: IEEE International Conference on Recent Trends in Information, Telecommunication and Computing, pp. 339–341 (2010)
Gim’enez, J., M’arquez, L.: SVMTtool: Technical manual, vol. 3 (August 2006)
Samuelsson, C., Voutilainen, A.: Comparing a linguistic and a stochastic tagger. In: Proceedings of the Eighth Conference on European Chapter of the Association for Computational Linguistics, pp. 246–253. Association for Computational Linguistics, Morristown
Kuba, A., Hócza, A., Csirik, J.A.: POS Tagging of Hungarian with Combined Statistical and Rule-Based Methods. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2004. LNCS (LNAI), vol. 3206, pp. 113–120. Springer, Heidelberg (2004)
Kumar, D., Josan, G.S.: Part of Speech Taggers for Morphologically Rich Indian Languages: A Survey. International Journal of Computer Applications (0975-8887) 6(5), 1–9 (2010)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 ICST Institute for Computer Science, Social Informatics and Telecommunications Engineering
About this paper
Cite this paper
Sukhadeve, P.P., Dwivedi, S.K. (2012). Developing Hindi POS Tagger for Homoeopathy Clinical Language. In: Meghanathan, N., Chaki, N., Nagamalai, D. (eds) Advances in Computer Science and Information Technology. Computer Science and Information Technology. CCSIT 2012. Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering, vol 86. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-27317-9_32
Download citation
DOI: https://doi.org/10.1007/978-3-642-27317-9_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-27316-2
Online ISBN: 978-3-642-27317-9
eBook Packages: Computer ScienceComputer Science (R0)