Abstract
In a multilingual country such as India, machine translation and crosslingual search are highly relevant problems. The WordNets, as crucial linguistic resources, play the most dominant role in the field of text processing and applications, such as machine learning, machine translation, information extraction, information retrieval, and natural language understanding systems. Therefore, no meaningful research in these areas can be complete without their help. This paper reports the categorization work of synsets of the Hindi WordNet (version 1.2), the challenges that were faced while doing the work, and solutions obtained for them thereafter. There are a number of concepts common to most of the languages, and linking these concepts with each other can provide an indispensable resource for Natural Language Processing and Language technology. The WordNet for Hindi language is created using the ab initio method while all the other Indian language WordNets are being created using the Hindi WordNet through expansion approach. The Hindi WordNet forms the foundation for the other Indian language WordNets as they are based on it and are being linked to it.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bahuguna, A., Talukdar, T., Bhattacharyya, P., & Singh, S. (2014). HinMA: Distributed morphology based hindi morphological analyzer. In International Conference on Natural Language Processing 2014 (ICON 2014), Goa University, Goa, India, 19–20 December 2014.
Bhattacharyya, P. (2010). IndoWordNet. In Lexical Resources Engineering Conference 2010 (LREC 2010), Malta, May, 2010.
Chaplot, D. S., Bhingardive, S., & Bhattacharyya, P. (2014). IndoWordnet visualizer: A graphical user interface for browsing and exploring wordnets of Indian languages. In Global WordNet Conference 2014 (GWC 2014), Tartu, Estonia, 25–29 January 2014.
Halle, M., Marantz, A. (1993).  Distributed Morphology and the Pieces of Inflection. In Kenneth Hale & S. Jay Keyser (Eds.), The view From Building 20: (pp. 111–176). Cambridge: MIT Press.
Halle, M., Marantz, A. (1994).  Some key features of Distributed Morphology. In Andrew Carnie and Heidi Harley (Eds.), MITWPL 21: Papers on phonology and morphology (pp. 275–288). Cambridge: MITWPL.
Harley, H., Noyer, R. (1999) Distributed morphology. Glot international, 4(4), 3–9.
Kanojia, D., Dabre, R., & Bhattacharyya, P. (2016). Sophisticated Lexical Databases-Simplified Usage: Mobile Applications and Browser Plugins For Wordnets. In Global WordNet Conference (GWC 2016), Bucharest, Romanian, 27–30 January 2016.
Naravane, V. D. (1961). Bharatiya Vyavahara Kosha: Solah Bhasao ka kosha. Triveni Samgama. (in Hindi).
Redkar, H., Paranjape, J., Joshi, N., Kulkarni, I., Kulkarni, M., & Bhattacharyya, P. (2014). Introduction to Synskarta: An online interface for synset creation with special reference to Sanskrit. In International Conference on Natural Language Processing 2014 (ICON 2014), Goa, India, 19–20 December 2014.
Saraswati, J., Shukla, R., Goyal, R. P., & Bhattacharyya, P. (2010). Hindi to English WordNet linkage: Challenges and solutions. In Proceedings of 3rd IndoWordNet Workshop, International Conference on Natural Language Processing 2010 (ICON 2010), Indian Institute of Kharagpur, India, 8–11 December 2010.
Acknowledgments
The support of the Dept. of Information Technology, Govt. of India, toward the WordNet development effort through Indradhanush project is thankfully acknowledged.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix: Some of the Universal Synsets Selected by IndoWordNet Members
Appendix: Some of the Universal Synsets Selected by IndoWordNet Members
Id | Synset | Id | Synset | Id | Synset | Id | Synset |
27 | moorkha (fool) | 273 | masooDaa (gum) | 561 | tarjanii (index finger) | 779 | jyeshTha (name of a lunar month) |
30 | yogya (capable) | 278 | hiiraa (diamond) | 562 | anaanikaa (ring finger) | 780 | saavana (name of a lunar month) |
31 | sabhya (decent) | 293 | Boonda (drop) | 568 | arahara (toor pulse) | 781 | bhaadrapada (name of a lunar month) |
44 | apamaan (insult) | 298 | khoona (blood) | 591 | maaranaa (to hit) | 782 | pousha (name of a lunar month) |
46 | sammaana (respect) | 332 | pinjaraa (cage) | 592 | latiyaanaa (kick) | 787 | chaadara (bed sheet) |
47 | iishwar (God) | 335 | choohaa (mouse) | 600 | jaala (net) | 788 | asatya (lie) |
51 | achaanaka (suddenly) | 344 | sariiyaa (rod) | 604 | mastaka (fore head) | 792 | dhanusha (bow) |
75 | sadguNa (virtue) | 345 | kalama (pen) | 605 | cheharaa (face) | 798 | dvaara (door) |
120 | prema (love) | 346 | kaagaja (paper) | 610 | tanaa (stem) | 801 | paradaa (curtain) |
121 | sneha (love) | 370 | baansurii (flute) | 617 | kala (yesterday) | 802 | bichhounaa |
129 | anuvaad (translation) | 373 | paasa men (nearby) | 623 | laala ranga (red) | 806 | kambala (blanket) |
142 | ghriNaa (hatred) | 406 | sitaara (name of a string instrument) | 624 | haraa (green) | 808 | maagha (name of a lunar month) |
155 | oura (other) | 409 | taanapooraa (name of a string instrument) | 625 | niilaa (blue) | 822 | puraskaara (prize) |
171 | aparadha (crime) | 417 | motaa (plump) | 630 | poornimaa (full moon) | 849 | sinha (lion) |
172 | aparaadhii (criminal) | 422 | paira (leg) | 631 | madhya raatri (midnight) | 858 | aadamii (man) |
203 | baraamadaa (veranda) | 444 | biskuta (biscuit) | 632 | kala (tomorrow) | 890 | taaraa (star) |
208 | graha (home) | 452 | mandira (temple) | 635 | makkhii (fly) | 893 | khulaa (uncovered) |
217 | lohaa (iron) | 464 | moorti (statue) | 642 | tahanii (twig) | 965 | qaanoona (law) |
225 | budha (Mercury) | 470 | rangamancha (theater_stage) | 643 | konpala (foliage) | 976 | taalaa (lock) |
226 | shukra (Venus) | 473 | dhaagaa (thread) | 644 | shaakhaa (tree branch) | 984 | darshaka (spectator) |
227 | brihaspati (Jupiter) | 474 | dhaatu (metal) | 647 | haddii (bone) | 985 | naaka (nose) |
228 | shani (Saturn) | 476 | duma (tail) | 648 | chaitra (name of a lunar month) | 987 | kaana (ear) |
229 | varuNa (Neptune) | 491 | bhujaa (arm) | 652 | avastha (state) | 990 | nathunaa (ala) |
231 | pradesha (province) | 492 | poshaaka (clothing) | 661 | pasalii (rib) | 991 | niraashaa (hopelessness) |
233 | zillaa (district) | 504 | kalaaii (wrist) | 667 | polaa (hollow) | 1011 | naraka (hell) |
235 | ronaa (cry) | 505 | kuhanii (elbow) | 679 | haara (defeat) | 1012 | damaa (asthma) |
236 | gaanna (sing) | 511 | ulkaa (meteoroid) | 720 | mitrataa (friendship) | 1013 | prasannataa (cheerfulness) |
247 | satyavaadii (honest) | 521 | jhandaa (flag) | 749 | koyala (cuckoo) | 1017 | Indradhanusha (rainbow) |
268 | riiDa (spine) | 526 | vaakaii (actually) | 752 | pankha (wing) | 1029 | fena (foam) |
269 | shakti (strength) | 528 | spashta (clear) | 753 | choncha (beak) | 1036 | mahaavata (mahout) |
270 | munha (mouth) | 531 | praaNa (spirit) | 758 | phana (hood) | 1038 | Imalii (tamarind) |
271 | daanta (tooth) | 532 | naayikaa (heroine) | 762 | magar (crocodile) | 1045 | shaanti (peace) |
272 | taaloo (palate) | 558 | choolhaa (stove) | 778 | vaishaakha (name of a lunar month) | 1046 | ……… |
Rights and permissions
Copyright information
© 2017 Springer Science+Business Media Singapore
About this chapter
Cite this chapter
Kashyap, L., Joshi, S.R., Bhattacharyya, P. (2017). Insights on Hindi WordNet Coming from the IndoWordNet. In: Dash, N., Bhattacharyya, P., Pawar, J. (eds) The WordNet in Indian Languages. Springer, Singapore. https://doi.org/10.1007/978-981-10-1909-8_2
Download citation
DOI: https://doi.org/10.1007/978-981-10-1909-8_2
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-1907-4
Online ISBN: 978-981-10-1909-8
eBook Packages: Social SciencesSocial Sciences (R0)