Semantic Representation of Malayalam Text Documents in Cricket Domain Using WordNet

Kumar, Sreedhi Deleep; Reshma, E. U.; Sunitha, C.; Ganesh, Amal

doi:10.1007/978-3-030-03146-6_49

Sreedhi Deleep Kumar⁶,
E. U. Reshma⁶,
C. Sunitha⁶ &
…
Amal Ganesh⁶

Part of the book series: Lecture Notes on Data Engineering and Communications Technologies ((LNDECT,volume 26))

Included in the following conference series:

International Conference on Intelligent Data Communication Technologies and Internet of Things

2020 Accesses

Abstract

Semantic representation is an abstract language for representing the meaning of text. It is used for representing the sentences semantically which can be employed in various applications such as Question Answering System, Information Extraction, Summarization, Machine translation etc. Various methods are employed to represent text document. But only limited works are done in Malayalam language. A specific domain is chosen (Cricket Domain) so as to obtain better results in semantic representation. A lexical database in Malayalam (WordNet), will be used as a resource for obtaining the required information. WordNet is a hierarchical information base in any language. In this project, semantic representation is extracted from a single Malayalam text document. It generates an abstractive representation of the given input. Semantic representation can be effectively extracted after going through different stages. Tokenization involves separation of words from sentences as tokens whereas POS Tagging deals with tagging of these tokens as corresponding Nouns, Verbs, Adjectives etc. The so got tagged tokens will undergo Morphological analysis. Morphological analysis is the process of finding the stem word for each of the generated tokens. After the analysis, the details regarding the stem words are obtained by searching in the WordNet. Next, the Semantic triplets (Subject, Object, Predicate) are extracted from the sentence. These triplets will be helpful for obtaining the semantic representation. For representation, the verb is taken as the root element. The aim of this project is semantic representation of Malayalam text documents pertaining to cricket domain using the database WordNet.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 259.00; Price excludes VAT (USA)

Softcover Book: USD 329.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Banu, M., Karthika, C., Sudarmani, P., Geethu, T.V.: Tamil document summarization using semantic graph method. In: International Conference on Computational Intelligence and Multimedia applications. IEEE (2007)
Google Scholar
Subramaniam, M., Dalal, V.: Test model for rich semantic graph representation for Hindi text using abstractive method. IRJET 02(02) (2015)
Google Scholar
Jayashree, R., Murthy, K.S., Sunny, K.: Keyword extraction based summarization of categorized Kannaad text documents. Int. J. Soft Comput. (IJSC) 2(4), 81 (2011)
Article Google Scholar
Khanam, M.H., Sravani, S.: Text summarization for Telugu document. IOSR J. Comput. Eng. (IOSR-JCE) 8(6), 25–28 (2016)
Google Scholar
Gupta, V., Lehal, G.S.: Preprocessing phase of Punjabi language text summarization. Springer, Heidelberg (2011)
Chapter Google Scholar
Kabeer, R., Sumam, M.I.: Text summarization of Malayalam documents-an experience. In: International Conference on Data Science and Engineering (ICDSE) (2014)
Google Scholar
Jaya, A., Sunitha, C., Ganesh, A.: Abstractive summarization techniques in Indian languages. In: International Conference of Recent Trends in Computer Science, Peer Review under Responsibility of the Organizing Committee of ICRTCSE 2016 (2016). https://doi.org/10.1016/j.procs.2016.05.121
Article Google Scholar
Aref, M., Moawad, I., Ibrahim, S.: Rich semantic graph generation system prototype. In: The Tenth Conference on Language Engineering, Egypt (2010)
Google Scholar
Thaokar, C., Malik, L.: Test model for summarize hindi text by extraction method. In: 2013 IEEE Conference on Information & Communication Technologies (ICT), pp. 1138–1143. IEEE (2013)
Google Scholar
Aref, M., Moawad, I.F.: Semantic graph reduction approach for abstractive text summarization. In: Computer Engineering and Systems (ICCES) (2012)
Google Scholar

Download references

Author information

Authors and Affiliations

Computer Science Department, Vidya Academy of Science and Technology, Thrissur, India
Sreedhi Deleep Kumar, E. U. Reshma, C. Sunitha & Amal Ganesh

Authors

Sreedhi Deleep Kumar
View author publications
You can also search for this author in PubMed Google Scholar
E. U. Reshma
View author publications
You can also search for this author in PubMed Google Scholar
C. Sunitha
View author publications
You can also search for this author in PubMed Google Scholar
Amal Ganesh
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Sreedhi Deleep Kumar .

Editor information

Editors and Affiliations

Department of ECE, Karunya University, Coimbatore, India
Jude Hemanth
Department of Electrical and Computer Engineering, Ryerson Communications Lab, Ryerson University, Toronto, ON, Canada
Xavier Fernando
Faculty of Engineering, Department of Telecommunication Engineering, Czech Technical University, Prague, Czech Republic
Pavel Lafata
School of Science, Joondalup Campus, Edith Cowan University, Joondalup, WA, Australia
Zubair Baig

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kumar, S.D., Reshma, E.U., Sunitha, C., Ganesh, A. (2019). Semantic Representation of Malayalam Text Documents in Cricket Domain Using WordNet. In: Hemanth, J., Fernando, X., Lafata, P., Baig, Z. (eds) International Conference on Intelligent Data Communication Technologies and Internet of Things (ICICI) 2018. ICICI 2018. Lecture Notes on Data Engineering and Communications Technologies, vol 26. Springer, Cham. https://doi.org/10.1007/978-3-030-03146-6_49

Download citation

DOI: https://doi.org/10.1007/978-3-030-03146-6_49
Published: 21 December 2018
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-03145-9
Online ISBN: 978-3-030-03146-6
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics