Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers

Naidu, Reddy; Bharti, Santosh Kumar; Babu, Korra Sathya; Mohapatra, Ramesh Kumar

doi:10.1007/978-981-10-5544-7_54

Reddy Naidu⁶,
Santosh Kumar Bharti⁶,
Korra Sathya Babu⁶ &
…
Ramesh Kumar Mohapatra⁶

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 77))

1532 Accesses
20 Citations

Abstract

Summarization is the process of shortening a text document to make a summary that keeps the main points of the actual document. Extractive summarizers work on the given text to extract sentences that best express the message hidden in the text. Most extractive summarization techniques revolve around the concept of finding keywords and extracting sentences that have more keywords than the rest. Keyword extraction usually is done by extracting relevant words having a higher frequency than others, with stress on important one’s. Manual extraction or annotation of keywords is a tedious process brimming with errors involving lots of manual effort and time. In this work, we proposed an algorithm that automatically extracts keyword for text summarization in Telugu e-newspaper datasets. The proposed method compares with the experimental result of articles having the similar title in five different Telugu e-newspapers to check the similarity and consistency in summarized results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

References

Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization. In: Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization, pp. 17–24. ACL (2008)
Google Scholar
Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization, vol. 293. MIT Press, Cambridge (1999)
Google Scholar
Thomas, J.R., Bharti, S.K., Babu, K.S.: Automatic keyword extraction for text summarization in e-newspapers. In: Proceedings of the International Conference on Informatics and Analytics, pp. 86–93. ACM (2016)
Google Scholar
http://www.ethnologue.com/statistics/size
Chien, L.F.: Pat-tree-based keyword extraction for chinese information retrieval. In: ACM SIGIR Forum, vol. 31, pp. 50–58. ACM (1997)
Google Scholar
Giarlo, M.J.: A comparative analysis of keyword extraction techniques (2005)
Google Scholar
Humphreys, J.K.: An HTML keyphrase extractor. Department of Computer Science, University of California, Riverside, CA, USA, Technical Report (2002)
Google Scholar
Reddy, S., Sharo, S.: Cross Language POS taggers (and other tools) for Indian languages an experiment with Kannada using Telugu resources. In: Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Chiang Mai, Thailand (2011)
Google Scholar
Bharati, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: annotating corpora guidelines for pos and chunk annotation for indian languages. Technical Report. Technical Report (TRLTRC-31), LTRC, IIIT-Hyderabad (2006)
Google Scholar

Download references

Author information

Authors and Affiliations

National Institute of Technology, Rourkela, Odisha, 769008, India
Reddy Naidu, Santosh Kumar Bharti, Korra Sathya Babu & Ramesh Kumar Mohapatra

Authors

Reddy Naidu
View author publications
You can also search for this author in PubMed Google Scholar
Santosh Kumar Bharti
View author publications
You can also search for this author in PubMed Google Scholar
Korra Sathya Babu
View author publications
You can also search for this author in PubMed Google Scholar
Ramesh Kumar Mohapatra
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Reddy Naidu .

Editor information

Editors and Affiliations

Department of Computer Science and Engineering, PVP Siddhartha Institute of Technology, Vijayawada, Andhra Pradesh, India
Suresh Chandra Satapathy
Department of Electronics and Communication Engineering, Shri Ramswaroop Memorial Group of Professional Colleges, Lucknow, Uttar Pradesh, India
Vikrant Bhateja
Electronics and Communication Sciences Unit, Indian Statistical Institute, Kolkata, West Bengal, India
Swagatam Das

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Naidu, R., Bharti, S.K., Babu, K.S., Mohapatra, R.K. (2018). Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers. In: Satapathy, S., Bhateja, V., Das, S. (eds) Smart Computing and Informatics . Smart Innovation, Systems and Technologies, vol 77. Springer, Singapore. https://doi.org/10.1007/978-981-10-5544-7_54

Download citation

DOI: https://doi.org/10.1007/978-981-10-5544-7_54
Published: 21 December 2017
Publisher Name: Springer, Singapore
Print ISBN: 978-981-10-5543-0
Online ISBN: 978-981-10-5544-7
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics