Skip to main content

Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers

  • Conference paper
  • First Online:
Smart Computing and Informatics

Part of the book series: Smart Innovation, Systems and Technologies ((SIST,volume 77))

Abstract

Summarization is the process of shortening a text document to make a summary that keeps the main points of the actual document. Extractive summarizers work on the given text to extract sentences that best express the message hidden in the text. Most extractive summarization techniques revolve around the concept of finding keywords and extracting sentences that have more keywords than the rest. Keyword extraction usually is done by extracting relevant words having a higher frequency than others, with stress on important one’s. Manual extraction or annotation of keywords is a tedious process brimming with errors involving lots of manual effort and time. In this work, we proposed an algorithm that automatically extracts keyword for text summarization in Telugu e-newspaper datasets. The proposed method compares with the experimental result of articles having the similar title in five different Telugu e-newspapers to check the similarity and consistency in summarized results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

References

  1. Litvak, M., Last, M.: Graph-based keyword extraction for single-document summarization. In: Proceedings of the Workshop on Multi-source Multilingual Information Extraction and Summarization, pp. 17–24. ACL (2008)

    Google Scholar 

  2. Mani, I., Maybury, M.T.: Advances in Automatic Text Summarization, vol. 293. MIT Press, Cambridge (1999)

    Google Scholar 

  3. Thomas, J.R., Bharti, S.K., Babu, K.S.: Automatic keyword extraction for text summarization in e-newspapers. In: Proceedings of the International Conference on Informatics and Analytics, pp. 86–93. ACM (2016)

    Google Scholar 

  4. http://www.ethnologue.com/statistics/size

  5. Chien, L.F.: Pat-tree-based keyword extraction for chinese information retrieval. In: ACM SIGIR Forum, vol. 31, pp. 50–58. ACM (1997)

    Google Scholar 

  6. Giarlo, M.J.: A comparative analysis of keyword extraction techniques (2005)

    Google Scholar 

  7. Humphreys, J.K.: An HTML keyphrase extractor. Department of Computer Science, University of California, Riverside, CA, USA, Technical Report (2002)

    Google Scholar 

  8. Reddy, S., Sharo, S.: Cross Language POS taggers (and other tools) for Indian languages an experiment with Kannada using Telugu resources. In: Proceedings of IJCNLP Workshop on Cross Lingual Information Access: Computational Linguistics and the Information Need of Multilingual Societies. Chiang Mai, Thailand (2011)

    Google Scholar 

  9. Bharati, A., Sangal, R., Sharma, D.M., Bai, L.: Anncorra: annotating corpora guidelines for pos and chunk annotation for indian languages. Technical Report. Technical Report (TRLTRC-31), LTRC, IIIT-Hyderabad (2006)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Reddy Naidu .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Naidu, R., Bharti, S.K., Babu, K.S., Mohapatra, R.K. (2018). Text Summarization with Automatic Keyword Extraction in Telugu e-Newspapers. In: Satapathy, S., Bhateja, V., Das, S. (eds) Smart Computing and Informatics . Smart Innovation, Systems and Technologies, vol 77. Springer, Singapore. https://doi.org/10.1007/978-981-10-5544-7_54

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-5544-7_54

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-5543-0

  • Online ISBN: 978-981-10-5544-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics