Skip to main content

Improved RAKE Models to Extract Keywords from Hindi Documents

  • Conference paper
  • First Online:
Information Systems Design and Intelligent Applications

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 672))

Abstract

In this paper, we have proposed several improved versions of rapid automatic keyword extraction (RAKE) algorithm for extracting keywords from Hindi documents. As RAKE requires a stopword list to generate the set of candidate keywords, which is unavailable in Hindi, we have constructed the Hindi stopword list for this purpose. We have found some weakness in keyword scoring measures of RAKE and proposed several models such as N-RAKE, SD-RAKE, NSD-RAKE, and WOS-RAKE to improve upon the effectiveness of RAKE. We have found that our modifications yield better results in general than original RAKE.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 259.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 329.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Ortuño, M., Carpena, P., Bernaola-Galván, P., Muñoz, E. and Somoza, A.M., “Keyword detection in natural languages and DNA”, Europhys. Lett. 57, (2002), pp. 759–764.

    Google Scholar 

  2. Rose, S., Engel, D., Cramer, N., & Cowley, W., “Automatic keyword extraction from individual documents”, Text Mining: Applications and Theory, John Wiley & Sons Ltd., (2010).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sifatullah Siddiqi .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Siddiqi, S., Sharan, A. (2018). Improved RAKE Models to Extract Keywords from Hindi Documents. In: Bhateja, V., Nguyen, B., Nguyen, N., Satapathy, S., Le, DN. (eds) Information Systems Design and Intelligent Applications. Advances in Intelligent Systems and Computing, vol 672. Springer, Singapore. https://doi.org/10.1007/978-981-10-7512-4_47

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-7512-4_47

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-7511-7

  • Online ISBN: 978-981-10-7512-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics