Skip to main content

Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features

  • Chapter
  • First Online:
Performance Management of Integrated Systems and its Applications in Software Engineering

Part of the book series: Asset Analytics ((ASAN))

Abstract

There is an enormous amount of textual material, and it is only growing every single day. The data available on Internet comprised of Web pages, news articles, status updates, blogs which are unstructured. There is a great need to reduce much of these text data to shorter, focused summaries that capture the salient details so that the user can navigate it more effectively as well as check whether the larger documents contain the information that we are looking for. Text summary is generating a shorter version of the original text. The need of summarization arises because every time it is not possible to read the detailed document due to lack of time. Automatic text summarization methods are greatly needed to address the ever-growing amount of text data available online both to better help discover relevant information and to consume relevant information faster. To address the issue of time constraint, an extractive text summarization technique has been proposed in this research work which selects important sentences from a text document to get a gist of information contained in it. A fuzzy technique has been used to generate extractive summary from multiple documents by using eight and eleven feature sets. The eleven feature set combines the existing eight features (term frequency-inverse sentence, length of sentence in the document, location of sentence in document, similarity between sentences, numerical data, title overlap, subject object verb (SOV) qualifier, lexical similarity) and three ancillary features (proper nouns, hindi cue phrase, thematic words). It was seen that applying fuzzy technique with eleven features gave better results for summarization than the same using eight features. The precision increases in the range of 3–5% for different datasets. Datasets used were Hindi news articles from online sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 119.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 159.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Thaokar, C., & Malik, L. (2013). Test model for summarizing Hindi text using extraction method. In IEEE Conference on ICT 2013.

    Google Scholar 

  2. Babar, S. A., & Patil, P. D. (2015). Improving performance of text summarization. In International Conference on Information and Communication Technologies (ICICT 2014), Procedia Computer Science (Vol. 46, pp. 354–363).

    Google Scholar 

  3. Meena, Y. K., & Gopalani, D. (2015). Evolutionary algorithms for extractive automatic text summarization. In International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014), Procedia Computer Science (Vol. 48, pp. 244–249).

    Google Scholar 

  4. Hahn, U., & Mani, I. (2000). The challenges of automatic summarization. In 2000 IEEE.

    Google Scholar 

  5. Megala, S. S., Kavitha, A., & Marimuthu, A. (2014). Enriching text summarization using fuzzy logic. (IJCSIT) International Journal of Computer Science and Information Technologies, 5(1), 863–867.

    Google Scholar 

  6. Kyoomarsi, F., Khosravi, H., Eslami, E., & Davoudi, M. (2010). Extraction based text summarization using fuzzy analysis. Iranian Journal of Fuzzy Systems, 7(3), 15–32.

    Google Scholar 

  7. Kumar, Y., & Gopalani, D. (2015). Feature priority based sentence filtering method for extractive automatic text summarization. In ICCC-2015, Procedia Computer Science (Vol. 48, pp. 728–734).

    Google Scholar 

  8. Patil, P. D., & Mane, P. M. (2015). Improving the performance for single and multi-document text summarization via LSA & FL. IJCST, 2(4).

    Google Scholar 

  9. Patil, P. D., & Mane, P. M. (2014). A comprehensive review on fuzzy logic & latent semantic analysis techniques for improving the performance of text summarization. International Journal of Advance Research in Computer Science and Management Studies (IJARCSMS), 2(11).

    Google Scholar 

  10. Patil, P. D., & Kulkarni, N. J. (2014). Text summarization using fuzzy logic. International Journal of Innovative Research in Advanced Engineering (IJIRAE), 1(3).

    Google Scholar 

  11. Santana Megala, S., & Kavitha, A. (2014). Feature extraction based legal document summarization. IJARMS, 2(12).

    Google Scholar 

  12. Suanmali1, L., Salim, N., & Binwahlan, M. S. (2009). Fuzzy logic based method for improving text summarization. International Journal of Computer Science and Information Security (IJCSIS), 2(1).

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Archana N. Gulati .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Singapore Pte Ltd.

About this chapter

Check for updates. Verify currency and authenticity via CrossMark

Cite this chapter

Gulati, A.N., Sawarkar, S.D. (2020). Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features. In: Pant, M., Sharma, T., Basterrech, S., Banerjee, C. (eds) Performance Management of Integrated Systems and its Applications in Software Engineering. Asset Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-13-8253-6_22

Download citation

Publish with us

Policies and ethics