Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features

Gulati, Archana N.; Sawarkar, Sudhir D.

doi:10.1007/978-981-13-8253-6_22

Archana N. Gulati⁸ &
Sudhir D. Sawarkar⁹

Part of the book series: Asset Analytics ((ASAN))

607 Accesses
1 Citations

Abstract

There is an enormous amount of textual material, and it is only growing every single day. The data available on Internet comprised of Web pages, news articles, status updates, blogs which are unstructured. There is a great need to reduce much of these text data to shorter, focused summaries that capture the salient details so that the user can navigate it more effectively as well as check whether the larger documents contain the information that we are looking for. Text summary is generating a shorter version of the original text. The need of summarization arises because every time it is not possible to read the detailed document due to lack of time. Automatic text summarization methods are greatly needed to address the ever-growing amount of text data available online both to better help discover relevant information and to consume relevant information faster. To address the issue of time constraint, an extractive text summarization technique has been proposed in this research work which selects important sentences from a text document to get a gist of information contained in it. A fuzzy technique has been used to generate extractive summary from multiple documents by using eight and eleven feature sets. The eleven feature set combines the existing eight features (term frequency-inverse sentence, length of sentence in the document, location of sentence in document, similarity between sentences, numerical data, title overlap, subject object verb (SOV) qualifier, lexical similarity) and three ancillary features (proper nouns, hindi cue phrase, thematic words). It was seen that applying fuzzy technique with eleven features gave better results for summarization than the same using eight features. The precision increases in the range of 3–5% for different datasets. Datasets used were Hindi news articles from online sources.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 119.00; Price excludes VAT (USA)

Hardcover Book: USD 159.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Thaokar, C., & Malik, L. (2013). Test model for summarizing Hindi text using extraction method. In IEEE Conference on ICT 2013.
Google Scholar
Babar, S. A., & Patil, P. D. (2015). Improving performance of text summarization. In International Conference on Information and Communication Technologies (ICICT 2014), Procedia Computer Science (Vol. 46, pp. 354–363).
Google Scholar
Meena, Y. K., & Gopalani, D. (2015). Evolutionary algorithms for extractive automatic text summarization. In International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014), Procedia Computer Science (Vol. 48, pp. 244–249).
Google Scholar
Hahn, U., & Mani, I. (2000). The challenges of automatic summarization. In 2000 IEEE.
Google Scholar
Megala, S. S., Kavitha, A., & Marimuthu, A. (2014). Enriching text summarization using fuzzy logic. (IJCSIT) International Journal of Computer Science and Information Technologies, 5(1), 863–867.
Google Scholar
Kyoomarsi, F., Khosravi, H., Eslami, E., & Davoudi, M. (2010). Extraction based text summarization using fuzzy analysis. Iranian Journal of Fuzzy Systems, 7(3), 15–32.
Google Scholar
Kumar, Y., & Gopalani, D. (2015). Feature priority based sentence filtering method for extractive automatic text summarization. In ICCC-2015, Procedia Computer Science (Vol. 48, pp. 728–734).
Google Scholar
Patil, P. D., & Mane, P. M. (2015). Improving the performance for single and multi-document text summarization via LSA & FL. IJCST, 2(4).
Google Scholar
Patil, P. D., & Mane, P. M. (2014). A comprehensive review on fuzzy logic & latent semantic analysis techniques for improving the performance of text summarization. International Journal of Advance Research in Computer Science and Management Studies (IJARCSMS), 2(11).
Google Scholar
Patil, P. D., & Kulkarni, N. J. (2014). Text summarization using fuzzy logic. International Journal of Innovative Research in Advanced Engineering (IJIRAE), 1(3).
Google Scholar
Santana Megala, S., & Kavitha, A. (2014). Feature extraction based legal document summarization. IJARMS, 2(12).
Google Scholar
Suanmali1, L., Salim, N., & Binwahlan, M. S. (2009). Fuzzy logic based method for improving text summarization. International Journal of Computer Science and Information Security (IJCSIS), 2(1).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Engineering, Datta Meghe College of Engineering, Mumbai, Maharashtra, India
Archana N. Gulati
Datta Meghe College of Engineering, Mumbai, Maharashtra, India
Sudhir D. Sawarkar

Authors

Archana N. Gulati
View author publications
You can also search for this author in PubMed Google Scholar
Sudhir D. Sawarkar
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Archana N. Gulati .

Editor information

Editors and Affiliations

Department of Applied Science and Engineering, Indian Institute of Technology Roorkee, Roorkee, Uttarakhand, India
Millie Pant
Amity School of Engineering & Technology, Amity University Rajasthan, Jaipur, Rajasthan, India
Tarun K. Sharma
Department of Computer Science, Czech Technical University in Prague, Ostrava, Praha, Czech Republic
Sebastián Basterrech
Amity Institute of Information Technology, Amity University Rajasthan, Jaipur, Rajasthan, India
Chitresh Banerjee

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Gulati, A.N., Sawarkar, S.D. (2020). Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features. In: Pant, M., Sharma, T., Basterrech, S., Banerjee, C. (eds) Performance Management of Integrated Systems and its Applications in Software Engineering. Asset Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-13-8253-6_22

Download citation

DOI: https://doi.org/10.1007/978-981-13-8253-6_22
Published: 11 September 2019
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8252-9
Online ISBN: 978-981-13-8253-6
eBook Packages: Business and ManagementBusiness and Management (R0)

Publish with us

Policies and ethics