Abstract
There is an enormous amount of textual material, and it is only growing every single day. The data available on Internet comprised of Web pages, news articles, status updates, blogs which are unstructured. There is a great need to reduce much of these text data to shorter, focused summaries that capture the salient details so that the user can navigate it more effectively as well as check whether the larger documents contain the information that we are looking for. Text summary is generating a shorter version of the original text. The need of summarization arises because every time it is not possible to read the detailed document due to lack of time. Automatic text summarization methods are greatly needed to address the ever-growing amount of text data available online both to better help discover relevant information and to consume relevant information faster. To address the issue of time constraint, an extractive text summarization technique has been proposed in this research work which selects important sentences from a text document to get a gist of information contained in it. A fuzzy technique has been used to generate extractive summary from multiple documents by using eight and eleven feature sets. The eleven feature set combines the existing eight features (term frequency-inverse sentence, length of sentence in the document, location of sentence in document, similarity between sentences, numerical data, title overlap, subject object verb (SOV) qualifier, lexical similarity) and three ancillary features (proper nouns, hindi cue phrase, thematic words). It was seen that applying fuzzy technique with eleven features gave better results for summarization than the same using eight features. The precision increases in the range of 3–5% for different datasets. Datasets used were Hindi news articles from online sources.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Thaokar, C., & Malik, L. (2013). Test model for summarizing Hindi text using extraction method. In IEEE Conference on ICT 2013.
Babar, S. A., & Patil, P. D. (2015). Improving performance of text summarization. In International Conference on Information and Communication Technologies (ICICT 2014), Procedia Computer Science (Vol. 46, pp. 354–363).
Meena, Y. K., & Gopalani, D. (2015). Evolutionary algorithms for extractive automatic text summarization. In International Conference on Intelligent Computing, Communication & Convergence (ICCC-2014), Procedia Computer Science (Vol. 48, pp. 244–249).
Hahn, U., & Mani, I. (2000). The challenges of automatic summarization. In 2000 IEEE.
Megala, S. S., Kavitha, A., & Marimuthu, A. (2014). Enriching text summarization using fuzzy logic. (IJCSIT) International Journal of Computer Science and Information Technologies, 5(1), 863–867.
Kyoomarsi, F., Khosravi, H., Eslami, E., & Davoudi, M. (2010). Extraction based text summarization using fuzzy analysis. Iranian Journal of Fuzzy Systems, 7(3), 15–32.
Kumar, Y., & Gopalani, D. (2015). Feature priority based sentence filtering method for extractive automatic text summarization. In ICCC-2015, Procedia Computer Science (Vol. 48, pp. 728–734).
Patil, P. D., & Mane, P. M. (2015). Improving the performance for single and multi-document text summarization via LSA & FL. IJCST, 2(4).
Patil, P. D., & Mane, P. M. (2014). A comprehensive review on fuzzy logic & latent semantic analysis techniques for improving the performance of text summarization. International Journal of Advance Research in Computer Science and Management Studies (IJARCSMS), 2(11).
Patil, P. D., & Kulkarni, N. J. (2014). Text summarization using fuzzy logic. International Journal of Innovative Research in Advanced Engineering (IJIRAE), 1(3).
Santana Megala, S., & Kavitha, A. (2014). Feature extraction based legal document summarization. IJARMS, 2(12).
Suanmali1, L., Salim, N., & Binwahlan, M. S. (2009). Fuzzy logic based method for improving text summarization. International Journal of Computer Science and Information Security (IJCSIS), 2(1).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Singapore Pte Ltd.
About this chapter
Cite this chapter
Gulati, A.N., Sawarkar, S.D. (2020). Comparative Analysis of Hindi Text Summarization for Multiple Documents by Padding of Ancillary Features. In: Pant, M., Sharma, T., Basterrech, S., Banerjee, C. (eds) Performance Management of Integrated Systems and its Applications in Software Engineering. Asset Analytics. Springer, Singapore. https://doi.org/10.1007/978-981-13-8253-6_22
Download citation
DOI: https://doi.org/10.1007/978-981-13-8253-6_22
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-13-8252-9
Online ISBN: 978-981-13-8253-6
eBook Packages: Business and ManagementBusiness and Management (R0)