Advertisement

Sādhanā

, 44:110 | Cite as

A novel approach for text summarization using optimal combination of sentence scoring methods

  • Pradeepika VermaEmail author
  • Hari Om
Article
  • 20 Downloads

Abstract

In this paper, a novel multi-document summarization scheme based on metaheuristic optimization is introduced that generates a summary by extracting salient and relevant sentences from a collection of documents. The proposed work generates optimal combinations of sentence scoring methods and their respective optimal weights to extract the sentences with the help of a metaheuristic approach known as teaching–learning-based optimization. In addition, the proposed scheme is compared to two summarization methods that use different metaheuristic approaches. The experimental results show the efficacy of the proposed summarization scheme.

Keywords

Multi-document summarization; word embedding; TLBO; cohesion; readability; non-redundancy 

References

  1. 1.
    Luhn H P 1958 The automatic creation of literature abstracts. IBM Journal of Research and Development 2: 159–165MathSciNetCrossRefGoogle Scholar
  2. 2.
    Verma P and Om H 2016 Extraction based text summarization methods on users review data: a comparative study. In: Proceedings of the Conference on Smart Trends for Information Technology and Computer Communications. Springer, pp. 346–354Google Scholar
  3. 3.
    Nenkova A and McKeown K 2012 A survey of text summarization techniques. In: Mining text data. Boston, MA: Springer, pp. 43–76CrossRefGoogle Scholar
  4. 4.
    Oliveira H, Ferreira R, Lima R, Lins R D, Freitas F, Riss M and Simske S J 2016 Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization. Expert Systems with Applications 65: 68–86CrossRefGoogle Scholar
  5. 5.
    Abbasi-ghalehtaki R, Khotanlou H and Esmaeilpour M 2016 Fuzzy evolutionary cellular learning automata model for text summarization. Swarm and Evolutionary Computation 30: 11–26CrossRefGoogle Scholar
  6. 6.
    Alguliev R M, Aliguliyev R M, Hajirahimova M S and Mehdiyev C A 2011 MCMR: maximum coverage and minimum redundant text summarization model. Expert Systems with Applications 38: 14514–14522CrossRefGoogle Scholar
  7. 7.
    Asgari H, Masoumi B and Sheijani O S 2014 Automatic text summarization based on multi-agent particle swarm optimization. In: Proceedings of the Iranian Conference on Intelligent Systems (ICIS), IEEE, pp. 1–5Google Scholar
  8. 8.
    Binwahlan M S, Salim N and Suanmali L 2009 Swarm based text summarization. In: Proceedings of the Association of Computer Science and Information Technology-Spring Conference (IACSITSC’09), IEEE, pp. 145–150Google Scholar
  9. 9.
    Binwahlan M S, Salim N and Suanmali L 2009 Fuzzy swarm based text summarization. Journal of Computer Science 5: 338–346CrossRefGoogle Scholar
  10. 10.
    Binwahlan M S, Salim N and Suanmali L 2010 Fuzzy swarm diversity hybrid model for text summarization. Information Processing & Management 46: 571–588CrossRefGoogle Scholar
  11. 11.
    Verma P and Om H 2019 A variable dimension optimization approach for text summarization. In: Proceedings of the Conference on Harmony Search and Nature Inspired Optimization Algorithms. Springer, pp. 687–696Google Scholar
  12. 12.
    Gordon M 1988 Probabilistic and genetic algorithms in document retrieval. Communications of the ACM 31: 1208–1218CrossRefGoogle Scholar
  13. 13.
    Khan A, Salim N and Kumar Y J 2015 A framework for multi-document abstractive summarization based on semantic role labeling. Applied Soft Computing 30: 737–747CrossRefGoogle Scholar
  14. 14.
    Kogilavani A and Balasubramanie P 2010 Clustering based optimal summary generation using genetic algorithm. In: Proceedings of the Conference on Communication and Computational Intelligence (INCOCCI), IEEE, pp. 324–329Google Scholar
  15. 15.
    Meena Y K and Gopalani D 2015 Evolutionary algorithms for extractive automatic text summarization. Procedia Computer Science 48: 244–249CrossRefGoogle Scholar
  16. 16.
    Shareghi E and Hassanabadi L S 2008 Text summarization with harmony search algorithm-based sentence extraction. In: Proceedings of the 5th International Conference on Soft Computing as Transdisciplinary Science and Technology, ACM, pp. 226–231Google Scholar
  17. 17.
    Rautray R and Balabantaray R C 2017 Cat swarm optimization based evolutionary framework for multi document summarization. Physica A: Statistical Mechanics and its Applications 477: 174–186CrossRefGoogle Scholar
  18. 18.
    Rautray R and Balabantaray R C 2017 An evolutionary framework for multi document summarization using Cuckoo search approach: MDSCSA. Applied Computing and Informatics. 14: 134–144CrossRefGoogle Scholar
  19. 19.
    Ansamma J, Premjith P S and Wilscy M 2017 Extractive multi-document summarization using population-based multicriteria optimization. Expert Systems with Applications 86: 385–397CrossRefGoogle Scholar
  20. 20.
    Verma P and Om H 2019 Collaborative ranking-based text summarization using a metaheuristic approach. In: Proceedings of the Conference on Emerging Technologies in Data Mining and Information Security. Springer, pp. 417–426Google Scholar
  21. 21.
    Nomoto T and Matsumoto Y 2003 The diversity-based approach to open-domain text summarization. Information Processing & Management 39(3): 363–389CrossRefGoogle Scholar
  22. 22.
    Jain A and Lobiyal D K 2016 Fuzzy Hindi WordNet and word sense disambiguation using fuzzy graph connectivity measures. ACM Transactions on Asian and Low-Resource Language Information Processing 15: 8Google Scholar
  23. 23.
    Miller G A, Beckwith R, Fellbaum C, Gross D and Miller K J 1990 Introduction to WordNet: an on-line lexical database. International Journal of Lexicography 3: 235–244CrossRefGoogle Scholar
  24. 24.
    He Y X, Liu D X, Ji D H, Yang H and Teng C 2006 Msbga: a multi-document summarization system based on genetic algorithm. In: Proceedings of the Conference on Machine Learning and Cybernetics, IEEE, pp. 2659–2664Google Scholar
  25. 25.
    Aliguliyev R M 2009 A new sentence similarity measure and sentence based extractive technique for automatic text summarization. Expert Systems with Applications 36: 7764–7772CrossRefGoogle Scholar
  26. 26.
    He R, Qin B and Liu T 2012 A novel approach to update summarization using evolutionary manifold-ranking and spectral clustering. Expert Systems with Applications 39: 2375–2384CrossRefGoogle Scholar
  27. 27.
    Alguliev R M, Aliguliyev R M and Isazade N R 2013 Multiple documents summarization based on evolutionary optimization algorithm. Expert Systems with Applications 40: 1675–1689CrossRefGoogle Scholar
  28. 28.
    Mendoza M, Bonilla S, Noguera C, Cobos C and Len E 2014 Extractive single-document summarization based on genetic operators and guided local search. Expert Systems with Applications 41: 4158–4169CrossRefGoogle Scholar
  29. 29.
    Kusner M, Sun Y, Kolkin N and Weinberger K 2015 From word embeddings to document distances. In: Proceedings of the Conference on Machine Learning, pp. 957–966Google Scholar
  30. 30.
    Kobayashi H, Noguchi M and Yatsuka T 2015 Summarization based on embedding distributions. In: Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, pp. 1984–1989Google Scholar
  31. 31.
    Kenter T and De Rijke M 2015 Short text similarity with word embeddings. In: Proceedings of the 24th ACM International Conference on Information and Knowledge Management, ACM, pp. 1411–1420Google Scholar
  32. 32.
    Mikolov T, Chen K, Corrado G and Dean J 2013 Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
  33. 33.
    Pinter Y, Guthrie R and Eisenstein J 2017 Mimicking word embeddings using subword RNNs. arXiv preprint arXiv:1707.06961
  34. 34.
    Rao R V, Savsani V J and Vakharia D P 2011 Teaching–learning-based optimization: a novel method for constrained mechanical design optimization problems. Computer-Aided Design 43: 303–315CrossRefGoogle Scholar
  35. 35.
    Rubner Y, Tomasi C and Guibas L J 2000 The earth mover’s distance as a metric for image retrieval. International Journal of Computer Vision 40: 99–121CrossRefGoogle Scholar
  36. 36.
    Parveen D, Mesgar M and Strube M 2016 Generating coherent summaries of scientific articles using coherence patterns. In: Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, pp. 772–783Google Scholar
  37. 37.
    Sankar K and Sobha L 2009 An approach to text summarization. In: Proceedings of the Third International Workshop on Cross Lingual Information Access: Addressing the Information Need of Multilingual Societies, ACL, pp. 53–60Google Scholar
  38. 38.
    Verma P and Om H 2019 MCRMR: Maximum coverage and relevancy with minimal redundancy based multi-document summarization. Expert Systems with Applications. 120: 43–56CrossRefGoogle Scholar
  39. 39.
    Willett P 2006 The Porter stemming algorithm: then and now. Program 40: 219–223CrossRefGoogle Scholar
  40. 40.
    Bird S and Loper E 2004 NLTK: the natural language toolkit. In: Proceedings of the ACL 2004 on Interactive Poster and Demonstration Sessions, ACL, p. 31Google Scholar
  41. 41.
    Schlkopf B, Weston J, Eskin E, Leslie C and Noble W S 2002 A kernel approach for learning from almost orthogonal patterns. In: Proceedings of the European Conference on Machine Learning. Berlin–Heidelberg: Springer, pp. 511–528Google Scholar
  42. 42.
    Lin CY 2004 Rouge: a package for automatic evaluation of summaries. In: Text Summarization Branches Out Google Scholar
  43. 43.
    Stajner S, Evans R, Orasan C and Mitkov R 2012 What can readability measures really tell us about text complexity. In: Proceedings of the Workshop on Natural Language Processing for Improving Textual Accessibility, pp. 14–22Google Scholar
  44. 44.
    William H D 2004 The principles of readability. ERIC, Online SubmissionGoogle Scholar
  45. 45.
    Ray R L 2010 Introduction to information retrieval. Journal of the American Society for Information Science and Technology 4: 852–885Google Scholar

Copyright information

© Indian Academy of Sciences 2019

Authors and Affiliations

  1. 1.Department of Computer Science and EngineeringIndian Institute of Technology (Indian School of Mines)DhanbadIndia

Personalised recommendations