Owing to the phenomenal growth in communication technology, most of us hardly have time to read books. This habit of reading is slowly diminishing because of the busy lives of people. For visually challenged people, the situation is even worse. In order to address this impedes, we develop a better and more accurate methodology than the existing ones. In this work, in order to save the efforts for reading the complete text every time, we modify the Weighted TF_IDF (Term Frequency Inverse Document Frequency) algorithm to summarize books into relevant keywords. Then, we compare the modified algorithm with that of the existing algorithms of TextRank Algorithm, Luhn’s Algorithm, LexRank Algorithm, Latent Semantic Analysis(LSA). From the comparative analysis, we find that Weighted TF_IDF is an efficient algorithm to automate text summarization and produce an effective summary which is then converted from text to speech. Thus, the proposed algorithm would highly be useful for blind people.
This is a preview of subscription content, log in to check access.
Buy single article
Instant access to the full article PDF.
Price includes VAT for USA
Subscribe to journal
Immediate online access to all issues from 2019. Subscription will auto renew annually.
This is the net price. Taxes to be calculated in checkout.
Aliguliyev, R. M. (2007). Automatic document summarization by sentence extraction. Computing Technology, 12(5), 5–15.
Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B., & Kochut, K. (2017). Text summarization techniques: a brief survey. arXiv:1707.02268.
Aone, C., Okurowski, M. E., & Gorlinsky, J. (1998, August). Trainable, scalable summarization using robust NLP and machine learning. In Proceedings of the 17th international conference on Computational linguistics-Volume 1 (pp. 62–66). Association for Computational Linguistics.
Barzilay, R., & Elhadad, N. (2002). Inferring strategies for sentence ordering in multi document news summarization. Journal of Artificial Intelligence Research, 17, 35–55.
Barzilay, R., & Lee, L. (2003, May). Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 16–23). Association for Computational Linguistics.
Basheer, S., Bivi, S., Aysha, M., Jayakumar, S., Rathore, A., & Jeyakumar, B. (2019). Machine learning based classification of cervical cancer using K-nearest neighbour, random forest and multilayer perceptron algorithms. Journal of Computational and Theoretical Nanoscience, 16(5–6), 2523–2527. (5).
Baxendale, P. B. (1958). Machine-made index for technical literature-an experiment. IBM Journal of Research and Development, 2(4), 354–361.
Bouguettaya, A., Gao, Y., Klimenko, A., Chen, L., Zhang, X., Dzerzhinskiy, F., et al. (2017). Web information systems engineering-WISE 2017. Cham: International Publishing AG.
Brandow, R., Mitze, K., & Rau, L. F. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5), 675–685.
Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM (JACM), 16(2), 264–285.
Eisner, J. (2007, June). In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL).
Erkan, G., & Radev, D. R. (2004). Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 365–371).
Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22, 457–479.
Freitas, D., & Kouroupetroglou, G. (2008). Speech technologies for blind and low vision persons. Technology and Disability, 20(2), 135–156.
Gillick, D., & Favre, B. (2009, June). A scalable global model for summarization. In Proceedings of the workshop on integer linear programming for natural langauge processing (pp. 10–18). Association for Computational Linguistics.
Gillick, D., Favre, B., & Hakkani-Tür, D. (2008). The ICSI Summarization System at TAC 2008. In Tac.
Gunning, R. (1952). The technique of clear writing. New York: McGraw-Hill.
Hadjadj, D., & Burger, D. (1999). Braillesurf: An HTML browser for visually handicapped people. In Proceedings of Tech. and Persons with Disabilities Conf.
Hahn, U., & Mani, I. (2000). The challenges of automatic summarization. Computer, 33(11), 29–36.
Kadam, S., Jadhav, V., Babar, S., Pise, S., & Davane, P. (2013). Text summarization: An overview.
Karthik, S., & Sudha, M. (2020). Predicting bipolar disorder and schizophrenia based on non-overlapping genetic phenotypes using deep neural network. Evolutionary Intelligence, 11, 1–16. https://doi.org/10.1007/s12065-019-00346-y.
Karthikeyan, T., Sekaran, K., Ranjith, D., & Balajee, J. M. (2019). Personalized content extraction and text classification using effective web scraping techniques. International Journal of Web Portals (IJWP), 11(2), 41–52.
Kincaid, J. P., Fishburne Jr, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel.
Lapata, M. (2003, July). Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 1 (pp. 545–552). Association for Computational Linguistics.
Li, C., Qian, X., & Liu, Y. (2013). Using supervised bigram-based ILP for extractive summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 1004-1013).
Lin, C. Y., & Hovy, E. (1997). Identifying topics by position. In fifth conference on applied natural language processing (pp. 283–290).
Lin, C. Y., & Hovy, E. (2002). From single to multi-document summarization. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 457-464).
Linvill, J. G., & Bliss, J. C. (1966). A direct translation reading aid for the blind. Proceedings of the IEEE, 54(1), 40–51.
MacQueen, J. (1967, June). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 14, pp. 281–297).
Mahajan, M., Nimbhorkar, P., & Varadarajan, K. (2009, February). The planar k-means problem is NP-hard. In International Workshop on Algorithms and Computation (pp. 274–285). Springer, Berlin
Mahmud, J. U., Borodin, Y., & Ramakrishnan, I. V. (2007, May). Csurf: a context-driven non-visual web-browser. In Proceedings of the 16th international conference on World Wide Web (pp. 31–40). ACM.
Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404–411).
Minel, J. L., Nugier, S., & Piat, G. (1997). How to appreciate the quality of automatic text summarization?. In Intelligent Scalable Text Summarization: Examples of FAN and MLUCE protocols and their results on SERAPHIN.
Morris, A. H., Kasper, G. M., & Adams, D. A. (1992). The effects and limitations of automated text condensing on reading comprehensionperformance. Information Systems Research, 3(1), 17–35.
Nandhini, K., & Balasundaram, S. R. (2012, December). Significance of learner dependent features for improving text readability using extractive summarization. In 2012 4th international conference on intelligent human computer interaction (IHCI) (pp. 1–5). IEEE.
Nenkova, A., & McKeown, K. (2012). A survey of text summarization techniques. Mining text data (pp. 43–76). Boston: Springer.
Neto, J. L., Freitas, A. A., & Kaestner, C. A. (2002, November). Automatic text summarization using a machine learning approach. In Brazilian symposium on artificial intelligence (pp. 205–215). Springer, Berlin.
Ouyang, Y., Li, S., & Li, W. (2007, November). Developing learning strategies for topic-based summarization. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 79–86). ACM.
Paice, C. D. (1990). Constructing literature abstracts by computer: Techniques and prospects. Information Processing & Management, 26(1), 171–186.
Radev, D. R., Jing, H., Styś, M., & Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919–938.
Rantala, J., Raisamo, R., Lylykangas, J., Surakka, V., Raisamo, J., Salminen, K., et al. (2009). Methods for presenting braille characters on a mobile device with a touchscreen and tactile feedback. IEEE Transactions on Haptics, 2(1), 28–39.
Schilder, F., & Kondadadi, R. (2008, June). FastSum: Fast and accurate query-based multi-document summarization. In Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: Short papers (pp. 205–208). Association for Computational Linguistics.
Sekaran, K., & Sudha, M. (2019). Prediction of lipopolysaccharides simulation responsiveness on gene expression profiles of major depression disorder affected cases using machine learning. International Journal of Scientific & Technology Research, 8(11), 21–24.
Sekaran, K., & Sudha, M. (2020). Predicting drug responsiveness with deep learning from the effects on gene expression of Obsessive-Compulsive Disorder affected cases. Computer Communications, 151, 386–394.
Shen, C., & Li, T. (2010, August). Multi-document summarization via the minimum dominating set. In: Proceedings of the 23rd international conference on computational linguistics (pp. 984–992). Association for Computational Linguistics.
Shen, D., Sun, J. T., Li, H., Yang, Q., & Chen, Z. (2007). Document summarization using conditional random fields. In IJCAI (vol. 7, pp. 2862–2867)
Shinohara, M., Shimizu, Y., & Mochizuki, A. (1998). Three-dimensional tactile display for the blind. IEEE Transactions on Rehabilitation Engineering, 6(3), 249–256.
Sidorov, G., & Gelbukh, A. (2001, October). Automatic detection of semantically primitive words using their reachability in an explanatory dictionary. In 2001 IEEE international conference on systems, man and cybernetics. e-systems and e-man for cybernetics in cyberspace (Cat. No. 01CH37236) (vol. 3, pp. 1683–1687). IEEE.
Sultana, H., Parveen, S., Nirvishi, D., Durai, D., Nalini, N., & Balajee, J. M. (2019). Comparison of machine learning algorithms to build optimized network intrusion detection system. Journal of Computational and Theoretical Nanoscience, 16(5–6), 2541–2549. (9).
Villatoro-Tello, E., Villaseñor-Pineda, L., & Montes-y-Gómez, M. (2006, September). Using word sequences for text summarization. In International conference on text, speech, and dialogue (pp. 293–300). Springer, Berlin.
Wan, X., Li, H., & Xiao, J. (2010, July). Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 917–926). Association for Computational Linguistics.
Wong, K. F., Wu, M., & Li, W. (2008, August). Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd international conference on computational linguistics-volume 1 (pp. 985–992). Association for Computational Linguistics.
Wu, H. C., Luk, R. W. P., Wong, K. F., & Kwok, K. L. (2008). Interpreting tf-idf term weights as making relevance decisions. ACM Transactions on Information Systems (TOIS), 26(3), 13.
Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. In the 33rd annual meeting of the association for computational linguistics (pp. 189–196).
Yeh, J. Y., Ke, H. R., Yang, W. P., & Meng, I. H. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information Processing & Management, 41(1), 75–95.
This research was supported by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations
About this article
Cite this article
Basheer, S., Anbarasi, M., Sakshi, D.G. et al. Efficient text summarization method for blind people using text mining techniques. Int J Speech Technol (2020). https://doi.org/10.1007/s10772-020-09712-z
- Text ranking algorithm