Skip to main content
Log in

Efficient text summarization method for blind people using text mining techniques

  • Published:
International Journal of Speech Technology Aims and scope Submit manuscript

Abstract

Owing to the phenomenal growth in communication technology, most of us hardly have time to read books. This habit of reading is slowly diminishing because of the busy lives of people. For visually challenged people, the situation is even worse. In order to address this impedes, we develop a better and more accurate methodology than the existing ones. In this work, in order to save the efforts for reading the complete text every time, we modify the Weighted TF_IDF (Term Frequency Inverse Document Frequency) algorithm to summarize books into relevant keywords. Then, we compare the modified algorithm with that of the existing algorithms of TextRank Algorithm, Luhn’s Algorithm, LexRank Algorithm, Latent Semantic Analysis(LSA). From the comparative analysis, we find that Weighted TF_IDF is an efficient algorithm to automate text summarization and produce an effective summary which is then converted from text to speech. Thus, the proposed algorithm would highly be useful for blind people.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

References

  • Aliguliyev, R. M. (2007). Automatic document summarization by sentence extraction. Computing Technology, 12(5), 5–15.

    MATH  Google Scholar 

  • Allahyari, M., Pouriyeh, S., Assefi, M., Safaei, S., Trippe, E. D., Gutierrez, J. B., & Kochut, K. (2017). Text summarization techniques: a brief survey. arXiv:1707.02268.

  • Aone, C., Okurowski, M. E., & Gorlinsky, J. (1998, August). Trainable, scalable summarization using robust NLP and machine learning. In Proceedings of the 17th international conference on Computational linguistics-Volume 1 (pp. 62–66). Association for Computational Linguistics.

  • Barzilay, R., & Elhadad, N. (2002). Inferring strategies for sentence ordering in multi document news summarization. Journal of Artificial Intelligence Research, 17, 35–55.

    Article  Google Scholar 

  • Barzilay, R., & Lee, L. (2003, May). Learning to paraphrase: An unsupervised approach using multiple-sequence alignment. In Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1 (pp. 16–23). Association for Computational Linguistics.

  • Basheer, S., Bivi, S., Aysha, M., Jayakumar, S., Rathore, A., & Jeyakumar, B. (2019). Machine learning based classification of cervical cancer using K-nearest neighbour, random forest and multilayer perceptron algorithms. Journal of Computational and Theoretical Nanoscience, 16(5–6), 2523–2527. (5).

    Article  Google Scholar 

  • Baxendale, P. B. (1958). Machine-made index for technical literature-an experiment. IBM Journal of Research and Development, 2(4), 354–361.

    Article  Google Scholar 

  • Bouguettaya, A., Gao, Y., Klimenko, A., Chen, L., Zhang, X., Dzerzhinskiy, F., et al. (2017). Web information systems engineering-WISE 2017. Cham: International Publishing AG.

    Book  Google Scholar 

  • Brandow, R., Mitze, K., & Rau, L. F. (1995). Automatic condensation of electronic publications by sentence selection. Information Processing & Management, 31(5), 675–685.

    Article  Google Scholar 

  • Edmundson, H. P. (1969). New methods in automatic extracting. Journal of the ACM (JACM), 16(2), 264–285.

    Article  Google Scholar 

  • Eisner, J. (2007, June). In: Proceedings of the 2007 joint conference on empirical methods in natural language processing and computational natural language learning (EMNLP-CoNLL).

  • Erkan, G., & Radev, D. R. (2004). Lexpagerank: Prestige in multi-document text summarization. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 365–371).

  • Erkan, G., & Radev, D. R. (2004). Lexrank: Graph-based lexical centrality as salience in text summarization. Journal of Artificial Intelligence Research, 22, 457–479.

    Article  Google Scholar 

  • Freitas, D., & Kouroupetroglou, G. (2008). Speech technologies for blind and low vision persons. Technology and Disability, 20(2), 135–156.

    Article  Google Scholar 

  • Gillick, D., & Favre, B. (2009, June). A scalable global model for summarization. In Proceedings of the workshop on integer linear programming for natural langauge processing (pp. 10–18). Association for Computational Linguistics.

  • Gillick, D., Favre, B., & Hakkani-Tür, D. (2008). The ICSI Summarization System at TAC 2008. In Tac.

  • Gunning, R. (1952). The technique of clear writing. New York: McGraw-Hill.

    Google Scholar 

  • Hadjadj, D., & Burger, D. (1999). Braillesurf: An HTML browser for visually handicapped people. In Proceedings of Tech. and Persons with Disabilities Conf.

  • Hahn, U., & Mani, I. (2000). The challenges of automatic summarization. Computer, 33(11), 29–36.

    Article  Google Scholar 

  • Kadam, S., Jadhav, V., Babar, S., Pise, S., & Davane, P. (2013). Text summarization: An overview.

  • Karthik, S., & Sudha, M. (2020). Predicting bipolar disorder and schizophrenia based on non-overlapping genetic phenotypes using deep neural network. Evolutionary Intelligence, 11, 1–16. https://doi.org/10.1007/s12065-019-00346-y.

    Article  Google Scholar 

  • Karthikeyan, T., Sekaran, K., Ranjith, D., & Balajee, J. M. (2019). Personalized content extraction and text classification using effective web scraping techniques. International Journal of Web Portals (IJWP), 11(2), 41–52.

    Article  Google Scholar 

  • Kincaid, J. P., Fishburne Jr, R. P., Rogers, R. L., & Chissom, B. S. (1975). Derivation of new readability formulas (automated readability index, fog count and flesch reading ease formula) for navy enlisted personnel.

  • Lapata, M. (2003, July). Probabilistic text structuring: Experiments with sentence ordering. In Proceedings of the 41st annual meeting on association for computational linguistics-volume 1 (pp. 545–552). Association for Computational Linguistics.

  • Li, C., Qian, X., & Liu, Y. (2013). Using supervised bigram-based ILP for extractive summarization. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers) (Vol. 1, pp. 1004-1013).

  • Lin, C. Y., & Hovy, E. (1997). Identifying topics by position. In fifth conference on applied natural language processing (pp. 283–290).

  • Lin, C. Y., & Hovy, E. (2002). From single to multi-document summarization. In Proceedings of the 40th annual meeting of the association for computational linguistics (pp. 457-464).

  • Linvill, J. G., & Bliss, J. C. (1966). A direct translation reading aid for the blind. Proceedings of the IEEE, 54(1), 40–51.

    Article  Google Scholar 

  • MacQueen, J. (1967, June). Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability (Vol. 1, No. 14, pp. 281–297).

  • Mahajan, M., Nimbhorkar, P., & Varadarajan, K. (2009, February). The planar k-means problem is NP-hard. In International Workshop on Algorithms and Computation (pp. 274–285). Springer, Berlin

  • Mahmud, J. U., Borodin, Y., & Ramakrishnan, I. V. (2007, May). Csurf: a context-driven non-visual web-browser. In Proceedings of the 16th international conference on World Wide Web (pp. 31–40). ACM.

  • Mihalcea, R., & Tarau, P. (2004). Textrank: Bringing order into text. In Proceedings of the 2004 conference on empirical methods in natural language processing (pp. 404–411).

  • Minel, J. L., Nugier, S., & Piat, G. (1997). How to appreciate the quality of automatic text summarization?. In Intelligent Scalable Text Summarization: Examples of FAN and MLUCE protocols and their results on SERAPHIN.

  • Morris, A. H., Kasper, G. M., & Adams, D. A. (1992). The effects and limitations of automated text condensing on reading comprehensionperformance. Information Systems Research, 3(1), 17–35.

    Article  Google Scholar 

  • Nandhini, K., & Balasundaram, S. R. (2012, December). Significance of learner dependent features for improving text readability using extractive summarization. In 2012 4th international conference on intelligent human computer interaction (IHCI) (pp. 1–5). IEEE.

  • Nenkova, A., & McKeown, K. (2012). A survey of text summarization techniques. Mining text data (pp. 43–76). Boston: Springer.

    Google Scholar 

  • Neto, J. L., Freitas, A. A., & Kaestner, C. A. (2002, November). Automatic text summarization using a machine learning approach. In Brazilian symposium on artificial intelligence (pp. 205–215). Springer, Berlin.

  • Ouyang, Y., Li, S., & Li, W. (2007, November). Developing learning strategies for topic-based summarization. In Proceedings of the sixteenth ACM conference on Conference on information and knowledge management (pp. 79–86). ACM.

  • Paice, C. D. (1990). Constructing literature abstracts by computer: Techniques and prospects. Information Processing & Management, 26(1), 171–186.

    Article  Google Scholar 

  • Radev, D. R., Jing, H., Styś, M., & Tam, D. (2004). Centroid-based summarization of multiple documents. Information Processing & Management, 40(6), 919–938.

    Article  Google Scholar 

  • Rantala, J., Raisamo, R., Lylykangas, J., Surakka, V., Raisamo, J., Salminen, K., et al. (2009). Methods for presenting braille characters on a mobile device with a touchscreen and tactile feedback. IEEE Transactions on Haptics, 2(1), 28–39.

    Article  Google Scholar 

  • Schilder, F., & Kondadadi, R. (2008, June). FastSum: Fast and accurate query-based multi-document summarization. In Proceedings of the 46th annual meeting of the association for computational linguistics on human language technologies: Short papers (pp. 205–208). Association for Computational Linguistics.

  • Sekaran, K., & Sudha, M. (2019). Prediction of lipopolysaccharides simulation responsiveness on gene expression profiles of major depression disorder affected cases using machine learning. International Journal of Scientific & Technology Research, 8(11), 21–24.

    Google Scholar 

  • Sekaran, K., & Sudha, M. (2020). Predicting drug responsiveness with deep learning from the effects on gene expression of Obsessive-Compulsive Disorder affected cases. Computer Communications, 151, 386–394.

    Article  Google Scholar 

  • Shen, C., & Li, T. (2010, August). Multi-document summarization via the minimum dominating set. In: Proceedings of the 23rd international conference on computational linguistics (pp. 984–992). Association for Computational Linguistics.

  • Shen, D., Sun, J. T., Li, H., Yang, Q., & Chen, Z. (2007). Document summarization using conditional random fields. In IJCAI (vol. 7, pp. 2862–2867)

  • Shinohara, M., Shimizu, Y., & Mochizuki, A. (1998). Three-dimensional tactile display for the blind. IEEE Transactions on Rehabilitation Engineering, 6(3), 249–256.

    Article  Google Scholar 

  • Sidorov, G., & Gelbukh, A. (2001, October). Automatic detection of semantically primitive words using their reachability in an explanatory dictionary. In 2001 IEEE international conference on systems, man and cybernetics. e-systems and e-man for cybernetics in cyberspace (Cat. No. 01CH37236) (vol. 3, pp. 1683–1687). IEEE.

  • Sultana, H., Parveen, S., Nirvishi, D., Durai, D., Nalini, N., & Balajee, J. M. (2019). Comparison of machine learning algorithms to build optimized network intrusion detection system. Journal of Computational and Theoretical Nanoscience, 16(5–6), 2541–2549. (9).

    Article  Google Scholar 

  • Villatoro-Tello, E., Villaseñor-Pineda, L., & Montes-y-Gómez, M. (2006, September). Using word sequences for text summarization. In International conference on text, speech, and dialogue (pp. 293–300). Springer, Berlin.

  • Wan, X., Li, H., & Xiao, J. (2010, July). Cross-language document summarization based on machine translation quality prediction. In Proceedings of the 48th annual meeting of the association for computational linguistics (pp. 917–926). Association for Computational Linguistics.

  • Wong, K. F., Wu, M., & Li, W. (2008, August). Extractive summarization using supervised and semi-supervised learning. In Proceedings of the 22nd international conference on computational linguistics-volume 1 (pp. 985–992). Association for Computational Linguistics.

  • Wu, H. C., Luk, R. W. P., Wong, K. F., & Kwok, K. L. (2008). Interpreting tf-idf term weights as making relevance decisions. ACM Transactions on Information Systems (TOIS), 26(3), 13.

    Article  Google Scholar 

  • Yarowsky, D. (1995). Unsupervised word sense disambiguation rivaling supervised methods. In the 33rd annual meeting of the association for computational linguistics (pp. 189–196).

  • Yeh, J. Y., Ke, H. R., Yang, W. P., & Meng, I. H. (2005). Text summarization using a trainable summarizer and latent semantic analysis. Information Processing & Management, 41(1), 75–95.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Deanship of Scientific Research at Princess Nourah bint Abdulrahman University through the Fast-track Research Funding Program.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to M. Anbarasi.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Basheer, S., Anbarasi, M., Sakshi, D.G. et al. Efficient text summarization method for blind people using text mining techniques. Int J Speech Technol 23, 713–725 (2020). https://doi.org/10.1007/s10772-020-09712-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10772-020-09712-z

Keywords

Navigation