AI-Chatbot Using Deep Learning to Assist the Elderly

  • Guido Tascini
Part of the Contemporary Systems Thinking book series (CST)


Recently Bot and Chatbot, both Artificial Intelligence software systems, have appeared online. These create a conversation between a virtual agent and the user. This paper describes an Artificial Intelligent Chatbot conversing with elderly persons, with age-related problems. Chatbot: understands natural language and learns from interactions, increasing his knowledge; remembers commitments and medicines, connects remotely with doctors, family; controls transmission of physiological parameters; entertains the elder. Do this with machine learning algorithms. In order to learn functions with high-level abstractions, as Natural Language, we adopt deep architectures: composed of multiple levels of non-linear operations, such as neural nets with many hidden layers. We used the recently optimal learning algorithm (DBN, Deep Belief Network) proposed by Hinton et al. Experiments confirm its optimal training strategy, by initializing weights in a region near local minimum.


  1. Auli, M., Galley, M., Quirk, C., & Zweig, G. (2013). Joint language and translation modeling with recurrent neural networks. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) (pp. 1044–1054), Seattle, Washington, 18–21 October 2013.Google Scholar
  2. Bahdanau, D., Cho, K., & Bengio, Y. (2014). Neural machine translation by jointly learning to align and translate. arXiv:1409.0473.Google Scholar
  3. Bengio, Y., Ducharme, R., Vincent, P., & Jauvin, C. (2003). A neural probabilistic language model. Journal of Machine Learning Research, 3, 1137–1155.zbMATHGoogle Scholar
  4. Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. In B. Schölkopf, J. Platt, & T. Hofmann (Eds.), Advances in neural information processing systems 19 (pp. 153–160). Cambridge: MIT Press.Google Scholar
  5. Bordes, A., Chopra, S., & Weston, J. (2014). Question answering with subgraph embeddings. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP 2014), October 25–29, 2014, Doha (pp. 615–620).Google Scholar
  6. Cho, K., Merrienboer, B., Gulcehre, C., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv:1406.1078.Google Scholar
  7. Ciresan, D., Meier, U., & Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 3642–3649).Google Scholar
  8. Dahl, G. E., Yu, D., Deng, L., & Acero, A. (2012). Context-dependent pre-trained deep neural networks for large vocabulary speech recognition. IEEE Transactions on Audio, Speech, and Language Processing, 20(1), 30–42 (Special Issue on Deep Learning for Speech and Language Processing).Google Scholar
  9. Deng, L., & Li, X. (2013). Machine learning paradigms for speech recognition: An overview. IEEE Transactions on Audio, Speech, and Language Processing, 21(5), 1060–1089.CrossRefGoogle Scholar
  10. Devlin, J., Zbib, R., Huang, Z., Lamar, T., Schwartz, R., & Makhoul, J. (2014). Fast and robust neural network joint models for statistical machine translation. In Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (ACL 2014) (pp. 1370–1380).Google Scholar
  11. Durrani, N., Haddow, B., Koehn, P., & Heafield, K. (2014). Edinburgh’s phrase-based machine translation systems for WMT-14. In Proceedings of the Ninth Workshop on Statistical Machine Translation (pp. 97–104).Google Scholar
  12. Graves, A. (2013). Generating sequences with recurrent neural networks. arXiv:1308.0850.Google Scholar
  13. Hermann, K. M., & Blunsom, P. (2014). Multilingual distributed representations without word alignment. In Proceedings of International Conference on Learning Representations (ICLR 2014). arXiv:1312.6173.Google Scholar
  14. Hinton, G. E., Deng, L., Yu, D., Dahl, G. E., Mohamed, A., Jaitly, N., et al. (2012). Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups. IEEE Signal Processing Magazine, 29(6), 82–97.CrossRefGoogle Scholar
  15. Hinton, G. E., Osindero, S., & Teh, Y. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18, 1527–1554.MathSciNetCrossRefGoogle Scholar
  16. Hochreiter, S., & Schmidhuber J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.CrossRefGoogle Scholar
  17. Kalchbrenner, N., & Blunsom, P. (2013). Recurrent continuous translation models. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing (EMNLP 2013) (pp. 1700–1709), Seattle, Washington, 18–21 October 2013.Google Scholar
  18. Larochelle, H., Bengio, Y., Louradour, J., & Lamblin, P. (2009). Exploring strategies for training deep neural networks. Journal of Machine Learning Research, 10(Jan), 1–40.zbMATHGoogle Scholar
  19. Larochelle, H., Erhan, D., Courville, A., Bergstra, J., & Bengio, Y. (2007). An empirical evaluation of deep architectures on problems with many factors of variation. In Proceedings of the 24th International Conference on Machine Learning (ICML 2007).Google Scholar
  20. Larochelle, H., Erhan, D., & Vincent, P. (2009). Deep learning using robust interdependent codes. In D. A. Van Dyk & M. Welling (Eds.), In Proceedings of the 12th International Conference on Artificial Intelligence and Statistics (AISTATS 2009) (pp. 312–319). JMLR Proceedings 5.Google Scholar
  21. Le, Q. V., Ranzato, M.A., Monga, R., Devin, M., Chen, K., Corrado, G. S., et al. (2012). Building high-level features using large scale unsupervised learning. In Proceedings of the 29th International Conference on Machine Learning (ICML 2012) (pp. 507–514).Google Scholar
  22. Li, J., Galley, M., Brockett, C., Spithourakis, G. P., Gao, J., & Dolan, B. (2016). A persona-based neural conversation model. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (pp. 994–1003).Google Scholar
  23. Mikolov, T., Chen, K., Corrado, G., & Dean, J. (2013). Efficient estimation of word representations in vector space. arXiv:1301.3781.Google Scholar
  24. Pascanu, R., Mikolov, T., & Bengio, Y. (2012). On the difficulty of training recurrent neural networks. arXiv:1211.5063.Google Scholar
  25. Pouget-Abadie, J., Bahdanau, D., van Merrienboer, B., Cho, K., & Bengio, Y. (2014). Overcoming the curse of sentence length for neural machine translation using automatic segmentation. arXiv:1409.1257.Google Scholar
  26. Salakhutdinov, R., & Larochelle, H. (2010). Efficient learning of deep Boltzmann machines. In Proceedings of the 13th International Conference on Artificial Intelligence and Statistics (AISTATS 2010) (pp. 693–700).Google Scholar
  27. Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Proceedings of the 27th International Conference on Neural Information Processing Systems (NIPS 2014) (pp. 3104–3112).Google Scholar
  28. Vincent, P., Larochelle, H., Bengio, Y., & Manzagol, P.-A. (2008). Extracting and composing robust features with denoising autoencoders. In Proceedings of the 25th International Conference on Machine Learning (ICML 2008) (pp. 1096–1103). New York: ACM.Google Scholar
  29. Vincent, P., Larochelle, H., Lajoie, I., Bengio, Y., & Manzagol, P.-A. (2010). Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. Journal of Machine Learning Research, 11(Dec), 3371–3408.MathSciNetzbMATHGoogle Scholar
  30. Yao, K., Zweig, G., & Peng, B. (2015). Attention with intention for a neural network conversation model. arXiv:1510.08565v3.Google Scholar
  31. Yih, W., Chang, M.-W., Meek, C., & Pastusiak, A. (2013). Question answering using enhanced lexical semantic models. In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (pp. 1744–1753), Sofia, August 4–9, 2013.Google Scholar
  32. Web Resource Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Guido Tascini
    • 1
  1. 1.Centro Studi e Ricerca “G. B. Carducci”FermoItaly

Personalised recommendations