Deep Learning and Bayesian Networks for Labelling User Activity Context Through Acoustic Signals

  • Francisco J. Rodríguez LeraEmail author
  • Francisco Martín Rico
  • Vicente Matellán
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 10338)


Context awareness in autonomous robots is usually performed combining localization information, objects identification, human interaction and time of the day. We think that gathering environmental sounds we can improve context recognition. With that purpose, we have designed, developed and tested an Environment Recognition Component (ERC) that provides an extra input to our Context-Awareness Component (CAC) and increases the rate of labeling correctly users’ activities. First element, the Environment Recognition Component (ERC) uses convolutional neural networks to classify acoustic signals and providing information to the Context-Awareness Component (CAC) which infers the user activity using a hierarchical Bayesian network. The work described in this paper evaluates the results of the labeling process in two HRI scenarios: robot and user sharing room and robot, and when the human and the robot are in different rooms. The results showed better accuracy when the ERC uses acoustic signals.


Acoustic Signal Automatic Speech Recognition Convolutional Neural Network Autonomous Robot Convolutional Deep Neural Network 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. 1.
    Abadi, M., Agarwal, A., Barham, P., Brevdo, E., Chen, Z., Citro, C., Corrado, G.S., Davis, A., Dean, J., Devin, M., Ghemawat, S., Goodfellow, I., Harp, A., Irving, G., Isard, M., Jia, Y., Jozefowicz, R., Kaiser, L., Kudlur, M., Levenberg, J., Mané, D., Monga, R., Moore, S., Murray, D., Olah, C., Schuster, M., Shlens, J., Steiner, B., Sutskever, I., Talwar, K., Tucker, P., Vanhoucke, V., Vasudevan, V., Viégas, F., Vinyals, O., Warden, P., Wattenberg, M., Wicke, M., Yu, Y., Zheng, X.: TensorFlow: large-scale machine learning on heterogeneous systems (2015). Software available at
  2. 2.
    Abdel-Hamid, O., Mohamed, A., Jiang, H., Penn, G.: Applying convolutional neural networks concepts to hybrid NN-HMM model for speech recognition. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4277–4280. IEEE (2012)Google Scholar
  3. 3.
    Chachada, S., Kuo, C.C.J.: Environmental sound recognition: a survey. In: Signal and Information Processing Association Annual Summit and Conference (APSIPA), 2013 Asia-Pacific, pp. 1–9. IEEE (2013)Google Scholar
  4. 4.
    Dieleman, S., Brakel, P., Schrauwen, B.: Audio-based music classification with a pretrained convolutional network. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 669–674. IEEE (2011)Google Scholar
  5. 5.
    Fukushima, K.: Features for content-based audio retrieval. Biol. Cybern. 36(4), 193–202 (1980)CrossRefGoogle Scholar
  6. 6.
    Glorot, X., Bordes, A., Bengio, Y.: Deep sparse rectifier neural networks. In: Proceedings of the 14th International Conference on Artificial Intelligence and Statistics, vol. 15, pp. 315–323 (2011)Google Scholar
  7. 7.
    Göker, A., Myrhaug, H.I.: User context and personalisation. In: Workshop proceedings for the 6th European Conference on Case Based Reasoning (2002)Google Scholar
  8. 8.
    Jiang, H.: Confidence measures for speech recognition: a survey. Speech Commun. 45(4), 455–470 (2005)CrossRefGoogle Scholar
  9. 9.
    Korpipaa, P., Mantyjarvi, J., Kela, J., Keranen, H., Malm, E.J.: Managing context information in mobile devices. IEEE Pervasive Comput. 2(3), 42–51 (2003)CrossRefGoogle Scholar
  10. 10.
    Krizhevsky, A., Sutskever, I., Hinton, G.E.: Imagenet classification with deep convolutional neural networks. In: Advances in Neural Information Processing Systems, pp. 1097–1105 (2012)Google Scholar
  11. 11.
    Lacave, C., Luque, M., Díez, F.J.: Explanation of Bayesian networks and influence diagrams in Elvira. Syst. Man Cybern. Part B: Cybern. IEEE Trans. 37(4), 952–965 (2007)CrossRefGoogle Scholar
  12. 12.
    Liao, L., Fox, D., Kautz, H.: Location-based activity recognition. Adv. Neural Inf. Process. Syst. 18, 787 (2006)Google Scholar
  13. 13.
    McCarthy, J., Buvac, S.: Formalizing context (expanded notes) (1997)Google Scholar
  14. 14.
    Mitrović, D., Zeppelzauer, M., Breiteneder, C.: Features for content-based audio retrieval. Adv. Comput. 78, 71–150 (2010)CrossRefGoogle Scholar
  15. 15.
    Moore, D.J., Essa, I.A., Hayes, M.H.: Exploiting human actions and object context for recognition tasks. In: The Proceedings of the Seventh IEEE International Conference on Computer Vision, 1999, vol. 1, pp. 80–86. IEEE (1999)Google Scholar
  16. 16.
    Piczac, K.: Enviromental sound classification with convolutional neuronal network. In: Proceedings of the 2015 IEEE International Workshop on Machine Learning for Signal Processing. IEEE (2015)Google Scholar
  17. 17.
    Quigley, M., Conley, K., Gerkey, B.P., Faust, J., Foote, T., Leibs, J., Wheeler, R., Ng, A.Y.: ROS: an open-source robot operating system. In: ICRA Workshop on Open Source Software (2009)Google Scholar
  18. 18.
    Salamon, J., Bello, J.P.: Deep convolutional neural networks and data augmentation for environmental sound classification. IEEE Signal Process. Lett. (2017)Google Scholar
  19. 19.
    Trentin, E., Gori, M.: A survey of hybrid ann/hmm models for automatic speech recognition. Neurocomputing 37(1), 91–126 (2001)CrossRefzbMATHGoogle Scholar
  20. 20.
    Wang, X.H., Zhang, D.Q., Gu, T., Pung, H.K.: Ontology based context modeling and reasoning using OWL. In: Proceedings of the Second IEEE Annual Conference on Pervasive Computing and Communications Workshops, 2004, pp. 18–22. IEEE (2004)Google Scholar
  21. 21.
    Zhu, C., Sheng, W.: Motion-and location-based online human daily activity recognition. Pervasive Mobile Comput. 7(2), 256–269 (2011)MathSciNetCrossRefGoogle Scholar
  22. 22.
    Ziebart, B.D., Maas, A.L., Dey, A.K., Bagnell, J.A.: Navigate like a cabbie: probabilistic reasoning from observed context-aware behavior. In: Proceedings of the 10th International Conference on Ubiquitous Computing, pp. 322–331. ACM (2008)Google Scholar

Copyright information

© Springer International Publishing AG 2017

Authors and Affiliations

  • Francisco J. Rodríguez Lera
    • 1
    Email author
  • Francisco Martín Rico
    • 2
  • Vicente Matellán
    • 3
  1. 1.AI RobolabUniversity of LuxembourgLuxembourgLuxembourg
  2. 2.Universidad Rey Juan CarlosMadridSpain
  3. 3.Robotics GroupUniversidad de LeónLeónSpain

Personalised recommendations