Automatic Analysis of Speech and Acoustic Events for Ambient Assisted Living

  • Alexey KarpovEmail author
  • Alexander Ronzhin
  • Irina Kipyatkova
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 9176)


We present a prototype of an ambient assisted living (AAL) with multimodal user interaction. In our research, the AAL environment is one studio room of 60 + square meters that has several tables, chairs and a sink, as well as equipped with four stationary microphones and two omni-directional video cameras. In this paper, we focus mainly on audio signal processing techniques for monitoring the assistive smart space and recognition of speech and non-speech acoustic events for automatic analysis of human’s activities and detection of possible emergency situations with the user (when an emergent help is needed). Acoustical modeling in our audio recognition system is based on single order Hidden Markov Models with Gaussian Mixture Models. The recognition vocabulary includes 12 non-speech acoustic events for different types of human activities plus 5 useful spoken commands (keywords), including a subset of alarm audio events. We have collected an audio-visual corpus containing about 1.3 h of audio data from 5 testers, who performed proposed test scenarios, and made the practical experiments with the system, results of which are reported in this paper.


Ambient assisted living Assistive technology Multimodal user interfaces Universal access Human-Computer interaction Automatic speech recognition Acoustic event detection 



This research is partially supported by the Council for Grants of the President of Russia (Projects No. MD-3035.2015.8 and MK-5209.2015.8), by the Russian Foundation for Basic Research (Projects No. 15-07-04415 and 15-07-04322), and by the Government of the Russian Federation (Grant No. 074-U01).


  1. 1.
    Burzagli, L., Di Fonzo, L., Emiliani, P.L.: Services and applications in an ambient assisted living (aal) environment. In: Stephanidis, C., Antona, M. (eds.) UAHCI 2014, Part III. LNCS, vol. 8515, pp. 475–482. Springer, Heidelberg (2014)Google Scholar
  2. 2.
    Sacco, M., Caldarola, E.G., Modoni, G., Terkaj, W.: Supporting the design of AAL through a SW integration framework: the D4All project. In: Stephanidis, C., Antona, M. (eds.) UAHCI 2014, Part I. LNCS, vol. 8513, pp. 75–84. Springer, Heidelberg (2014)Google Scholar
  3. 3.
    Mora, N., Bianchi, V., De Munari, I., Ciampolini, P.: A BCI platform supporting AAL applications. In: Stephanidis, C., Antona, M. (eds.) UAHCI 2014, Part I. LNCS, vol. 8513, pp. 515–526. Springer, Heidelberg (2014)Google Scholar
  4. 4.
    Karpov, A., Ronzhin, A.: A Universal assistive technology with multimodal input and multimedia output interfaces. In: Stephanidis, C., Antona, M. (eds.) UAHCI 2014, Part I. LNCS, vol. 8513, pp. 369–378. Springer, Heidelberg (2014)Google Scholar
  5. 5.
    Argyropoulos, S., Moustakas, K., Karpov, A., Aran, O., Tzovaras, D., Tsakiris, T., Varni, G., Kwon, B.: A Multimodal framework for the communication of the disabled. J. Multimodal User Interfaces 2(2), 105–116 (2008). SpringerCrossRefGoogle Scholar
  6. 6.
    Karpov, A., Ronzhin, A., Kipyatkova, I.: An assistive bi-modal user interface integrating multi-channel speech recognition and computer vision. In: Jacko, J.A. (ed.) Human-Computer Interaction, Part II, HCII 2011. LNCS, vol. 6762, pp. 454–463. Springer, Heidelberg (2011)Google Scholar
  7. 7.
    Portet, F., Vacher, M., Golanski, C., Roux, C., Meillon, B.: Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects. Pers. Ubiquit. Comput. 32(1), 1–18 (2011)zbMATHGoogle Scholar
  8. 8.
    Karpov A., Akarun L., Yalçın H., Ronzhin Al., Demiröz B., Çoban A., Zelezny M.: Audio-visual signal processing in a multimodal assisted living environment. In: Proceedings of the 15th International Conference, INTERSPEECH-2014, Singapore, pp. 1023–1027 (2014)Google Scholar
  9. 9.
    Karpov, A.: An automatic multimodal speech recognition system with audio and video information. Autom. Remote Control 75(12), 2190–2200 (2014). SpringerCrossRefGoogle Scholar
  10. 10.
    Karpov, A., Ronzhin, A.: Information Enquiry Kiosk with Multimodal User Interface. Pattern Recogn. Image Anal. 19(3), 546–558 (2009). SpringerCrossRefGoogle Scholar
  11. 11.
    Drugman T., Urbain J., Dutoit T. Assessment of audıo features for automatıc cough detectıon. In: Proceedings of the 19th European Signal Processing Conference, EUSIPCO-2011, Barcelona, Spain, pp. 1289–1293 (2011)Google Scholar
  12. 12.
    Zigel, Y., Litvak, D., Gannot, I.: A method for automatic fall detection of elderly people using floor vibrations and sound - proof of concept on human mimicking doll falls. IEEE Trans. Biomed. Eng. 56(12), 2858–2867 (2009)CrossRefGoogle Scholar
  13. 13.
    Miao, Yu., Naqvi, S.M., Rhuma, A., Chambers J.: Fall detection in a smart room by using a fuzzy one class support vector machine and imperfect training data. In: Proceedings of the 36th International Conference, ICASSP-2011, Prague, Czech Republic, pp. 1833–1836 (2011)Google Scholar
  14. 14.
    Huynh, T.H., Tran, V.A., Tran, H.D.: Semi-supervised tree support vector machine for online cough recognition, In: Proceedings of the 12th International Conference, INTERSPEECH-2011, Florence, Italy, pp. 1637–1640 (2011)Google Scholar
  15. 15.
    Aman, F., Vacher, M., Rossato S., Portet, F.: In-Home Detection of Distress Calls: The Case of Aged Users. In: Proceedings of the 14th International Conference, INTERSPEECH-2013, Lyon, France, pp. 2065–2067 (2013)Google Scholar
  16. 16.
    Levin, K. et al.: Automated Closed Captioning for Russian Live Broadcasting. In: Proceedings of the 15th International Conference, INTERSPEECH-2014, Singapore, pp. 1438–1442 (2014)Google Scholar
  17. 17.
    Matveev, Y.: The Problem of voice template aging in speaker recognition systems. In: Železný, M., Habernal, I., Ronzhin, A. (eds.) SPECOM 2013. LNCS, vol. 8113, pp. 345–353. Springer, Heidelberg (2013)CrossRefGoogle Scholar

Copyright information

© Springer International Publishing Switzerland 2015

Authors and Affiliations

  • Alexey Karpov
    • 1
    • 2
    Email author
  • Alexander Ronzhin
    • 1
  • Irina Kipyatkova
    • 1
  1. 1.St. Petersburg Institute for Informatics and Automation of the Russian Academy of SciencesSt. PetersburgRussia
  2. 2.ITMO UniversitySaint-PetersburgRussian Federation

Personalised recommendations