Abstract
In the framework of Ambient Assisted Living, home automation may be a solution for helping elderly people living alone at home. This study is part of the Sweet-Home project which aims at developing a new home automation system based on voice command to improve support and well-being of people in loss of autonomy. The goal of the study is vocal order recognition with a focus on two aspects: distance speech recognition and sentence spotting. Several ASR techniques were evaluated on a realistic corpus acquired in a 4-room flat equipped with microphones set in the ceiling. This distant speech French corpus was recorded with 21 speakers who acted scenarios of activities of daily living. Techniques acting at the decoding stage, such as our novel approach called Driven Decoding Algorithm (DDA), gave better speech recognition results than the baseline and other approaches. This solution which uses the two best SNR channels and a priori knowledge (voice commands and distress sentences) has demonstrated an increase in recognition rate without introducing false alarms. Generally speaking, a short overview allows then to outline the research challenges that speech technologies must take up for Ambient Assisted Living and Augmentative and Alternative Communication, and the current reseach avenues in this domain.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
- 2.
camera-contact.com.
- 3.
- 4.
References
Chan, M., Estève, D., Escriba, C., Campo, E.: A review of smart homes- present state and future challenges. Comput. Methods Programs Biomed. 91(1), 55–81 (2008)
Vacher, M., Portet, F., Rossato, S., Aman, F., Golanski, C., Dugheanu, R.: Speech-based interaction in an AAL context. Gerontechnology 11(2), 310 (2012)
Vacher, M., Portet, F., Fleury, A., Noury, N.: Development of audio sensing technology for ambient assisted living: applications and challenges. Int. J. E-Health Med. Commun. 2(1), 35–54 (2011)
Katz, S., Akpom, C.: A measure of primary sociobiological functions. J. Health Serv. 6(3), 493508 (1976)
Badii, A., Boudy, J.: CompanionAble - integrated cognitive assistive & domotic companion robotic systems for ability and security. In: 1st Congrés of the Société Française des Technologies pour l’Autonomie et de Gérontechnologie (SFTAG 2009), pp. 18–20, Troyes (2009)
Filho, G., Moir, T.: From science fiction to science fact: a smart-house interface using speech technology and a photorealistic avatar. Int. J. Comput. Appl. Technol. 39(8), 32–39 (2010)
Gödde, F., Möller, S., Engelbrecht, K.P., Kühnel, C., Schleicher, R., Naumann, A., Wolters, M.: Study of a speech-based smart home system with older users. In: International Workshop on Intelligent User Interfaces for Ambient Assisted Living pp. 17–22 (2008)
Hamill, M., Young, V., Boger, J., Mihailidis, A.: Development of an automated speech recognition interface for personal emergency response systems. J. NeuroEngineering Rehabil. 6(1), 26 (2009)
Vacher, M., Chahuara, P., Lecouteux, B., Istrate, D., Portet, F., Joubert, T., Sehili, M.E.A., Meillon, B., Bonnefond, N., Fabre, S., Roux, C., Caffiau, S.: The SWEET-HOME project: audio technology in smart homes to improve well-being and reliance. In: 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2013), Osaka, Japan, pp. 7298–7301, July 2013
Portet, F., Vacher, M., Golanski, C., Roux, C., Meillon, B.: Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects. Pers. Ubiquit. Comput. 17(1), 127–144 (2013)
López-Cózar, R., Callejas, Z.: Multimodal dialogue for ambient intelligence and smart environments. In: Nakashima, H., Aghajan, H., Augusto, J.C. (eds.) Handbook of Ambient Intelligence and Smart Environments, pp. 559–579. Springer, Berlin (2010)
Koskela, T., Väänänen-Vainio-Mattila, K.: Evolution towards smart home environments: empirical evaluation of three user interfaces. Pers. Ubiquit. Comput. 8, 234–240 (2004)
Vacher, M., Portet, F., Fleury, A., Noury, N.: Challenges in the processing of audio channels for ambient assisted living. In: IEEE HealthCom 2010, Lyon, France, pp. 330–337, 1–3 July 2010
Mäyrä, F., Soronen, A., Vanhala, J., Mikkonen, J., Zakrzewski, M., Koskinen, I., Kuusela, K.: Probing a proactive home: challenges in researching and designing everyday smart environments. Hum. Technol. 2, 158–186 (2006)
Edwards, W., Grinter, R.: At home with ubiquitous computing: seven challenges. In: Abowd, G., Brumitt, B., Shafer, S. (eds.) Ubicomp 2001. LNCS, vol. 2201, pp. 256–272. Springer, Heidelberg (2001)
Wölfel, M., McDonough, J.W.: Distant Speech Recognition. Wiley, New York (2009)
Deng, L., Acero, A., Plumpe, M., Huang, X.: Large-vocabulary speech recognition under adverse acoustic environments. In: ICSLP-2000, vol. 3, pp. 806–809. ISCA, Beijing, China (2000)
Baba, A., Lee, A., Saruwatari, H., Shikano, K.: Speech recognition by reverberation adapted acoustic model. In: ASJ General Meeting, pp. 27–28 (2002)
Michaut, F., Bellanger, M.: Filtrage adaptatif: théorie et algorithmes. Hermes Science Publication, Lavoisier (2005)
Valin, J.M.: On adjusting the learning rate in frequency domain echo cancellation with double talk. IEEE Trans. Acoust. Speech Signal Process. 15(3), 1030–1034 (2007)
Vacher, M., Fleury, A., Guirand, N., Serignat, J.F., Noury, N.: Speech recognition in a smart home: some experiments for telemonitoring. In: Corneliu Burileanu, H.N.T. (ed.) From Speech Processing to Spoken Language Technology, pp. 171–179. Publishing House of the Romanian Academy, Constanta (2009)
Vacher, M., Fleury, A., Serignat, J.F., Noury, N., Glasson, H.: Preliminary evaluation of speech/sound recognition for telemedicine application in a real environment. In: Proceedings of the InterSpeech, pp. 496–499 (2008)
Reidel, K., Tamblyn, R., Patel, V., Huang, A.: Pilot study of an interactive voice response system to improve medication refill compliance. BMC Med. Inform. Decis. Mak. 8, 46 (2008)
Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The Sweet-Home speech and multimodal corpus for home automation interaction. In: The 9th edition of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland, pp. 4499–4506 (2014)
Nocera, P., Linares, G., Massonié, D., Lefort, L.: Phoneme lattice based A* search algorithm for speech recognition. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 301–308. Springer, Heidelberg (2002)
Aman, F., Vacher, M., Rossato, S., Portet, F.: Speech recognition of aged voices in the AAL context: detection of distress sentences. In: The 7th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2013, Cluj-Napoca, Romania, pp. 177–184 (2013)
Wang, Y., Zhu, X.: A new approach for incremental speaker adaptation. In: Proceedings of the International Symposium on Chinese Spoken Language Processing (ISCSLP 2000), pp. 163–166 (2000)
Fiscus, J.G.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: Proceedings of the IEEE Workshop ASRU, pp. 347–354 (1997)
Lecouteux, B., Linarès, G., Estève, Y., Mauclair, J.: System combination by driven decoding. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2007, vol. 4, pp. IV-341–IV-344 (2007)
Lecouteux, B., Linarès, G., Estève, Y., Gravier, G.: Generalized driven decoding for speech recognition system combination. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2008, pp. 1549–1552 (2008)
Lecouteux, B., Linarès, G., Nocéra, P., Bonastre, J.: Reconnaissance de la parole guidée par des transcriptions approchées. In: Journées d’Etudes sur la Parole (JEP 2006), Dinard, France, pp. 53–56 (2006)
Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Workshop on Knowledge Discovery in Databases (KDD 1994) pp. 359–370 (1994)
Vacher, M., Lecouteux, B., Istrate, D., Joubert, T., Portet, F., Sehili, M., Chahuara, P.: Experimental evaluation of speech recognition technologies for voice-based home automation control in a smart home. In: 4th Workshop on Speech and Language Processing for Assistive Technologies, Grenoble, France, pp. 99–105 (2013)
Chahuara, P., Portet, F., Vacher, M.: Making context aware decision from uncertain information in a smart home: a Markov logic network approach. In: Augusto, J.C., Wichert, R., Collier, R., Keyson, D., Salah, A.A., Tan, A.-H. (eds.) AmI 2013. LNCS, vol. 8309, pp. 78–93. Springer, Heidelberg (2013)
Franco, A.: Conférence invitée: Nouveaux paradigmes et technologies pour la santé et l’autonomie (invited conference: new paradigms and technologies for health and autonomy) [in french]. In: JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes), pp. 1–2. ATALA/AFCP, Grenoble, France, June 2012
Vacher, M., Lecouteux, B., Portet, F.: Recognition of voice commands by multisource ASR and noise cancellation in a smart home environment. In: EUSIPCO (European Signal Processing Conference), Bucarest, Romania, pp. 1663–1667, 27–31 August 2012
Gemmeke, J.F., Ons, B., Tessema, N., hamme, H.V., van de Loo, J., Pauw, G.D., Daelemans, W., Huyghe, J., Derboven, J., Vuegen, L., Broeck, B.V.D., Karsmakers, P., Vanrumste, B.: Self-taught assistive vocal interfaces: an overview of the ALADIN project. In: Interspeech 2013, pp. 2039–2043 (2013)
Christensen, H., Casanueva, I., Cunningham, S., Green, P., Hain, T.: homeService: Voice-enabled assistive technology in the home using cloud-based automatic speech recognition. In: 4th Workshop on Speech and Language Processing for Assistive Technologies (2013)
Cristoforetti, L., Ravanelli, M., Omologo, M., Sosi, A., Abad, A., Hagmueller, M., Maragos, P.: The DIRHA simulated corpus. In: The 9th edition of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland, pp. 2629–2634 (2014)
Rougui, J., Istrate, D., Souidene, W.: Audio sound event identification for distress situations and context awareness. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2009, Minneapolis, USA, pp. 3501–3504 (2009)
Milhorat, P., Istrate, D., Boudy, J., Chollet, G.: Hands-free speech-sound interactions at home. In: Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1678–1682, August 2012
Lines, L., Hone, K.S.: Multiple voices, multiple choices: older adults’ evaluation of speech output to support independent living. Gerontechnology J. 5(2), 78–91 (2006)
Wolters, M.K., Georgila, K., Moore, J.D., MacPherson, S.E.: Being old doesn’t mean acting old: how older users interact with spoken dialog systems. TACCESS 2(1), 1–31 (2009)
Cavazza, M., de la Camara, R.S., Turunen, M.: How was your day?: a companion ECA. In: AAMAS, pp. 1629–1630 (2010)
Istrate, D., Vacher, M., Serignat, J.F.: Embedded implementation of distress situation identification through sound analysis. J. Inf. Technol. Healthc. 6, 204–211 (2008)
Charalampos, D., Maglogiannis, I.: Enabling human status awareness in assistive environments based on advanced sound and motion data classification. In: Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments, pp. 1:1–1:8 (2008)
Popescu, M., Li, Y., Skubic, M., Rantz, M.: An acoustic fall detector system that uses sound height information to reduce the false alarm rate. In: Proceedings 30th Annual International Conference of the IEEE-EMBS 2008, pp. 4628–4631, 20–25 August 2008
Lecouteux, B., Vacher, M., Portet, F.: Distant speech recognition in a smart home: comparison of several multisource ASRs in realistic conditions. In: Association, I.S.C. (ed.) Interspeech 2011 Florence, pp. 2273–2276. Florence, Italy (2011)
Bouakaz, S., Vacher, M., Bobillier-Chaumon, M.E., Aman, F., Bekkadja, S., Portet, F., Guillou, E., Rossato, S., Desserée, E., Traineau, P., Vimon, J.P., Chevalier, T.: CIRDO: smart companion for helping elderly to live at home for longer. IRBM 35(2), 101–108 (2014)
Barker, J., Vincent, E., Ma, N., Christensen, H., Green, P.: The PASCAL CHiME speech separation and recognition challenge. Comput. Speech Lang. 27(3), 621–633 (2013)
Vincent, E., Barker, J., Watanabe, S., Le Roux, J., Nesta, F., Matassoni, M.: The second ‘CHiME’ speech separation and recognition challenge: an overview of challenge systems and outcomes. In: 2013 IEEE Automatic Speech Recognition and Understanding Workshop, Olomouc, Czech Republic, December 2013
Ryan, W., Burk, K.: Perceptual and acoustic correlates in the speech of males. J. Commun. Disord. 7, 181–192 (1974)
Takeda, N., Thomas, G., Ludlow, C.: Aging effects on motor units in the human thyroarytenoid muscle. Laryngoscope 110, 1018–1025 (2000)
Mueller, P., Sweeney, R., Baribeau, L.: Acoustic and morphologic study of the senescent voice. Ear Nose Throat J. 63, 71–75 (1984)
Vipperla, R.C., Wolters, M., Georgila, K., Renals, S.: Speech input from older users in smart environments: challenges and perspectives. In: Stephanidis, C. (ed.) Universal Access in HCI, Part II, HCII 2009. LNCS, vol. 5615, pp. 117–126. Springer, Heidelberg (2009)
Pellegrini, T., Trancoso, I., Hämäläinen, A., Calado, A., Dias, M.S., Braga, D.: Impact of age in ASR for the elderly: preliminary experiments in European Portuguese. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 139–147. Springer, Heidelberg (2012)
Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K.: Acoustic models of the elderly for large-vocabulary continuous speech recognition. Electron. Commun. 87(2), 49–57 (2004)
Vipperla, R., Renals, S., Frankel, J.: Longitudinal study of ASR performance on ageing voices. In: Proceedings of Interspeech 2008, Brisbane, pp. 2550–2553 (2008)
Baeckman, L., Small, B., Wahlin, A.: Aging and memory: cognitive and biological perspectives. In: Birren, J.E., Schaie, K.W. (eds.) Handbook of the Psychology of Aging, 5th edn, pp. 349–377. Academic Press, San Diego (2001)
Fozard, J., Gordont-Salant, S.: Changes in vision and hearing with aging. In: Birren, J.E., Schaie, K.W. (eds.) Handbook of the Psychology of Aging, 5th edn, pp. 241–266. Academic Press, San Diego (2001)
Audibert, N., Aubergé, V., Rilliard, A.: The prosodic dimensions of emotion in speech: the relative weights of parameters. In: Proceedings of Interspeech 2005, Lisbon, Portugal, pp. 525–528 (2005)
Vlasenko, B., Prylipko, D., Philippou-Hübner, D., Wendemuth, A.: Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions. Proc. Interspeech 2011, 1577–1580 (2011)
Vlasenko, B., Prylipko, D., Wendemuth, A.: Towards robust spontaneous speech recognition with emotional speech adapted acoustic models. In: 35th German Conference on Artificial Intelligence (KI-2012), Saarbrücken, Germany, pp. 103–107, September 2012
Aman, F., Auberge, V., Vacher, M.: How affects can perturbe the automatic speech recognition of domotic interactions. In: Workshop on Affective Social Speech Signals, Grenoble, France, pp. 1–5 (2013)
Ziefle, M., Wilkowska, W.: Technology acceptability for medical assistance. In: PervasiveHealth, pp. 1–9, March 2010
McCoy, K., Waller, A.: Introduction to the special issue on AAC. ACM Trans. Access. Comput. 1(3), 1–34 (2009)
McCoy, K., Arnott, J., Ferres, L., Fried-Oken, M., Roark, B.: Speech and language processing as assistive technologies. Comput. Speech Lang. 27, 1143–1146 (2013)
Acknowledgments
This work is part of the Sweet-Home project supported by the French National Research Agency (Agence Nationale de la Recherche / ANR-09-VERS-011).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this chapter
Cite this chapter
Vacher, M., Lecouteux, B., Portet, F. (2015). On Distant Speech Recognition for Home Automation. In: Holzinger, A., Röcker, C., Ziefle, M. (eds) Smart Health. Lecture Notes in Computer Science(), vol 8700. Springer, Cham. https://doi.org/10.1007/978-3-319-16226-3_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-16226-3_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-16225-6
Online ISBN: 978-3-319-16226-3
eBook Packages: Computer ScienceComputer Science (R0)