Skip to main content

On Distant Speech Recognition for Home Automation

  • Chapter
  • First Online:
Smart Health

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 8700))

Abstract

In the framework of Ambient Assisted Living, home automation may be a solution for helping elderly people living alone at home. This study is part of the Sweet-Home project which aims at developing a new home automation system based on voice command to improve support and well-being of people in loss of autonomy. The goal of the study is vocal order recognition with a focus on two aspects: distance speech recognition and sentence spotting. Several ASR techniques were evaluated on a realistic corpus acquired in a 4-room flat equipped with microphones set in the ceiling. This distant speech French corpus was recorded with 21 speakers who acted scenarios of activities of daily living. Techniques acting at the decoding stage, such as our novel approach called Driven Decoding Algorithm (DDA), gave better speech recognition results than the baseline and other approaches. This solution which uses the two best SNR channels and a priori knowledge (voice commands and distress sentences) has demonstrated an increase in recognition rate without introducing false alarms. Generally speaking, a short overview allows then to outline the research challenges that speech technologies must take up for Ambient Assisted Living and Augmentative and Alternative Communication, and the current reseach avenues in this domain.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    www.technosens.fr.

  2. 2.

    camera-contact.com.

  3. 3.

    http://www.slpat.org/.

  4. 4.

    http://www.interspeech2014.org/.

References

  1. Chan, M., Estève, D., Escriba, C., Campo, E.: A review of smart homes- present state and future challenges. Comput. Methods Programs Biomed. 91(1), 55–81 (2008)

    Article  Google Scholar 

  2. Vacher, M., Portet, F., Rossato, S., Aman, F., Golanski, C., Dugheanu, R.: Speech-based interaction in an AAL context. Gerontechnology 11(2), 310 (2012)

    Google Scholar 

  3. Vacher, M., Portet, F., Fleury, A., Noury, N.: Development of audio sensing technology for ambient assisted living: applications and challenges. Int. J. E-Health Med. Commun. 2(1), 35–54 (2011)

    Article  Google Scholar 

  4. Katz, S., Akpom, C.: A measure of primary sociobiological functions. J. Health Serv. 6(3), 493508 (1976)

    Google Scholar 

  5. Badii, A., Boudy, J.: CompanionAble - integrated cognitive assistive & domotic companion robotic systems for ability and security. In: 1st Congrés of the Société Française des Technologies pour l’Autonomie et de Gérontechnologie (SFTAG 2009), pp. 18–20, Troyes (2009)

    Google Scholar 

  6. Filho, G., Moir, T.: From science fiction to science fact: a smart-house interface using speech technology and a photorealistic avatar. Int. J. Comput. Appl. Technol. 39(8), 32–39 (2010)

    Article  Google Scholar 

  7. Gödde, F., Möller, S., Engelbrecht, K.P., Kühnel, C., Schleicher, R., Naumann, A., Wolters, M.: Study of a speech-based smart home system with older users. In: International Workshop on Intelligent User Interfaces for Ambient Assisted Living pp. 17–22 (2008)

    Google Scholar 

  8. Hamill, M., Young, V., Boger, J., Mihailidis, A.: Development of an automated speech recognition interface for personal emergency response systems. J. NeuroEngineering Rehabil. 6(1), 26 (2009)

    Article  Google Scholar 

  9. Vacher, M., Chahuara, P., Lecouteux, B., Istrate, D., Portet, F., Joubert, T., Sehili, M.E.A., Meillon, B., Bonnefond, N., Fabre, S., Roux, C., Caffiau, S.: The SWEET-HOME project: audio technology in smart homes to improve well-being and reliance. In: 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC 2013), Osaka, Japan, pp. 7298–7301, July 2013

    Google Scholar 

  10. Portet, F., Vacher, M., Golanski, C., Roux, C., Meillon, B.: Design and evaluation of a smart home voice interface for the elderly: acceptability and objection aspects. Pers. Ubiquit. Comput. 17(1), 127–144 (2013)

    Article  Google Scholar 

  11. López-Cózar, R., Callejas, Z.: Multimodal dialogue for ambient intelligence and smart environments. In: Nakashima, H., Aghajan, H., Augusto, J.C. (eds.) Handbook of Ambient Intelligence and Smart Environments, pp. 559–579. Springer, Berlin (2010)

    Chapter  Google Scholar 

  12. Koskela, T., Väänänen-Vainio-Mattila, K.: Evolution towards smart home environments: empirical evaluation of three user interfaces. Pers. Ubiquit. Comput. 8, 234–240 (2004)

    Article  Google Scholar 

  13. Vacher, M., Portet, F., Fleury, A., Noury, N.: Challenges in the processing of audio channels for ambient assisted living. In: IEEE HealthCom 2010, Lyon, France, pp. 330–337, 1–3 July 2010

    Google Scholar 

  14. Mäyrä, F., Soronen, A., Vanhala, J., Mikkonen, J., Zakrzewski, M., Koskinen, I., Kuusela, K.: Probing a proactive home: challenges in researching and designing everyday smart environments. Hum. Technol. 2, 158–186 (2006)

    Google Scholar 

  15. Edwards, W., Grinter, R.: At home with ubiquitous computing: seven challenges. In: Abowd, G., Brumitt, B., Shafer, S. (eds.) Ubicomp 2001. LNCS, vol. 2201, pp. 256–272. Springer, Heidelberg (2001)

    Chapter  Google Scholar 

  16. Wölfel, M., McDonough, J.W.: Distant Speech Recognition. Wiley, New York (2009)

    Book  Google Scholar 

  17. Deng, L., Acero, A., Plumpe, M., Huang, X.: Large-vocabulary speech recognition under adverse acoustic environments. In: ICSLP-2000, vol. 3, pp. 806–809. ISCA, Beijing, China (2000)

    Google Scholar 

  18. Baba, A., Lee, A., Saruwatari, H., Shikano, K.: Speech recognition by reverberation adapted acoustic model. In: ASJ General Meeting, pp. 27–28 (2002)

    Google Scholar 

  19. Michaut, F., Bellanger, M.: Filtrage adaptatif: théorie et algorithmes. Hermes Science Publication, Lavoisier (2005)

    Google Scholar 

  20. Valin, J.M.: On adjusting the learning rate in frequency domain echo cancellation with double talk. IEEE Trans. Acoust. Speech Signal Process. 15(3), 1030–1034 (2007)

    Google Scholar 

  21. Vacher, M., Fleury, A., Guirand, N., Serignat, J.F., Noury, N.: Speech recognition in a smart home: some experiments for telemonitoring. In: Corneliu Burileanu, H.N.T. (ed.) From Speech Processing to Spoken Language Technology, pp. 171–179. Publishing House of the Romanian Academy, Constanta (2009)

    Google Scholar 

  22. Vacher, M., Fleury, A., Serignat, J.F., Noury, N., Glasson, H.: Preliminary evaluation of speech/sound recognition for telemedicine application in a real environment. In: Proceedings of the InterSpeech, pp. 496–499 (2008)

    Google Scholar 

  23. Reidel, K., Tamblyn, R., Patel, V., Huang, A.: Pilot study of an interactive voice response system to improve medication refill compliance. BMC Med. Inform. Decis. Mak. 8, 46 (2008)

    Article  Google Scholar 

  24. Vacher, M., Lecouteux, B., Chahuara, P., Portet, F., Meillon, B., Bonnefond, N.: The Sweet-Home speech and multimodal corpus for home automation interaction. In: The 9th edition of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland, pp. 4499–4506 (2014)

    Google Scholar 

  25. Nocera, P., Linares, G., Massonié, D., Lefort, L.: Phoneme lattice based A* search algorithm for speech recognition. In: Sojka, P., Kopeček, I., Pala, K. (eds.) TSD 2002. LNCS (LNAI), vol. 2448, pp. 301–308. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  26. Aman, F., Vacher, M., Rossato, S., Portet, F.: Speech recognition of aged voices in the AAL context: detection of distress sentences. In: The 7th International Conference on Speech Technology and Human-Computer Dialogue, SpeD 2013, Cluj-Napoca, Romania, pp. 177–184 (2013)

    Google Scholar 

  27. Wang, Y., Zhu, X.: A new approach for incremental speaker adaptation. In: Proceedings of the International Symposium on Chinese Spoken Language Processing (ISCSLP 2000), pp. 163–166 (2000)

    Google Scholar 

  28. Fiscus, J.G.: A post-processing system to yield reduced word error rates: recognizer output voting error reduction (ROVER). In: Proceedings of the IEEE Workshop ASRU, pp. 347–354 (1997)

    Google Scholar 

  29. Lecouteux, B., Linarès, G., Estève, Y., Mauclair, J.: System combination by driven decoding. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2007, vol. 4, pp. IV-341–IV-344 (2007)

    Google Scholar 

  30. Lecouteux, B., Linarès, G., Estève, Y., Gravier, G.: Generalized driven decoding for speech recognition system combination. In: Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2008, pp. 1549–1552 (2008)

    Google Scholar 

  31. Lecouteux, B., Linarès, G., Nocéra, P., Bonastre, J.: Reconnaissance de la parole guidée par des transcriptions approchées. In: Journées d’Etudes sur la Parole (JEP 2006), Dinard, France, pp. 53–56 (2006)

    Google Scholar 

  32. Berndt, D., Clifford, J.: Using dynamic time warping to find patterns in time series. In: Workshop on Knowledge Discovery in Databases (KDD 1994) pp. 359–370 (1994)

    Google Scholar 

  33. Vacher, M., Lecouteux, B., Istrate, D., Joubert, T., Portet, F., Sehili, M., Chahuara, P.: Experimental evaluation of speech recognition technologies for voice-based home automation control in a smart home. In: 4th Workshop on Speech and Language Processing for Assistive Technologies, Grenoble, France, pp. 99–105 (2013)

    Google Scholar 

  34. Chahuara, P., Portet, F., Vacher, M.: Making context aware decision from uncertain information in a smart home: a Markov logic network approach. In: Augusto, J.C., Wichert, R., Collier, R., Keyson, D., Salah, A.A., Tan, A.-H. (eds.) AmI 2013. LNCS, vol. 8309, pp. 78–93. Springer, Heidelberg (2013)

    Chapter  Google Scholar 

  35. Franco, A.: Conférence invitée: Nouveaux paradigmes et technologies pour la santé et l’autonomie (invited conference: new paradigms and technologies for health and autonomy) [in french]. In: JEP-TALN-RECITAL 2012, Workshop ILADI 2012: Interactions Langagières pour personnes Agées Dans les habitats Intelligents (ILADI 2012: Language Interaction for Elderly in Smart Homes), pp. 1–2. ATALA/AFCP, Grenoble, France, June 2012

    Google Scholar 

  36. Vacher, M., Lecouteux, B., Portet, F.: Recognition of voice commands by multisource ASR and noise cancellation in a smart home environment. In: EUSIPCO (European Signal Processing Conference), Bucarest, Romania, pp. 1663–1667, 27–31 August 2012

    Google Scholar 

  37. Gemmeke, J.F., Ons, B., Tessema, N., hamme, H.V., van de Loo, J., Pauw, G.D., Daelemans, W., Huyghe, J., Derboven, J., Vuegen, L., Broeck, B.V.D., Karsmakers, P., Vanrumste, B.: Self-taught assistive vocal interfaces: an overview of the ALADIN project. In: Interspeech 2013, pp. 2039–2043 (2013)

    Google Scholar 

  38. Christensen, H., Casanueva, I., Cunningham, S., Green, P., Hain, T.: homeService: Voice-enabled assistive technology in the home using cloud-based automatic speech recognition. In: 4th Workshop on Speech and Language Processing for Assistive Technologies (2013)

    Google Scholar 

  39. Cristoforetti, L., Ravanelli, M., Omologo, M., Sosi, A., Abad, A., Hagmueller, M., Maragos, P.: The DIRHA simulated corpus. In: The 9th edition of the Language Resources and Evaluation Conference (LREC), Reykjavik, Iceland, pp. 2629–2634 (2014)

    Google Scholar 

  40. Rougui, J., Istrate, D., Souidene, W.: Audio sound event identification for distress situations and context awareness. In: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, EMBC 2009, Minneapolis, USA, pp. 3501–3504 (2009)

    Google Scholar 

  41. Milhorat, P., Istrate, D., Boudy, J., Chollet, G.: Hands-free speech-sound interactions at home. In: Proceedings of the 20th European Signal Processing Conference (EUSIPCO), pp. 1678–1682, August 2012

    Google Scholar 

  42. Lines, L., Hone, K.S.: Multiple voices, multiple choices: older adults’ evaluation of speech output to support independent living. Gerontechnology J. 5(2), 78–91 (2006)

    Google Scholar 

  43. Wolters, M.K., Georgila, K., Moore, J.D., MacPherson, S.E.: Being old doesn’t mean acting old: how older users interact with spoken dialog systems. TACCESS 2(1), 1–31 (2009)

    Article  Google Scholar 

  44. Cavazza, M., de la Camara, R.S., Turunen, M.: How was your day?: a companion ECA. In: AAMAS, pp. 1629–1630 (2010)

    Google Scholar 

  45. Istrate, D., Vacher, M., Serignat, J.F.: Embedded implementation of distress situation identification through sound analysis. J. Inf. Technol. Healthc. 6, 204–211 (2008)

    Google Scholar 

  46. Charalampos, D., Maglogiannis, I.: Enabling human status awareness in assistive environments based on advanced sound and motion data classification. In: Proceedings of the 1st international conference on PErvasive Technologies Related to Assistive Environments, pp. 1:1–1:8 (2008)

    Google Scholar 

  47. Popescu, M., Li, Y., Skubic, M., Rantz, M.: An acoustic fall detector system that uses sound height information to reduce the false alarm rate. In: Proceedings 30th Annual International Conference of the IEEE-EMBS 2008, pp. 4628–4631, 20–25 August 2008

    Google Scholar 

  48. Lecouteux, B., Vacher, M., Portet, F.: Distant speech recognition in a smart home: comparison of several multisource ASRs in realistic conditions. In: Association, I.S.C. (ed.) Interspeech 2011 Florence, pp. 2273–2276. Florence, Italy (2011)

    Google Scholar 

  49. Bouakaz, S., Vacher, M., Bobillier-Chaumon, M.E., Aman, F., Bekkadja, S., Portet, F., Guillou, E., Rossato, S., Desserée, E., Traineau, P., Vimon, J.P., Chevalier, T.: CIRDO: smart companion for helping elderly to live at home for longer. IRBM 35(2), 101–108 (2014)

    Article  Google Scholar 

  50. Barker, J., Vincent, E., Ma, N., Christensen, H., Green, P.: The PASCAL CHiME speech separation and recognition challenge. Comput. Speech Lang. 27(3), 621–633 (2013)

    Article  Google Scholar 

  51. Vincent, E., Barker, J., Watanabe, S., Le Roux, J., Nesta, F., Matassoni, M.: The second ‘CHiME’ speech separation and recognition challenge: an overview of challenge systems and outcomes. In: 2013 IEEE Automatic Speech Recognition and Understanding Workshop, Olomouc, Czech Republic, December 2013

    Google Scholar 

  52. Ryan, W., Burk, K.: Perceptual and acoustic correlates in the speech of males. J. Commun. Disord. 7, 181–192 (1974)

    Article  Google Scholar 

  53. Takeda, N., Thomas, G., Ludlow, C.: Aging effects on motor units in the human thyroarytenoid muscle. Laryngoscope 110, 1018–1025 (2000)

    Article  Google Scholar 

  54. Mueller, P., Sweeney, R., Baribeau, L.: Acoustic and morphologic study of the senescent voice. Ear Nose Throat J. 63, 71–75 (1984)

    Google Scholar 

  55. Vipperla, R.C., Wolters, M., Georgila, K., Renals, S.: Speech input from older users in smart environments: challenges and perspectives. In: Stephanidis, C. (ed.) Universal Access in HCI, Part II, HCII 2009. LNCS, vol. 5615, pp. 117–126. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  56. Pellegrini, T., Trancoso, I., Hämäläinen, A., Calado, A., Dias, M.S., Braga, D.: Impact of age in ASR for the elderly: preliminary experiments in European Portuguese. In: Torre Toledano, D., Ortega Giménez, A., Teixeira, A., González Rodríguez, J., Hernández Gómez, L., San Segundo Hernández, R., Ramos Castro, D. (eds.) IberSPEECH 2012. CCIS, vol. 328, pp. 139–147. Springer, Heidelberg (2012)

    Chapter  Google Scholar 

  57. Baba, A., Yoshizawa, S., Yamada, M., Lee, A., Shikano, K.: Acoustic models of the elderly for large-vocabulary continuous speech recognition. Electron. Commun. 87(2), 49–57 (2004)

    Google Scholar 

  58. Vipperla, R., Renals, S., Frankel, J.: Longitudinal study of ASR performance on ageing voices. In: Proceedings of Interspeech 2008, Brisbane, pp. 2550–2553 (2008)

    Google Scholar 

  59. Baeckman, L., Small, B., Wahlin, A.: Aging and memory: cognitive and biological perspectives. In: Birren, J.E., Schaie, K.W. (eds.) Handbook of the Psychology of Aging, 5th edn, pp. 349–377. Academic Press, San Diego (2001)

    Google Scholar 

  60. Fozard, J., Gordont-Salant, S.: Changes in vision and hearing with aging. In: Birren, J.E., Schaie, K.W. (eds.) Handbook of the Psychology of Aging, 5th edn, pp. 241–266. Academic Press, San Diego (2001)

    Google Scholar 

  61. Audibert, N., Aubergé, V., Rilliard, A.: The prosodic dimensions of emotion in speech: the relative weights of parameters. In: Proceedings of Interspeech 2005, Lisbon, Portugal, pp. 525–528 (2005)

    Google Scholar 

  62. Vlasenko, B., Prylipko, D., Philippou-Hübner, D., Wendemuth, A.: Vowels formants analysis allows straightforward detection of high arousal acted and spontaneous emotions. Proc. Interspeech 2011, 1577–1580 (2011)

    Google Scholar 

  63. Vlasenko, B., Prylipko, D., Wendemuth, A.: Towards robust spontaneous speech recognition with emotional speech adapted acoustic models. In: 35th German Conference on Artificial Intelligence (KI-2012), Saarbrücken, Germany, pp. 103–107, September 2012

    Google Scholar 

  64. Aman, F., Auberge, V., Vacher, M.: How affects can perturbe the automatic speech recognition of domotic interactions. In: Workshop on Affective Social Speech Signals, Grenoble, France, pp. 1–5 (2013)

    Google Scholar 

  65. Ziefle, M., Wilkowska, W.: Technology acceptability for medical assistance. In: PervasiveHealth, pp. 1–9, March 2010

    Google Scholar 

  66. McCoy, K., Waller, A.: Introduction to the special issue on AAC. ACM Trans. Access. Comput. 1(3), 1–34 (2009)

    Article  Google Scholar 

  67. McCoy, K., Arnott, J., Ferres, L., Fried-Oken, M., Roark, B.: Speech and language processing as assistive technologies. Comput. Speech Lang. 27, 1143–1146 (2013)

    Article  Google Scholar 

Download references

Acknowledgments

This work is part of the Sweet-Home project supported by the French National Research Agency (Agence Nationale de la Recherche / ANR-09-VERS-011).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Michel Vacher .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Vacher, M., Lecouteux, B., Portet, F. (2015). On Distant Speech Recognition for Home Automation. In: Holzinger, A., Röcker, C., Ziefle, M. (eds) Smart Health. Lecture Notes in Computer Science(), vol 8700. Springer, Cham. https://doi.org/10.1007/978-3-319-16226-3_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-16226-3_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-16225-6

  • Online ISBN: 978-3-319-16226-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics