Recognition of Distant Voice Commands for Home Applications in Portuguese

Matos, Miguel; Abad, Alberto; Astudillo, Ramón; Trancoso, Isabel

doi:10.1007/978-3-319-13623-3_19

Miguel Matos^23,24,
Alberto Abad^23,24,
Ramón Astudillo²³ &
…
Isabel Trancoso^23,24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8854))

807 Accesses
1 Citations

Abstract

This paper presents a set of exploratory experiments addressed to analyse and evaluate the performance of baseline speech processing components in European Portuguese for distant voice command recognition applications in domestic environments. The analysis, conducted in a multi-channel multi-room scenario, showed the importance of adequate room detection and channel selection strategies to obtain acceptable performances. Two different computationally inexpensive channel selection measures for room detection, channel selection and cluster selection have been investigated. Experimental results show that the strategies based on envelope-variance measure consistently outperformed the remaining methods investigated, and particularly, that channel selection strategies can be more convenient than baseline beamforming methods, such as delay-and-sum, for this type of multi-room scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Young, S., et al.: HTK – Hidden Markov Model Toolkit, Manual (2006), http://htk.eng.cam.ac.uk/
Neto, J.P., Martins, C.A., Meinedo, H., Almeida, L.B.: The design of a large vocabulary speech corpus for Portuguese. In: Proc. Eurospeech, pp. 1707–1710 (1997)
Google Scholar
Potamianos, G., et al.: Robustness of distant–speech recognition and speaker identification-development of baseline system. Deliverable D4.1, DIRHA Consortium (February 2013)
Google Scholar
Hagmüller, M., et al.: Experimental task definitions. Deliverable D2.2, DIRHA Consortium (February 2013)
Google Scholar
Ravanelli, M., et al.: DIRHA-simcorpora I and II. Deliverables 2.1, 2.3, 2.4, DIRHA Consortium (February 2014)
Google Scholar
Johnson, D., Dudgeon, D.: Array signal processing: concepts and techniques. Prentice Hall (1993)
Google Scholar
Wolf, M., Nadeu, C.: On the potential of channel selection for recognition of reverberated speech with multiple microphones. In: Proc. Interspeech, pp. 80–83 (2010)
Google Scholar
Wolf, M., Nadeu, C.: Channel selection using N-Best hypothesis for multi-microphone ASR. In: Proc. Interspeech (2013)
Google Scholar
Wolf, M.: Channel selection and reverberation-robust automatic speech recognition. PhD, Universitat Politècnica de Catalunya (UPC) (2013)
Google Scholar
Cristoforetti, L., Ravanelli, M., Omologo, M., Sosi, A., Abad, A., Hagmüller, M., Maragos, P.: The DIRHA simulated corpus. In: Proc. LREC (2014)
Google Scholar
Abad, A., et al.: First report on novel techniques for distant-speech and speaker recognition. Deliverable D4.2, DIRHA Consortium (February 2014)
Google Scholar
Reynolds, D.A., Quatieri, T.F., Dunn, R.B.: Speaker Verification Using Adapted Gaussian Mixture Models. Digital Signal Processing, 19–41 (2000)
Google Scholar

Download references

Author information

Authors and Affiliations

L2F - Spoken Language Systems Lab, INESC-ID, Lisboa, Portugal
Miguel Matos, Alberto Abad, Ramón Astudillo & Isabel Trancoso
IST - Instituto Superior Técnico, University of Lisbon, Portugal
Miguel Matos, Alberto Abad & Isabel Trancoso

Authors

Miguel Matos
View author publications
You can also search for this author in PubMed Google Scholar
Alberto Abad
View author publications
You can also search for this author in PubMed Google Scholar
Ramón Astudillo
View author publications
You can also search for this author in PubMed Google Scholar
Isabel Trancoso
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

ETSIT, Las Palmas de Gran Canaria, Spain
Juan Luis Navarro Mesa , Eduardo Hernández Pérez , Pedro Quintana Morales , Antonio Ravelo García & Iván Guerra Moreno , , , &
University of Zaragoza, Spain
Alfonso Ortega
Dep. of Electronics, Telecommunications and Informatics Engineering, University of Aveiro, Portugal
António Teixeira
ATVS Biometric Recognition Group,, Universidad Autónoma de Madrid, Spain
Doroteo T. Toledano

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Matos, M., Abad, A., Astudillo, R., Trancoso, I. (2014). Recognition of Distant Voice Commands for Home Applications in Portuguese. In: Navarro Mesa, J.L., et al. Advances in Speech and Language Technologies for Iberian Languages. Lecture Notes in Computer Science(), vol 8854. Springer, Cham. https://doi.org/10.1007/978-3-319-13623-3_19

Download citation

DOI: https://doi.org/10.1007/978-3-319-13623-3_19
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-13622-6
Online ISBN: 978-3-319-13623-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics