Skip to main content

Spoken Commands in a Smart Home: An Iterative Approach to the Sphinx Algorithm

  • Conference paper
MICAI 2007: Advances in Artificial Intelligence (MICAI 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4827))

Included in the following conference series:

Abstract

An algorithm for decoding commands spoken in an intelligent environment through iterative vocabulary reduction is presented. Current research in the field of speech recognition focuses primarily on the optimization of algorithms for single pass decoding using large vocabularies. While this is ideal for processing conversational speech, alternative methods should be explored for different domains of speech, specifically commands issued verbally in an intelligent environment. Such commands have both an explicitly defined structure and a vocabulary limited to valid task descriptions. We propose that a multiple pass context-driven decoding scheme utilizing dictionary pruning yields improved accuracy; this occurs when one deals with command structure and a reduced vocabulary. Each iteration incorporates the hypothesis of the previous into its decoding scheme by removing unlikely words from the current language model. We have applied this decoding method to a comprehensive set of spoken commands through the use of Sphinx-4, an Automatic Speech Recognition (ASR) engine using the Hidden Markov Model (HMM). When decoding via HMM, multiple previous states are used to determine the current state, thus utilizing context to aid in intelligent recognition. Our results show that within a fixed domain, multiple pass decoding yields recognition accuracy. Further research must be conducted to optimize practical context driven decoding and to apply the method to larger domains, primarily those of intelligent environments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Ravishankar, M.K.: Efficient Algorithms for Speech Recognition. Ph.D Thesis, Carnegie Mellon University, Tech Report. CMU-CS-96-143 (May 1996)

    Google Scholar 

  2. Harper, R. (ed.): Inside Smart Home. Springer, Berlin (2003)

    Google Scholar 

  3. Wang, C., Chung, G., Seneff, S.: Automatic Induction of Language Model Data for A Spoken Dialogue System. Special Issue of the Springer Journal on Language Resources and Evaluation 40(1), 25–46 (2006)

    Google Scholar 

  4. Lamere, P.K., Walker, W., Gouvea, E., Singh, R., Raj, B., Wolf, P.: Sphinx-4: A Flexible Open Source Framework for Speech Recognition Sun Microsystems, Report Number: TR-2004-139

    Google Scholar 

  5. Lamere, P.K., Walker, W., Gouvea, E., Singh, R., Raj, B., Wolf, P.: Design of the CMU Sphinx-4 decoder. In: Proceedings of the 8th European Conference on Speech Communication and Technology, Geneva, Switzerland, pp. 1181–1184 (September 2003)

    Google Scholar 

  6. Huggins-Daines, D., et al.: PocketSphinx: A Free, Real-Time Continuous Speech Recognition System for Hand-Held Devices. In: Proc. of ICASSP 2006, Toulouse (May 16-19, 2006)

    Google Scholar 

  7. Walker, W., Lamere, P., Kwok, P.: FreeTTS - A Performance Case Study, SUN (2002)

    Google Scholar 

  8. Intille, S.S.: The goal: smart people, not smart homes. In: Proceedings of the International Conference on Smart Homes and Health Telematics, IOS Press, Amsterdam (2006)

    Google Scholar 

  9. Siivola, V., Pellom, B.L.: Growing an n-gram language model. In: INTERSPEECH-2005, pp. 1309–1312 (2005)

    Google Scholar 

  10. Wang, W., Stolcke, A., Harper, M.P.: The Use Of A Linguistically Motivated Language Model In Conversational Speech Recognition. In: Proceedings of the IEEE International Conference on Acoustic, Speech and Signal Processing, Montreal, Canada (2004)

    Google Scholar 

  11. Lee, K.F., Hon, H.W., Reddy, R.: An overview of the SPHINX speech recognition system. IEEE Transactions on Acoustics, Speech and Signal Processing 38(1), 35–45 (1990)

    Article  Google Scholar 

  12. Huang, F.A., Hon, H.W., Hwang, M.Y., Rosenfeld, R.: The SPHINX-II speech recognition system: an overview. Computer Speech and Language 7(2), 137–148 (1993)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Alexander Gelbukh Ángel Fernando Kuri Morales

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Denkowski, M., Hannon, C., Sanchez, A. (2007). Spoken Commands in a Smart Home: An Iterative Approach to the Sphinx Algorithm. In: Gelbukh, A., Kuri Morales, Á.F. (eds) MICAI 2007: Advances in Artificial Intelligence. MICAI 2007. Lecture Notes in Computer Science(), vol 4827. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-76631-5_98

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-76631-5_98

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-76630-8

  • Online ISBN: 978-3-540-76631-5

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics