Skip to main content

“Your Word is my Command”: Google Search by Voice: A Case Study

  • Chapter
  • First Online:
Advances in Speech Recognition

Abstract

An important goal at Google is to make spoken access ubiquitously available. Achieving ubiquity requires two things: availability (i.e., built into every possible interaction where speech input or output can make sense) and performance (i.e., works so well that the modality adds no friction to the interaction).

This chapter is a case study of the development of Google Search by Voice – a step toward our long-term vision of ubiquitous access. While the integration of speech input into Google search is a significant step toward more ubiquitous access, it has posed many problems in terms of the performance of core speech technologies and the design of effective user interfaces. Work is ongoing and no doubt the problems are far from solved. Nonetheless, we have at the minimum achieved a level of performance showing that usage of voice search is growing rapidly, and that many users do indeed become repeat users.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. C. Allauzen, M. Riley, J. Schalkwyk, W. Skut, and M. Mohri. OpenFst: A general and efficient weighted finite-state transducer library. Lecture Notes in Computer Science, 4783:11, 2007.

    Article  Google Scholar 

  2. M. Bacchiani, F. Beaufays, J. Schalkwyk, M. Schuster, and B. Strope. Deploying GOOG-411: Early lessons in data, measurement, and testing. In Proceedings of ICASSP, pp 5260–5263, April 2008.

    Google Scholar 

  3. MJF Gales. Semi-tied full-covariance matrices for hidden Markov models. 1997.

    Google Scholar 

  4. B. Harb, C. Chelba, J. Dean, and G. Ghemawhat. Back-off language model compression. 2009.

    Google Scholar 

  5. H. Hermansky. Perceptual linear predictive (PLP) analysis of speech. Journal of the Acoustical Society of America, 87(4):1738–1752, 1990.

    Article  Google Scholar 

  6. M. Kamvar and S. Baluja. A large scale study of wireless search behavior: Google mobile search. In CHI, pp 701–709, 22–27 April 2006.

    Google Scholar 

  7. S. Katz. Estimation of probabilities from sparse data for the language model component of a speech recognizer. In IEEE Transactions on Acoustics, Speech and Signal Processing, volume 35, pp 400–401, March 1987.

    Google Scholar 

  8. D. Povey, D. Kanevsky, B. Kingsbury, B. Ramabhadran, G. Saon, and K. Visweswariah. Boosted MMI for model and feature-space discriminative training. In Proceedings of the International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2008.

    Google Scholar 

  9. R. Sproat, C. Shih, W. Gale, and N. Chang. A stochastic finite-state word-segmentation algorithm for Chinese. Computational Linguistics, 22(3):377–404, 1996.

    Google Scholar 

  10. C. Van Heerden, J. Schalkwyk, and B. Strope. Language modeling for what-with-where on GOOG-411. 2009.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Johan Schalkwyk .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer Science+Business Media, LLC

About this chapter

Cite this chapter

Schalkwyk, J. et al. (2010). “Your Word is my Command”: Google Search by Voice: A Case Study. In: Neustein, A. (eds) Advances in Speech Recognition. Springer, Boston, MA. https://doi.org/10.1007/978-1-4419-5951-5_4

Download citation

  • DOI: https://doi.org/10.1007/978-1-4419-5951-5_4

  • Published:

  • Publisher Name: Springer, Boston, MA

  • Print ISBN: 978-1-4419-5950-8

  • Online ISBN: 978-1-4419-5951-5

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics