Spoken Language Understanding for Natural Interaction: The Siri Experience

Bellegarda, Jerome R.

doi:10.1007/978-1-4614-8280-2_1

Jerome R. Bellegarda⁵

3449 Accesses
49 Citations

Abstract

Recent advances in software integration and efforts toward more personalization and context awareness have brought closer the long-standing vision of the ubiquitous intelligent personal assistant. This has become particularly salient in the context of smartphones and electronic tablets, where natural language interaction has the potential to considerably enhance mobile experience. Far beyond merely offering more options in terms of user interface, this trend may well usher in a genuine paradigm shift in man-machine communication. This contribution reviews the two major semantic interpretation frameworks underpinning natural language interaction, along with their respective advantages and drawbacks. It then discusses the choices made in Siri, Apple’s personal assistant on the iOS platform, and speculates on how the current implementation might evolve in the near future to best mitigate any downside.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Hardcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Apple Inc. http://www.apple.com/iphone/features/siri.html. Accessed Oct 2011
Bellegarda, J.R.: Latent semantic mapping. In: Deng, L., Wang, K., Chou, W. (eds.) Signal Processing Magazine, Special Issue on Speech Technology and Systems in Human-Machine Communication, vol. 22(5), pp. 70–80, Sep 2005
Google Scholar
Berry, P., Myers, K., Uribe, T., Yorke-Smith, N.: Constraint solving experience with the CALO project. In: Proceedings of Workshop on Constraint Solving Under Change and Uncertainty, pp. 4–8 (2005)
Google Scholar
Buchanan, B.G., Shortliffe, E.H.: Rule–Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addison–Wesley, Reading (1984)
Google Scholar
Cheyer, A., Martin, D.: The open agent architecture. J. Auton. Agents Multi-Agent Syst. 4(1), 143–148 (2001)
Article Google Scholar
Fu, W.-T., Anderson, J.: From recurrent choice to skill learning: a reinforcement-learning model. J. Exp. Psychol. Gen. 135(2), 184–206 (2006)
Article Google Scholar
Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., Young, S.: Training and evaluation of the HIS POMDP dialogue system in noise. In: Proceedings of 9th SIGdial Workshop Discourse Dialog, Columbus, OH (2008)
Google Scholar
Google Mobile. http://www.google.com/mobile/voice-actions (2008)
Guzzoni, D., Baur, C., Cheyer, A.: Active: a unified platform for building intelligent web interaction assistants. In: Proceedings of 2006 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, IEEE Computer Society, 2006
Google Scholar
Kaelbling, J.L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99–134 (1998)
Article MathSciNet MATH Google Scholar
Kording, J.K., Wolpert, D.: Bayesian integration in sensorimotor learning. Nature 427, 224–227 (2004)
Article Google Scholar
Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: an architecture for general intelligence. Artif. Intell. 33(1), 1–64 (1987)
Article MathSciNet Google Scholar
Microsoft Tellme. http://www.microsoft.com/en-us/Tellme/consumers/default.aspx (2008)
Morris, J., Ree, P., Maes, P.: SARDINE: dynamic seller strategies in an auction marketplace. In: Proceedings of ACM Conference on Electronic Commerce, pp. 128–134 (2000)
Google Scholar
Nuance Dragon Go! http://www.nuance.com/products/dragon-go-in-action/index.htm (2011)
Rabiner, L.R., Juang, B.H., Lee, C.-H.: An overview of automatic speech recognition, Chapter 1. In: Lee, C.-H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition: Advanced Topics, pp. 1–30. Kluwer Academic Publishers, Boston (1996)
Google Scholar
Sondik, E.: The optimal control of partially observable markov decision processes. Ph.D. Dissertation, Stanford University, Palo Alto, CA (1971)
Google Scholar
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)
Google Scholar
Sycara, K., Paolucci, M., van Velsen, M., Giampapa, J.: The RETSINA MAS Infrastructure. Technical Report CMU- RI-TR-01-05, Robotics Institute Technical Report, Carnegie Mellon University, 2001
Google Scholar
Thomson, B., Schatzmann, J., Young, S.: Bayesian update of dialogue state for robust dialogue systems. In: Proceedings of International Conference on Acoustics Speech Signal Processing, Las Vegas, NV (2008)
Google Scholar
Vlingo Mobile Voice User Interface. http://www.vlingo.com/ (2008)
Wildfire Virtual Assistant Service, Virtuosity Corp. http://www.wildfirevirtualassistant.com (1995)
Williams, J., Young, S.: Scaling POMDPs for spoken dialog management. IEEE Trans. Audio, Speech Lang. Process. 15(7), 2116–2129 (2007)
Google Scholar
Williams, J., Poupart, P., Young, S.: Factored partially observable Markov decision processes for dialogue management. In: Proceedings of 4th Workshop Knowledge Reasoning in Practical Dialogue Systems, Edinburgh, UK (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Apple Inc., One Infinite Loop, Cupertino, CA, 95014, USA
Jerome R. Bellegarda

Authors

Jerome R. Bellegarda
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jerome R. Bellegarda .

Editor information

Editors and Affiliations

IMMI-CNRS, Orsay, France
Joseph Mariani
LIMSI-CNRS, Orsay, France
Sophie Rosset
IMMI-CNRS, Orsay, France
Martine Garnier-Rizet
LIMSI-CNRS, Orsay, France
Laurence Devillers

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bellegarda, J.R. (2014). Spoken Language Understanding for Natural Interaction: The Siri Experience. In: Mariani, J., Rosset, S., Garnier-Rizet, M., Devillers, L. (eds) Natural Interaction with Robots, Knowbots and Smartphones. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8280-2_1

Download citation

DOI: https://doi.org/10.1007/978-1-4614-8280-2_1
Published: 28 August 2013
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8279-6
Online ISBN: 978-1-4614-8280-2
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics