Abstract
Recent advances in software integration and efforts toward more personalization and context awareness have brought closer the long-standing vision of the ubiquitous intelligent personal assistant. This has become particularly salient in the context of smartphones and electronic tablets, where natural language interaction has the potential to considerably enhance mobile experience. Far beyond merely offering more options in terms of user interface, this trend may well usher in a genuine paradigm shift in man-machine communication. This contribution reviews the two major semantic interpretation frameworks underpinning natural language interaction, along with their respective advantages and drawbacks. It then discusses the choices made in Siri, Appleās personal assistant on the iOS platform, and speculates on how the current implementation might evolve in the near future to best mitigate any downside.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Apple Inc. http://www.apple.com/iphone/features/siri.html. Accessed Oct 2011
Bellegarda, J.R.: Latent semantic mapping. In: Deng, L., Wang, K., Chou, W. (eds.) Signal Processing Magazine, Special Issue on Speech Technology and Systems in Human-Machine Communication, vol.Ā 22(5), pp.Ā 70ā80, Sep 2005
Berry, P., Myers, K., Uribe, T., Yorke-Smith, N.: Constraint solving experience with the CALO project. In: Proceedings of Workshop on Constraint Solving Under Change and Uncertainty, pp.Ā 4ā8 (2005)
Buchanan, B.G., Shortliffe, E.H.: RuleāBased Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. AddisonāWesley, Reading (1984)
Cheyer, A., Martin, D.: The open agent architecture. J. Auton. Agents Multi-Agent Syst. 4(1), 143ā148 (2001)
Fu, W.-T., Anderson, J.: From recurrent choice to skill learning: a reinforcement-learning model. J. Exp. Psychol. Gen. 135(2), 184ā206 (2006)
Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., Young, S.: Training and evaluation of the HIS POMDP dialogue system in noise. In: Proceedings of 9th SIGdial Workshop Discourse Dialog, Columbus, OH (2008)
Google Mobile. http://www.google.com/mobile/voice-actions (2008)
Guzzoni, D., Baur, C., Cheyer, A.: Active: a unified platform for building intelligent web interaction assistants. In: Proceedings of 2006 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, IEEE Computer Society, 2006
Kaelbling, J.L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99ā134 (1998)
Kording, J.K., Wolpert, D.: Bayesian integration in sensorimotor learning. Nature 427, 224ā227 (2004)
Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: an architecture for general intelligence. Artif. Intell. 33(1), 1ā64 (1987)
Microsoft Tellme. http://www.microsoft.com/en-us/Tellme/consumers/default.aspx (2008)
Morris, J., Ree, P., Maes, P.: SARDINE: dynamic seller strategies in an auction marketplace. In: Proceedings of ACM Conference on Electronic Commerce, pp.Ā 128ā134 (2000)
Nuance Dragon Go! http://www.nuance.com/products/dragon-go-in-action/index.htm (2011)
Rabiner, L.R., Juang, B.H., Lee, C.-H.: An overview of automatic speech recognition, ChapterĀ 1. In: Lee, C.-H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition: Advanced Topics, pp.Ā 1ā30. Kluwer Academic Publishers, Boston (1996)
Sondik, E.: The optimal control of partially observable markov decision processes. Ph.D. Dissertation, Stanford University, Palo Alto, CA (1971)
Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)
Sycara, K., Paolucci, M., van Velsen, M., Giampapa, J.: The RETSINA MAS Infrastructure. Technical Report CMU- RI-TR-01-05, Robotics Institute Technical Report, Carnegie Mellon University, 2001
Thomson, B., Schatzmann, J., Young, S.: Bayesian update of dialogue state for robust dialogue systems. In: Proceedings of International Conference on Acoustics Speech Signal Processing, Las Vegas, NV (2008)
Vlingo Mobile Voice User Interface. http://www.vlingo.com/ (2008)
Wildfire Virtual Assistant Service, Virtuosity Corp. http://www.wildfirevirtualassistant.com (1995)
Williams, J., Young, S.: Scaling POMDPs for spoken dialog management. IEEE Trans. Audio, Speech Lang. Process. 15(7), 2116ā2129 (2007)
Williams, J., Poupart, P., Young, S.: Factored partially observable Markov decision processes for dialogue management. In: Proceedings of 4th Workshop Knowledge Reasoning in Practical Dialogue Systems, Edinburgh, UK (2005)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
Ā© 2014 Springer Science+Business Media New York
About this paper
Cite this paper
Bellegarda, J.R. (2014). Spoken Language Understanding for Natural Interaction: The Siri Experience. In: Mariani, J., Rosset, S., Garnier-Rizet, M., Devillers, L. (eds) Natural Interaction with Robots, Knowbots and Smartphones. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8280-2_1
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8280-2_1
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8279-6
Online ISBN: 978-1-4614-8280-2
eBook Packages: EngineeringEngineering (R0)