Skip to main content

Spoken Language Understanding for Natural Interaction: The Siri Experience

  • Conference paper
  • First Online:
Natural Interaction with Robots, Knowbots and Smartphones

Abstract

Recent advances in software integration and efforts toward more personalization and context awareness have brought closer the long-standing vision of the ubiquitous intelligent personal assistant. This has become particularly salient in the context of smartphones and electronic tablets, where natural language interaction has the potential to considerably enhance mobile experience. Far beyond merely offering more options in terms of user interface, this trend may well usher in a genuine paradigm shift in man-machine communication. This contribution reviews the two major semantic interpretation frameworks underpinning natural language interaction, along with their respective advantages and drawbacks. It then discusses the choices made in Siri, Appleā€™s personal assistant on the iOS platform, and speculates on how the current implementation might evolve in the near future to best mitigate any downside.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Apple Inc. http://www.apple.com/iphone/features/siri.html. Accessed Oct 2011

  2. Bellegarda, J.R.: Latent semantic mapping. In: Deng, L., Wang, K., Chou, W. (eds.) Signal Processing Magazine, Special Issue on Speech Technology and Systems in Human-Machine Communication, vol.Ā 22(5), pp.Ā 70ā€“80, Sep 2005

    Google ScholarĀ 

  3. Berry, P., Myers, K., Uribe, T., Yorke-Smith, N.: Constraint solving experience with the CALO project. In: Proceedings of Workshop on Constraint Solving Under Change and Uncertainty, pp.Ā 4ā€“8 (2005)

    Google ScholarĀ 

  4. Buchanan, B.G., Shortliffe, E.H.: Ruleā€“Based Expert Systems: The MYCIN Experiments of the Stanford Heuristic Programming Project. Addisonā€“Wesley, Reading (1984)

    Google ScholarĀ 

  5. Cheyer, A., Martin, D.: The open agent architecture. J. Auton. Agents Multi-Agent Syst. 4(1), 143ā€“148 (2001)

    ArticleĀ  Google ScholarĀ 

  6. Fu, W.-T., Anderson, J.: From recurrent choice to skill learning: a reinforcement-learning model. J. Exp. Psychol. Gen. 135(2), 184ā€“206 (2006)

    ArticleĀ  Google ScholarĀ 

  7. Gasic, M., Keizer, S., Mairesse, F., Schatzmann, J., Thomson, B., Young, S.: Training and evaluation of the HIS POMDP dialogue system in noise. In: Proceedings of 9th SIGdial Workshop Discourse Dialog, Columbus, OH (2008)

    Google ScholarĀ 

  8. Google Mobile. http://www.google.com/mobile/voice-actions (2008)

  9. Guzzoni, D., Baur, C., Cheyer, A.: Active: a unified platform for building intelligent web interaction assistants. In: Proceedings of 2006 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, IEEE Computer Society, 2006

    Google ScholarĀ 

  10. Kaelbling, J.L., Littman, M., Cassandra, A.: Planning and acting in partially observable stochastic domains. Artif. Intell. 101, 99ā€“134 (1998)

    ArticleĀ  MathSciNetĀ  MATHĀ  Google ScholarĀ 

  11. Kording, J.K., Wolpert, D.: Bayesian integration in sensorimotor learning. Nature 427, 224ā€“227 (2004)

    ArticleĀ  Google ScholarĀ 

  12. Laird, J.E., Newell, A., Rosenbloom, P.S.: SOAR: an architecture for general intelligence. Artif. Intell. 33(1), 1ā€“64 (1987)

    ArticleĀ  MathSciNetĀ  Google ScholarĀ 

  13. Microsoft Tellme. http://www.microsoft.com/en-us/Tellme/consumers/default.aspx (2008)

  14. Morris, J., Ree, P., Maes, P.: SARDINE: dynamic seller strategies in an auction marketplace. In: Proceedings of ACM Conference on Electronic Commerce, pp.Ā 128ā€“134 (2000)

    Google ScholarĀ 

  15. Nuance Dragon Go! http://www.nuance.com/products/dragon-go-in-action/index.htm (2011)

  16. Rabiner, L.R., Juang, B.H., Lee, C.-H.: An overview of automatic speech recognition, ChapterĀ 1. In: Lee, C.-H., Soong, F.K., Paliwal, K.K. (eds.) Automatic Speech and Speaker Recognition: Advanced Topics, pp.Ā 1ā€“30. Kluwer Academic Publishers, Boston (1996)

    Google ScholarĀ 

  17. Sondik, E.: The optimal control of partially observable markov decision processes. Ph.D. Dissertation, Stanford University, Palo Alto, CA (1971)

    Google ScholarĀ 

  18. Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. Adaptive Computation and Machine Learning. MIT Press, Cambridge (1998)

    Google ScholarĀ 

  19. Sycara, K., Paolucci, M., van Velsen, M., Giampapa, J.: The RETSINA MAS Infrastructure. Technical Report CMU- RI-TR-01-05, Robotics Institute Technical Report, Carnegie Mellon University, 2001

    Google ScholarĀ 

  20. Thomson, B., Schatzmann, J., Young, S.: Bayesian update of dialogue state for robust dialogue systems. In: Proceedings of International Conference on Acoustics Speech Signal Processing, Las Vegas, NV (2008)

    Google ScholarĀ 

  21. Vlingo Mobile Voice User Interface. http://www.vlingo.com/ (2008)

  22. Wildfire Virtual Assistant Service, Virtuosity Corp. http://www.wildfirevirtualassistant.com (1995)

  23. Williams, J., Young, S.: Scaling POMDPs for spoken dialog management. IEEE Trans. Audio, Speech Lang. Process. 15(7), 2116ā€“2129 (2007)

    Google ScholarĀ 

  24. Williams, J., Poupart, P., Young, S.: Factored partially observable Markov decision processes for dialogue management. In: Proceedings of 4th Workshop Knowledge Reasoning in Practical Dialogue Systems, Edinburgh, UK (2005)

    Google ScholarĀ 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Jerome R. Bellegarda .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

Ā© 2014 Springer Science+Business Media New York

About this paper

Cite this paper

Bellegarda, J.R. (2014). Spoken Language Understanding for Natural Interaction: The Siri Experience. In: Mariani, J., Rosset, S., Garnier-Rizet, M., Devillers, L. (eds) Natural Interaction with Robots, Knowbots and Smartphones. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8280-2_1

Download citation

  • DOI: https://doi.org/10.1007/978-1-4614-8280-2_1

  • Published:

  • Publisher Name: Springer, New York, NY

  • Print ISBN: 978-1-4614-8279-6

  • Online ISBN: 978-1-4614-8280-2

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics