Abstract
Natural language interfaces are becoming more ubiquitous. By allowing for more natural communication, reducing the complexity of interacting with machines, and enabling non-expert users, these interfaces have found homes in numerous common products. However, these natural language interfaces still have great room for growth and development in order to better reflect human speech patterns. Intuitive speech communication is often accompanied by gestural information that is currently lacking from most speech interfaces. Exclusion of gestural data reduces a machine’s ability to interpret deictic information and understand some semantic intent. To allow for truly intuitive communication between humans and machines, a natural language interface must understand not only speech but also gestural data. This paper will outline the limitations and restrictions of some of the most popular and common speech-only natural language processing algorithms and systems in use today. Focus will be given to extra-linguistic communication aspects, including gestural information. Current research trends will then be presented that have been designed to compensate for these gaps by incorporating extra-linguistic information. The success of each of these trends will then be evaluated, as well as the hopefulness of continued investigative efforts. Additionally, a model multimodal interface will be presented that incorporates language and gesture data in order to demonstrate the effectiveness of such an interface. The gestural portion of this interface is included to compensate for some of the limitations of speech-only natural language interfaces. Combining these two types of natural language interfaces thereby works to reduce the limitations of natural language interfaces and increase their success. This presentation will discuss how the two interfaces work together and will specify how the speech interface limitations are addressed through the inclusion of a gestural system.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Saffer, D.: Designing Gestural Interfaces: Touchscreens and Interactive Devices. O’Reilly Media, Inc., Sebastopol (2008)
Becker, K.C.: Developing a Speech-Based Interface for Field Data Collection. Diss., Texas A&M University (2016)
McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press, Chicago (1992)
Nadkarni, P.M., Ohno-Machado, L., Chapman, W.W.: Natural language processing: an introduction. J. Am. Med. Inform. Assoc. 18(5), 544–551 (2011)
Lewis-Kraus, G.: The Great A.I. Awakening. The New York Times Magazine. http://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html. Accessed 14 Dec 2016
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing, vol. 999. MIT Press, Cambridge (1999)
Deng, L., Dong, Y.: Deep learning: methods and applications. Found. Trends Signal Process. 7(3–4), 197–387 (2014)
Nielsen, M.A.: Neural networks and deep learning, 2016. http://neuralnetworksanddeeplearning.com/. Accessed 21 Dec 2016
McAllester, D.: Interviewed by author. November 30, 2016
Sproat, R.: Interviewed by author. December 7, 2016
Brew, C.: Interviewed by author. December 1, 2016
Hollingshead Seitz, K.: Interviewed by author. December 11, 2016
Bates, M., Bobrow, R.J., Weischedel, R.M.: Critical challenges for natural language processing. In: Challenges in Natural Language Processing, pp. 3–34 (1993)
Bänziger, T., Scherer, K.R.: The role of intonation in emotional expressions. Speech Commun. 46(3), 252–267 (2005)
Nakassis, C., Snedeker, J.: Beyond sarcasm: intonation and context as relational cues in children’s recognition of irony. In: Proceedings of the Twenty-Sixth Boston University Conference on Language Development. Cascadilla Press, Somerville, MA, pp. 429–440 (2002)
Liberman, M., Prince, A.: On stress and linguistic rhythm. Linguist. Inq. 8(2), 249–336 (1977)
Bolt, R.A.: ‘Put-that-there’: voice and gesture at the graphics interface. In: Maybury, M.T., Wahlster, W. (eds.)Readings in Intelligent User Interfaces, pp. 19–28. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Cavar, D.: Interviewed by author. November 8, 2016
Chandarana, M., et al.: A natural interaction interface for UAVs using intuitive gesture recognition. In: Savage-Knepshield, P., Chen, J. (eds.) Advances in Human Factors in Robots and Unmanned Systems, pp. 387–398. Springer International Publishing, Berlin (2017)
Bulyko, I., Ostendorf, M., Siu, M., Ng, T., Stolcke, A., Çetin, Ö.: Web resources for language modeling in conversational speech recognition. ACM Trans. Speech Lang. Process. 5(1), 1 (2007)
Gershgorn, D.: Oxford University’s lip-reading AI is more accurate than humans, but still has a way to go. Quartz. http://qz.com/829041/oxford-lip-reading-artificial-intelligence/. Accessed 07 Nov 2016
Sensory, Inc. Linguist Technologist. Interviewed by author. December 11, 2016
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson, Upper Saddle River, NJ (2009)
Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
Chandarana, M., et al.: Fly like this: Natural language interfaces for uav mission planning. In: Proceedings of the 10th International Conference on Advances in Computer-Human Interaction. ThinkMind (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this paper
Cite this paper
Meszaros, E.L., Chandarana, M., Trujillo, A., Allen, B.D. (2018). Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In: Chen, J. (eds) Advances in Human Factors in Robots and Unmanned Systems. AHFE 2017. Advances in Intelligent Systems and Computing, vol 595. Springer, Cham. https://doi.org/10.1007/978-3-319-60384-1_18
Download citation
DOI: https://doi.org/10.1007/978-3-319-60384-1_18
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60383-4
Online ISBN: 978-3-319-60384-1
eBook Packages: EngineeringEngineering (R0)