Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation

Meszaros, Erica L.; Chandarana, Meghan; Trujillo, Anna; Allen, B. Danette

doi:10.1007/978-3-319-60384-1_18

Erica L. Meszaros¹⁵,
Meghan Chandarana¹⁶,
Anna Trujillo¹⁷ &
…
B. Danette Allen¹⁷

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 595))

Included in the following conference series:

International Conference on Applied Human Factors and Ergonomics

1825 Accesses
1 Citations

Abstract

Natural language interfaces are becoming more ubiquitous. By allowing for more natural communication, reducing the complexity of interacting with machines, and enabling non-expert users, these interfaces have found homes in numerous common products. However, these natural language interfaces still have great room for growth and development in order to better reflect human speech patterns. Intuitive speech communication is often accompanied by gestural information that is currently lacking from most speech interfaces. Exclusion of gestural data reduces a machine’s ability to interpret deictic information and understand some semantic intent. To allow for truly intuitive communication between humans and machines, a natural language interface must understand not only speech but also gestural data. This paper will outline the limitations and restrictions of some of the most popular and common speech-only natural language processing algorithms and systems in use today. Focus will be given to extra-linguistic communication aspects, including gestural information. Current research trends will then be presented that have been designed to compensate for these gaps by incorporating extra-linguistic information. The success of each of these trends will then be evaluated, as well as the hopefulness of continued investigative efforts. Additionally, a model multimodal interface will be presented that incorporates language and gesture data in order to demonstrate the effectiveness of such an interface. The gestural portion of this interface is included to compensate for some of the limitations of speech-only natural language interfaces. Combining these two types of natural language interfaces thereby works to reduce the limitations of natural language interfaces and increase their success. This presentation will discuss how the two interfaces work together and will specify how the speech interface limitations are addressed through the inclusion of a gestural system.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 129.00; Price excludes VAT (USA)

Softcover Book: USD 169.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Saffer, D.: Designing Gestural Interfaces: Touchscreens and Interactive Devices. O’Reilly Media, Inc., Sebastopol (2008)
Google Scholar
Becker, K.C.: Developing a Speech-Based Interface for Field Data Collection. Diss., Texas A&M University (2016)
Google Scholar
McNeill, D.: Hand and Mind: What Gestures Reveal About Thought. University of Chicago Press, Chicago (1992)
Google Scholar
Nadkarni, P.M., Ohno-Machado, L., Chapman, W.W.: Natural language processing: an introduction. J. Am. Med. Inform. Assoc. 18(5), 544–551 (2011)
Article Google Scholar
Lewis-Kraus, G.: The Great A.I. Awakening. The New York Times Magazine. http://www.nytimes.com/2016/12/14/magazine/the-great-ai-awakening.html. Accessed 14 Dec 2016
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing, vol. 999. MIT Press, Cambridge (1999)
MATH Google Scholar
Deng, L., Dong, Y.: Deep learning: methods and applications. Found. Trends Signal Process. 7(3–4), 197–387 (2014)
Article MathSciNet MATH Google Scholar
Nielsen, M.A.: Neural networks and deep learning, 2016. http://neuralnetworksanddeeplearning.com/. Accessed 21 Dec 2016
McAllester, D.: Interviewed by author. November 30, 2016
Google Scholar
Sproat, R.: Interviewed by author. December 7, 2016
Google Scholar
Brew, C.: Interviewed by author. December 1, 2016
Google Scholar
Hollingshead Seitz, K.: Interviewed by author. December 11, 2016
Google Scholar
Bates, M., Bobrow, R.J., Weischedel, R.M.: Critical challenges for natural language processing. In: Challenges in Natural Language Processing, pp. 3–34 (1993)
Google Scholar
Bänziger, T., Scherer, K.R.: The role of intonation in emotional expressions. Speech Commun. 46(3), 252–267 (2005)
Article Google Scholar
Nakassis, C., Snedeker, J.: Beyond sarcasm: intonation and context as relational cues in children’s recognition of irony. In: Proceedings of the Twenty-Sixth Boston University Conference on Language Development. Cascadilla Press, Somerville, MA, pp. 429–440 (2002)
Google Scholar
Liberman, M., Prince, A.: On stress and linguistic rhythm. Linguist. Inq. 8(2), 249–336 (1977)
Google Scholar
Bolt, R.A.: ‘Put-that-there’: voice and gesture at the graphics interface. In: Maybury, M.T., Wahlster, W. (eds.)Readings in Intelligent User Interfaces, pp. 19–28. Morgan Kaufmann Publishers Inc., San Francisco (1998)
Google Scholar
Cavar, D.: Interviewed by author. November 8, 2016
Google Scholar
Chandarana, M., et al.: A natural interaction interface for UAVs using intuitive gesture recognition. In: Savage-Knepshield, P., Chen, J. (eds.) Advances in Human Factors in Robots and Unmanned Systems, pp. 387–398. Springer International Publishing, Berlin (2017)
Chapter Google Scholar
Bulyko, I., Ostendorf, M., Siu, M., Ng, T., Stolcke, A., Çetin, Ö.: Web resources for language modeling in conversational speech recognition. ACM Trans. Speech Lang. Process. 5(1), 1 (2007)
Article Google Scholar
Gershgorn, D.: Oxford University’s lip-reading AI is more accurate than humans, but still has a way to go. Quartz. http://qz.com/829041/oxford-lip-reading-artificial-intelligence/. Accessed 07 Nov 2016
Sensory, Inc. Linguist Technologist. Interviewed by author. December 11, 2016
Google Scholar
Jurafsky, D., Martin, J.H.: Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition. Pearson, Upper Saddle River, NJ (2009)
Google Scholar
Turing, A.M.: Computing machinery and intelligence. Mind 59(236), 433–460 (1950)
Article MathSciNet Google Scholar
Chandarana, M., et al.: Fly like this: Natural language interfaces for uav mission planning. In: Proceedings of the 10th International Conference on Advances in Computer-Human Interaction. ThinkMind (2017)
Google Scholar

Download references

Author information

Authors and Affiliations

University of Chicago, Chicago, IL, USA
Erica L. Meszaros
Carnegie Mellon University, Pittsburgh, PA, USA
Meghan Chandarana
NASA Langley Research Center, Hampton, VA, USA
Anna Trujillo & B. Danette Allen

Authors

Erica L. Meszaros
View author publications
You can also search for this author in PubMed Google Scholar
Meghan Chandarana
View author publications
You can also search for this author in PubMed Google Scholar
Anna Trujillo
View author publications
You can also search for this author in PubMed Google Scholar
B. Danette Allen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Erica L. Meszaros .

Editor information

Editors and Affiliations

U.S. Army Research Laboratory, Orlando, Florida, USA
Jessie Chen

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Meszaros, E.L., Chandarana, M., Trujillo, A., Allen, B.D. (2018). Compensating for Limitations in Speech-Based Natural Language Processing with Multimodal Interfaces in UAV Operation. In: Chen, J. (eds) Advances in Human Factors in Robots and Unmanned Systems. AHFE 2017. Advances in Intelligent Systems and Computing, vol 595. Springer, Cham. https://doi.org/10.1007/978-3-319-60384-1_18

Download citation

DOI: https://doi.org/10.1007/978-3-319-60384-1_18
Published: 21 June 2017
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-60383-4
Online ISBN: 978-3-319-60384-1
eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics