Skip to main content

Evaluating the Utility of Auditory Perspective-Taking in Robot Speech Presentations

  • Conference paper
Auditory Display (CMMR 2009, ICAD 2009)

Abstract

In speech interactions, people routinely reason about each other’s auditory perspective and change their manner of speaking accordingly, by adjusting their voice to overcome noise or distance, or by pausing for especially loud sounds and resuming when conditions are more favorable for the listener. In this paper we report the findings of a listening study motivated both by this observation and a prototype auditory interface for a mobile robot that monitors the aural parameters of its environment and infers its user’s listening requirements. The results provide significant empirical evidence of the utility of simulated auditory perspective taking and the inferred use of loudness and/or pauses to overcome the potential of ambient noise to mask synthetic speech.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Peres, S.C., Best, V., Brock, D., Frauenberger, C., Hermann, T., Neuhoff, J., Nickerson, L.V., Shinn-Cunningham, B., Stockman, A.: Auditory Interfaces. In: Kortum, P. (ed.) HCI Beyond the GUI, pp. 145–195. Morgan Kaufman, San Francisco (2008)

    Google Scholar 

  2. Brock, D., Martinson, E.: Exploring the Utility of Giving Robots Auditory Perspective-Taking Abilities. In: Proceedings of the 12th International Conference on Auditory Display (ICAD), London (2006)

    Google Scholar 

  3. Thrun, S., Beetz, M., Bennewitz, M., Burgard, W., Cremers, A.B., Dellaert, F., Fox, D., Hähnel, D., Rosenberg, C., Roy, N., Schulte, J., Schulz, D.: Probabilistic Algorithms and the Interactive Museum Tour-guide Robot Minerva. Intl. J. Robotics Res. 19, 972–999 (2000)

    Article  Google Scholar 

  4. Martinson, E., Brock, D.: Improving Human-Robot Interaction through Adaptation to the Auditory Scene. In: HRI 2007: Proceedings of the ACM/IEEE International Conference on Human-Robot Interaction, Arlington, VA (2007)

    Google Scholar 

  5. Brock, D., Martinson, E.: Using The Concept of Auditory Perspective Taking to Improve Robotic Speech Presentations for Individual Human Listeners. In: AAAI 2006 Fall Symposium Technical Report: Aurally Informed Performance: Integrating Machine Listening and Auditory Presentation in Robotic Systems, Washington, DC (2006)

    Google Scholar 

  6. Kagami, S., Sasaki, Y., Thompson, S., Fujihara, T., Enomoto, T., Mizoguchi, H.: Loudness Measurement of Human Utterance to a Robot in Noisy Environment. In: HRI 2008: Proceedings of the 3rd ACM/IEEE International Conference on Human-Robot Interaction, Amsterdam (2008)

    Google Scholar 

  7. Brock, D., McClimens, B., Trafton, J.G., McCurry, M., Perzanowski, D.: Evaluating Listeners’ Attention to and Comprehension of Spatialized Concurrent and Serial Talkers at Normal and a Synthetically Faster Rate of Speech. In: Proceedings of the 14th International Conference on Auditory Display (ICAD), Paris (2008)

    Google Scholar 

  8. Hardee, J.B., Mayhorn, C.B.: Reexamining Synthetic Speech: Intelligibility and the Effect of Age, Task, and Speech Type on Recall. In: Proceedings of the Human Factors and Egonomics Society 51st Annual Meeting, Baltimore, MD, pp. 1143–1147 (2007)

    Google Scholar 

  9. Stevens, C., Lees, N., Vonwiller, J., Burnham, D.: Online Experimental Methods to Evaluate Text-to-speech (TTS) Synthesis: Effects of Voice Gender and Signal Quality on Intelligibility, Naturalness, and Preference. Computer Speech and Language 19, 129–146 (2005)

    Article  Google Scholar 

  10. Cepstral, http://cepstral.com

  11. Fastl, H., Zwicker, E.: Psychoacoustics: Facts and Models, 3rd edn. Springer, Berlin (2007)

    Google Scholar 

  12. Royer, J.M., Hastings, C.N., Hook, C.: A Sentence Verification Technique for Measuring Reading Comprehension. J. Reading Behavior 11, 355–363 (1979)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2010 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brock, D., McClimens, B., Wasylyshyn, C., Trafton, J.G., McCurry, M. (2010). Evaluating the Utility of Auditory Perspective-Taking in Robot Speech Presentations. In: Ystad, S., Aramaki, M., Kronland-Martinet, R., Jensen, K. (eds) Auditory Display. CMMR ICAD 2009 2009. Lecture Notes in Computer Science, vol 5954. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-12439-6_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-12439-6_14

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-12438-9

  • Online ISBN: 978-3-642-12439-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics