Abstract
We tackle the novel problem of predicting when a user is likely to begin speaking to a humanoid robot. Human speakers usually take the state of their addressee into consideration and choose when to begin speaking to the addressee, and our idea is to use this convention with a system that interprets audio input. The proposed method predicts when a user is likely to begin speaking to a humanoid robot by machine learning that uses the robot’s behaviors—such as its posture, motion, and utterance—as input features. We create a data set manually annotated by three human participants indicating in real time whether or not they would be likely to begin speaking to the robot. We collect the parts to which the three commonly give the same labels and use these parts as the training and evaluation data for machine learning. Results of an experimental evaluation showed that our model correctly predicted 88.5% of the common parts in an open test. This result is similar to the results of a cross-validation, demonstrating that our model is not dependent on a specific training data set. A possible application of the model is the elimination of environmental noises that occur at timing when a cooperative user is not likely to begin speaking to a robot.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bregler, C.: Eigenlips for robust speech recognition. Int. Comput. Sci. Inst. 2, 669–672 (1994)
Duncan, S.: Some signals and rules for taking speaking turns in conversations. J. Pers. Soc. Psychol. 23, 283–292 (1972)
Ishiguro, H., Nishio, S.: Building artificial humans to understand humans. The JPN Soc. Artif. Organs 10(3), 133–142 (2007)
Kanda, T., Ishiguro, H., Imai, M., Ono, T.: Development and evaluation of interactive humanoid robots. Int. Conf. Robot. Autom. 92(11), 1839–1850 (2004)
Kendon, A.: Some functions of gaze direction in social interaction. Acta Psychol. 26, 22–63 (1967)
Kim, W., Ko, H.: Noise variance estimation for Kalman filtering of noisy speech. IEICE Trans. Inf. Syst. E84-D(1), 155–160 (2001)
Lee, A., Nakamura, K., Nisimura, R., Saruwatari, H., Shikano, K.: Noise robust real world spoken dialogue system using GMM based rejection of unintended inputs. In: Proceedings of INTERSPEECH, pp. 173–176 (2004)
Minato, T., Shimada, M., Ishiguro, H., Itakura, S.: Development of an android robot for studying human-robot interaction. In: Proceedings of IEA/AIE Conference, pp. 424–434 (2004)
Mori, M., Macdorman, F., Kageki, N.: The uncanny valley. The Robot. Autom. Mag. 19(2), 98–100 (2012)
Reeves, B., Nass, C.: The Media Equation: How People Treat Computers, Televisions, and New Media as Real People and Places. Cambridge University Press, Cambridge (1996)
Sacks, H., Schegloff, A., Jefferson, G.: A simplest systematics for the organization of turn-taking for conversation. Language 50(4), 696–735 (1974)
Skantze, G., Gustafson, J.: Attention and interaction control in a human-human-computer dialogue setting. In: Proceedings of the SIGDIAL Conference, pp. 310–313 (2009)
Vertegaal, R., Slagter, R., Veer, G., Nijholt, A.: Eye gaze patterns in conversations: there is more to conversational agents than meets the eyes. In: Proceedings of the SIGCHI conference on Human factors in computing systems, pp. 301–308 (2001)
Yoon, S., Chang., D.: Speech enhancement based on speech/noise-dominant decision. IEICE Trans. Inf. Syst. E85-D(4), 744–750 (2002)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer Science+Business Media New York
About this paper
Cite this paper
Sugiyama, T., Komatani, K., Sato, S. (2014). Predicting When People Will Speak to a Humanoid Robot. In: Mariani, J., Rosset, S., Garnier-Rizet, M., Devillers, L. (eds) Natural Interaction with Robots, Knowbots and Smartphones. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8280-2_17
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8280-2_17
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8279-6
Online ISBN: 978-1-4614-8280-2
eBook Packages: EngineeringEngineering (R0)