Skip to main content

Part of the book series: Modern Acoustics and Signal Processing ((MASP))

Abstract

Audition is often described by physiologists as the most important sense in humans, due to its essential role in communication and socialization. But quite surprisingly, the interest of this modality for robotics arose only in the 2000s, brought to evidence by cognitive robotics and Human–robot interaction. Since then, numerous contributions have been proposed to the field of robot audition, ranging from sound localization to scene analysis. Binaural approaches were investigated first, then became forsaken due to mixed results. Nevertheless, the last years have witnessed a renewal of interest in binaural active audition, that is, in the opportunities and challenges opened by the coupling of binaural sensing and robot motion. This chapter proposes a comprehensive state of the art of binaural approaches to robot audition. Though the literature on binaural audition and, more generally, on acoustics and signal processing, is a fundamental source of knowledge, the tasks, constraints, and environments of robotics raise original issues. These are reviewed, prior to the most prominent contributions, platforms and projects. Two lines of research in binaural active audition, conducted by the current authors, are then outlined, one of which is tightly connected to psychology of perception.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 189.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 249.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    Single-sensor approaches exist, such as [76, 83], but are rarely addressed in the literature.

  2. 2.

    http://www.ai.mit.edu/projects/humanoid-robotics-group/cog/

  3. 3.

    http://winnie.kuis.kyoto-u.ac.jp/SIG/

  4. 4.

    HRI-JP audition for robots with Kyoto University, http://winnie.kuis.kyoto-u.ac.jp/HARK/. In Ariel’s Song, The Tempest, from Shakespeare, hark is an ancient english word for listen.

  5. 5.

    http://www.icub.org/projects.php

  6. 6.

    http://perception.inrialpes.fr/POP/

  7. 7.

    http://humavips.inrialpes.fr/

  8. 8.

    http://perception.inrialpes.fr/~Deleforge/CAMIL_Dataset/index.html

  9. 9.

    http://ravel.humavips.eu

  10. 10.

    http://projects.laas.fr/BINAAHR

References

  1. J. Aloimonos, I. Weiss, and A. Bandyopadhyay. Active vision. Intl. J. Computer Vision, 1:333–356, 1988.

    Google Scholar 

  2. S. Argentieri and P. Danès. Broadband variations of the MUSIC high-resolution method for sound source localization in robotics. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2007, pages 2009–2014, 2007.

    Google Scholar 

  3. E. Arnaud, H. Christensen, Y.-C. Lu, J. Barker, V. Khalidov, M. Hansard, B. Holveck, H. Mathieu, R. Narasimha, E. Taillant, F. Forbes, and R. Horaud. The CAVA corpus: Synchronised stereoscopic and binaural datasets with head movements. In ACM/IEEE Intl. Conf. Multimodal, Interfaces, ICMI’08, 2008.

    Google Scholar 

  4. M. Aytekin, C. Moss, and J. Simon. A sensorimotor approach to sound localization. Neural Computation, 20:603–635, 2008.

    Google Scholar 

  5. P. Azad, T. Gockel, R. Dillmann. Computer Vision: Principles and Practice. Elektor, Electronics, 2008.

    Google Scholar 

  6. R. Bajcsy. Active perception. Proc. of the IEEE, 76:966–1005, 1988.

    Google Scholar 

  7. Y. Bar-Shalom and X. Li. Estimation and Tracking: Principles, Techniques and Software. Artech House, 1993.

    Google Scholar 

  8. M. Bernard, S. N’Guyen, P. Pirim, B. Gas, and J.-A. Meyer. Phonotaxis behavior in the artificial rat Psikharpax. In Intl. Symp. Robotics and Intelligent Sensors, IRIS’2010, pages 118–122, Nagoya, Japan, 2010.

    Google Scholar 

  9. M. Bernard, P. Pirim, A. de Cheveigné, and B. Gas. Sensorimotor learning of sound localization from an auditory evoked behavior. In IEEE Intl. Conf. Robotics and Automation, ICRA’2012, pages 91–96, St. Paul, MN, 2012.

    Google Scholar 

  10. J. Blauert, D. Kolossa, K. Obermayer, and K. Adiloglu. Further challenges and the road ahead. In J. Blauert, editor, The technology of binaural listening, chapter 18. Springer, Berlin-Heidelberg-New York NY, 2013.

    Google Scholar 

  11. W. Brimijoin, D. Mc Shefferty, and M. Akeroyd. Undirected head movements of listeners with asymmetrical hearing impairment during a speech-in-noise task.Hearing Research, 283:162–8, 2012.

    Google Scholar 

  12. R. Brooks, C. Breazeal, N. Marjanović, B. Scassellati, and M. Williamson. The Cog project: Building a humanoid robot. In C. Nehaniv, editor, Computations for Metaphors, Analogy, and Agents, volume 1562 of LNCS, pages 52–87. Springer, 1999.

    Google Scholar 

  13. Y. Chen and Y. Rui. Real-time speaker tracking using particle filter sensor fusion. Proc. of the IEEE, 920:485–494, 2004.

    Google Scholar 

  14. H. Christensen and J. Barker. Using location cues to track speaker changes from mobile binaural microphones. In Interspeech 2009, Brighton, UK, 2009.

    Google Scholar 

  15. H. Christensen, J. Barker, Y.-C. Lu, J. Xavier, R. Caseiro, and H. Arafajo. POPeye: Real-time binaural sound-source localisation on an audio-visual robot head. In Conf. Natural Computing and Intelligent Robotics, 2009.

    Google Scholar 

  16. Computing Community Consortium. A roadmap for US robotics. From Internet to Robotics, 2009. http://www.us-robotics.us/reports/CCC%20Report.pdf.

  17. M. Cooke, Y. Lu, Y. Lu, and R. Horaud. Active hearing, active speaking. In Intl. Symp. Auditory and Audiological Res., 2007.

    Google Scholar 

  18. M. Cooke, A. Morris, and P. Green. Recognizing occluded speech. In Proceedings of the ESCA Tutorial and Res.arch Worksh. Auditory Basis of Speech Perception, pages 297–300, Keele University, United Kingdom, 1996.

    Google Scholar 

  19. M. Cooke, A. Morris, and P. Green. Missing data techniques for robust speech recognition. In Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1997, pages 863–866, Munich, Germany, 1997.

    Google Scholar 

  20. B. Cornelis, M. Moonen, and J. Wouters. Binaural voice activity detection for MWF-based noise reduction in binaural hearing aids. In European Signal Processing Conf., EUSIPCO’2011, pages Barcelona, Spain, 2011.

    Google Scholar 

  21. P. Danès and J. Bonnal. Information-theoretic detection of broadband sources in a coherent beamspace MUSIC scheme. In IEEE/RSJ Intl. Conf. Intell. Robots and Systems, IROS’2010, pages 1976–1981, Taipei, Taiwan, 2010.

    Google Scholar 

  22. A. Deleforge and R. Horaud. Learning the direction of a sound source using head motions and spectral features. Technical Report 7529, INRIA, 2011.

    Google Scholar 

  23. A. Deleforge and R. Horaud. The Cocktail-Party robot: Sound source separation and localisation with an active binaural head. In IEEE/ACM Intl. Conf. Human Robot Interaction, HRI’2012, Boston, MA, 2012.

    Google Scholar 

  24. J. Gibson. The Ecological Approach to Visual Perception. Erlbaum, 1982.

    Google Scholar 

  25. M. Giuliani, C. Lenz, T. Müller, M. Rickert, and A. Knoll. Design principles for safety in human-robot interaction. Intl. J. Social Robotics, 2:253–274, 2010.

    Google Scholar 

  26. A. Handzel, S. Andersson, M. Gebremichael, and P. Krishnaprasad. A biomimetic apparatus for sound-source localization. In IEEE Conf. Decision and Control, CDC’2003, volume 6, pages 5879–5884, Maui, HI, 2003.

    Google Scholar 

  27. A. Handzel and P. Krishnaprasad. Biomimetic sound-source localization. IEEE Sensors J., 2:607–616, 2002.

    Google Scholar 

  28. S. Hashimoto, S. Narita, H. Kasahara, A. Takanishi, S. Sugano, K. Shirai, T. Kobayashi, H. Takanobu, T. Kurata, K. Fujiwara, T. Matsuno, T. Kawasaki, K. Hoashi. Humanoid robot-development of an information assistant robot, Hadaly. In IEEE Intl. Worksh. Robot and Human, Communication, RO-MAN’1997, pages 106–111, 1997.

    Google Scholar 

  29. J. Hörnstein, M. Lopes, J. Santos-victor, and F. Lacerda. Sound localization for humanoid robots - building audio-motor maps based on the HRTF. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 1170–1176, Beijing, China, 2006.

    Google Scholar 

  30. J. Huang, T. Supaongprapa, I. Terakura, F. Wang, N. Ohnishi, and N. Sugie. A model-based sound localization system and its application to robot navigation. Robotics and Autonomous Syst., 270:199–209, 1999.

    Google Scholar 

  31. G. Ince, K. Nakadai, T. Rodemann, Y. Hasegawa, H. Tsujino, and J. Imura. Ego noise suppression of a robot using template subtraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 199–204, Saint Louis, MO, 2009.

    Google Scholar 

  32. G. Ince, K. Nakadai, T. Rodemann, J. Imura, K. Nakamura, and H. Nakajima. Incremental learning for ego noise estimation of a robot. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 131–136, San Francisco, CA, 2011.

    Google Scholar 

  33. G. Ince, K. Nakadai, T. Rodemann, H. Tsujino, and J. Imura. Multi-talker speech recognition under ego-motion noise using missing feature theory. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 982–987, Taipei, Taiwan, 2010.

    Google Scholar 

  34. R. Irie. Multimodal sensory integration for localization in a humanoid robot. In IJCAI Worksh. Computational Auditory Scene Analysis, pages 54–58, Nagoya, Aichi, Japan, 1997.

    Google Scholar 

  35. A. Ito, T. Kanayama, M. Suzuki, and S. Makino. Internal noise suppression for speech recognition by small robots. In Interspeech’2005, pages 2685–2688, Lisbon, Portugal, 2005.

    Google Scholar 

  36. M. Ji, S. Kim, H. Kim, K. Kwak, and Y. Cho. Reliable speaker identification using multiple microphones in ubiquitous robot companion environment. In IEEE Intl. Conf. Robot & Human Interactive Communication, RO-MAN’2007, pages 673–677, Jeju Island, Korea, 2007.

    Google Scholar 

  37. H.-D. Kim, J. Kim, K. Komatani, T. Ogata, and H. Okuno. Target speech detection and separation for humanoid robots in sparse dialogue with noisy home environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2008, pages 1705–1711, Nice, France, 2008.

    Google Scholar 

  38. C. Knapp and G. Carter. The generalized correlation method for estimation of time delay. IEEE Trans. Acoustics, Speech and, Signal Processing, 24:320–327, 1976.

    Google Scholar 

  39. C. Knapp and G. Carter. Time delay estimation in the presence of relative motion. In IEEE Intl. Conf. Acoustics, Speech, and Signal Processing, ICASSP’1977, pages 280–283, Storrs, CT, 1977.

    Google Scholar 

  40. Y. Kubota, M. Yoshida, K. Komatani, T. Ogata, and H. Okuno. Design and implementation of a 3D auditory scene visualizer: Towards auditory awareness with face tracking. In IEEE Intl. Symp. Multimedia, ISM’2008, pages 468–476, Berkeley, CA, 2008.

    Google Scholar 

  41. M. Kumon and Y. Noda. Active soft pinnae for robots. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2011, pages 112–117, San Francisco, CA, 2011.

    Google Scholar 

  42. M. Kumon, R. Shimoda, and Z. Iwai. Audio servo for robotic systems with pinnae. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 885–890, Edmonton, Canada, 2005.

    Google Scholar 

  43. S. Kurotaki, N. Suzuki, K. Nakadai, H. Okuno, and H. Amano. Implementation of active direction-pass filter on dynamically reconfigurable processor. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2005, pages 3175–3180, Edmonton, Canada, 2005.

    Google Scholar 

  44. Q. Lin, E. E. Jan, and J. Flanagan. Microphone arrays and speaker identification. IEEE Trans. Speech and Audio Processing, 2:622–629, 1994.

    Google Scholar 

  45. R. Lippmann and B. A. Carlson. Using missing feature theory to actively select features for robust speech recognition with interruptions, filtering, and noise. In Eurospeech’1997, pages 863–866, Rhodos, Greece, 1997.

    Google Scholar 

  46. Y.-C. Lu and M. Cooke. Motion strategies for binaural localisation of speech sources in azimuth and distance by artificial listeners. Speech Comm., 53:622–642, 2011.

    Google Scholar 

  47. V. Lunati, J. Manhès, and P. Danès. A versatile system-on-a-programmable-chip for array processing and binaural robot audition. InIEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 998–1003, Vilamoura, Portugal, 2012.

    Google Scholar 

  48. D. Marr. Vision: A Computational Investigation into the Human Representation and Processing of Visual Information. Feeeman, W.H., 1982.

    Google Scholar 

  49. E. Martinson and B. Fransen. Dynamically reconfigurable microphone arrays. In IEEE Intl. Conf. Robotics and Automation, ICRA’2011, pages 5636–5641, Shangai, China, 2011.

    Google Scholar 

  50. Y. Matsusaka, T. Tojo, S. Kubota, K. Furukawa, D. Tamiya, K. Hayata, Y. Nakano, and T. Kobayashi. Multi-person conversation via multi-modal interface - a robot who communicate with multi-user -. In Eurospeech’1999, pages 1723–1726, Budapest, Hungary, 1999.

    Google Scholar 

  51. T. May, S. van de Par, and A. Kohlrausch. Binaural localization and detection of speakers in complex acoustic scenes. In J. Blauert, editor, The Technology of Binaural Listening, chapter 15. Springer, Berlin-Heidelberg-New York NY, 2013.

    Google Scholar 

  52. F. Michaud, C. Côté, D. Létourneau, Y. Brosseau, J.-M. Valin, E. Beaudry, C. Raïevsky, A. Ponchon, P. Moisan, P. Lepage, Y. Morin, F. Gagnon, P. Giguère, M.-A. Roux, S. Caron, P. Frenette, and F. Kabanza. Spartacus attending the 2005 AAAI conference.Autonomous Robots, 22:369–383, 2007.

    Google Scholar 

  53. K. Nakadai, T. Lourens, H. Okuno, and H. Kitano. Active audition for humanoids. In Nat. Conf. Artificial Intelligence, AAAI-2000, pages 832–839, Austin, TX, 2000.

    Google Scholar 

  54. K. Nakadai, T. Matsui, H. Okuno, and H. Kitano. Active audition system and humanoid exterior design. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2000, pages 1453–1461, Takamatsu, Japan, 2000.

    Google Scholar 

  55. K. Nakadai, D. Matsuura, H. Okuno, and H. Kitano. Applying scattering theory to robot audition system: Robust sound source localization and extraction. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2003, pages 1147–1152, Las Vegas, NV, 2003.

    Google Scholar 

  56. K. Nakadai, H. Okuno, and H. Kitano. Epipolar geometry based sound localization and extraction for humanoid audition. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2001, volume 3, pages 1395–1401, Maui, HI, 2001.

    Google Scholar 

  57. K. Nakadai, H. Okuno, and H. Kitano. Auditory fovea based speech separation and its application to dialog system. In IEEE/RSJ Intl. Conf. on Intelligent Robots and Systems, IROS’2002, volume 2, pages 1320–1325, Lausanne, Switzerland, 2002.

    Google Scholar 

  58. K. Nakadai, H. Okuno, and H. Kitano. Robot recognizes three simultaneous speech by active audition. In IEEE Intl. Conf. Robotics and Automation, ICRA’2003, volume 1, pages 398–405, Taipei, Taiwan, 2003.

    Google Scholar 

  59. H. Nakajima, K. Kikuchi, T. Daigo, Y. Kaneda, K. Nakadai, and Y. Hasegawa. Real-time sound source orientation estimation using a 96 channel microphone array. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2009, pages 676–683, Saint Louis, MO, 2009.

    Google Scholar 

  60. H. Nakashima and T. Mukai. 3D sound source localization system based on learning of binaural hearing. In IEEE Intl. Conf. Systems, Man and Cybernetics, SMC’2005, pages 3534–3539, Nagoya, Japan, 2005.

    Google Scholar 

  61. E. Nemer, R. Goubran, and S. Mahmoud. Robust voice activity detection using higher-order statistics in the LPC residual domain. IEEE Trans. Speech and Audio Processing, 9:217–231, 2001.

    Google Scholar 

  62. Y. Nishimura, M. Nakano, K. Nakadai, H. Tsujino, and M. Ishizuka. Speech recognition for a robot under its motor noises by selective application of missing feature theory and MLLR. In ISCA Tutorial and Research Worksh. Statistical and Perceptual Audition, Pittsburgh, PA, 2006.

    Google Scholar 

  63. H. Okuno, T. Ogata, K. Komatani, and K. Nakadai. Computational auditory scene analysis and its application to robot audition. In IEEE Intl. Conf. Informatics Res. for Development of Knowledge Society Infrastructure, ICKS’2004, pages 73–80, 2004.

    Google Scholar 

  64. J. O’Regan. How to build a robot that is conscious and feels. Minds and Machines, pages 117–136, 2012.

    Google Scholar 

  65. J. O’Regan and A. Noë. A sensorimotor account of vision and visual consciousness. Behavioral and brain sciences, 24:939–1031, 2001.

    Google Scholar 

  66. D. Philipona and J. K. O’Regan. Is there something out there? inferring space from sensorimotor dependencies. Neural Computation, 15:2029–2049, 2001.

    Google Scholar 

  67. B. Pierce, T. Kuratate, A. Maejima, S. Morishima, Y. Matsusaka, M. Durkovic, K. Diepold, and G. Cheng. Development of an integrated multi-modal communication robotic face. In IEEE Worksh. Advanced Robotics and its Social Impacts, RSO’2012, pages 101–102, Munich, Germany, 2012.

    Google Scholar 

  68. H. Poincaré. L’espace et la géométrie. Revue de Métaphysique et de Morale, pages 631–646, 1895.

    Google Scholar 

  69. A. Portello, P. Danès, and S. Argentieri. Active binaural localization of intermittent moving sources in the presence of false meaurements. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 3294–3299, Vilamoura, Portugal, 2012.

    Google Scholar 

  70. R. Prasad, H. Saruwatari, and K. Shikano. Enhancement of speech signals separated from their convolutive mixture by FDICA algorithm. Digital Signal Processing, 19:127–133, 2009.

    Google Scholar 

  71. L. Rabiner and M. Sambur. An algorithm for determining the endpoints of isolated utterances. The Bell System Techn. J., 54:297–315, 1975.

    Google Scholar 

  72. B. Raj, R. Singh, and R. Stern. Inference of missing spectrographic features for robust speech recognition. In Intl. Conf. Spoken Language Processing, Sydney, Australia, 1998.

    Google Scholar 

  73. B. Raj and R. M. Stern. Missing-feature approaches in speech recognition. IEEE Signal Processing Mag., 22:101–116, 2005.

    Google Scholar 

  74. T. Rodemann. A study on distance estimation in binaural sound localization. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 425–430, Taipei, Taiwan, 2010.

    Google Scholar 

  75. D. Rosenthal and H. Okuno, editors. Computational Auditory Scene Analysis. Lawrence Erlbaum Associates, 1997.

    Google Scholar 

  76. A. Saxena and A. Ng. Learning sound location from a single microphone. In IEEE Intl. Conf. Robotics and Automation, ICRA’2009, pages 1737–1742, Kobe, Japan, 2009.

    Google Scholar 

  77. S. Schulz and T. Herfet. Humanoid separation of speech sources in reverberant environments. In Intl. Symp. Communications, Control and Signal Processing, ISCCSP’2008, pages 377–382, Brownsville, TX, 2008.

    Google Scholar 

  78. M. L. Seltzer, B. Raj, and R. Stern. A Bayesian classifier for spectrographic mask estimation for missing feature speech recognition. Speech Comm., 43:379–393, 2004.

    Google Scholar 

  79. A. Skaf and P. Danès. Optimal positioning of a binaural sensor on a humanoid head for sound source localization. In IEEE Intl. Conf. Humanoid Robots, Humanoids’2011, pages 165–170, Bled, Slovenia, 2011.

    Google Scholar 

  80. D. Sodoyer, B. Rivet, L. Girin, C. Savariaux, J.-L. Schwartz, and C. Jutten. A study of lip movements during spontaneous dialog and its application to voice activity detection. J. Acoust. Soc. Am., 125:1184–1196, 2009.

    Google Scholar 

  81. M. Stamm and M. Altinsoy. Employing binaural-proprioceptive interaction in human machine interfaces. In J. Blauert, editor, The technology of binaural listening, chapter 17. Springer, Berlin-Heidelberg-New York NY, 2013.

    Google Scholar 

  82. R. Takeda, S. Yamamoto, K. Komatani, T. Ogata, and H. Okuno. Missing-feature based speech recognition for two simultaneous speech signals separated by ICA with a pair of humanoid ears. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2006, pages 878–885, Beijing, China, 2006.

    Google Scholar 

  83. K. Tanaka, M. Abe, and S. Ando. A novel mechanical cochlea “fishbone” with dual sensor/actuator characteristics. IEEE/ASME Trans. Mechatronics, 3:98–105, 1998.

    Google Scholar 

  84. J. Valin, J. Rouat, and F. Michaud. Enhanced robot audition based on microphone array source separation with post-filter. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2004, pages 2123–2128, Sendai, Japan, 2004.

    Google Scholar 

  85. H. Van Trees. Optimum Array Processing (Detection, Estimation, and Modulation Theory, Part IV). Wiley-Interscience, 2002.

    Google Scholar 

  86. D. Ward, E. Lehmann, and R. Williamson. Particle filtering algorithms for tracking an acoustic source in a reverberant environment. IEEE Trans. Speech and Audio Processing, 11:826–836, 2003.

    Google Scholar 

  87. E. Weinstein and A. Weiss. Fundamental limitations in passive time delay estimation - Part II: Wideband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 1064–1078, 1984.

    Google Scholar 

  88. A. Weiss and E. Weinstein. Fundamental limitations in passive time delay estimation - Part I: Narrowband systems. IEEE Trans. Acoustics, Speech and Signal Processing, pages 472–486, 1983.

    Google Scholar 

  89. R. Weiss, M. Mandel, and D. Ellis. Combining localization cues and source model constraints for binaural source separation. Speech Comm., 53:606–621, 2011.

    Google Scholar 

  90. R. Woodworth and H. Schlosberg. Experimental Psychology. Holt, Rinehart and Winston, 3rd edition, 1971.

    Google Scholar 

  91. T. Yoshida and K. Nakadai. Two-layered audio-visual speech recognition for robots in noisy environments. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2010, pages 988–993, 2010.

    Google Scholar 

  92. K. Youssef, S. Argentieri, and J. Zarader. From monaural to binaural speaker recognition for humanoid robots. In IEEE/RAS Intl. Conf. Humanoid Robots, Humanoids’2010, pages 580–586, Nashville, TN, 2010.

    Google Scholar 

  93. K. Youssef, S. Argentieri, and J.-L. Zarader. A binaural sound source localization method using auditive cues and vision. In IEEE Intl. Conf. Acoustics, Speech and Signal Processing, ICASSP’2012, pages 217–220, Kyoto, Japan, 2012.

    Google Scholar 

  94. K. Youssef, S. Argentieri, and J.-L. Zarader. Towards a systematic study of binaural cues. In IEEE/RSJ Intl. Conf. Intelligent Robots and Systems, IROS’2012, pages 1004–1009, Vilamoura, Portugal, 2012.

    Google Scholar 

  95. K. Youssef, B. Breteau, S. Argentieri, J.-L. Zarader, and Z. Wang. Approaches for automatic speaker recognition in a binaural humanoid context. In Eur. Symp. Artificial Neural Networks, Computational Intelligence and Machine Learning, ESANN’2011, pages 411–416, Bruges, Belgium, 2011.

    Google Scholar 

Download references

Acknowledgments

This work was conducted within the project binaural active audition for humanoid robots, BINAAHR, funded under contract # ANR-09-BLAN-0370-02 by ANR, France, and JST, Japan. The authors would like to thank two anonymous reviewers for valuable suggestions.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to B. Gas .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Argentieri, S., Portello, A., Bernard, M., Danès, P., Gas, B. (2013). Binaural Systems in Robotics. In: Blauert, J. (eds) The Technology of Binaural Listening. Modern Acoustics and Signal Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37762-4_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37762-4_9

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37761-7

  • Online ISBN: 978-3-642-37762-4

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics