Skip to main content

Disambiguation in Unknown Object Detection by Integrating Image and Speech Recognition Confidences

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 7724))

Abstract

This paper presents a new method to detect unknown objects and their unknown names in object manipulation through man-robot dialog. In the method, the detection is carried out by using the information of object images and user’s speech in an integrated way. Originality of the method is to use logistic regression for the discrimination between unknown and known objects. The accuracy of the unknown object detection was 97% in the case when there were about fifty known objects.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Araki, T., et al.: Autonomous Acquisition of Multimodal Information for Online Object Concept Formation by Robots. In: IEEE International Conference on Intelligent Robots and Systems (2011)

    Google Scholar 

  2. Holzapfel, H., et al.: A Dialogue Approach to Learning Object Descriptions and Semantic Categories. Robotics and Autonomous Systems 56(11), 1004–1013 (2008)

    Article  Google Scholar 

  3. Nakano, M., et al.: Grounding New Words on The Physical World in Multi-Domain Human-Robot Dialogues. In: Dialog with Robots: Papers from the AAAI Fall Symposium (2010)

    Google Scholar 

  4. Steels, L., Kaplan, F.: AIBO’s first words: The social learning of language and meaning. Evolution of Communication 4(1), 3–32 (2002)

    Article  Google Scholar 

  5. Skocaj, D., et al.: A basic cognitive system for interactive continuous learning of visual concepts. In: ICRA 2010 Workshop (2010)

    Google Scholar 

  6. Zuo, X., et al.: Detecting Robot-Directed Speech by Situated Understanding in Physical Interaction. Journal of Artificial Intelligence 25(25), 670–682 (2010)

    Google Scholar 

  7. Julius, http://julius.sourceforge.jp/

  8. Jiang, H.: Confidence Measures for Speech Recognition: A survey. Speech Communication 45, 455–470 (2005)

    Article  Google Scholar 

  9. Persoon, E., Fu, K.S.: Shape Discrimination Using Fourier Descriptors. IEEE Trans. Accoust. Speech Signal Processing 28(4), 170–179 (1977)

    MathSciNet  Google Scholar 

  10. Kurita, T.: Interactive Weighted Least Squares Algorithms for Neural Networks Classifiers. In: Proc. Workshop on Algorithmic Learning Theory, pp. 77–86 (1992)

    Google Scholar 

  11. Bishop, C.: Pattern Recognition and Machine Learning. Springer Science+Business Media, LLC, New York (2006)

    MATH  Google Scholar 

  12. Kinect, http://www.microsoft.com/en-us/kinectforwindows/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ozasa, Y., Ariki, Y., Nakano, M., Iwahashi, N. (2013). Disambiguation in Unknown Object Detection by Integrating Image and Speech Recognition Confidences. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds) Computer Vision – ACCV 2012. ACCV 2012. Lecture Notes in Computer Science, vol 7724. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-37331-2_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-37331-2_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-37330-5

  • Online ISBN: 978-3-642-37331-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics