Simulation for AI

  • Tadahiro TaniguchiEmail author
Reference work entry


This chapter describes how multimodal categorization techniques can be applied to a humanoid robot. Multimodal categorization enables a robot to form an internal representation system and use it for various purposes on the basis of the robot’s multimodal sensory-motor experience. Real-world environments, including homes and offices, where humanoid robots typically perform their tasks are full of uncertainties. These robots have to interact with human users by using not only sensory-motor information but also linguistic information. The use of conventional stiff, hand-written, internal representation systems in such environments is limited by the lack of adaptability. Various methods have been developed for forming the internal representation system of a humanoid robot. This chapter focuses on a Bayesian probabilistic generative model-based approach to multimodal categorization-based representation learning. In particular, multimodal latent Dirichlet allocation (MLDA), an extension of latent Dirichlet allocation, which is a commonly used topic model, is considered. Topic models are popular and sophisticated machine learning techniques used in various areas, e.g., natural language processing, data mining, and image recognition. On the basis of the MLDA, several extensions and applications for a humanoid robot are described.


  1. 1.
    A. Anandkumar, D.P. Foster, D. Hsu, S.M. Kakade, Y.K. Liu, A spectral algorithm for latent Dirichlet allocation. Algorithmica 72(1), 193–214 (2015)MathSciNetCrossRefGoogle Scholar
  2. 2.
    Y. Ando, T. Nakamura, T. Araki, T. Nagai, Formation of hierarchical object concept using hierarchical latent Dirichlet allocation, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2013, pp. 2272–2279Google Scholar
  3. 3.
    T. Araki, T. Nakamura, T. Nagai, S. Nagasaka, T. Taniguchi, N. Iwahashi, Online learning of concepts and words using multimodal LDA and hierarchical Pitman-Yor Language model, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 1623–1630Google Scholar
  4. 4.
    L.W. Barsalou, Perceptual symbol systems. Behav. Brain Sci. 22(04), 1–16 (1999)Google Scholar
  5. 5.
    C. Bishop, Pattern Recognition and Machine Learning. Information Science and Statistics (Springer, New York, 2010)Google Scholar
  6. 6.
    D. Blei, T.L. Griffiths, M.I. Jordan, J.B. Tenenbaum, Hierarchical topic models and the nested Chinese restaurant process. Adv. Neural Inf. Proces. Syst. 16, 106 (2004)Google Scholar
  7. 7.
    D.M. Blei, T.L. Griffiths, M.I. Jordan, The nested Chinese restaurant process and Bayesian nonparametric inference of topic hierarchies. J. ACM (JACM) 57(2), 1–30 (2007)MathSciNetCrossRefGoogle Scholar
  8. 8.
    D.M. Blei, A.Y. Ng, M.I. Jordan, Latent Dirichlet allocation. J. Mach. Learn. Res. 3(1), 993–1022 (2003)Google Scholar
  9. 9.
    R. Brooks, Intelligence without representation. Artif. Intell. 47(1–3), 139–159 (1991)CrossRefGoogle Scholar
  10. 10.
    G.E. Dahl, D. Yu, L. Deng, A. Acero, Context-dependent pre-trained deep neural networks for large-vocabulary speech recognition. IEEE Trans. Audio Speech Lang. Process. 20(1), 30–42 (2012)CrossRefGoogle Scholar
  11. 11.
    S. Goldwater, T.L. Griffiths, M. Johnson, A Bayesian framework for word segmentation: exploring the effects of context. Cognition 112(1), 21–54 (2009)CrossRefGoogle Scholar
  12. 12.
    T.L Griffiths, M. Steyvers, Finding scientific topics. Proc. Natl. Acad. Sci. U S A (PNAS) 101(Suppl 1), 5228–5235 (2004)CrossRefGoogle Scholar
  13. 13.
    S. Harnad, The symbol grounding problem. Physica D: Nonlinear Phenomena 42(1), 335–346 (1990)CrossRefGoogle Scholar
  14. 14.
    M.D. Hoffman, D.M. Blei, C. Wang, J. Paisley, Stochastic variational inference. J. Mach. Learn. Res. 14, 1303–1347 (2013)Google Scholar
  15. 15.
    A. Krizhevsky, I. Sutskever, G.E. Hinton, ImageNet classification with deep convolutional neural networks, in Advances In Neural Information Processing Systems (NIPS), 2012, pp. 1–9Google Scholar
  16. 16.
    Q.V. Le, M.A. Ranzato, R. Monga, M. Devin, K. Chen, G.S. Corrado, J. Dean, A.Y. Ng, Building high-level features using large scale unsupervised learning, in International Conference in Machine Learning (ICML), 2011Google Scholar
  17. 17.
    O. Mangin, D. Filliat, L. ten Bosch, P.-Y. Oudeyer, MCA-NMF: multimodal concept acquisition with non-negative matrix factorization. PLoS One, 10(10), 1–35 (2015)CrossRefGoogle Scholar
  18. 18.
    D. Mochihashi, T. Yamada, N. Ueda, Bayesian unsupervised word segmentation with nested Pitman-Yor language modeling, in Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP (ACL-IJCNLP), 2009, pp. 100–108Google Scholar
  19. 19.
    T. Nakamura, Y. Ando, T. Nagai, M. Kaneko, Concept formation by robots using an infinite mixture of models, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2015Google Scholar
  20. 20.
    T. Nakamura, T. Nagai, K. Funakoshi, S. Nagasaka, T. Taniguchi, N. Iwahashi, Mutual learning of an object concept and language model based on MLDA and NPYLM, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2014, pp. 600–607Google Scholar
  21. 21.
    T. Nakamura, T. Nagai, N. Iwahashi, Multimodal object categorization by a robot, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2007, pp. 2415–2420Google Scholar
  22. 22.
    T. Nakamura, T. Nagai, N. Iwahashi, Grounding of word meanings in multimodal concepts using LDA, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2009, pp. 3943–3948Google Scholar
  23. 23.
    T. Nakamura, T. Nagai, N. Iwahashi, Bag of multimodal LDA models for concept formation, in IEEE International Conference on Robotics and Automation (ICRA), 2011, pp. 6233–6238Google Scholar
  24. 24.
    T. Nakamura, T. Nagai, N. Iwahashi, Multimodal categorization by hierarchical Dirichlet process, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2011, pp. 1520–1525Google Scholar
  25. 25.
    T. Nakamura, T. Nagai, N. Iwahashi, Bag of multimodal hierarchical Dirichlet processes: Model of complex conceptual structure for intelligent robots, in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2012, pp. 3818–3823Google Scholar
  26. 26.
    A. Newell, Physical symbol systems. Cogn. Sci. 4, 135–183 (1980)Google Scholar
  27. 27.
    J. Ngiam, A. Khosla, M. Kim, J. Nam, H. Lee, A.Y. Ng, Multimodal deep learning. In Proceedings of the 28th international conference on machine learning (ICML-11) (pp. 689–696) (2011)Google Scholar
  28. 28.
    K. Noda, H. Arie, Y. Suga, T. Ogata, Intersensory Causality Modeling Using Deep Neural Networks, in IEEE International Conference on Systems, Man, and Cybernetics (SMC), 2013, pp. 1995–2000Google Scholar
  29. 29.
    R. Pfeifer, C. Scheier, Understanding Intelligence (A Bradford Book, Cambridge, 2001)Google Scholar
  30. 30.
    J.R. Saffran, R.N. Aslin, E.L. Newport, Statistical learning by 8-month-old infants. Science 274(5294), 1926–1928 (1996)CrossRefGoogle Scholar
  31. 31.
    I. Sato, H. Nakagawa, Rethinking collapsed variational Bayes inference for LDA. arXiv preprint arXiv:1206.6435, 2012Google Scholar
  32. 32.
    L. Steels, The symbol grounding problem has been solved, so what’s next?, in Symbols, Embodiment and Meaning (Oxford University Press, Oxford, 2008), pp. 223–244CrossRefGoogle Scholar
  33. 33.
    Y. Sugita, J. Tani, Learning semantic combinatoriality from the interaction between linguistic and behavioral processes. Adaptive Behavior, 13(1), 33–52 (2005)CrossRefGoogle Scholar
  34. 34.
    T. Taniguchi, T. Takano, R. Yoshino, Multimodal hierarchical Dirichlet process-based active perception. arXiv:1510.00331, 2015Google Scholar
  35. 35.
    T. Taniguchi, T. Nagai, T. Nakamura, N. Iwahashi, T. Ogata, H. Asoh, Symbol emergence in robotics: a survey. Adv. Robot. 30, 706–728 (2016)CrossRefGoogle Scholar
  36. 36.
    Y.W. Teh, D. Newman, M. Welling, A collapsed variational Bayesian inference algorithm for latent Dirichlet allocation, in NIPS, 2006.Google Scholar
  37. 37.
    Y.W. Teh, M.I. Jordan, M.J. Beal, D.M. Blei, Hierarchical Dirichlet processes. J. Am. Stat. Assoc. 101(476), 1566–1581 (2006)MathSciNetCrossRefGoogle Scholar

Copyright information

© Springer Nature B.V. 2019

Authors and Affiliations

  1. 1.Department of Information Science and EngineeringRitsumeikan UniversityKusatsuJapan

Personalised recommendations