Sensorimotor self-learning model based on operant conditioning for two-wheeled robot

  • Xiaoping Zhang (张晓平)
  • Xiaogang Ruan (阮晓钢)
  • Yao Xiao (肖 尧)
  • Jing Huang (黄 静)


Traditional control methods of two-wheeled robot are usually model-based and require the robot’s precise mathematic model which is hard to get. A sensorimotor self-learning model named SMM TWR is presented in this paper to handle these problems. The model consists of seven elements: the discrete learning time set, the sensory state set, the motion set, the sensorimotor mapping, the state orientation unit, the learning mechanism and the model’s entropy. The learning mechanism for SMM TWR is designed based on the theory of operant conditioning (OC), and it adjusts the sensorimotor mapping at every learning step. This helps the robot to choose motions. The leaning direction of the mechanism is decided by the state orientation unit. Simulation results show that with the sensorimotor model designed, the robot is endowed the abilities of self-learning and self-organizing, and it can learn the skills to keep itself balance through interacting with the environment.

Key words

two-wheeled robot sensorimotor model self-learning operant conditioning (OC) 

CLC number

TP 181 

Document code


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



Part of this research was done at the Department of Psychology, Michigan State University. The authors would like to express their thanks to Professor LIU Taosheng and his lab for help.


  1. [1]
    CHAN R P M, STOL K A, HALKYARD C R. Review of modelling and control of two-wheeled robots [J]. Annual Reviews in Control, 2013, 37: 89–103.CrossRefGoogle Scholar
  2. [2]
    SUPRAPTO B Y, AMRI D, DWIJAYANTI S. Comparison of control methods PD, PI, and PID on two wheeled self balancing robot [C]//Proceeding of International Conference on Electrical Engineering, Computer Science and Informatics. Yogyakarta, Indonesia: IEEE, 2014: 67–71.Google Scholar
  3. [3]
    BATURE A A, BUYAMIN S, AHMAD M N, et al. A comparison of controllers for balancing two wheeled inverted pendulum robot [J]. International Journal of Mechanical & Mechatronics Engineering, 2014, 14(3): 62–68.Google Scholar
  4. [4]
    ALARFAJ M, KANTOR G. Centrifugal force compensation of a two-wheeled balancing robot [C]//Proceeding of International Conference on Control, Automation, Robotics and Vision. Singapore: IEEE, 2010: 2333–2338.Google Scholar
  5. [5]
    ZHOU Y S, WANG Z H. Motion controller design of wheeled inverted pendulum with an input delay via optimal control theory[J]. Journal of Optimization Theory and Application, 2016, 168(2): 625–645.MathSciNetCrossRefMATHGoogle Scholar
  6. [6]
    LI C Q, GAO X S, HUANG Q, et al. A coaxial couple wheeled robot with T-S fuzzy equilibrium control [J]. Industrial Robot: An International Journal, 2011, 38(3): 292–300.CrossRefGoogle Scholar
  7. [7]
    NASIR A N K, AHMAD M A, GHAZALI R, et al. Performance comparison between fuzzy logic controller (FLC) and PID controller for a highly nonlinear twowheels balancing robot [C]//2011 First International Conference on Informatics and Computational Intelligence. Bandung, Indonesia: IEEE, 2011: 176–181.CrossRefGoogle Scholar
  8. [8]
    YUE M, WANG S, SUN J Z. Simultaneous balancing and trajectory tracking control for two-wheeled inverted pendulum vehicles: A composite control approach [J]. Neurocomputing, 2016, 191: 44–54.CrossRefGoogle Scholar
  9. [9]
    RUAN X G, WU X. The skinner automaton: A psychological model formalizing the theory of operant conditioning [J]. Science China Technological Sciences, 2013, 56(11): 2745–2761.CrossRefGoogle Scholar
  10. [10]
    RUAN X G, CHEN J, YU N G. Thalamic cooperation between the cerebellum and basal ganglia with a new tropism-based action-dependent heuristic dynamic programming method [J]. Neurocomputing, 2012, 93: 27–40.CrossRefGoogle Scholar
  11. [11]
    SKINNER B F. The behavior of organisms: An experimental analysis [M]. New York: D Appleton-Century Company, 1938.Google Scholar
  12. [12]
    ROSEN B E, GOODWIN J M, VIDAL J J. Machine operant conditioning [C]//Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Piscataway, USA: IEEE, 1988: 1500–1501.CrossRefGoogle Scholar
  13. [13]
    ZALAMA E, GóMEZ J, PAUL M, et al. Adaptive behavior navigation of a mobile robot [J]. IEEE Transactions on Systems, Man, and Cybernetics. Part A: Systems and Humans, 2002, 32(1): 160–169.CrossRefGoogle Scholar
  14. [14]
    ITOH K, MIWA H, MATSUMOTO M, et al. Behavior model of humanoid robots based on operant conditioning [C]//Proceedings of 2005 5th IEEE-RAS International Conference on Humanoid Robots. Tsukuba: IEEE, 2005: 220–225.CrossRefGoogle Scholar
  15. [15]
    TANIGUGHI T, SAWARAGI T. Incremental acquisition of behaviors and signs based on a reinforcement learning schemata model and a spike timingdependent plasticity network [J]. Advanced Robotics, 2007, 21(10): 1177–1199.CrossRefGoogle Scholar
  16. [16]
    CHEU E Y, QUEK C, NG S K. ARPOP: An appetitive reward-based pseudo-outer-product neural fuzzy inference system inspired from the operant conditioning of feeding behavior in aplysia [J]. IEEE Transactions on Neural Networks and Learning Systems, 2012, 23(2): 317–329.CrossRefGoogle Scholar
  17. [17]
    PIAGET J. The origins of intelligence in children [M]. New York: International Universities Press, 1952.CrossRefGoogle Scholar
  18. [18]
    LEE D D, SEUNG H S. Learning in intelligent embedded systems [C]//Proceedings of the Embedded Systems Workshop. Cambridge, USA: IEEE, 1999: 133–139.Google Scholar
  19. [19]
    NATALE L, ORABONA F, BERTON F, et al. From sensorimotor development to object perception [C]//Proceedings of 2005 5th IEEE-RAS International Conference on Humanoid Robots. Tsukuba: IEEE, 2005: 226–231.CrossRefGoogle Scholar
  20. [20]
    HOFFMANN H. Perception through visual motor anticipation in a mobile robot [J]. Neural Networks, 2007, 20(1): 22–33.CrossRefMATHGoogle Scholar
  21. [21]
    REN H G, SHI T, ZHANG R C. Foundation of the sensorimotor system cognitive model with operant conditioning mechanism [J]. Robot, 2012, 34(3): 292–298 (in Chinese).CrossRefGoogle Scholar

Copyright information

© Shanghai Jiaotong University and Springer-Verlag Berlin Heidelberg 2017

Authors and Affiliations

  • Xiaoping Zhang (张晓平)
    • 1
    • 2
  • Xiaogang Ruan (阮晓钢)
    • 1
  • Yao Xiao (肖 尧)
    • 1
  • Jing Huang (黄 静)
    • 1
  1. 1.College of Electronic Information and Control EngineeringBeijing University of TechnologyBeijingChina
  2. 2.Department of PsychologyMichigan State UniversityMichiganUSA

Personalised recommendations