On Designing Socially Acceptable Reward Shaping

Raza, Syed Ali; Clark, Jesse; Williams, Mary-Anne

doi:10.1007/978-3-319-47437-3_84

Syed Ali Raza¹⁸,
Jesse Clark¹⁸ &
Mary-Anne Williams¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 9979))

Included in the following conference series:

International Conference on Social Robotics

5721 Accesses
1 Citations
1 Altmetric

Abstract

For social robots, learning from an ordinary user should be socially appealing. Unfortunately, machine learning demands an enormous amount of human data, and a prolonged interactive teaching session becomes anti-social. We have addressed this problem in the context of reward shaping for reinforcement learning. For efficient reward shaping, a continuous stream of rewards is expected from the teacher. We present a simple framework which seeks rewards for a small number of steps from each of a large number of human teachers. Therefore, it simplifies the job of an individual teacher. The framework was tested with online crowd workers on a transport puzzle. We thoroughly analyzed the quality of the learned policies and crowd’s teaching behavior. Our results showed that nearly perfect policies can be learned using this framework. The framework was generally acceptable in the crowd’s opinion.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Reward Function Design in Reinforcement Learning

Human–Robot Cooperation in Economic Games: People Show Strong Reciprocity but Conditional Prosociality Toward Robots

Article Open access 07 April 2023

References

Chung, M.J.-Y., Forbes, M., Cakmak, M., Rao, R.P.: Accelerating imitation learning through crowdsourcing. In: 2014 IEEE International Conference on Robotics and Automation (ICRA), pp. 4777–4784. IEEE (2014)
Google Scholar
Forbes, M., Chung, M.J.-Y., Cakmak, M., Rao, R.P.: Robot programming by demonstration with crowdsourced action fixes. In: Second AAAI Conference on Human Computation and Crowdsourcing (2014)
Google Scholar
Gabriel, V., Peng, B., Lasecki, W.S., Taylor, M.E.: Towards integrating real-time crowd advice with reinforcement learning. In: IUI Companion, pp. 17–20 (2015)
Google Scholar
Knox, W.B., Stone, P.: Combining manual feedback with subsequent MDP reward signals for reinforcement learning. In: International Foundation for Autonomous Agents and Multiagent Systems Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, vol. 1, pp. 5–12 (2010)
Google Scholar
Loftin, R.T., MacGlashan, J., Peng, B., Taylor, M.E., Littman, M.L., Huang, J., Roberts, D.L.: A strategy-aware technique for learning behaviors from discrete human feedback. In: Twenty-Eighth AAAI Conference on Artificial Intelligence (2014)
Google Scholar
Peng, B., MacGlashan, J., Loftin, R., Littman, M.L., Roberts, D.L., Taylor, M.E.: A need for speed: adapting agent action speed to improve task learning from non-expert humans. In: Paper Presented to the International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2016) (2016)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, vol. 1. MIT Press Cambridge, Cambridge (1998)
Google Scholar
Thomaz, A.L., Breazeal, C.: Reinforcement learning with human teachers: evidence of feedback and guidance with implications for learning performance. In: Paper Presented to the Proceedings of the 21st National Conference on Artificial Intelligence vol. 1, Boston, Massachusetts (2006)
Google Scholar
Toris, R., Kent, D., Chernova, S.: Unsupervised learning of multi-hypothesized pick-and-place task templates via crowdsourcing. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 4504–4510. IEEE (2015)
Google Scholar

Download references

Author information

Authors and Affiliations

Centre for Quantum Computation and Intelligent Systems, University of Technology, Sydney, Australia
Syed Ali Raza, Jesse Clark & Mary-Anne Williams

Authors

Syed Ali Raza
View author publications
You can also search for this author in PubMed Google Scholar
Jesse Clark
View author publications
You can also search for this author in PubMed Google Scholar
Mary-Anne Williams
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Syed Ali Raza .

Editor information

Editors and Affiliations

Department of Electrical Engineering, The University of Kansas, Lawrence, Indiana, USA
Arvin Agah
Department of Mechanical Engineering, Qatar University, Doha, Qatar
John-John Cabibihan
School of Electrical and Computer Engineering, Georgia Institute of Technology, Atlanta, USA
Ayanna M. Howard
Department of Systems Engineering and Automation, University Carlos III de Madrid, madrid, Spain
Miguel A. Salichs
Department of Mechanical, Aerospace and Biomedical Engineering, University of Tennessee, Knoxville, Tennessee, USA
Hongsheng He

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Raza, S.A., Clark, J., Williams, MA. (2016). On Designing Socially Acceptable Reward Shaping. In: Agah, A., Cabibihan, JJ., Howard, A., Salichs, M., He, H. (eds) Social Robotics. ICSR 2016. Lecture Notes in Computer Science(), vol 9979. Springer, Cham. https://doi.org/10.1007/978-3-319-47437-3_84

Download citation

DOI: https://doi.org/10.1007/978-3-319-47437-3_84
Published: 07 October 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-47436-6
Online ISBN: 978-3-319-47437-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

On Designing Socially Acceptable Reward Shaping

Abstract

Access this chapter

Similar content being viewed by others

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Reward Function Design in Reinforcement Learning

Human–Robot Cooperation in Economic Games: People Show Strong Reciprocity but Conditional Prosociality Toward Robots

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

On Designing Socially Acceptable Reward Shaping

Abstract

Access this chapter

Similar content being viewed by others

Adaptively Shaping Reinforcement Learning Agents via Human Reward

Reward Function Design in Reinforcement Learning

Human–Robot Cooperation in Economic Games: People Show Strong Reciprocity but Conditional Prosociality Toward Robots

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation