Reinforcement Learning for Scheduling of Maintenance

Knowles, Michael; Baglee, David; Wermter, Stefan

doi:10.1007/978-0-85729-130-1_31

Michael Knowles⁴,
David Baglee⁴ &
Stefan Wermter⁵

Included in the following conference series:

International Conference on Innovative Techniques and Applications of Artificial Intelligence

789 Accesses
9 Citations

Abstract

Improving maintenance scheduling has become an area of crucial importance in recent years. Condition-based maintenance (CBM) has started to move away from scheduled maintenance by providing an indication of the likelihood of failure. Improving the timing of maintenance based on this information to maintain high reliability without resorting to over-maintenance remains, however, a problem. In this paper we propose Reinforcement Learning (RL), to improve long term reward for a multistage decision based on feedback given either during or at the end of a sequence of actions, as a potential solution to this problem. Several indicative scenarios are presented and simulated experiments illustrate the performance of RL in this application.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Grall A., Berenguer C., Dieulle L.: A condition-based maintenance policy for stochastically deteriorating systems. Reliability Engineering & System Safety, Volume 76, Issue 2, Pages 167-180, ISSN 0951-8320, DOI: 10.1016/S0951-8320(01)00148-X.(2002)
Article Google Scholar
Bengtsson M.: Standardization Issues in Condition Based Maintenance. In Condition Monitoring and Diagnostic Engineering Management - Proceedings of the 16th International Congress, August 27-29, 2003, Växjö University, Sweden, Edited by Shrivastav, O. and Al-Najjar, B., Växjö University Press, ISBN 91-7636-376-7. (2003)
Google Scholar
Davies A. (Ed): Handbook of Condition Monitoring - Techniques and Methodology. Springer, 1998 978-0-412-61320-3.(1997)
Google Scholar
Barron R. (Ed): Engineering Condition Monitoring: Practice, Methods and Applications. Longman, 1996, 978-0582246560.(1996)
Google Scholar
Wang W.: A model to determine the optimal critical level and the monitoring intervals in condition-based maintenance. International Journal of Production Research, volume 38 No 6 pp 1425–1436. (2000)
Article MATH Google Scholar
Meier A.: Is that old refrigerator worth saving? Home Energy Magazine http://homeenergy.org/archive/hem.dis.anl.gov/eehem/93/930107.html(1993)
Litt B., Megowen A. and Meier A.: Maintenance doesn’t necessarily lower energy use. Home Energy Magazine http://homeenergy.org/archive/hem.dis.anl.gov/eehem/93/930108.html. (1993)
Techato K-A, Watts D.J. and Chaiprapat S.: Life cycle analysis of retrofitting with high energy efficiency air-conditioner and fluorescent lamp in existing buildings. Energy Policy, Vol. 37, pp 318 – 325. (2009)
Article Google Scholar
Boardman B., Lane K., Hinnells M., Banks N., Milne G., Goodwin A. and Fawcett T.: Transforming the UK Cold Market Domestic Equipment and Carbon Dioxide Emissions (DECADE) Report. (1997)
Google Scholar
Knowles M.J. and Baglee D.:The Role of Maintenance in Energy Saving, 19th MIRCE International Symposium on Engineering and Managing Sustainability - A Reliability, Maintainability and Supportability Perspective, (2009)
Google Scholar
Singh, S. Litman, D., Kearns M ., and Walker, M. Optimizing Dialogue Management with Reinforcement Learning: Experiments with the NJFun System. In Journal of Artificial Intelligence Research (JAIR),Volume 16, pp. 105-133. (2002)
Google Scholar
Altahhan A., Burn K. Wermter S.: Visual Robot Homing using Sarsa(λ), Whole Image Measure, and Radial Basis Function. Proceedings IEEE IJCNN (2008)
Google Scholar
Altahhan A.: Conjugate Temporal Difference Methods For Visual Robot Homing. PhD Thesis, University of Sunderland. (2009)
Google Scholar
Lazaric, A., M. Restelli, Bonarini A.: Reinforcement Learning in Continuous Action Spaces through Sequential Monte Carlo Methods. Twenty First Annual Conference on Neural Information Processing Systems – NIPS. (2007)
Google Scholar
Sheynikhovich, D., Chavarriaga R., Strosslin T. and Gerstner W.: Spatial Representation and Navigation in a Bio-inspired Robot. Biomimetic Neural Learning for Intelligent Robots. S. Wermter, M. Elshaw and G.Palm, Springer: 245-265. (2005)
Google Scholar
Asadpour, M. and Siegwart, R.: Compact Q-learning optimized for micro-robots with processing and memory constrains. Robotics and Autonomous Systems, Science Direct, Elsevier. (2004)
Google Scholar
Knowles, M.J. and Wermter, S.: The Hybrid Integration of Perceptual Symbol Systems and Interactive Reinforcement Learning. 8th International Conference on Hybrid Intelligent Systems. Barcelona, Spain, September 10-12th, (2008)
Google Scholar
Muse, D. and Wermter, S.: Actor-Critic Learning for Platform-Independent Robot Navigation. Cognitive Computation, Volume 1, Springer New York, pp. 203-220, (2009)
Google Scholar
Weber, C., Elshaw, M., Wermter, S., Triesch J. and Willmot, C.: Reinforcement Learning Embedded in Brains and Robots, In: Weber, C., Elshaw M., and Mayer N. M. (Eds.) Reinforcement Learning: Theory and Applications. pp. 119-142, I-Tech Education and Publishing, Vienna, Austria. (2008)
Google Scholar
Stone, P., Sutton R. S. and Kuhlmann G.: Reinforcement learning for robocup soccer keepaway. International Society for Adaptive Behavior 13(3): 165–188 (2005)
Article Google Scholar
Taylor M.E. and Stone P.: Towards reinforcement learning representation transfer. In The Autonomous Agents and Multi-Agent Systems Conference (AAMAS-07), Honolulu, Hawaii. (2007)
Google Scholar
Kalyanakrishnan S., Liu Y. and Stone P.: Half Field Offense in RoboCup Soccer: A Multiagent Reinforcement Learning Case Study. Lecture Notes In Computer Science, Springer (2007)
Google Scholar
Lokuge, P. and Alahakoon, D.: Reinforcement learning in neuro BDI agents for achieving agent's intentions in vessel berthing applications 19th International Conference on Advanced Information Networking and Applications, 2005. AINA 2005. Volume: 1 Digital Object Identifier: 10.1109/AINA.2005.293, Page(s): 681 - 686 vol.1(2005)
Google Scholar
Cong Shi, Shicong Meng, Yuanjie Liu, Dingyi Han and Yong Yu: Reinforcement Learning for Query-Oriented Routing Indices in Unstructured Peer-to-Peer Networks, Sixth IEEE International Conference on Peer-to-Peer Computing P2P 2006, Digital Object Identifier: 10.1109/P2P.2006.30, Page(s): 267 - 274 (2006)
Google Scholar
Cong Shi, Shicong Meng, Yuanjie Liu, Dingyi Han and Yong Yu: Reinforcement Learning for Query-Oriented Routing Indices in Unstructured Peer-to-Peer Networks, Sixth IEEE International Conference on Peer-to-Peer Computing, 2006. P2P 2006.Digital Object Identifier: 10.1109/P2P.2006, Page(s): 267 - 274 (2006).
Google Scholar
Mattila, V.: Flight time allocation for a fleet of aircraft through reinforcement learning. Simulation Conference, 2007 Winter, Digital Object Identifier: 10.1109/WSC.2007.4419888 Page(s): 2373 - 2373 (2007)
Google Scholar
Zhang, Y. and Fromherz, M.: Constrained flooding: a robust and efficient routing framework for wireless sensor networks, 20th International Conference on Advanced Information Networking and Applications, 2006. AINA 2006.Volume: 1 Digital Object Identifier: 10.1109/AINA.2006.132 (2006)
Google Scholar
Chasparis, G.C. and Shamma, J.S.: Efficient network formation by distributed reinforcement 47th IEEE Conference on Decision and Control, 2008. CDC 2008. Digital Object Identifier: 10.1109/CDC.2008.4739163, Page(s): 1690 - 1695 (2008).
Google Scholar
Usynin, A., Hines, J.W. and Urmanov, A.: Prognostics-Driven Optimal Control for Equipment Performing in Uncertain Environment Aerospace Conference, 2008 IEEE Digital Object Identifier: 10.1109/AERO.2008.4526626, Page(s): 1 – 9 (2008)
Google Scholar
Lihu, A.and Holban, S.: Top five most promising algorithms in scheduling. 5th International Symposium on Applied Computational Intelligence and Informatics, 2009. SACI '09. Digital Object Identifier: 10.1109/SACI.2009.5136281, Page(s): 397 - 404 (2009).
Google Scholar
Zhang Huiliang and Huang Shell Ying: BDIE architecture for rational agents.. International Conference on Integration of Knowledge Intensive Multi-Agent Systems, Page(s): 623 – 628 (2005)
Google Scholar
Malhotra, R., Blasch, E.P. and Johnson, J.D.: Learning sensor-detection policies ., Proceedings of the IEEE 1997 National Aerospace and Electronics Conference, 1997. NAECON 1997Volume: 2 Digital Object Identifier: 10.1109/NAECON.1997.622727 , Page(s): 769 - 776 vol.2 (1997)
Google Scholar
Sutton, R.S. and Barto, A.G.: Reinforcement Learning: An Introduction, IEEE Transactions on Neural Networks Volume: 9 , Issue: 5 Digital Object Identifier: 10.1109/TNN.1998.712192, Page(s): 1054 - 1054 (1998)
Google Scholar
Barto, A.G.: Reinforcement learning in the real world 2004. Proceedings. 2004 IEEE International Joint Conference on Neural Networks, Volume: 3 (2004)
Google Scholar
Barto, A.G. and Dietterich, T.G.: Reinforcement Learning and Its Relationship to Supervised Learning In Si, J., Barto, A.G., Powell, W.B., and Wunsch, D., editors, Handbook of Learning and Approximate Dynamic Programming, pages 47 - 64. Wiley-IEEE Press, (2004)
Google Scholar
Sutton, R.S., Barto, A.G.: and Williams, R.J.: Reinforcement learning is direct adaptive optimal control Control Systems Magazine, IEEE Volume: 12 , Issue: 2 Digital Object Identifier: 10.1109/37.126844 Publication Year: 1992 , Page(s): 19 - 22
Google Scholar
Kaebling, L.P., Littman, M.L. and Moore A.W.: Reinforcement Learning: A Survey. Journal of Artificial Intelligence Research, Vol 4, pp 237 – 285. (1996)
Google Scholar
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England. (1989).
Google Scholar
Rummery G.A and Niranjan M.: On-line Q-Learning using connectionist Systems. Technical Report CUED/F-INFENG/TR166, Cambridge University. (1994)
Google Scholar
Sutton, R.: Learning to predict by the methods of temporal differences. Machine Learning 3(1),pp 9–44. doi:10.1007/BF00115009. (1988)
Google Scholar
Foster D.J., Morris, R.G.N. and Dayan, P.: A model of hippocampally dependent navigation, using the temporal learning rule. Hippocampus, Vol. 10, pp. 1-16, (2000)
Article Google Scholar
Humphrys, M.: Action Selection methods using Reinforcement Learning , PhD thesis, University of Cambridge, Computer Laboratory (1997)
Google Scholar
Watkins, C.J.C.H. and Dayan, P.: Technical Note: Q-Learning, Machine Learning 8:279-292. (1992)
MATH Google Scholar
Sutton R.S.: Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding. Advances in Neural Processing Systems 8, pp 1038–1044. (1996)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Automotive and Manufacturing Advanced Practice (AMAP), University of Sunderland, Colima Avenue, Sunderland, SR5 3XB, UK
Michael Knowles & David Baglee
Knowledge Technology Group, Department of Informatics, University of Hamburg, Vogt Koelln Str. 30, 22527, Hamburg, Germany
Stefan Wermter

Authors

Michael Knowles
View author publications
You can also search for this author in PubMed Google Scholar
David Baglee
View author publications
You can also search for this author in PubMed Google Scholar
Stefan Wermter
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Dept. Computer Science and, Software Engineering, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, United Kingdom
Max Bramer
School of Computing &, Mathematical Sciences, University of Greenwich, Park Row 30, London, SE10 9LS, United Kingdom
Miltos Petridis
, Faculty of Technology, De Montford University, The Gateway, Leicester, LE1 9BH, United Kingdom
Adrian Hopgood

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Knowles, M., Baglee, D., Wermter, S. (2011). Reinforcement Learning for Scheduling of Maintenance. In: Bramer, M., Petridis, M., Hopgood, A. (eds) Research and Development in Intelligent Systems XXVII. SGAI 2010. Springer, London. https://doi.org/10.1007/978-0-85729-130-1_31

Download citation

DOI: https://doi.org/10.1007/978-0-85729-130-1_31
Published: 29 October 2010
Publisher Name: Springer, London
Print ISBN: 978-0-85729-129-5
Online ISBN: 978-0-85729-130-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics