Abstract
We propose a new approach to verification of probabilistic processes for which the model may not be available. We use a technique from Reinforcement Learning to approximate how far apart two processes are by solving a Markov Decision Process. If two processes are equivalent, the algorithm will return zero, otherwise it will provide a number and a test that witness the non equivalence. We suggest a new family of equivalences, called K-moment, for which it is possible to do so. The weakest, 1-moment equivalence, is trace-equivalence. The others are weaker than bisimulation but stronger than trace-equivalence.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Blute, R., Desharnais, J., Edalat, A., Panangaden, P.: Bisimulation for labelled Markov processes. In: Proc. of the Twelfth IEEE Symposium On Logic In Computer Science, Warsaw, Poland (1997)
van Breugel, F., Shalit, S., Worrell, J.B.: Testing labelled markov processes. In: Widmayer, P., Triguero, F., Morales, R., Hennessy, M., Eidenbenz, S., Conejo, R. (eds.) ICALP 2002. LNCS, vol. 2380, pp. 537–548. Springer, Heidelberg (2002)
Van Breugel, F., Worrell, J.: Approximating and computing behavioural distances in probabilistic transition systems. Theoretical Computer Science (2006)
Cover, T.M., Thomas, J.A.: Elements of Information Theory. Wiley, Chichester (1991)
Desharnais, J., Laviolette, F., Darsini Moturu, K., Zhioua, S.: Trace equivalence characterization through reinforcement learning. In: 19th Canadian Conference on Artificial Intelligence (2006) (accepted for publication)
Even-Dar, E., Mansour, Y.: Learning rates for Q-learning. In: Helmbold, D.P., Williamson, B. (eds.) COLT 2001 and EuroCOLT 2001. LNCS, vol. 2111, pp. 589–604. Springer, Heidelberg (2001)
Fiechter, C.N.: Design and Analysis of Efficient Reinforcement Learning Algorithms. PhD thesis, Univ. of Pittsburgh (1997)
Giacalone, A., Jou, C., Smolka, S.: Algebraic reasoning for probabilistic concurrent systems. In: Proceedings of the Working Conference on Programming Concepts and Methods. IFIP TC2 (1990)
Van Glabbeek, R.J.: The linear time - branching time spectrum ii. In: Best, E. (ed.) CONCUR 1993. LNCS, vol. 715, pp. 66–81. Springer, Heidelberg (1993)
Jou, C.-C., Smolka, S.A.: Equivalences, congruences, and complete axiomatizations for probabilistic processes. In: Baeten, J.C.M., Klop, J.W. (eds.) CONCUR 1990. LNCS, vol. 458. Springer, Heidelberg (1990)
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Kearns, M., Singh, S.: Finite-sample convergence rates for q-learning and indirect algorithms. In: Proc. of the 1998 conference on Advances in neural information processing systems II, pp. 996–1002. MIT Press, Cambridge (1999)
Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)
Lowe, G.: Representing Nondeterministic and Probabilistic Behaviour in Reactive Processes. Technical report, Progr. Res. Group, Oxford University (1993)
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Univ. of Cambridge (1989)
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Desharnais, J., Laviolette, F., Zhioua, S. (2006). Testing Probabilistic Equivalence Through Reinforcement Learning. In: Arun-Kumar, S., Garg, N. (eds) FSTTCS 2006: Foundations of Software Technology and Theoretical Computer Science. FSTTCS 2006. Lecture Notes in Computer Science, vol 4337. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11944836_23
Download citation
DOI: https://doi.org/10.1007/11944836_23
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-49994-7
Online ISBN: 978-3-540-49995-4
eBook Packages: Computer ScienceComputer Science (R0)