Trace Equivalence Characterization Through Reinforcement Learning

Desharnais, Josée; Laviolette, François; Moturu, Krishna Priya Darsini; Zhioua, Sami

doi:10.1007/11766247_32

Josée Desharnais²⁰,
François Laviolette²⁰,
Krishna Priya Darsini Moturu²⁰ &
…
Sami Zhioua²⁰

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4013))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

2668 Accesses
1 Citations

Abstract

In the context of probabilistic verification, we provide a new notion of trace-equivalence divergence between pairs of Labelled Markov processes. This divergence corresponds to the optimal value of a particular derived Markov Decision Process. It can therefore be estimated by Reinforcement Learning methods. Moreover, we provide some PAC-guarantees on this estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Bellman, R.E.: Dynamic Programming. Dover Publications, Incorporated (2003)
MATH Google Scholar
Blute, R., Desharnais, J., Edalat, A., Panangaden, P.: Bisimulation for labelled Markov processes. In: Proc. of the Twelfth IEEE Symposium On Logic In Computer Science, Warsaw, Poland (1997)
Google Scholar
Censor, Y.: Parallel Optimization: Theory, Algorithms, Applications. Oxford University Press, Oxford (1997)
Google Scholar
Cover, T.M., Thomas, J.A.: Elements of Information Theory, ch. 12. Wiley, Chichester (1991)
Book Google Scholar
Even-Dar, E., Mansour, Y.: Learning rates for Q-learning. In: Helmbold, D.P., Williamson, B. (eds.) COLT 2001 and EuroCOLT 2001. LNCS (LNAI), vol. 2111, pp. 589–604. Springer, Heidelberg (2001)
Chapter Google Scholar
Fiechter, C.N.: Design and Analysis of Efficient Reinforcement Learning Algorithms. PhD thesis, Univ.of Pittsburgh (1997)
Google Scholar
Hoeffding, W.: Probability inequalities for sums of bounded random variables. American Statistical Association Journal 58, 13–30 (1963)
Article MATH MathSciNet Google Scholar
Jaakkola, T., Jordan, M.I., Singh, S.P.: Convergence of stochastic iterative dynamic programming algorithms. In: Cowan, J.D., Tesauro, G., Alspector, J. (eds.) Advances in Neural Information Processing Systems, vol. 6, pp. 703–710. Morgan Kaufmann Publishers, San Francisco (1994)
Google Scholar
Jou, C.-C., Smolka, S.A.: Equivalences, congruences, and complete axiomatizations for probabilistic processes. In: Baeten, J.C.M., Klop, J.W. (eds.) CONCUR 1990. LNCS, vol. 458, Springer, Heidelberg (1990)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.P.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Google Scholar
Kearns, M., Singh, S.: Finite-sample convergence rates for q-learning and indirect algorithms. In: Proc. of the 1998 conference on Advances in neural information processing systems II, pp. 996–1002. MIT Press, Cambridge (1999)
Google Scholar
Larsen, K.G., Skou, A.: Bisimulation through probabilistic testing. Inf. Comput. 94(1), 1–28 (1991)
Article MATH MathSciNet Google Scholar
Sutton, R.S., Barto, A.G.: Introduction to Reinforcement Learning. MIT Press, Cambridge (1998)
Google Scholar
Tsitsiklis, J.N.: Asynchronous stochastic approximation and Q-learning. Machine Learning 16(3), 185–202 (1994)
MATH MathSciNet Google Scholar
Watkins, C.: Learning from Delayed Rewards. PhD thesis, Univ. of Cambridge (1989)
Google Scholar
Watkins, C., Dayan, P.: Q-learning. Machine Learning 8, 279–292 (1992)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

IFT-GLO, Université Laval, Québec (QC), G1K-7P4, Canada
Josée Desharnais, François Laviolette, Krishna Priya Darsini Moturu & Sami Zhioua

Authors

Josée Desharnais
View author publications
You can also search for this author in PubMed Google Scholar
François Laviolette
View author publications
You can also search for this author in PubMed Google Scholar
Krishna Priya Darsini Moturu
View author publications
You can also search for this author in PubMed Google Scholar
Sami Zhioua
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Departement of Computer Science and Software Engineering, Laval University, G1K 7P4, Québec, Canada
Luc Lamontagne
Département IFT-GLO, Pavillon Adrien-Pouliot, Université Laval, G1K-7P4, Québec, Canada
Mario Marchand

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Desharnais, J., Laviolette, F., Moturu, K.P.D., Zhioua, S. (2006). Trace Equivalence Characterization Through Reinforcement Learning. In: Lamontagne, L., Marchand, M. (eds) Advances in Artificial Intelligence. Canadian AI 2006. Lecture Notes in Computer Science(), vol 4013. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11766247_32

Download citation

DOI: https://doi.org/10.1007/11766247_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-34628-9
Online ISBN: 978-3-540-34630-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics