Skip to main content

The Big Picture: Toward a Synthesis of RL and Adaptive Tensor Factorization

  • Chapter
  • First Online:
Realtime Data Mining

Part of the book series: Applied and Numerical Harmonic Analysis ((ANHA))

  • 1791 Accesses

Abstract

We explore the subject of uniting the control-theoretic with the factorization-based approach to recommendation, arguing that tensor factorization may be employed to vanquish combinatorial complexity impediments related to more sophisticated MDP models that take a history of previous states rather than one single state into account. Specifically, we introduce a tensor representation of transition probabilities of Markov-k-processes and devise a Tucker-based approximation architecture that relies crucially on the notion of an aggregation basis described in Chap. 6. As our method requires a partitioning of the set of state transition histories, we are left with the challenge of how to determine a suitable partitioning, for which we propose a genetic algorithm.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)

    MATH  Google Scholar 

  2. Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. The MIT Press, Cambridge (1992)

    Google Scholar 

  3. Paprotny, A.: Multilevel Methods for Dynamic Programming: Deterministic and Stochastic Iterative Methods with Application to Recommendation Engines. AVM – Akademische Verlagsgemeinschaft, München (2011)

    Google Scholar 

  4. Tsitsiklis, J.N., Roy, B.V.: An analysis of temporal-difference learning with function approximation. IEE Trans. Autom. Control 42(5), 674–690 (1997)

    Article  MATH  Google Scholar 

  5. Zimmermann K.-H.: Diskrete Mathematik (in German). Books on Demand, Norderstedt (2006)

    Google Scholar 

  6. Ziv, O.: Algebraic multigrid for reinforcement learning. Master’s Thesis, Technion (2004)

    Google Scholar 

  7. Ziv, O., Shimkin, N.: Multigrid methods for policy evaluation and reinforcement learning. In: 2005 International Symposium on Intelligent Control (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

Copyright information

© 2013 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Paprotny, A., Thess, M. (2013). The Big Picture: Toward a Synthesis of RL and Adaptive Tensor Factorization. In: Realtime Data Mining. Applied and Numerical Harmonic Analysis. Birkhäuser, Cham. https://doi.org/10.1007/978-3-319-01321-3_10

Download citation

Publish with us

Policies and ethics