The Big Picture: Toward a Synthesis of RL and Adaptive Tensor Factorization

  • Alexander Paprotny
  • Michael Thess
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)


We explore the subject of uniting the control-theoretic with the factorization-based approach to recommendation, arguing that tensor factorization may be employed to vanquish combinatorial complexity impediments related to more sophisticated MDP models that take a history of previous states rather than one single state into account. Specifically, we introduce a tensor representation of transition probabilities of Markov-k-processes and devise a Tucker-based approximation architecture that relies crucially on the notion of an aggregation basis described in Chap.  6. As our method requires a partitioning of the set of state transition histories, we are left with the challenge of how to determine a suitable partitioning, for which we propose a genetic algorithm.


Factorization-based Approach State Value Function Core Tensor Prolonged Accumulation Generalized Markov Property 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [BT96]
    Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)zbMATHGoogle Scholar
  2. [Hol92]
    Holland, J.H.: Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence. The MIT Press, Cambridge (1992)Google Scholar
  3. [Pap11]
    Paprotny, A.: Multilevel Methods for Dynamic Programming: Deterministic and Stochastic Iterative Methods with Application to Recommendation Engines. AVM – Akademische Verlagsgemeinschaft, München (2011)Google Scholar
  4. [TVR97]
    Tsitsiklis, J.N., Roy, B.V.: An analysis of temporal-difference learning with function approximation. IEE Trans. Autom. Control 42(5), 674–690 (1997)CrossRefzbMATHGoogle Scholar
  5. [Zim06]
    Zimmermann K.-H.: Diskrete Mathematik (in German). Books on Demand, Norderstedt (2006)Google Scholar
  6. [Ziv04]
    Ziv, O.: Algebraic multigrid for reinforcement learning. Master’s Thesis, Technion (2004)Google Scholar
  7. [ZS05]
    Ziv, O., Shimkin, N.: Multigrid methods for policy evaluation and reinforcement learning. In: 2005 International Symposium on Intelligent Control (2005)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Alexander Paprotny
    • 1
  • Michael Thess
    • 2
  1. 1.Research and Developmentprudsys AGBerlinGermany
  2. 2.Research and Developmentprudsys AGChemnitzGermany

Personalised recommendations