Environment Systems and Decisions

, Volume 33, Issue 3, pp 413–426 | Cite as

Action-based feature representation for reverse engineering trading strategies

  • Roy L. Hayes
  • Peter A. Beling
  • William T. Scherer


This paper considers the problem of reverse engineering strategies for trading in the financial markets. We investigate this problem in the context of a trading tournament in which student teams used delta hedging and other mechanisms to attempt to achieve benchmark performance in managing a hedge fund in a simulated market. Our hypothesis is that machine learning models can be trained to solve the apprenticeship learning problem; that is, these models can learn to trade like tournament participants. After reviewing classical return-matching approaches and recent work in inverse reinforcement learning, we propose a supervised learning methodology that makes use of recursive partitioning (RP). Our proposed RP approach is based on a feature representation for actions that, we argue, corresponds to the information structures readily available to tournament participants. RP achieves high accuracy in predicting the type and scale of participant trades and in tracking overall portfolio performance. Our results suggest that further research on our proposed approach is warranted and should include an expansion to testing on data from real markets.


Algorithm trading Reverse engineering Trading strategy 



This work would not have been possible without the collaboration of dedicated researchers. We like to thank Stefano Grazioli for providing the trading platforms and data that made this research possible. Additionally, we like to thank Mark Paddrik, Andrew Todd, and Matt Burkett for providing their insights into difficult problems.


  1. Abbeel P, Ng A (2004) Apprenticeship learning via inverse reinforcement learning. In: 21st international conference on machine learningGoogle Scholar
  2. Abbeel P, Coates A, Quigley M, Ng A (2007) An application of reinforcement learning to aerobatic helicopter flight. Adv Neural Inf Process Syst 19:1Google Scholar
  3. Adaptrade (2013) Adaptrade software.
  4. Bellman R (1957) Dynamic programming, vol 1. Princeton University Press, PrincetonGoogle Scholar
  5. Bertsimas D, Kogan L, Lo A (2001) Hedging derivative securities and incomplete markets: an-arbitrage approach. Oper Res 49:372–397CrossRefGoogle Scholar
  6. CFTC, SEC (2010) Findings regarding the market events of May 6, 2010. Technical reportGoogle Scholar
  7. Chen S, Yeh CH (2002) On the emergent properties of artificial stock markets: the efficient market hypothesis and the rational expectations hypothesis. J Econ Behav Organ 49:217–239CrossRefGoogle Scholar
  8. Conlisk J (1996) Why bounded rationality. J Econ Lit June:669–700Google Scholar
  9. Dvijotham K, Todorov E (2010) Inverse optimal control with linearly-solvable MDPs. In: Proceedings of the international conference on machine learning, CiteseerGoogle Scholar
  10. Fung W, Hsieh D (1997) Empirical characteristics of dynamic trading strategies: the case of hedge funds. Rev Financial Stud 10:275–302CrossRefGoogle Scholar
  11. Hasanhodzic J, Lo A (2006) Can hedge-fund returns be replicated?: the linear case. 16 Aug 2006Google Scholar
  12. Hayes R, Paddrik M, Todd A, Yang S, Beling P, Scherer W (2012) Agent based model of the e-mini s&p 500 future: application for policy making. In: Winter simulation conferenceGoogle Scholar
  13. Holland J, Miller J (1991) Artificial adaptive agents in economic theory. Am Econ Rev 81:365–370Google Scholar
  14. Kat H, Palaro H (2005) Who needs hedge funds? A copula-based approach to hedge fund return replication. In: Alternative Investment Research Centre working paper (27), 23 Nov 2005Google Scholar
  15. Kirilenko A, Kyle AS, Samadi M, Tuzun T (2011) The flash crash: The impact of high frequency trading on an electronic market. Technical report, Commodity Future Trade Commission. doi: 10.2139/ssrn.1686004
  16. Kutner M, Nachtsheim C, Neter J, Li W (2005) Applied linear statistical models, vol 5. McGraw-Hill/Irwin, New YorkGoogle Scholar
  17. Lebaron B (2002) Building the Santa Fe artificial stock market. In: Working paper, graduate in School of International Economics and Finance, Brandeis, pp 1117–1147Google Scholar
  18. Maslov S (2002) Simple model of a limit order-driven market. Physica A Stat Mech Appl 278:571–578CrossRefGoogle Scholar
  19. Mike S, Farmer J (2008) An empirical behavioral model of liquidity and volatility. J Econ Dyn Control 32:200–234CrossRefGoogle Scholar
  20. Morris S, Shin H (2008) Financial regulation in a system context. In: Brooking papers on economic activity fall, pp 229–274Google Scholar
  21. Neu G, Szepesvári C (2012) Apprenticeship learning using inverse reinforcement learning and gradient methods. arXiv:12065264Google Scholar
  22. Ng A, Russel S (2000) Algorithms for inverse reinforcement learning. In: International conference on machine learningGoogle Scholar
  23. Paddrik M, Hayes R, Todd A, Yang S, Beling P, Scherer W (2012) An agent based model of the e-mini s&p 500 and the flash crash. In: Computational intelligence for financial engineering and economicsGoogle Scholar
  24. Qiao Q, Beling P (2011) Inverse reinforcement learning with Gaussian process. In: American control conference, pp 113–118Google Scholar
  25. Qiao Q, Beling P (2013) Recognition of agents from observation of their sequential behavior. In: 2013 European Conference on Machine Learning and Principles and Practices of Knowledge Discovery in Databases (ECML-PKDD 2013)Google Scholar
  26. Sharpe W (1992) Asset allocation: management style and performance measurement. J Portfolio Manag 18:7–19CrossRefGoogle Scholar
  27. Shiller R (1999) Human behavior and the efficiency of the financial system. In: Handbook of macroeconomics. Elsevier, Amsterdam, pp 1305–1340Google Scholar
  28. Simon H (1959) Theories of decision-making in economics and behavioral science. Am Econ Rev 49:253–283Google Scholar
  29. Syed U, Schapire R (2008) A game-theoretic approach to apprenticeship learning. Adv Neural Inf Process Syst 20:1–8Google Scholar
  30. Therneau T, Atkinson E (1997) An introduction to recursive partitioning using the rpart routines. Technical report 61, Section of Biostatistics, Mayo Clinic, RochesterGoogle Scholar
  31. Yang S, Paddrik M, Hayes R, Todd A, Beling P, Scherer W (2012a) Behavior based learning in identifying high frequency trading strategies. In: Computational intelligence for financial engineering and economicsGoogle Scholar
  32. Yang S, Qiao Q, Beling P, Scherer W, Kirilenko A (2012b) Gaussian process based trading strategy identification. Technical report, System and Information Engineering, University of Virginia. doi: 10.2139/ssrn.2051138
  33. Ziebart B, Maas A, Bagnell J, Dey A (2008) Maximum entropy inverse reinforcement learning. In: Proceedings of AAAI, pp 1433–1438Google Scholar

Copyright information

© Springer Science+Business Media New York 2013

Authors and Affiliations

  • Roy L. Hayes
    • 1
  • Peter A. Beling
    • 1
  • William T. Scherer
    • 1
  1. 1.Department of Systems and Information EngineeringUniversity of VirginiaCharlottesvilleUSA

Personalised recommendations