Skip to main content

Learning Optimal Q-Function Using Deep Boltzmann Machine for Reliable Trading of Cryptocurrency

  • Conference paper
  • First Online:
Book cover Intelligent Data Engineering and Automated Learning – IDEAL 2018 (IDEAL 2018)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 11314))

Abstract

The explosive price volatility from the end of 2017 to January 2018 shows that bitcoin is a high risk asset. The deep reinforcement algorithm is straightforward idea for directly outputs the market management actions to achieve higher profit instead of higher price-prediction accuracy. However, existing deep reinforcement learning algorithms including Q-learning are also limited to problems caused by enormous searching space. We propose a combination of double Q-network and unsupervised pre-training using Deep Boltzmann Machine (DBM) to generate and enhance the optimal Q-function in cryptocurrency trading. We obtained the profit of 2,686% in simulation, whereas the best conventional model had that of 2,087% for the same period of test. In addition, our model records 24% of profit while market price significantly drops by −64%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Nakamoto, S.: Bitcoin: a peer-to-peer electronic cash system (2008)

    Google Scholar 

  2. Agarwal, A., Hazan, E., Kale, S., Schapire, R.E.: Algorithms for portfolio management based on the Newton method. In: Proceedings of the 23rd International Conference on Machine Learning, pp. 9–16. ACM (2006)

    Google Scholar 

  3. Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518, 529 (2015)

    Article  Google Scholar 

  4. Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: AAAI, vol. 16, pp. 2094–2100 (2016)

    Google Scholar 

  5. Lee, H., Grosse, R., Ranganath, R., Ng, A.Y.: Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In: Proceedings of the 26th Annual International Conference on Machine Learning, pp. 609–616 (2009)

    Google Scholar 

  6. Hinton, G.E., Osindero, S., Teh, Y.W.: A fast learning algorithm for deep belief nets. Neural Comput. 18, 1527–1554 (2006)

    Article  MathSciNet  Google Scholar 

  7. Huang, W., Nakamori, Y., Wang, S.Y.: Forecasting stock market movement direction with support vector machine. Comput. Oper. Res. 32, 2513–2522 (2005)

    Article  Google Scholar 

  8. Schumaker, R.P., Chen, H.: Textual analysis of stock market prediction using breaking financial news: the AZFin text system. ACM Trans. Inf. Syst. 27, 12 (2009)

    Article  Google Scholar 

  9. Patel, J., Shah, S., Thakkar, P., Kotecha, K.: Predicting stock and stock price index movement using trend deterministic data preparation and machine learning techniques. Expert Syst. Appl. 42, 259–268 (2015)

    Article  Google Scholar 

  10. McNally, S.: Predicting the Price of Bitcoin using Machine Learning. National College of Ireland (2016)

    Google Scholar 

  11. Amjad, M., Shah, D.: Trading bitcoin and online time series prediction. In: NIPS 2016 Time Series Workshop, pp. 1–15 (2017)

    Google Scholar 

  12. Jiang, Z., Liang, J.: Cryptocurrency portfolio management with deep reinforcement learning. In: Intelligent Systems Conference, pp. 905–913 (2017)

    Google Scholar 

  13. Bell, T.: Bitcoin Trading Agents. University of Southampton (2016)

    Google Scholar 

  14. Żbikowski, K.: Application of machine learning algorithms for bitcoin automated trading. In: Ryżko, D., Gawrysiak, P., Kryszkiewicz, M., Rybiński, H. (eds.) Machine Intelligence and Big Data in Industry. SBD, vol. 19, pp. 161–168. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-30315-4_14

    Chapter  Google Scholar 

  15. Tesauro, G.: Extending Q-learning to general adaptive multi-agent systems. In: Advances in Neural Information Processing Systems, pp. 871–878 (2004)

    Google Scholar 

  16. Pinheiro, P.H., Collobert, R.: Recurrent convolutional neural networks for scene labeling. In: International Conference on Machine Learning, pp. 82–90 (2014)

    Google Scholar 

  17. Sainath, T.N., Vinyals, O., Senior, A., Sak, H.: Convolutional, long short-term memory, fully connected deep neural networks. In: Acoustics, Speech and Signal Processing, pp. 4580–4584 (2015)

    Google Scholar 

  18. Ren, Y., Wu, Y.: Convolutional deep belief networks for feature extraction of EEG signal. In: International Joint Conference on Neural Networks, pp. 2850–2853 (2014)

    Google Scholar 

  19. Lample, G., Chaplot, D.S.: Playing FPS games with deep reinforcement learning. In: AAAI, pp. 2140–2146 (2017)

    Google Scholar 

  20. Donahue, J., et al.: Long-term recurrent convolutional networks for visual recognition and description. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2625–2634 (2015)

    Google Scholar 

  21. Bu, S.-J., Cho, S.-B.: A hybrid system of deep learning and learning classifier system for database intrusion detection. In: Martínez de Pisón, F.J., Urraca, R., Quintián, H., Corchado, E. (eds.) HAIS 2017. LNCS (LNAI), vol. 10334, pp. 615–625. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-59650-1_52

    Chapter  Google Scholar 

  22. Cover, T.M.: Universal portfolios. In: The Kelly Capital Growth Investment Criterion: Theory and Practice, pp. 181–209 (2011)

    Chapter  Google Scholar 

  23. Das, P., Banerjee, A.: Meta optimization and its applications to portfolio selection. In: Proceedings of the 17th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 1163–1171 (2011)

    Google Scholar 

  24. Li, B., Zhao, P., Hoi, S.C., Gopalkrishnan, V.: PAMR: passive aggressive mean reversion strategy for portfolio selection. Mach. Learn. 87, 221–258 (2012)

    Article  MathSciNet  Google Scholar 

  25. Maaten, L.V.D., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9, 2579–2606 (2008)

    MATH  Google Scholar 

Download references

Acknowledgements

This research was supported by Korea Electric Power Corporation. (Grant number:R18XA05).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sung-Bae Cho .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bu, SJ., Cho, SB. (2018). Learning Optimal Q-Function Using Deep Boltzmann Machine for Reliable Trading of Cryptocurrency. In: Yin, H., Camacho, D., Novais, P., Tallón-Ballesteros, A. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2018. IDEAL 2018. Lecture Notes in Computer Science(), vol 11314. Springer, Cham. https://doi.org/10.1007/978-3-030-03493-1_49

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03493-1_49

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03492-4

  • Online ISBN: 978-3-030-03493-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics