Advanced Neural Networks

Dixon, Matthew F.; Halperin, Igor; Bilokon, Paul

doi:10.1007/978-3-030-41068-1_8

Matthew F. Dixon⁴,
Igor Halperin⁵ &
Paul Bilokon⁶

12k Accesses

Abstract

This chapter presents various neural network models for financial time series analysis, providing examples of how they relate to well-known techniques in financial econometrics. Recurrent neural networks (RNNs) are presented as non-linear time series models and generalize classical linear time series models such as AR(p). They provide a powerful approach for prediction in financial time series and generalize to non-stationary data. This chapter also presents convolution neural networks for filtering time series data and exploiting different scales in the data. Finally, this chapter demonstrates how autoencoders are used to compress information and generalize principal component analysis.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Hardcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Baldi, P., & Hornik, K. (1989, January). Neural networks and principal component analysis: Learning from examples without local minima. Neural Netw.,2(1), 53–58.
Article Google Scholar
Borovykh, A., Bohte, S., & Oosterlee, C. W. (2017, Mar). Conditional time series forecasting with convolutional neural networks. arXiv e-prints, arXiv:1703.04691.
Google Scholar
Elman, J. L. (1991, Sep). Distributed representations, simple recurrent networks, and grammatical structure. Machine Learning,7(2), 195–225.
Google Scholar
Gers, F. A., Eck, D., & Schmidhuber, J. (2001). Applying LSTM to time series predictable through time-window approaches (pp. 669–676). Berlin, Heidelberg: Springer Berlin Heidelberg.
Google Scholar
Graves, A. (2012). Supervised sequence labelling with recurrent neural networks. Studies in Computational intelligence. Heidelberg, New York: Springer.
Google Scholar
Heaton, J. B., Polson, N. G., & Witte, J. H. (2017). Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry,33(1), 3–12.
Article MathSciNet MATH Google Scholar
Hochreiter, S., & Schmidhuber, J. (1997, November). Long short-term memory. Neural Comput.,9(8), 1735–1780.
Article Google Scholar
Krizhevsky, A., Sutskever, I., & Hinton, G. E. (2012). ImageNet classification with deep convolutional neural networks. In Advances in neural information processing systems (pp. 1097–1105).
Google Scholar
Litterman, R. B., & Scheinkman, J. (1991). Common factors affecting bond returns. The Journal of Fixed Income,1(1), 54–61.
Article Google Scholar
Plaut, E. (2018, Apr). From principal subspaces to principal components with linear autoencoders. arXiv e-prints, arXiv:1804.10253.
Google Scholar
van den Oord, A., Dieleman, S., Zen, H., Simonyan, K., Vinyals, O., Graves, A., et al. (2016). WaveNet: A generative model for raw audio. CoRR,abs/1609.03499.
Google Scholar
Zheng, J., Xu, C., Zhang, Z., & Li, X. (2017, March). Electric load forecasting in smart grids using long-short-term-memory based recurrent neural network. In 2017 51st Annual Conference on Information Sciences and Systems (CISS) (pp. 1–6).
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Applied Mathematics, Illinois Institute of Technology, Chicago, IL, USA
Matthew F. Dixon
Tandon School of Engineering, New York University, Brooklyn, NY, USA
Igor Halperin
Department of Mathematics, Imperial College London, London, UK
Paul Bilokon

Authors

Matthew F. Dixon
View author publications
You can also search for this author in PubMed Google Scholar
Igor Halperin
View author publications
You can also search for this author in PubMed Google Scholar
Paul Bilokon
View author publications
You can also search for this author in PubMed Google Scholar

Appendix

1.1 Answers to Multiple choice questions

Question 1

Answer: 1,2,4,5. An augmented Dickey–Fuller test can be applied to time series to determine whether they are covariance stationary.

The estimated partial autocorrelation of a covariance stationary time series can be used to identify the design sequence length in a plain RNN because the network has a fixed partial autocorrelation matrix.

Plain recurrent neural networks are not guaranteed to be stable—the stability constraint restricts the choice of activation in the hidden state update.

Once the model is fitted, the Ljung–Box test is used to test whether the residual error is auto-correlated. A well-specified model should exhibit white noise error both in and out-of-sample.

The half-life of a lag-1 unit impulse is the number of lags before the impulse has half its effect on the model output.

Question 2

Answer: 1,4. A gated recurrent unit uses dynamic exponential smoothing to propagate a hidden state with infinite memory. However, there is no requirement for covariance stationarity of the data in order to fit a GRU, or LSTM. This is because the later are dynamic models with a time-dependent partial autocorrelation structure.

Gated recurrent units are conditionally stable—the choice of activation in the hidden state update is especially important. For example, a tanh function for the hidden state update satisfies the stability constraint. A GRU only has one memory, the hidden state, whereas a LSTM indeed has an additional, cellular, memory.

Question 3

Answer: 1,2,3.

CNNs apply a collection of different, but equal width, filters to the data. Each filter is a unit in the CNN hidden layer and is activated before using a feedforward network for regression or classification. CNNs are sparse networks, exploiting locality of the data, to reduce the number of weights. CNNs are especially relevant for spatial, temporal, or even spatio-temporal datasets (e.g., implied volatility surfaces). A dilated CNN, such as the WaveNet architecture, is appropriate for multi-scale time series analysis—it captures a hierarchy of patterns at different resolutions (i.e., dependencies on past lags at different frequencies, e.g., days, weeks, months). The number of layers in a CNN must be determined manually during training.

Python Notebooks

The notebooks provided in the accompanying source code repository implement many of the techniques presented in this chapter including RNNs, GRUs, LSTMs, CNNs, and autoencoders. Example datasets include 1-minute snapshots of Coinbase prices and a HFT dataset. Further details of the notebooks are included in the README.md file.

Rights and permissions

Reprints and permissions

Copyright information

About this chapter

Cite this chapter

Dixon, M.F., Halperin, I., Bilokon, P. (2020). Advanced Neural Networks. In: Machine Learning in Finance. Springer, Cham. https://doi.org/10.1007/978-3-030-41068-1_8

Download citation

DOI: https://doi.org/10.1007/978-3-030-41068-1_8
Published: 02 July 2020
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-41067-4
Online ISBN: 978-3-030-41068-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)

Publish with us

Policies and ethics

Abstract

Access this chapter

References

Author information

Authors and Affiliations

Appendix

Appendix

1.1 Answers to Multiple choice questions

Question 1

Question 2

Question 3

Python Notebooks

Rights and permissions

Copyright information

About this chapter

Cite this chapter

Download citation

Share this chapter

Publish with us

Search

Navigation