Abstract
In the past 10 years, neural networks have emerged as a powerful tool for predictive modeling with “big data.” This chapter discusses the potential role of neural networks as applied to economic forecasting. It begins with a brief discussion of the history of neural networks, their use in economics, and their value as universal function approximators. It proceeds to introduce the elemental structures of neural networks, taking the classic feed forward, fully connected type of neural network as its point of reference. A broad set of design decisions are discussed including regularization, activation functions, and model architecture. Following this, two additional types of neural network model are discussed: recurrent neural networks and encoder-decoder models. The chapter concludes with an empirical application of all three models to the task of forecasting unemployment.
The views expressed are those of the author and do not necessarily reflect the views of the Federal Reserve Bank of Kansas City or the Federal Reserve System.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsNotes
- 1.
Specifically, a subset of the imagenet dataset. See Russakovsky et al. (2015).
- 2.
We use log-loss because it is the convention (in both the machine learning and statistical literature) for this type of categorization problem. Other loss functions, including mean squared error would likely work as well.
- 3.
- 4.
In this setting, the goal of the model fitting process would be to minimize this objective function.
- 5.
An alternative form of the wavelet neural network uses wavelet functions as activation functions for hidden nodes in the network. This form of wavelet network, however, is designed to improve optimization speeds, create self-assembling networks, or achieve ends other than accommodating non-stationary data.
- 6.
We focus here on a sequence of scalar values. All discussion in this section extends to sequences of multi-dimensional input (e.g., a sequence of vectors).
- 7.
In other words, the output of the decoder LSTM prior to the final, fully connected layer.
- 8.
This is to be contrasted with an iterative model, in which the next-step-ahead is forecast and then iterative extrapolation is used to generate a prediction for the desired forecast horizon.
References
Altman, E. I., Marco, G., & Varetto, F. (1994). Corporate distress diagnosis: Comparisons using linear discriminant analysis and neural networks (the Italian experience). Journal of Banking & Finance, 18(3), 505–529.
Bengio, Y., Lamblin, P., Popovici, D., & Larochelle, H. (2007). Greedy layer-wise training of deep networks. In Advances in Neural Information Processing Systems (pp. 153–160).
Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with gradient descent is difficult. IEEE Transactions on Neural Networks, 5(2), 157–166. Retrieved from http://www.comp.hkbu.edu.hk/~markus/teaching/comp7650/tnn-%2094-gradient.pdf
Bland, R. (1998). Learning xor: Exploring the space of a classic problem. Stirling: Department of Computing Science and Mathematics, University of Stirling.
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., & Bengio, Y. (2014). Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint, 1406.1078.
Cook, T. R., & Hall, A. S. (2017). Macroeconomic indicator forecasting with deep neural networks. Federal Reserve Bank of Kansas City Research Working Paper (pp. 17-11).
Cybenko, G. (1989). Approximations by superpositions of a sigmoidal function. Mathematics of Control, Signals and Systems, 2, 183–192.
Dauphin, Y. N., Pascanu, R., Gulcehre, C., Cho, K., Ganguli, S., & Bengio, Y. (2014). Identifying and attacking the saddle point problem in high-dimensional non-convex optimization. In Advances in Neural Information Processing Systems (pp. 2933–2941).
Dijk, D. v., Teräsvirta, T., & Franses, P. H. (2002). Smooth transition autoregressive models—A survey of recent developments. Econometric Reviews, 21(1), 1–47.
Dixon, M., Klabjan, D., & Bang, J. H. (2017). Classification-based financial markets prediction using deep neural networks. Algorithmic Finance, 6(3–4), 67–77.
Duchi, J., Hazan, E., & Singer, Y. (2011). Adaptive subgradient methods for online learning and stochastic optimization. Journal of Machine Learning Research, 12(7), 2121–2159.
Gal, Y., & Ghahramani, Z. (2016). Dropout as a Bayesian approximation. In Proceedings of the 33rd International Conference on Machine Learning (Vol. 3, pp. 1661–1680).
Gers, F. A., Schmidhuber, J., & Cummins, F. (2000). Learning to forget: Continual prediction with LSTM. Neural Computation, 12(10), 2451–2471.
Glorot, X., & Bengio, Y. (2010). Understanding the difficulty of training deep feed-forward neural networks. In Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics (pp. 249–256). Retrieved from http://proceedings.mlr.press/v9/glorot10a/glorot10a.pdf?%20hc_location=ufi
Goodfellow, I., Bengio, Y., & Courville, A. (2016). Deep learning. Cambridge: MIT Press. http://www.deeplearningbook.org
Hastad, J. (1986). Almost optimal lower bounds for small depth circuits. In Proceedings of the Eighteenth Annual ACM Symposium on Theory of Computing (pp. 6–20).
Heaton, J., Polson, N. G., & Witte, J. H. (2016). Deep learning in finance. arXiv preprint, 1602.06561.
Hinton, G. E., Osindero, S., & Teh, Y.-W. (2006). A fast learning algorithm for deep belief nets. Neural Computation, 18(7), 1527–1554.
Hochreiter, S., & Schmidhuber, J. (1997). Long short-term memory. Neural Computation, 9(8), 1735–1780.
Hornik, K. (1991). Approximation capabilities of multilayer feedforward networks. Neural Networks, 4(2), 251–257.
Hornik, K., Stinchcombe, M., & White, H. (1989). Multilayer feedforward networks are universal approximators. Neural Networks, 2(5), 359–366.
Huang, G.-B. (2003). Learning capability and storage capacity of two-hidden-layer feedforward networks. IEEE Transactions on Neural Networks, 14(2), 274–281.
Huang, G.-B., & Babri, H. A. (1997). General approximation theorem on feedforward networks. In Proceedings of the 1997 International Conference on Information, Communications and Signal Processing (Vol. 2, pp. 698–702). Piscataway: IEEE.
Jothimani, D., Yadav, S. S., & Shankar, R. (2015). Discrete wavelet transform-based prediction of stock index: A study on national stock exchange fifty index. Journal of Financial Management and Analysis, 28(2), 35–42.
Karlik, B., & Olgac, A. V. (2011). Performance analysis of various activation functions in generalized MLP architectures of neural networks. International Journal of Artificial Intelligence and Expert Systems, 1(4), 111–122. Retrieved from https://www.researchgate.net/publication/%20228813985_Performance_Analysis_of_Various_Activation_Functions_in_Generalized_MLP_Architectures_of_Neural_Networks
Kingma, D. P., & Ba, J. (2014). Adam: A method for stochastic optimization. arXiv preprint, 1412.6980.
Kristjanpoller, W., & Minutolo, M. C. (2015). Gold price volatility: A forecasting approach using the artificial neural network–GARCH model. Expert Systems with Applications, 42(20), 7245–7251.
Krogh, A., & Hertz, J. A. (1992). A simple weight decay can improve generalization. In Advances in Neural Information Processing Systems (pp. 950–957).
Lineesh, M., Minu, K., & John, C. J. (2010). Analysis of nonstationary nonlinear economic time series of gold price: A comparative study. In International Mathematical Forum (Vol. 5, 34, pp. 1673–1683). Citeseer.
Maas, A. L., Hannun, A. Y., & Ng, A. Y. (2013). Rectifier nonlinearities improve neural network acoustic models. In Proceedings of the 30th International Conference on Machine Learning (Vol. 30, 1, p. 3). Retrieved from http://robotics.stanford.edu/~amaas/papers/%20relu_hybrid_icml2013_final.pdf
Marcellino, M., Stock, J. H., & Watson, M. W. (2006). A comparison of direct and iterated multistep AR methods for forecasting macroeconomic time series. Journal of Econometrics, 135, 499–526.
McCulloch, W. S., & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133.
McNelis, P. (2005). Neural networks in finance: Gaining predictive edge in the market. Amsterdam: Elsevier.
Minsky, M., & Papert, S. (1969). Perceptrons: An introduction to computation geometry (Vol. 200, pp. 355–368). Cambridge: MIT Press.
Minu, K., Lineesh, M., & John, C. J. (2010). Wavelet neural networks for nonlinear time series analysis. Applied Mathematical Sciences, 4(50), 2485–2495.
Nair, V., & Hinton, G. E. (2010). Rectified linear units improve restricted Boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (pp. 807–814). Retrieved from http://www.cs.toronto.edu/~fritz/absps/reluICML.pdf
Odom, M. D., & Sharda, R. (1990). A neural network model for bankruptcy prediction. In Proceedings of the 1990 International Joint Conference on Neural Networks (pp. 163–168). Piscataway: IEEE.
Ramachandran, P., Zoph, B., & Le, Q. V. (2017). Searching for activation functions. arXiv preprint, 1710.05941. Retrieved from https://arxiv.org/pdf/1710.05941
Ruder, S. (2016). An overview of gradient descent optimization algorithms. arXiv preprint, 1609.04747.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1985). Learning internal representations by error propagation. San Diego: California University, La Jolla Institute for Cognitive Science.
Rumelhart, D. E., Hinton, G. E., & Williams, R. J. (1986). Learning representations by back-propagating errors. Nature, 323(6088), 533–536. Retrieved from http://www.cs.toronto.edu/~hinton/absps/naturebp.pdf
Russakovsky, O., Deng, J., Su, H., Krause, J., Satheesh, S., Ma, S.,… Bernstein, M., et al. (2015). Imagenet large scale visual recognition challenge. International Journal of Computer Vision, 115(3), 211–252.
Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., & Salakhutdinov, R. (2014). Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1), 1929–1958.
Stark, T. (2017). Error statistics for the survey of professional forecasters for unemployment rate. Philadelphia: Federal Reserve Bank of Philadelphia. Retrieved from https://www.philadelphiafed.org/-/media/research-and-data/%20real-time-center/survey-of-professional-forecasters/data-%20files/unemp/spf_error_statistics_unemp_1_aic.pdf?la=en
Sussillo, D., & Abbott, L. (2014). Random walk initialization for training very deep feedforward networks. arXiv preprint, 1412.6558.
Sutskever, I., Vinyals, O., & Le, Q. V. (2014). Sequence to sequence learning with neural networks. In Advances in Neural Information Processing Systems (pp. 3104–3112).
Szegedy, C., Liu, W., Jia, Y., Sermanet, P., Reed, S., Anguelov, D.,… Rabinovich, A. (2015). Going deeper with convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (pp. 1–9).
Tam, K. Y. (1991). Neural network models and the prediction of bank bankruptcy. Omega, 19(5), 429–445.
Telgarsky, M. (2016). Benefits of depth in neural networks. arXiv preprint, 1602.04485.
Terasvirta, T., & Anderson, H. M. (1992). Characterizing nonlinearities in business cycles using smooth transition autoregressive models. Journal of Applied Econometrics, 7(S1), S119–S136.
Werbos, P. J. (1990). Backpropagation through time: What it does and how to do it. Proceedings of the IEEE, 78(10), 1550–1560.
Zou, D., Cao, Y., Zhou, D., & Gu, Q. (2018). Stochastic gradient descent optimizes over-parameterized deep ReLU networks. arXiv preprint, 1811.08888.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Cook, T.R. (2020). Neural Networks. In: Fuleky, P. (eds) Macroeconomic Forecasting in the Era of Big Data. Advanced Studies in Theoretical and Applied Econometrics, vol 52. Springer, Cham. https://doi.org/10.1007/978-3-030-31150-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-030-31150-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31149-0
Online ISBN: 978-3-030-31150-6
eBook Packages: Economics and FinanceEconomics and Finance (R0)