Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

Wikle, Christopher K.

doi:10.1007/s13253-019-00361-7

Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

Published: 28 March 2019

Volume 24, pages 175–203, (2019)
Cite this article

Journal of Agricultural, Biological and Environmental Statistics Aims and scope Submit manuscript

Christopher K. Wikle ORCID: orcid.org/0000-0002-0655-2696¹

1768 Accesses
13 Citations
1 Altmetric
Explore all metrics

Abstract

Spatio-temporal data are ubiquitous in the agricultural, ecological, and environmental sciences, and their study is important for understanding and predicting a wide variety of processes. One of the difficulties with modeling spatial processes that change in time is the complexity of the dependence structures that must describe how such a process varies, and the presence of high-dimensional complex datasets and large prediction domains. It is particularly challenging to specify parameterizations for nonlinear dynamic spatio-temporal models (DSTMs) that are simultaneously useful scientifically and efficient computationally. Statisticians have developed multi-level (deep) hierarchical models that can accommodate process complexity as well as the uncertainties in the predictions and inference. However, these models can be expensive and are typically application specific. On the other hand, the machine learning community has developed alternative “deep learning” approaches for nonlinear spatio-temporal modeling. These models are flexible yet are typically not implemented in a probabilistic framework. The two paradigms have many things in common and suggest hybrid approaches that can benefit from elements of each framework. This overview paper presents a brief introduction to the multi-level (deep) hierarchical DSTM (H-DSTM) framework, and deep models in machine learning, culminating with the deep neural DSTM (DN-DSTM). Recent approaches that combine elements from H-DSTMs and echo state network DN-DSTMs are presented as illustrations. Supplementary materials accompanying this paper appear online.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Machine learning algorithms to forecast air quality: a survey

Article Open access 16 February 2023

Assessment of machine learning models to predict daily streamflow in a semiarid river catchment

Article 24 April 2024

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Article 10 April 2024

Notes

https://iri.columbia.edu/our-expertise/climate/forecasts/enso/2017-July-quick-look/?enso_tab=enso-sst_table.

References

Aggarwal, C. C. (2018), Neural networks and deep learning, Springer, Berlin.
Book MATH Google Scholar
Antonelo, E. A., Camponogara, E., and Foss, B. (2017), “Echo State Networks for data-driven downhole pressure estimation in gas-lift oil wells,” Neural Networks, 85, 106–117.
Article Google Scholar
Berliner, L. M. (1996), “Hierarchical Bayesian time series models,” in Maximum Entropy and Bayesian Methods, eds. Hanson, K. M. and Silver, R. N., Dordecht: Kluwer, Fundamental Theories of Physics, 79, pp. 15–22.
Bingham, E. and Mannila, H. (2001), “Random projection in dimensionality reduction: applications to image and text data,” in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining, ACM, pp. 245–250.
Bronstein, M., Bruna Estrach, J., LeCun, Y., Szlam, A., and Vandergheynst, P. (2017), “Geometric Deep Learning: Going beyond Euclidean data,” IEEE Signal Processing Magazine, 34, 18–42.
Article Google Scholar
Chatzis, S. P. (2015), “Sparse Bayesian Recurrent Neural Networks,” in Joint European Conference on Machine Learning and Knowledge Discovery in Databases, Springer, pp. 359–372.
Chien, J.-T. and Ku, Y.-C. (2016), “Bayesian recurrent neural network for language modeling,” IEEE transactions on neural networks and learning systems, 27, 361–374.
Article MathSciNet Google Scholar
Cho, K., Van Merriënboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., and Bengio, Y. (2014), “Learning phrase representations using RNN encoder-decoder for statistical machine translation,” arXiv preprint arXiv:1406.1078.
Cressie, N. and Wikle, C. K. (2011), Statistics for Spatio-Temporal Data, Wiley, Hoboken.
MATH Google Scholar
Donahue, J., Anne Hendricks, L., Guadarrama, S., Rohrbach, M., Venugopalan, S., Saenko, K., and Darrell, T. (2015), “Long-term recurrent convolutional networks for visual recognition and description,” in Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2625–2634.
Erhan, D., Bengio, Y., Courville, A., Manzagol, P.-A., Vincent, P., and Bengio, S. (2010), “Why does unsupervised pre-training help deep learning?” Journal of Machine Learning Research, 11, 625–660.
MathSciNet MATH Google Scholar
Fan, J. and Lv, J. (2010), “A selective overview of variable selection in high dimensional feature space,” Statistica Sinica, 20, 101.
MathSciNet MATH Google Scholar
Gallicchio, C. and Micheli, A. (2011), “Architectural and markovian factors of echo state networks,” Neural Networks, 24, 440–456.
Article Google Scholar
Gallicchio, C., Micheli, A., and Pedrelli, L. (2018), “Design of deep echo state networks,” Neural Networks, 108, 33–47.
Article Google Scholar
Gan, Z., Li, C., Chen, C., Pu, Y., Su, Q., and Carin, L. (2016), “Scalable Bayesian Learning of Recurrent Neural Networks for Language Modeling,” arXiv preprint arXiv:1611.08034.
Gelman, A. and Hill, J. (2006), Data analysis using regression and multilevel/hierarchical models, Cambridge University Press, Cambridge.
Book Google Scholar
Gelman, A., Stern, H. S., Carlin, J. B., Dunson, D. B., Vehtari, A., and Rubin, D. B. (2013), Bayesian data analysis, third edition, Chapman and Hall/CRC, London.
MATH Google Scholar
Goodfellow, I., Bengio, Y., Courville, A., and Bengio, Y. (2016), Deep learning, vol. 1, MIT Press, Cambridge.
MATH Google Scholar
Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2014), “Generative adversarial nets,” in Advances in neural information processing systems, pp. 2672–2680.
Graves, A., Mohamed, A.-r., and Hinton, G. (2013), “Speech recognition with deep recurrent neural networks,” in 2013 ieee international conference on acoustics, speech and signal processing (icassp), IEEE, pp. 6645–6649.
Heaton, M. J., Datta, A., Finley, A. O., Furrer, R., Guinness, J., Guhaniyogi, R., Gerber, F., Gramacy, R. B., Hammerling, D., Katzfuss, M., et al. (2018), “A case study competition among methods for analyzing large spatial data,” Journal of Agricultural, Biological and Environmental Statistics, 1–28.
Hermans, M. and Schrauwen, B. (2013), “Training and analysing deep recurrent neural networks,” in Advances in neural information processing systems, pp. 190–198.
Hinton, G., Deng, L., Yu, D., Dahl, G. E., Mohamed, A.-r., Jaitly, N., Senior, A., Vanhoucke, V., Nguyen, P., Sainath, T. N., et al. (2012), “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal processing magazine, 29, 82–97.
Article Google Scholar
Hochreiter, S. and Schmidhuber, J. (1997), “Long short-term memory,” Neural computation, 9, 1735–1780.
Article Google Scholar
Jaeger, H. (2007), “Discovering multiscale dynamical features with hierarchical echo state networks,” Tech. rep., Jacobs University Bremen.
Karpatne, A., Atluri, G., Faghmous, J. H., Steinbach, M., Banerjee, A., Ganguly, A., Shekhar, S., Samatova, N., and Kumar, V. (2017), “Theory-guided data science: A new paradigm for scientific discovery from data,” IEEE Transactions on Knowledge and Data Engineering, 29, 2318–2331.
Article Google Scholar
Keren, G. and Schuller, B. (2016), “Convolutional RNN: an enhanced model for extracting features from sequential data,” in Neural Networks (IJCNN), 2016 International Joint Conference on, IEEE, pp. 3412–3419.
Leeds, W. B., Wikle, C. K., and Fiechter, J. (2014), “Emulator-assisted reduced-rank ecological data assimilation for nonlinear multivariate dynamical spatio-temporal processes,” Statistical Methodology, 17, 126–138.
Article MathSciNet MATH Google Scholar
Lukoševičius, M. and Jaeger, H. (2009), “Reservoir computing approaches to recurrent neural network training,” Computer Science Review, 3, 127–149.
Article MATH Google Scholar
Ma, Q., Shen, L., and Cottrell, G. W. (2017), “Deep-ESN: A Multiple Projection-encoding Hierarchical Reservoir Computing Framework,” arXiv preprint arXiv:1711.05255.
MacKay, D. J. (1992), “A practical Bayesian framework for backpropagation networks,” Neural computation, 4, 448–472.
Article Google Scholar
McDermott, P. L. and Wikle, C. K. (2017a), “Bayesian Recurrent Neural Network Models for Forecasting and Quantifying Uncertainty in Spatial-Temporal Data,” arXiv preprint arXiv:1711.00636.
McDermott, P. L. and Wikle, C. K. (2017b), “An Ensemble Quadratic Echo State Network for Nonlinear Spatio-Temporal Forecasting,” STAT, 6, 315–330.
Article MathSciNet Google Scholar
McDermott, P. L. and Wikle, C. K. (2018), “Deep echo state networks with uncertainty quantification for spatio-temporal forecasting,” Environmetrics, e2553.
Neal, R. M. (1996), Bayesian learning for neural networks, New York, NY: Springer.
Book MATH Google Scholar
Polson, N. G., Sokolov, V., et al. (2017), “Deep learning: A bayesian perspective,” Bayesian Analysis, 12, 1275–1304.
Article MathSciNet MATH Google Scholar
Polson, N. G. and Sokolov, V. O. (2017), “Deep learning for short-term traffic flow prediction,” Transportation Research Part C: Emerging Technologies, 79, 1–17.
Article Google Scholar
Quiroz, M., Nott, D. J., and Kohn, R. (2018), “Gaussian variational approximation for high-dimensional state space models,” arXiv preprint arXiv:1801.07873.
Rasmussen, C. E. and Williams, C. K. (2006), Gaussian processes for machine learning, Cambridge, MA: MIT press.
MATH Google Scholar
Reichstein, M., Camps-Valls, G., Stevens, B., Jung, M., Denzler, J., Carvalhais, N., et al. (2019), “Deep learning and process understanding for data-driven Earth system science,” Nature, 566, 195.
Article Google Scholar
Shalev-Shwartz, S., Shamir, O., and Shammah, S. (2017), “Failures of deep learning,” arXiv preprint arXiv:1703.07950.
Sheng, C., Zhao, J., Wang, W., and Leung, H. (2013), “Prediction intervals for a noisy nonlinear time series based on a bootstrapping reservoir computing network ensemble,” IEEE Transactions on neural networks and learning systems, 24, 1036–1048.
Article Google Scholar
Silver, D., Huang, A., Maddison, C. J., Guez, A., Sifre, L., Van Den Driessche, G., Schrittwieser, J., Antonoglou, I., Panneershelvam, V., Lanctot, M., et al. (2016), “Mastering the game of Go with deep neural networks and tree search,” nature, 529, 484.
Silver, D., Hubert, T., Schrittwieser, J., Antonoglou, I., Lai, M., Guez, A., Lanctot, M., Sifre, L., Kumaran, D., Graepel, T., et al. (2018), “A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play,” Science, 362, 1140–1144.
Article MathSciNet Google Scholar
Snoek, J., Rippel, O., Swersky, K., Kiros, R., Satish, N., Sundaram, N., Patwary, M., Prabhat, M., and Adams, R. (2015), “Scalable bayesian optimization using deep neural networks,” in International Conference on Machine Learning, pp. 2171–2180.
Takens, F. (1981), “Detecting strange attractors in turbulence,” Lecture notes in mathematics, 898, 366–381.
Tobler, W. R. (1970), “A computer movie simulating urban growth in the Detroit region,” Economic geography, 46, 234–240.
Article Google Scholar
Tong, Z. and Tanaka, G. (2018), “Reservoir Computing with Untrained Convolutional Neural Networks for Image Recognition,” in 2018 24th International Conference on Pattern Recognition (ICPR), IEEE, pp. 1289–1294.
Tran, M.-N., Nguyen, N., Nott, D., and Kohn, R. (2018), “Bayesian Deep Net GLM and GLMM,” arXiv preprint arXiv:1805.10157.
Triefenbach, F., Jalalvand, A., Demuynck, K., and Martens, J. (2013), “Acoustic modeling with hierarchical reservoirs,” IEEE Transactions on Audio, Speech, and Language Processing, 21, 2439–2450.
Article Google Scholar
Wikle, C., Zammit-Mangion, A., and Cressie, N. (2019), Spatio-Temporal Statistics with R, Boca Raton, FL: Chapman and Hall/CRC.
Book Google Scholar
Wikle, C. K., Berliner, L. M., and Cressie, N. (1998), “Hierarchical Bayesian space-time models,” Environmental and Ecological Statistics, 5, 117–154.
Article Google Scholar
Wikle, C. K. and Hooten, M. B. (2010), “A general science-based framework for dynamical spatio-temporal models,” Test, 19, 417–451.
Article MathSciNet MATH Google Scholar
Wikle, C. K., Milliff, R. F., Nychka, D., and Berliner, L. M. (2001), “Spatiotemporal hierarchical Bayesian modeling tropical ocean surface winds,” Journal of the American Statistical Association, 96, 382–397.
Article MathSciNet MATH Google Scholar
Xingjian, S., Chen, Z., Wang, H., Yeung, D.-Y., Wong, W.-K., and Woo, W.-c. (2015), “Convolutional LSTM network: A machine learning approach for precipitation nowcasting,” in Advances in neural information processing systems, pp. 802–810.

Download references

Acknowledgements

This work was partially supported by the US National Science Foundation (NSF) and the US Census Bureau under NSF Grant SES-1132031, funded through the NSF-Census Research Network (NCRN) program, and NSF Award DMS-1811745. The author would like to thank Brian Reich for encouraging the writing of this paper, Patrick McDermott for helpful discussions, Nathan Wikle for providing helpful comments on an early draft, and Jennifer Hoeting for encouraging and helpful review comments.

Author information

Authors and Affiliations

Department of Statistics, University of Missouri, Columbia, MO, 65211, USA
Christopher K. Wikle

Authors

Christopher K. Wikle
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Christopher K. Wikle.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Wikle, C.K. Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data. JABES 24, 175–203 (2019). https://doi.org/10.1007/s13253-019-00361-7

Download citation

Received: 16 February 2019
Accepted: 18 March 2019
Published: 28 March 2019
Issue Date: 15 June 2019
DOI: https://doi.org/10.1007/s13253-019-00361-7

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

Abstract

Access this article

Similar content being viewed by others

Machine learning algorithms to forecast air quality: a survey

Assessment of machine learning models to predict daily streamflow in a semiarid river catchment

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Comparison of Deep Neural Networks and Deep Hierarchical Models for Spatio-Temporal Data

Abstract

Access this article

Similar content being viewed by others

Machine learning algorithms to forecast air quality: a survey

Assessment of machine learning models to predict daily streamflow in a semiarid river catchment

Hydrologic interpretation of machine learning models for 10-daily streamflow simulation in climate sensitive upper Indus catchments

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation