Abstract
In the era of Big Data, selecting relevant variables from a potentially large pool of candidate variables becomes a newly emerged concern in macroeconomic researches, especially when the data available is high-dimensional, i.e., the number of explanatory variables (p) is greater than the length of the sample size (n). Common approaches include factor models, the principal component analysis, and regularized regressions. However, these methods require additional assumptions that are hard to verify and/or introduce biases or aggregated factors which complicate the interpretation of the estimated outputs. This chapter reviews an alternative solution, namely Boosting, which is able to estimate the variables of interest consistently under fairly general conditions given a large set of explanatory variables. Boosting is fast and easy to implement which makes it one of the most popular machine learning algorithms in academia and industry.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Adam, T., Mayr, A., & Kneib, T. (2017). Gradient boosting in Markov-switching generalized additive models for location, scale and shape (Unpublished paper). https://www.semanticscholar.org/paper/Gradient-boosting-in-Markov-switching-generalized-Adam-Mayr/ee085649e4fe36bb2f58015b6dc29870f34fa45e
Audrino, F., & Bühlmann, P. (2016). Volatility estimation with functional gradient descent for very high-dimensional financial time series. The Journal of Computational Finance, 6(3), 65–89.
Bai, J. (2009). Panel data models with interactive fixed effects. Econometrica, 77(4), 1229–1279.
Bartlett, P. L., Jordan, M. I., & McAuliffe, J. D. (2006). Convexity, classification, and risk bounds. Journal of the American Statistical Association, 101(473), 138–156.
Bartlett, P. L., & Traskin, M. (2007). AdaBoost is consistent. Journal of Machine Learning Research, 8, 2347–2368.
Bollerslev, T. (1986). Generalized autoregressive conditional heteroskedasticity. Journal of Econometrics, 31(3), 307–327.
Breiman, L. (2004). Population theory for Boosting ensembles. Annals of Statistics, 32(1), 1–11.
Bühlmann, P. (2003). Bagging, subagging and bragging for improving some prediction algorithms. Zürich: Seminar für Statistik, Eidgenössische Technische Hochschule (ETH).
Bühlmann, P. (2006). Boosting for high-dimensional linear models. Annals of Statistics, 34(2), 559–583.
Chan, K. S., & Tong, H. (1986). On estimating thresholds in autoregressive models. Journal of Time Series Analysis, 7(3), 179–190.
Chen, R., & Tsay, R. S. (1993). Nonlinear additive ARX models. Journal of the American Statistical Association, 88(423), 955–967.
Chu, J., Lee, T.-H., & Ullah, A. (2018). Component-wise AdaBoost algorithms for high-dimensional binary classification and class probability prediction. In Handbook of statistics. Amsterdam: Elsevier.
Feng, J., Yu, Y., & Zhou, Z.-H. (2018). Multi-layered gradient boosting decision trees. In Proceedings of the 32nd Conference on Neural Information Processing Systems.
Freund, Y., & Schapire, R. E. (1997). A decision-theoretic generalization of on-line learning and an application to Boosting. Journal of Computer and System Sciences, 55, 119–139.
Friedman, J. H. (2001). Greedy function approximation: a Gradient boosting machine. The Annals of Statistics, 29, 1189–1232.
Friedman, J. H. (2002). Stochastic gradient boosting. Computational Statistics and Data Analysis, 38(4), 367–378.
Friedman, J. H., Hastie, T., & Tibshirani, R. (2000). Additive logistic regression: A statistical view of boosting. Annals of Statistics, 28(2), 337–407.
Hastie, T., Tibshirani, R., & Friedman, J. (2009). The Elements of Statistical Learning. Berlin: Springer
MatÃas, J., Febrero-Bande, M., González-Manteiga, W., & Reboredo, J. (2010). Boosting GARCH and neural networks for the prediction of heteroskedastic time series. Mathematical and Computer Modelling, 51(3–4), 256–271.
Mease, D., Wyner, A., & Buja, A. (2007). Cost-weighted boosting with jittering and over/under-sampling: Jous-boost. Journal of Machine Learning Research, 8, 409–439.
Mukherjee, I., Canini, K., Frongillo, R., & Singer, Y. (2013). Parallel boosting with momentum. In Blockeel, H., Kersting, K., Nijssen, S., Železný, F. (Eds.), Machine learning and knowledge discovery in databases. ECML PKDD 2013. Lecture notes in computer science (Vol. 8190). Berlin: Springer.
Ng, S. (2014). Viewpoint: Boosting recessions. Canadian Journal of Economics, 47(1), 1–34.
Ridgeway, G. (2007). Generalized boosted models: A guide to the gbm package (Technical Report No. 4).
Robinzonov, N., Tutz, G., & Hothorn, T. (2012). Boosting techniques for nonlinear time series models. AStA Advances in Statistical Analysis, 96(1), 99–122.
Rossi, A. G., & Timmermann, A. (2015). Modeling covariance risk in Merton’s ICAPM. Review of Financial Studies, 28(5), 1428–1461.
Schapire, R. E. (1990). The strength of weak learnability. Machine Learning, 5, 197–227.
Tong, H., & Lim, K. S. (1980). Threshold autoregression, limit cycles and cyclical data. Journal of the Royal Statistical Society. Series B (Methodological), 42(3), 245–292.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2020 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Chu, J., Lee, TH., Ullah, A., Wang, R. (2020). Boosting. In: Fuleky, P. (eds) Macroeconomic Forecasting in the Era of Big Data. Advanced Studies in Theoretical and Applied Econometrics, vol 52. Springer, Cham. https://doi.org/10.1007/978-3-030-31150-6_14
Download citation
DOI: https://doi.org/10.1007/978-3-030-31150-6_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-31149-0
Online ISBN: 978-3-030-31150-6
eBook Packages: Economics and FinanceEconomics and Finance (R0)