Selection of sparse vine copulas in high dimensions with the Lasso
- 55 Downloads
We propose a novel structure selection method for high-dimensional (\(d > 100\)) sparse vine copulas. Current sequential greedy approaches for structure selection require calculating spanning trees in hundreds of dimensions and fitting the pair copulas and their parameters iteratively throughout the structure selection process. Our method uses a connection between the vine and structural equation models. The later can be estimated very fast using the Lasso, also in very high dimensions, to obtain sparse models. Thus, we obtain a structure estimate independently of the chosen pair copulas and parameters. Additionally, we define the novel concept of regularization paths for R-vine matrices. It relates sparsity of the vine copula model in terms of independence copulas to a penalization coefficient in the structural equation models. We illustrate our approach and provide many numerical examples. These include simulations and data applications in high dimensions, showing the superiority of our approach to other existing methods.
KeywordsDependence modeling Vine copula Lasso Sparsity
The first author acknowledges financial support by a research stipend of the Technische Universität München. The second author is supported by the German Research Foundation (DFG Grant CZ 86/4-1). Numerical calculations were performed on a Linux cluster supported by DFG Grant INST 95/919-1 FUGG.
- Akaike, H.: Information theory and an extension of the maximum likelihood principle. In: Petrov, B.N., Csaki, F. (eds.) Proceedings of the Second International Symposium on Information Theory Budapest, pp. 267–281. Akademiai Kiado, Budapest (1973)Google Scholar
- Brechmann, E.C., Schepsmeier, U.: Modeling dependence with C- and D-vine copulas: the R package CDVine. J. Stat. Softw. 52(3), 1–27 (2013), http://www.jstatsoft.org/v52/i03/
- Friedman, J., Hastie, T., Tibshirani, R.: Sparse inverse covariance estimation with the graphical lasso. Biostatistics 9(3), 432 (2008). https://doi.org/10.1093/biostatistics/kxm045
- Friedman, J., Hastie, T., Tibshirani, R.: Regularization paths for generalized linear models via coordinate descent. J. Stat. Softw. 33(1), 1–22 (2010), http://www.jstatsoft.org/v33/i01/
- Frommlet F, Chakrabarti A, Murawska M, Bogdan M (2011) Asymptotic Bayes optimality under sparsity for generally distributed effect sizes under the alternative. Technical report, arXiv:1005.4753
- Gruber L, Czado C (2015a) Bayesian model selection of regular vine copulas. Preprint https://www.statistics.ma.tum.de/fileadmin/w00bdb/www/LG/bayes-vine.pdf
- Hoyle, R.H.: Structural Equation Modeling, 1st edn. SAGE Publications, Thousand Oaks (1995)Google Scholar
- Kurowicka, D., Joe, H.: Dependence Modeling—Handbook on Vine Copulae. World Scientific Publishing Co., Singapore (2011)Google Scholar
- Müller, D., Czado, C.: Representing sparse Gaussian DAGs as sparse R-vines allowing for non-Gaussian dependence. J. Comput. Graph. Stat. (2017). https://doi.org/10.1080/10618600.2017.1366911
- Schepsmeier U, Stöber J, Brechmann EC, Graeler B, Nagler T, Erhardt T (2016) VineCopula: Statistical Inference of Vine Copulas. https://github.com/tnagler/VineCopula, r package version 2.0.6