Supervised Component Generalized Linear Regression with Multiple Explanatory Blocks: THEME-SCGLR
We address component-based regularization of a multivariate Generalized Linear Model (GLM). A set of random responses Y is assumed to depend, through a GLM, on a set X of explanatory variables, as well as on a set T of additional covariates. X is partitioned into R conceptually homogeneous blocks X1, …, X R , viewed as explanatory themes. Variables in each X r are assumed many and redundant. Thus, generalized linear regression demands regularization with respect to each X r . By contrast, variables in T are assumed selected so as to demand no regularization. Regularization is performed searching each X r for an appropriate number of orthogonal components that both contribute to model Y and capture relevant structural information in X r . We propose a very general criterion to measure structural relevance (SR) of a component in a block, and show how to take SR into account within a Fisher-scoring-type algorithm in order to estimate the model. We show how to deal with mixed-type explanatory variables. The method, named THEME-SCGLR, is tested on simulated data, and then applied to rainforest data in order to model the abundance of tree-species.
KeywordsComponent-based regularization Generalized linear model (GLM) Regularization
This research was supported by the CoForChange project (http://www.coforchange.eu/) funded by the ERA-Net BiodivERsA with the national funders ANR (France) and NERC (UK), part of the 2008 BiodivERsA call for research proposals involving 16 European, African and international partners including a number of timber companies (see the list on the website, http://www.coforchange.eu/partners), and by the CoForTips project funded by the ERA-Net BiodivERsA with the national funders FWF (Austria), BelSPO (Belgium) and ANR (France), part of the 2011–2012 BiodivERsA call for research proposals (http://www.biodiversa.org/519).
- Bry, X., Trottier, C., Verron, T., Mortier, F.: Supervised component generalized linear regression using a PLS-extension of the fisher scoring algorithm. In: COMPSTAT 2012 Proceedings, Limassol (2012)Google Scholar
- Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. R. Stat. Soc.: Ser. A 135, 370–384 (1972)Google Scholar