# Supervised Component Generalized Linear Regression with Multiple Explanatory Blocks: THEME-SCGLR

## Abstract

We address component-based regularization of a multivariate Generalized Linear Model (GLM). A set of random responses *Y* is assumed to depend, through a GLM, on a set *X* of explanatory variables, as well as on a set *T* of additional covariates. *X* is partitioned into *R* conceptually homogeneous blocks *X*_{1}, *…*, *X*_{ R }, viewed as explanatory *themes*. Variables in each *X*_{ r } are assumed many and redundant. Thus, generalized linear regression demands regularization with respect to each *X*_{ r }. By contrast, variables in *T* are assumed selected so as to demand no regularization. Regularization is performed searching each *X*_{ r } for an appropriate number of orthogonal components that both contribute to model *Y* and capture relevant structural information in *X*_{ r }. We propose a very general criterion to measure structural relevance (SR) of a component in a block, and show how to take SR into account within a Fisher-scoring-type algorithm in order to estimate the model. We show how to deal with mixed-type explanatory variables. The method, named THEME-SCGLR, is tested on simulated data, and then applied to rainforest data in order to model the abundance of tree-species.

## Keywords

Component-based regularization Generalized linear model (GLM) Regularization## Notes

### Acknowledgements

This research was supported by the CoForChange project (http://www.coforchange.eu/) funded by the ERA-Net BiodivERsA with the national funders ANR (France) and NERC (UK), part of the 2008 BiodivERsA call for research proposals involving 16 European, African and international partners including a number of timber companies (see the list on the website, http://www.coforchange.eu/partners), and by the CoForTips project funded by the ERA-Net BiodivERsA with the national funders FWF (Austria), BelSPO (Belgium) and ANR (France), part of the 2011–2012 BiodivERsA call for research proposals (http://www.biodiversa.org/519).

## References

- Bry, X., Trottier, C., Verron, T., Mortier, F.: Supervised component generalized linear regression using a PLS-extension of the fisher scoring algorithm. In: COMPSTAT 2012 Proceedings, Limassol (2012)Google Scholar
- Bry, X., Trottier, C., Verron, T., Mortier, F.: Supervised component generalized linear regression using a PLS-extension of the fisher scoring algorithm. J. Multivar. Anal.
**119**, 47–60 (2013)MathSciNetCrossRefzbMATHGoogle Scholar - Marx, B.D.: Iteratively reweighted partial least squares estimation for generalized regression. Technometrics
**38**(4), 374–381 (1996)MathSciNetCrossRefzbMATHGoogle Scholar - Nelder, J.A., Wedderburn, R.W.M.: Generalized linear models. J. R. Stat. Soc.: Ser. A
**135**, 370–384 (1972)Google Scholar