Sparse matrix transform based weight updating in partial least squares regression
Regression from high dimensional observation vectors is particularly difficult when training data is limited. Partial least squares (PLS) partly solves the high dimensional regression problem by projecting the data to latent variables space. The key issue in PLS is the computation of weight vector which describes the covariance between the responses and observations. For small-sample-size and high-dimensional regression problem, the covariance estimation is usually inaccurate and the correlated components in the predictors will distort the PLS weight. In this paper, we propose a sparse matrix transform (SMT) based PLS (SMT-PLS) method for high-dimensional spectroscopy regression. In SMT-PLS, the observation data is first decorrelated by SMT. Then, in the decorrelated data space, the PLS loading weight is computed by least squares regression. SMT technique provides an accurate data covariance estimation, which can overcome the effect of small-sample-size and benefit both the PLS weight computation and subsequent regression prediction. The proposed SMT-PLS method is compared, in terms of root mean square errors of prediction, to PLS, Power PLS and PLS with orthogonal scatter correction on four real spectroscopic data sets. Experimental results demonstrate the efficacy and effectiveness of our proposed method.
KeywordsPartial least squares Sparse matrix transform Weight updating High-dimensional small-sample Spectroscopy regression
This work was supported by the National Natural Science Foundation of China under Grants No. 11071058.
- 1.R. Tibshirani, J. R. Stat. Soc. B 58, 267 (1996)Google Scholar
- 2.H. Wold, Perspectives in Probability and Statistics (Academic Press, London, 1975)Google Scholar
- 3.H. Martens, T. Nas, Multivariate Calibration (Wiley, New York, 1989)Google Scholar
- 4.G. Cao, C.A. Bouman, Adv. Neural Inf. Process. Syst. 21, 225 (2009)Google Scholar