Outlier Detection and Robust Variable Selection for Least Angle Regression
The problem of selecting a parsimonious subset of variables from a large number of predictors in a regression model is a topic of high importance. When the data contains vertical outliers and/or leverage points, outlier detection and variable selection are inseparable problems. Therefore a robust method that can simultaneously detect outliers and select variables is needed. An outlier detection and robust variable selection method is introduced that combines robust least angle regression with least trimmed squares regression on jack-knife subsets. In a second stage the detected outliers are removed and standard least angle regression is applied on the cleaned data to robustly sequence the predictor variables in order of importance. The performance of this method is evaluated by simulations that contain vertical outliers and high leverage points. The results of the simulation study show the good performance of this method in both outlier detection and robust variable selection.
KeywordsOutlier Detection Robust Variable Selection Least Angle Regression
Unable to display preview. Download preview PDF.
- 6.Maronna, R.A., Martin, R.D., Yohai, V.J.: Robust Statistics: Theory and Methods. J. Wiley & Sons (2006)Google Scholar
- 14.Efron, B.: The jackknife, the bootstrap and other resampling plans, vol. 38. SIAM NSF-CBMS (1982)Google Scholar
- 17.Hubert, M., Rousseeuw, P.J., Van Aelst, S.: High-breakdown robust multivariate methods. Statistical Science, 92–119 (2008)Google Scholar
- 19.Core Team, R.: R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna (2012)Google Scholar
- 20.Hastie, T., Efron, B.: lars: Least Angle Regression, Lasso and Forward Stagewise, R package version 1.2 (2013)Google Scholar
- 21.Alfons, A.: robustHD: Robust methods for high-dimensional data, R package version 0.4.0 (2013)Google Scholar