Abstract
In this paper, we consider the detection of multiple influential observations in high dimensional regression, where the p number of covariates is much larger than sample size n. Detection of influential observations in high dimensional regression is challenging. In the case of single influential observation, Zhao et al. (2013) developed a method called High dimensional Influence Measure (HIM). However, the result of HIM is not applicable to the case of multiple influential observations, where the detection of influential observations is much more complicated than the case of single influential observation. We propose in this paper a new method based on the multiple deletion to detect the multiple influential.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Cook, R.D.: Detection of influential observation in linear regression. Technometrics 19, 15–18 (1977)
Behnken, D.W., Draper, N.R.: Residuals and their variance patterns. Technometrics 14, 101–111 (1972)
Belsley, D.A., Kuh, E., Welsch, R.E.: The Grid: Regression Diagnostics: Identifying Influential Data and Sources of Collinearity. Wiley, New York (2005)
Chatterjee, S., Hadi, A.S.: The Grid: Sensitivity Analysis in Linear Regression. Wiley, New York (1988)
Pena, D.: A new statistic for influence in linear regression. Technometrics 47(1), 1–12 (2005)
Pena, D.: Measures of Influence and Sensitivity in Linear Regression, pp. 523–536. Springer, London (2006). Springer Handbook of Engineering Statistics
Nurunnabi, A.A.M., Imon, A.H.M.R., Nasser, M.: A diagnostic measure for influential observations in linear regression. Commun. Stat. Theor. Methods 40, 1169–1183 (2011)
Nurunnabi, A.A.M., Hadi, A.S., Imon, A.H.M.R.: Procedures for the identification of multiple influential observations in linear regression. J. Appl. Stat. 41, 1315–1331 (2014)
Imon, A.R., Hadi, A.S.: Identification of multiple outliers in logistic regression. Commun. Stat. Theor. Methods 37, 1697–1709 (2008)
Imon, A.R., Hadi, A.S.: Identification of multiple high leverage points in logistic regression. J. Appl. Stat. 40, 2601–2616 (2013)
Zakaria, A., Howard, N.K., Nkansah, B.K.: On the detection of influential outliers in linear regression analysis. Am. J. Theor. Appl. Stat. 3, 100–106 (2014)
Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. Roy. Stat. Soc. B 58, 267–288 (1996)
Fan, J., Lv, J.: Sure independence screening for ultrahigh dimensional feature space. J. Roy. Stat. Soc. Ser. B (Stat. Methodol.) 70, 849–911 (2008)
Wang, H., Li, G., Jiang, G.: Robust regression shrinkage and consistent variable selection through the LAD-Lasso. J. Bus. Econ. Stat. 25, 347–355 (2007)
She, Y., Owen, A.B.: Outlier detection using nonconvex penalized regression. J. Am. Stat. Assoc. 106, 626–639 (2011)
Rahmatullah Imon, A.H.M.: Identifying multiple influential observations in linear regression. J. Appl. Stat. 32, 929–946 (2005)
Pan, J.X., Fung, W.K., Fang, K.T.: Multiple outlier detection in multivariate data using projection pursuit techniques. J. Stat. Plann. Infer. 83(1), 153–167 (2000)
Benjamini, Y., Hochberg, Y.: Controlling the false discovery rate: a practical and powerful approach to multiple testing. J. Roy. Stat. Soc.: Ser. B (Methodol.) 57, 289–300 (1995)
Acknowledgements
The research of Zhao was supposed by National Science Foundation of China (No. 11471030, 11101022) and the Fundamental Research Funds for the Central Universities.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Zhao, J., Zhang, Y., Niu, L. (2015). Detecting Multiple Influential Observations in High Dimensional Linear Regression. In: Huang, DS., Han, K. (eds) Advanced Intelligent Computing Theories and Applications. ICIC 2015. Lecture Notes in Computer Science(), vol 9227. Springer, Cham. https://doi.org/10.1007/978-3-319-22053-6_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-22053-6_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-22052-9
Online ISBN: 978-3-319-22053-6
eBook Packages: Computer ScienceComputer Science (R0)