Abstract
It is well-known (Subrahmanyam, Sankhya Ser B 34:355–356, 1972; Mayo and Gray, Am Stat 51:122–129, 1997) that the ordinary least squares estimate can be expressed as a weighted sum of so-called elemental estimates based on subsets of p observations where p is the dimension of parameter vector. The weights can be viewed as a probability distribution on subsets of size p of the predictors {x i : i = 1, ⋯ , n}. In this contribution, we derive the lower dimensional distributions of this p dimensional distribution and define a measure of potential influence for subsets of observations analogous to the diagonal elements of the “hat” matrix for single observations. This theory is then applied to algorithmic leveraging, which is a method for approximating the ordinary least squares estimates using a particular form of biased subsampling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Andrews, D. F., & Pregibon, D. (1978). Finding the outliers that matter. Journal of the Royal Statistical Society. Series B, 40, 85–93.
Chatterjee, S., & Hadi, A. S. (1986). Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1, 379–393.
Cook, R. D. (1977). Detection of influential observations in linear regression. Technometrics, 19, 15–18.
Draper, N. R., & John, J. A. (1981). Influential observations and outliers in regression. Technometrics, 23, 21–26.
Drineas, P., Magdon-Ismail, M., Mahoney, M. W., & Woodruff, D. P. (2012). Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13, 3475–3506.
Drineas, P., Mahoney, M. W., Muthukrishnan, S., & Sarlós, T. (2011). Faster least squares approximation. Numerische Mathematik, 117, 219–249.
Gao, K. (2016). Statistical inference for algorithmic leveraging. Preprint, arXiv:1606.01473.
Golberg, M. A. (1972). The derivative of a determinant. The American Mathematical Monthly, 79, 1124–1126.
Hoaglin, D. C., & Welsch, R. E. (1978). The hat matrix in regression and ANOVA. The American Statistician, 32, 17–22.
Hoerl, A. E., & Kennard, R. W. (1980). M30. A note on least squares estimates. Communications in Statistics–Simulation and Computation, 9, 315–317.
Little, J. K. (1985). Influence and a quadratic form in the Andrews-Pregibon statistic. Technometrics, 27, 13–15.
Ma, P., Mahoney, M. W., & Yu, B. (2015). A statistical perspective on algorithmic leveraging. Journal of Machine Learning Research, 16, 861–911.
Ma, P., & Sun, X. (2015). Leveraging for big data regression. Wiley Interdisciplinary Reviews: Computational Statistics, 7, 70–76.
Mayo, M. S., & Gray, J. B. (1997). Elemental subsets: The building blocks of regression. The American Statistician, 51, 122–129.
Nurunnabi, A. A. M., Hadi, A. S., & Imon, A. H. M. R. (2014). Procedures for the identification of multiple influential observations in linear regression. Journal of Applied Statistics, 41, 1315–1331.
Subrahmanyam, M. (1972). A property of simple least squares estimates. Sankhya Series B, 34, 355–356.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Knight, K. (2018). Elemental Estimates, Influence, and Algorithmic Leveraging. In: Bertail, P., Blanke, D., Cornillon, PA., Matzner-Løber, E. (eds) Nonparametric Statistics. ISNPS 2016. Springer Proceedings in Mathematics & Statistics, vol 250. Springer, Cham. https://doi.org/10.1007/978-3-319-96941-1_15
Download citation
DOI: https://doi.org/10.1007/978-3-319-96941-1_15
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-96940-4
Online ISBN: 978-3-319-96941-1
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)