Skip to main content

Elemental Estimates, Influence, and Algorithmic Leveraging

  • Conference paper
  • First Online:
Nonparametric Statistics (ISNPS 2016)

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 250))

Included in the following conference series:

  • 1075 Accesses

Abstract

It is well-known (Subrahmanyam, Sankhya Ser B 34:355–356, 1972; Mayo and Gray, Am Stat 51:122–129, 1997) that the ordinary least squares estimate can be expressed as a weighted sum of so-called elemental estimates based on subsets of p observations where p is the dimension of parameter vector. The weights can be viewed as a probability distribution on subsets of size p of the predictors {x i : i = 1, ⋯ , n}. In this contribution, we derive the lower dimensional distributions of this p dimensional distribution and define a measure of potential influence for subsets of observations analogous to the diagonal elements of the “hat” matrix for single observations. This theory is then applied to algorithmic leveraging, which is a method for approximating the ordinary least squares estimates using a particular form of biased subsampling.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Andrews, D. F., & Pregibon, D. (1978). Finding the outliers that matter. Journal of the Royal Statistical Society. Series B, 40, 85–93.

    MATH  Google Scholar 

  2. Chatterjee, S., & Hadi, A. S. (1986). Influential observations, high leverage points, and outliers in linear regression. Statistical Science, 1, 379–393.

    Article  MathSciNet  Google Scholar 

  3. Cook, R. D. (1977). Detection of influential observations in linear regression. Technometrics, 19, 15–18.

    MathSciNet  MATH  Google Scholar 

  4. Draper, N. R., & John, J. A. (1981). Influential observations and outliers in regression. Technometrics, 23, 21–26.

    Article  MathSciNet  Google Scholar 

  5. Drineas, P., Magdon-Ismail, M., Mahoney, M. W., & Woodruff, D. P. (2012). Fast approximation of matrix coherence and statistical leverage. Journal of Machine Learning Research, 13, 3475–3506.

    MathSciNet  MATH  Google Scholar 

  6. Drineas, P., Mahoney, M. W., Muthukrishnan, S., & Sarlós, T. (2011). Faster least squares approximation. Numerische Mathematik, 117, 219–249.

    Article  MathSciNet  Google Scholar 

  7. Gao, K. (2016). Statistical inference for algorithmic leveraging. Preprint, arXiv:1606.01473.

    Google Scholar 

  8. Golberg, M. A. (1972). The derivative of a determinant. The American Mathematical Monthly, 79, 1124–1126.

    Article  MathSciNet  Google Scholar 

  9. Hoaglin, D. C., & Welsch, R. E. (1978). The hat matrix in regression and ANOVA. The American Statistician, 32, 17–22.

    MATH  Google Scholar 

  10. Hoerl, A. E., & Kennard, R. W. (1980). M30. A note on least squares estimates. Communications in Statistics–Simulation and Computation, 9, 315–317.

    Article  Google Scholar 

  11. Little, J. K. (1985). Influence and a quadratic form in the Andrews-Pregibon statistic. Technometrics, 27, 13–15.

    MathSciNet  Google Scholar 

  12. Ma, P., Mahoney, M. W., & Yu, B. (2015). A statistical perspective on algorithmic leveraging. Journal of Machine Learning Research, 16, 861–911.

    MathSciNet  MATH  Google Scholar 

  13. Ma, P., & Sun, X. (2015). Leveraging for big data regression. Wiley Interdisciplinary Reviews: Computational Statistics, 7, 70–76.

    Article  MathSciNet  Google Scholar 

  14. Mayo, M. S., & Gray, J. B. (1997). Elemental subsets: The building blocks of regression. The American Statistician, 51, 122–129.

    Google Scholar 

  15. Nurunnabi, A. A. M., Hadi, A. S., & Imon, A. H. M. R. (2014). Procedures for the identification of multiple influential observations in linear regression. Journal of Applied Statistics, 41, 1315–1331.

    Article  MathSciNet  Google Scholar 

  16. Subrahmanyam, M. (1972). A property of simple least squares estimates. Sankhya Series B, 34, 355–356.

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to K. Knight .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Knight, K. (2018). Elemental Estimates, Influence, and Algorithmic Leveraging. In: Bertail, P., Blanke, D., Cornillon, PA., Matzner-Løber, E. (eds) Nonparametric Statistics. ISNPS 2016. Springer Proceedings in Mathematics & Statistics, vol 250. Springer, Cham. https://doi.org/10.1007/978-3-319-96941-1_15

Download citation

Publish with us

Policies and ethics