Abstract
In this chapter, we define a general nonparametric estimator of a d-variate function valued parameter ψ 0. This parameter is defined as a minimizer of an expectation of a loss function L(ψ)(O) that is guaranteed to converge to the true ψ 0 at a rate faster than n −1∕4, for all dimensions d: \(\sqrt{ d_{0}(\psi _{n},\psi _{0})} = O_{P}(n^{-1/4-\alpha (d)/8})\), where d 0(ψ, ψ 0) = P 0 L(ψ) − P 0 L(ψ 0) is the loss-based dissimilarity. This is a remarkable result because this rate does not depend on the underlying smoothness of ψ 0. For example, ψ 0 can be a function that is discontinuous at many points or nondifferentiable. The only assumption we need to assume is that ψ 0 is right-continuous with left-hand limits, and has a finite variation norm, so that ψ 0 generates a measure (just as a cumulative distribution function generates a measure on the Euclidean space).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
A. Afifi, S. Azen, Statistical Analysis: A Computer Oriented Approach, 2nd edn. (Academic, New York, 1979)
D. Benkeser, M.J. van der Laan, The highly adaptive lasso estimator, in IEEE International Conference on Data Science and Advanced Analytics, pp. 689–696 (2016)
L. Breiman, Random forests. Mach. Learn. 45, 5–32 (2001)
L. Breiman, J.H. Friedman, R. Olshen, C.J. Stone, Classification and Regression Trees (Chapman & Hall, Boca Raton, 1984)
J.H. Friedman, Multivariate adaptive regression splines. Ann. Stat. 19(1), 1–141 (1991)
J.H. Friedman, Greedy function approximation: a gradient boosting machine. Ann. Stat. 29, 1189–1232 (2001)
M.A. Hearst, S.T Dumais, E. Osman, J. Platt, B. Scholkopf. Support vector machines. IEEE Intell. Syst. Appl. 13(4), 18–28 (1998)
D. Kibler, D.W. Aha, M.K. Albert, Instance-based prediction of real-valued attributes. Comput. Intell. 5, 51 (1989)
Z. Liu, T. Stengos, Nonlinearities in cross country growth regressions: a semiparametric approach. J. Appl. Econom. 14, 527–538 (1999)
E.A. Nadaraya, On estimating regression. Theory Probab. Appl. 9(1), 141–142 (1964)
B. Rosner, Fundamentals of Biostatistics, 5th edn. (Duxbury, Pacific Grove, 1999)
J.W. Smith, J.E. Everhart, W.C. Dickson, W.C. Knowler, R.S. Johannes, Using the adap learning algorithm to forecast the onset of diabetes mellitus, in Proceedings of the Annual Symposium on Computer Application in Medical Care (American Medical Informatics Association, Bethesda, 1988), p. 261
A.W. van der Vaart, J.A. Wellner, Weak Convergence and Empirical Processes (Springer, Berlin, Heidelberg, New York, 1996)
A.W. van der Vaart, J.A. Wellner, A local maximal inequality under uniform entropy. Electron. J. Stat. 5, 192–203 (2011)
G.S. Watson, Smooth regression analysis. Sankhyā Indian J. Stat. Ser. A 359–372 (1964)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
Copyright information
© 2018 Springer International Publishing AG
About this chapter
Cite this chapter
van der Laan, M.J., Benkeser, D. (2018). Highly Adaptive Lasso (HAL). In: Targeted Learning in Data Science. Springer Series in Statistics. Springer, Cham. https://doi.org/10.1007/978-3-319-65304-4_6
Download citation
DOI: https://doi.org/10.1007/978-3-319-65304-4_6
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-65303-7
Online ISBN: 978-3-319-65304-4
eBook Packages: Mathematics and StatisticsMathematics and Statistics (R0)