Skip to main content

Sparse Density Estimation with ℓ1 Penalties

  • Conference paper
Learning Theory (COLT 2007)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4539))

Included in the following conference series:

Abstract

This paper studies oracle properties of ℓ1-penalized estimators of a probability density. We show that the penalized least squares estimator satisfies sparsity oracle inequalities, i.e., bounds in terms of the number of non-zero components of the oracle vector. The results are valid even when the dimension of the model is (much) larger than the sample size. They are applied to estimation in sparse high-dimensional mixture models, to nonparametric adaptive density estimation and to the problem of aggregation of density estimators.

Research of F. Bunea and M. Wegkamp is supported in part by NSF grant DMS 0406049.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Abramovich, F., Benjamini, Y., Donoho, D.L., Johnstone, I.M: Adapting to unknown sparsity by controlling the False Discovery Rate. Annals of Statistics 34, 584–653 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Barron, A., Birgé, L., Massart, P.: Risk bounds for model selection via penalization. Probability Theory and Related Fields  113, 301–413 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  • Birgé, L., Massart, P.: From model selection to adaptive estimation. Festschrift for Lucien LeCam. In: Pollard, D., Torgersen, E., Yang, G. (eds.) Research Papers in Probability and Statistics, pp. 55–87. Springer, New York (1997)

    Google Scholar 

  • Bunea, F., Tsybakov, A.B., Wegkamp, M.H: Aggregation for Gaussian regression. Preprint Department of Statistics, Florida State University. Annals of Statistics, to appear (2005)

    Google Scholar 

  • Bunea, F., Tsybakov, A.B., Wegkamp, M.H.: Aggregation and sparsity via ℓ1-penalized least squares. In: Lugosi, G., Simon, H.U. (eds.) COLT 2006. LNCS (LNAI), vol. 4005, pp. 379–391. Springer, Heidelberg (2006a)

    Chapter  Google Scholar 

  • Bunea, F., Tsybakov, A.B., Wegkamp, M.H: b). Sparsity oracle inequalities for the Lasso. Submitted (2006)

    Google Scholar 

  • Chen, S., Donoho, D., Saunders, M.: Atomic decomposition by basis pursuit. SIAM Review 43, 129–159 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Devroye, L., Lugosi, G.: Combinatorial Methods in density estimation. Springer, Heidelberg (2000)

    Google Scholar 

  • Donoho, D.L.: Denoising via soft-thresholding. IEEE Trans. Info. Theory 41, 613–627 (1995)

    Article  MATH  MathSciNet  Google Scholar 

  • Donoho, D.L., Elad, M., Temlyakov, V.: Stable Recovery of Sparse Overcomplete Representations in the Presence of Noise. Manuscript (2004)

    Google Scholar 

  • Donoho, D.L., Huo, X.: Uncertainty principles and ideal atomic decomposition. IEEE Transactions Inform. Theory 47, 2845–2862 (2001)

    Article  MATH  MathSciNet  Google Scholar 

  • Golubev, G.K.: Nonparametric estimation of smooth probability densties in L 2. Problems of Information Transmission 28, 44–54 (1992)

    MathSciNet  Google Scholar 

  • Golubev, G.K.: Reconstruction of sparse vectors in white Gaussian noise. Problems of Information Transmission 38, 65–79 (2002)

    Article  MATH  MathSciNet  Google Scholar 

  • Greenshtein, E., Ritov, Y.: Persistency in high dimensional linear predictor-selection and the virtue of over-parametrization. Bernoulli 10, 971–988 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  • Hall, P., Kerkyacharian, G., Picard, D.: Block threshold rules for curve estimation using kernel and wavelet methods. Annals of Statistics 26, 922–942 (1998)

    Article  MATH  MathSciNet  Google Scholar 

  • Härdle, W., Kerkyacharian, G., Picard, D., Tsybakov, A.: Wavelets, Approximation and Statistical Applications. Lecture Notes in Statistics, vol. 129, Springer, New York (1998)

    MATH  Google Scholar 

  • Kerkyacharian, G., Picard, D., Tribouley, K.: L p adaptive density estimation. Bernoulli 2, 229–247 (1996)

    Article  MATH  MathSciNet  Google Scholar 

  • Koltchinskii, V.: Model selection and aggregation in sparse classification problems. Mathematisches Forschungsinstitut Oberwolfach 2, 2663–2667 (2005)

    Google Scholar 

  • Koltchinskii, V.: Sparsity in penalized empirical risk minimization. Submitted (2006)

    Google Scholar 

  • Loubes, J.– M., van de Geer, S.A.: Adaptive estimation in regression, using soft thresholding type penalties. Statistica Neerlandica 56, 453–478 (2002)

    Article  Google Scholar 

  • Meinshausen, N., Bühlmann, P.: High-dimensional graphs and variable selection with the Lasso. Annals of Statistics 34, 1436–1462 (2006)

    Article  MATH  MathSciNet  Google Scholar 

  • Nemirovski, A.: Topics in non-parametric statistics. In: Bernard, P. (ed.) Ecole d’Eté de Probabilités de Saint-Flour 1998. Lecture Notes in Mathematics, vol. XXVIII, Springer, New York (2000)

    Google Scholar 

  • Rigollet, Ph.: Inégalités d’oracle, agrégation et adaptation. PhD thesis, University of Paris 6 (2006)

    Google Scholar 

  • Rigollet, Ph.,Tsybakov, A. B.: Linear and convex aggregation of density estimators. ( 2004), https://hal.ccsd.cnrs.fr/ccsd-00068216.

  • Rudemo, M.: Empirical choice of histograms and kernel density estimato. Scandinavian Journal of Statistics 9, 65–78 (1982)

    MathSciNet  MATH  Google Scholar 

  • Tibshirani, R.: Regression shrinkage and selection via the Lasso. Journal of the Royal Statistical Society, Series B 58, 267–288 (1996)

    MATH  MathSciNet  Google Scholar 

  • Tsybakov, A.B.: Optimal rates of aggregation. In: Schölkopf, B., Warmuth, M.K. (eds.) COLT/Kernel 2003. LNCS (LNAI), vol. 2777, Springer, Heidelberg (2003)

    Google Scholar 

  • Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1998)

    Google Scholar 

  • van de Geer, S.A.: High dimensional generalized linear models and the Lasso. Research report No.133. Seminar für Statistik, ETH, Zürich (2006)

    Google Scholar 

  • Wegkamp, M.H.: Quasi-Universal Bandwidth Selection for Kernel Density Estimators. Canadian Journal of Statistics 27, 409–420 (1999)

    Article  MATH  MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Nader H. Bshouty Claudio Gentile

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer Berlin Heidelberg

About this paper

Cite this paper

Bunea, F., Tsybakov, A.B., Wegkamp, M.H. (2007). Sparse Density Estimation with ℓ1 Penalties. In: Bshouty, N.H., Gentile, C. (eds) Learning Theory. COLT 2007. Lecture Notes in Computer Science(), vol 4539. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-72927-3_38

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-72927-3_38

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-72925-9

  • Online ISBN: 978-3-540-72927-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics