Advertisement

Probability Theory and Related Fields

, Volume 173, Issue 3–4, pp 1165–1196 | Cite as

Concentration of the empirical level sets of Tukey’s halfspace depth

  • Victor-Emmanuel BrunelEmail author
Article
  • 120 Downloads

Abstract

Tukey’s halfspace depth has attracted much interest in data analysis, because it is a natural way of measuring the notion of depth relative to a cloud of points or, more generally, to a probability measure. Given an i.i.d. sample, we investigate the concentration of upper level sets of the Tukey depth relative to that sample around their population version. We show that under some mild assumptions on the underlying probability measure, concentration occurs at a parametric rate and we deduce moment inequalities at that same rate. In a computational prospective, we study the concentration of a discretized version of the empirical upper level sets.

Keywords

Tukey depth Level set Multivariate quantiles Support function Semi-infinite linear programming 

Mathematics Subject Classification

62H11 

References

  1. 1.
    Arcones, M.A., Chen, Z., Giné, E.: Estimators related to \(U\)-processes with applications to multivariate medians: asymptotic normality. Ann. Stat. 22(3), 1460–1477 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  2. 2.
    Bárány, I., Larman, D.G.: Convex bodies, economic cap coverings, random polytopes. Mathematika 35(2), 274–291 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  3. 3.
    Baraud, Y.: Bounding the expectation of the supremum of an empirical process over a (weak) VC-major class. Electron. J. Stat. 10(2), 1709–1728 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Brunel, V.-E.: Uniform behaviors of random polytopes under the hausdorff metric. Bernoulli (2018) (to appear) Google Scholar
  5. 5.
    Brunel, V.-E.: Adaptive estimation of convex polytopes and convex sets from noisy data. Electron. J. Stat. 7, 1301–1327 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  6. 6.
    Brunel, V.-E..: A universal deviation inequality for random polytopes (2014). arXiv:1311.2902
  7. 7.
    Chaudhuri, P.: On a geometric notion of quantiles for multivariate data. J. Am. Stat. Assoc. 91(434), 862–872 (1996)MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Chazelle, B.: An optimal convex hull algorithm in any fixed dimension. Discrete Comput. Geom. 10, 377–409 (1993)MathSciNetCrossRefzbMATHGoogle Scholar
  9. 9.
    Cole, R., Sharir, M., Yap, C.-K.: On \(k\)-hulls and related problems. SIAM J. Comput. 16(1), 61–77 (1987)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Cuesta-Albertos, J.A., Nieto-Reyes, A.: The random Tukey depth. Comput. Stat. Data Anal. 52(11), 4979–4988 (2008)MathSciNetCrossRefzbMATHGoogle Scholar
  11. 11.
    Cuevas, A., González-Manteiga, W., Rodríguez-Casal, A.: Plug-in estimation of general level sets. Aust. N. Z. J. Stat. 48(1), 7–19 (2006)MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Donoho, D.L., Gasko, M.: Breakdown properties of location estimates based on halfspace depth and projected outlyingness. Ann. Stat. 20(4), 1803–1827 (1992)MathSciNetCrossRefzbMATHGoogle Scholar
  13. 13.
    Dutta, S., Ghosh, A.K., Chaudhuri, P.: Some intriguing properties of Tukey’s half-space depth. Bernoulli 17(4), 1420–1434 (2011)MathSciNetCrossRefzbMATHGoogle Scholar
  14. 14.
    Dyckerhoff, R., Mozharovskyi, P.: Exact computation of the halfspace depth. Comput. Stat. Data Anal. 98, 19–30 (2016)MathSciNetCrossRefzbMATHGoogle Scholar
  15. 15.
    Fresen, D.: A multivariate Gnedenko law of large numbers. Ann. Probab. 41(5), 3051–3080 (2013)MathSciNetCrossRefzbMATHGoogle Scholar
  16. 16.
    Ghosh, A.K., Chaudhuri, P.: On data depth and distribution-free discriminant analysis using separating surfaces. Bernoulli 11(1), 1–27 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  17. 17.
    Ghosh, A.K., Chaudhuri, P.: On maximum depth and related classifiers. Scand. J. Stat. 32(2), 327–350 (2005)MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Guntuboyina, A.: Optimal rates of convergence for convex set estimation from support functions. Ann. Stat. 40(1), 385–411 (2012)MathSciNetCrossRefzbMATHGoogle Scholar
  19. 19.
    Hallin, M., Paindaveine, D., Šiman, M.: Multivariate quantiles and multiple-output regression quantiles: from \(L_1\) optimization to halfspace depth. Ann. Stat. 38(2), 635–669 (2010)CrossRefzbMATHGoogle Scholar
  20. 20.
    He, X., Wang, G.: Convergence of depth contours for multivariate datasets. Ann. Stat. 25(2), 495–504 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  21. 21.
    He, Y.: Multivariate extreme value statistics for risk assessment. Ph.D. thesis (2016)Google Scholar
  22. 22.
    Hubert, M., Rousseeuw, P., Segaert, P.: Multivariate and functional classification using depth and distance. Adv. Data Anal. Classif. 11(3), 445–466 (2017)MathSciNetCrossRefGoogle Scholar
  23. 23.
    Johnson, D.S., Preparata, F.P.: The densest hemisphere problem. Theor. Comput. Sci. 6(1), 93–107 (1978)MathSciNetCrossRefzbMATHGoogle Scholar
  24. 24.
    Kim, J.: Rate of convergence of depth contours: with application to a multivariate metrically trimmed mean. Stat. Probab. Lett. 49(4), 393–400 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  25. 25.
    Koltchinskii, V.: Oracle inequalities in empirical risk minimization and sparse recovery problems, volume 2033 of Lecture Notes in Mathematics. Lectures from the 38th Probability Summer School held in Saint-Flour, 2008, École d’Été de Probabilités de Saint-Flour. [Saint-Flour Probability Summer School] Springer, Heidelberg (2011)Google Scholar
  26. 26.
    Kong, L., Mizera, I.: Quantile tomography: using quantiles with multivariate data. Stat. Sin. 22(4), 1589–1610 (2012)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Li, S.: Concise formulas for the area and volume of a hyperspherical cap. Asian J. Math. Stat. 4(1), 66–70 (2011)MathSciNetCrossRefGoogle Scholar
  28. 28.
    Liu, R.Y., Parelius, J.M., Singh, K.: Multivariate analysis by data depth: descriptive statistics, graphics and inference. With discussion and a rejoinder by Liu and Singh. Ann. Stat. 27(3), 783–858 (1999)CrossRefzbMATHGoogle Scholar
  29. 29.
    Liu, R.Y., Singh, K.: A quality index based on data depth and multivariate rank tests. J. Am. Stat. Assoc. 88(421), 252–260 (1993)MathSciNetzbMATHGoogle Scholar
  30. 30.
    Lòpez, R., Still, G.: Semi-infinite programming. Eur. J. Oper. Res. 180(2), 491–518 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  31. 31.
    Lovász, L., Vempala, S.: The geometry of logconcave functions and sampling algorithms. Random Struct. Algorithms 30(3), 307–358 (2007)MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Mani-Levitska, P.: Characterization of convex sets. In: Gruber, P.M. and Wills, J.M. (eds.) Handbook of Convex Geometry, North-Holland, pp. 19–41 (1993)Google Scholar
  33. 33.
    Massé, J.-C., Theodorescu, R.: Halfplane trimming for bivariate distributions. J. Multivar. Anal. 48(2), 188–202 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  34. 34.
    Massé, J.-C.: Asymptotics for the Tukey depth process, with an application to a multivariate trimmed mean. Bernoulli 10(3), 397–419 (2004)MathSciNetCrossRefzbMATHGoogle Scholar
  35. 35.
    Miller, K., Ramaswami, S., Rousseeuw, P., Sellarès, J.A., Souvaine, D., Streinu, I., Struyf, A.: Efficient computation of location depth contours by methods of computational geometry. Stat. Comput. 13(2), 153–162 (2003)MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Molchanov, I.S.: A limit theorem for solutions of inequalities. Scand. J. Stat. 25(1), 235–242 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  37. 37.
    Pateiro-Lopez, B.: Set estimation under convexity type restrictions. PhD thesis (2008)Google Scholar
  38. 38.
    Polonik, W.: Measuring mass concentrations and estimating density contour clusters—an excess mass approach. Ann. Stat. 23, 855–881 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  39. 39.
    Rigollet, P., Vert, R.: Optimal rates for plug-in estimators of density level sets. Bernoulli 15(4), 1154–1178 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  40. 40.
    Rousseeuw, P.J., Ruts, I.: Computing depth contours of bivariate point clouds. Comput. Stat. Data Anal. 23, 153–168 (1996)CrossRefzbMATHGoogle Scholar
  41. 41.
    Rousseeuw, P.J., Struyf, A.: Computing location depth and regression depth in higher dimensions. Stat. Comput. 8, 193–203 (1998)CrossRefGoogle Scholar
  42. 42.
    Rousseeuw, P.J., Ruts, I.: The depth function of a population distribution. Metrika 49(3), 213–244 (1999)MathSciNetzbMATHGoogle Scholar
  43. 43.
    Schneider, R.: Convex Bodies: The Brunn–Minkowski theory, volume 151 of Encyclopedia of Mathematics and Its Applications, expanded edn. Cambridge University Press, Cambridge (2014)Google Scholar
  44. 44.
    Schtt, C., Werner, E.: The convex floating body. Math. Scand. 66, 275–290 (1990)MathSciNetCrossRefGoogle Scholar
  45. 45.
    Tsybakov, A.: On nonparametric estimation of density level sets. Ann. Stat. 25, 948–969 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  46. 46.
    Tukey, J.W.: Mathematics and the picturing of data. In: Proceedings of the International Congress of Mathematicians, pp. 523–531 (1975)Google Scholar
  47. 47.
    Yeh, A.B., Singh, K.: Balanced confidence regions based on Tukey’s depth and the bootstrap. J. R. Stat. Soc. Ser. B 59(3), 639–652 (1997)MathSciNetCrossRefzbMATHGoogle Scholar
  48. 48.
    Zuo, Y., Serfling, R.: General notions of statistical depth function. Ann. Stat. 28(2), 461–482 (2000)MathSciNetCrossRefzbMATHGoogle Scholar
  49. 49.
    Zuo, Y., Serfling, R.: Structural properties and convergence results for contours of sample statistical depth functions. Ann. Stat. 28(2), 483–499 (2000)MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of MathematicsMassachusetts Institute of TechnologyCambridgeUSA

Personalised recommendations