Multiscale inference for a multivariate density with applications to X-ray astronomy

  • Konstantin Eckle
  • Nicolai Bissantz
  • Holger Dette
  • Katharina Proksch
  • Sabrina Einecke
Article
  • 89 Downloads

Abstract

In this paper, we propose methods for inference of the geometric features of a multivariate density. Our approach uses multiscale tests for the monotonicity of the density at arbitrary points in arbitrary directions. In particular, a significance test for a mode at a specific point is constructed. Moreover, we develop multiscale methods for identifying regions of monotonicity and a general procedure for detecting the modes of a multivariate density. It is shown that the latter method localizes the modes with an effectively optimal rate. The theoretical results are illustrated by means of a simulation study and a data example. The new method is applied to and motivated by the determination and verification of the position of high-energy sources from X-ray observations by the Swift satellite which is important for a multiwavelength analysis of objects such as Active Galactic Nuclei.

Keywords

Multiple tests Modes Multivariate density X-ray astronomy 

Notes

Acknowledgements

This research has made use of data obtained through the High Energy Astrophysics Science Archive Research Center Online Service, provided by the NASA/Goddard Space Flight Center. We are very grateful to a reviewer and an associate editor for their constructive comments on an earlier version of this paper. The authors would also like to thank Martina Stein, who typed parts of this manuscript with considerable technical expertise. This work has been supported in part by the Collaborative Research Center “Statistical modeling of nonlinear dynamic processes” (SFB 823, Teilprojekt C1, C4) of the German Research Foundation (DFG).

References

  1. Abraham, C., Biau, G., Cadre, B. (2004). On the asymptotic properties of a simple estimate of the mode. ESAIM. Probability and Statistics, 8, 1–11. (electronic).Google Scholar
  2. Barlow, R. E., Bartholomew, D. J., Bremner, J. M., Brunk, H. D. (1972). Statistical inference under order restrictions. The theory and application of isotonic regression. Wiley series in probability and mathematical statistics. London: Wiley.Google Scholar
  3. Burman, P., Polonik, W. (2009). Multivariate mode hunting: Data analytic tools with measures of significance. Journal of Multivariate Analysis, 100(6), 1198–1218.Google Scholar
  4. Chacón, J., Duong, T. (2013). Data-driven density derivative estimation, with applications to nonparametric clustering and bump hunting. Electronic Journal of Statistics, 7, 499–532.Google Scholar
  5. Chan, K. S., Tong, H. (2004). Testing for multimodality with dependent data. Biometrika, 91(1), 113–123.Google Scholar
  6. Chaudhuri, P., Marron, J. S. (1999). SiZer for exploration of structures in curves. Journal of the American Statistical Association, 94(447), 807–823.Google Scholar
  7. Chaudhuri, K., Dasgupta, S., Kpotufe, S., von Luxburg, U. (2014). Consistent procedures for cluster tree estimation and pruning. Institute of Electrical and Electronics Engineers. Transactions on Information Theory, 60(12), 7900–7912.Google Scholar
  8. Chen, Y.-C., Genovese, C. R., Wasserman, L. (2015). Statistical inference using the Morse–Smale complex. Preprint arXiv:1506.08826.
  9. Dasgupta, S., Kpotufe, S. (2014). Optimal rates for k-NN density and mode estimation. Advances in Neural Information Processing Systems, 27, 2555–2563.Google Scholar
  10. Donoho, D. L., Liu, R. C. (1991). Geometrizing rates of convergence. III. The Annals of Statistics, 19(2):633–667, 668–701.Google Scholar
  11. Dümbgen, L., Walther, G. (2008). Multiscale inference about a density. The Annals of Statistics, 36(4), 1758–1785.Google Scholar
  12. Duong, T., Cowling, A., Koch, I., Wand, M. P. (2008). Feature significance for multivariate kernel density estimation. Computational Statistics and Data Analysis, 52(9), 4225–4242.Google Scholar
  13. Fisher, N. I., Marron, J. S. (2001). Mode testing via the excess mass estimate. Biometrika, 88(2), 499–517.Google Scholar
  14. Fukunaga, K., Hostetler, L. D. (1975). The estimation of the gradient of a density function, with applications in pattern recognition. Institute of Electrical and Electronics Engineers. Transactions on Information Theory, IT–21, 32–40.Google Scholar
  15. Genovese, C. R., Perone-Pacifico, M., Verdinelli, I., Wasserman, L. (2016). Non-parametric inference for density modes. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 78(1), 99–126.Google Scholar
  16. Gerber, S., Potter, K. (2012). Data analysis with the Morse–Smale complex: The msr package for R. Journal of Statistical Software, 50(1), 1–22.Google Scholar
  17. Godtliebsen, F., Marron, J. S., Chaudhuri, P. (2002). Significance in scale space for bivariate density estimation. Journal of Computational and Graphical Statistics, 11(1), 1–21.Google Scholar
  18. Grund, B., Hall, P. (1995). On the minimisation of \(L^p\) error in mode estimation. The Annals of Statistics, 23, 2264–2284.Google Scholar
  19. Hall, P., York, M. (2001). On the calibration of Silverman’s test for multimodality. Statistica Sinica, 11, 515–536.Google Scholar
  20. Hartigan, J. A. (1975). Clustering algorithms. Wiley series in probability and mathematical statistics. London: Wiley.Google Scholar
  21. Hartigan, J. A. (1987). Estimation of a convex density contour in two dimensions. Journal of the American Statistical Association, 82(397), 267–270.MathSciNetCrossRefMATHGoogle Scholar
  22. Hartigan, J. A., Hartigan, P. M. (1985). The dip test of unimodality. The Annals of Statistics, 13(1), 70–84.Google Scholar
  23. Klemelä, J. (2005). Adaptive estimation of the mode of a multivariate density. Journal of Nonparametric Statistics, 17(1), 83–105.MathSciNetCrossRefMATHGoogle Scholar
  24. Li, J., Ray, S., Lindsay, B. G. (2007). A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research, 8, 1687–1723.Google Scholar
  25. Mammen, E., Marron, J. S., Fisher, N. (1991). Some asymptotics for multimodality tests based on kernel density estimates. Probability Theory and Related Fields, 91, 115–132.Google Scholar
  26. Minnotte, M. C. (1997). Nonparametric testing of the existence of modes. The Annals of Statistics, 25(4), 1646–1660.MathSciNetCrossRefMATHGoogle Scholar
  27. Müller, D. W., Sawitzki, G. (1991). Excess mass estimates and tests for multimodality. Journal of the American Statistical Association, 86(415), 738–746.Google Scholar
  28. Polonik, W. (1995). Measuring mass concentrations and estimating density contour clusters—An excess mass approach. The Annals of Statistics, 23(3), 855–881.MathSciNetCrossRefMATHGoogle Scholar
  29. Romano, J. (1988). On weak convergence and optimality of kernel density estimates of the mode. The Annals of Statistics, 16, 629–647.MathSciNetCrossRefMATHGoogle Scholar
  30. Schmidt-Hieber, J., Munk, A., Dümbgen, L. (2013). Multiscale methods for shape constraints in deconvolution: Confidence statements for qualitative features. The Annals of Statistics, 41(3), 1299–1328.Google Scholar
  31. Silverman, B. W. (1981). Using kernel density estimates to investigate multimodality. Journal of the Royal Statistical Society. Series B. Statistical Methodology, 43(1), 97–99.MathSciNetGoogle Scholar
  32. Tsybakov, A. B. (1990). Recurrent estimation of the mode of a multidimensional distribution. Problemy Peredachi Informatsii, 26(1), 38–45.MathSciNetGoogle Scholar

Copyright information

© The Institute of Statistical Mathematics, Tokyo 2017

Authors and Affiliations

  • Konstantin Eckle
    • 1
  • Nicolai Bissantz
    • 1
  • Holger Dette
    • 1
  • Katharina Proksch
    • 2
  • Sabrina Einecke
    • 3
  1. 1.Fakultät für MathematikRuhr-Universität BochumBochumGermany
  2. 2.Institut für Mathematische StochastikGeorg-August-Universität GöttingenGöttingenGermany
  3. 3.Fakultät PhysikTechnische Universität DortmundDortmundGermany

Personalised recommendations