Testing equality of a large number of densities under mixing conditions
- 24 Downloads
In certain settings, such as microarray data, the sampling information is formed by a large number of possibly dependent small data sets. In special applications, for example in order to perform clustering, the researcher aims to verify whether all data sets have a common distribution. For this reason we propose a formal test for the null hypothesis that all data sets come from a single distribution. The asymptotic setting is that in which the number of small data sets goes to infinity, while the sample size remains fixed. The asymptotic null distribution of the proposed test is derived under mixing conditions on the sequence of small data sets, and the power properties of our test under two reasonable fixed alternatives are investigated. A simulation study is conducted, showing that the test respects the nominal level, and that it has a power which tends to 1 when the number of data sets tends to infinity. An illustration involving microarray data is provided.
KeywordsDependent data Kernel density estimation k-Sample problem Smooth tests U-statistics
Mathematics Subject Classification62G10
This work has received financial support of the Call 2015 Grants for Ph.D. contracts for training of doctors of the Ministry of Economy and Competitiveness, cofinanced by the European Social Fund (Ref. BES-2015-074958). We acknowledge support from MTM2014-55966-P project, Ministry of Economy and Competitiveness, and MTM2017-89422-P project, Ministry of Economy, Industry and Competitiveness, State Research Agency, and Regional Development Fund, UE. We also acknowledge the financial support provided by the SiDOR research group through the grant Competitive Reference Group, 2016–2019 (ED431C 2016/040), funded by the “Consellería de Cultura, Educación e Ordenación Universitaria. Xunta de Galicia.” To finish, the first author would like to thank the University of Vigo, and its Escola Internacional de Doutoramento (EIDO) by the financial support provided through mobility doctorate grants. The authors also thank Professors Raymond J. Carroll and Robert Chapkin for allowing use of their data.
- Bühlmann P (1993) The blockwise bootstrap in time series and empirical processes (Ph.D. thesis), ETH Zürich, Diss. ETH No. 10354Google Scholar
- Cousido-Rocha M, de Uña-Álvarez J, Hart J (2018) Equalden.HD: testing the equality of a high dimensional set of densities. R package version 1.0. CRAN package repository: https://cran.r-project.org/web/packages/Equalden.HD/index.html
- Dehling H, Fried R, Garcia I, Wendler M (2015) Change-point detection under dependence based on two-sample \(U\)-statistics. Asymptotic laws and method in stochastics, a volume in Honour of Miklos Csrg, pp 195–220Google Scholar
- Hedenfalk I, Duggan D, Chen Y, Radmacher M, Bittner M, Simon R, Meltzer P, Gusterson B, Esteller M, Kallioniemi O, Wilfond B, Borg A, Trent J, Raffeld M, Yakhini Z, BenDor A, Dougherty E, Kononen J, Bubendorf L, Fehrle W, Pittaluga S, Gruvberger G, Loman N, Johannsson O, Olsson H, Sauter G (2001) Gene-expression profiles in hereditary breast cancer. N Engl J Med 344(8):539–548CrossRefGoogle Scholar
- Liu RY, Singh K (1992) Moving blocks jackknife and bootstrap capture weak dependence. In: Lepage R, Billard L (eds) Exploring the limits of bootstrap. Wiley, New YorkGoogle Scholar
- Marmer V (2016) Lecture notes on econometric theory II: Lecture 7, adapted from Peter Phillips’ lecture notes on stationarity and NSTS, 1995, and H. White, 1999, asymptotic theory for econometricians, Academic Press. UBC Vancouver School of Economics, Econ627. http://faculty.arts.ubc.ca/vmarmer/econ627/627_07_2.pdf
- Politis DN (2002) Adaptive bandwidth choice. https://pdfs.semanticscholar.org/c8d5/4df33343c6550HrB85f867e82a1861e9d510dcd.pdfHrB. Accessed 13 Feb 2017
- Politis DN, Romano JP (1994) Bias-corrected nonparametric spectral estimation II. Technical Report #94-5Google Scholar