Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension

Yata, Kazuyoshi; Aoshima, Makoto

doi:10.1007/s11009-011-9233-z

Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension

Published: 21 June 2011

Volume 14, pages 459–476, (2012)
Cite this article

Methodology and Computing in Applied Probability Aims and scope Submit manuscript

Kazuyoshi Yata¹ &
Makoto Aoshima¹

213 Accesses
3 Citations
Explore all metrics

Abstract

We focus on inference about high-dimensional mean vectors when the sample size is much fewer than the dimension. Such data situation occurs in many areas of modern science such as genetic microarrays, medical imaging, text recognition, finance, chemometrics, and so on. First, we give a given-radius confidence region for mean vectors. This inference can be utilized as a variable selection of high-dimensional data. Next, we give a given-width confidence interval for squared norm of mean vectors. This inference can be utilized in a classification procedure of high-dimensional data. In order to assure a prespecified coverage probability, we propose a two-stage estimation methodology and determine the required sample size for each inference. Finally, we demonstrate how the new methodologies perform by using a microarray data set.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Inference on high-dimensional mean vectors under the strongly spiked eigenvalue model

Article Open access 13 December 2018

Aki Ishii, Kazuyoshi Yata & Makoto Aoshima

Testing high-dimensional mean vector with applications

Article 28 October 2021

Jin-Ting Zhang, Bu Zhou & Jia Guo

Small Samples of Multidimensional Feature Vectors

References

Ahn J, Marron JS, Muller KM, Chi Y-Y (2007) The high-dimension, low-sample-size geometric representation holds under mild conditions. Biometrika 94:760–766
Article MathSciNet MATH Google Scholar
Aoshima M (2005) Statistical inference in two-stage sampling. Trans Am Math Soc 215:125–145
MathSciNet Google Scholar
Aoshima M, Mukhopadhyay N (1998) Fixed-width simultaneous confidence intervals for multinormal means in several intraclass correlation models. J Multivar Anal 66(1):46–63
Article MathSciNet MATH Google Scholar
Aoshima M, Takada Y (2004) Asymptotic second-order efficiency for multivariate two-stage estimation of a linear function of normal mean vectors. Seq Anal 23(3):333–353
Article MathSciNet MATH Google Scholar
Aoshima M, Takada Y, Srivastava MS (2002) A two-stage procedure for estimating a linear function of k multinormal mean vectors when covariance matrices and unknown. J Stat Plan Inference 100:109–119
Article MathSciNet MATH Google Scholar
Aoshima M, Yata K (2010) Asymptotic second-order consistency for two-stage estimation methodologies and its applications. Ann Inst Stat Math 62:571–600
Article MathSciNet Google Scholar
Aoshima M, Yata K (2011) Two-stage procedures for high-dimensional data. Seq Anal (Editor’s special invited paper), to appear
Bai Z, Sarandasa H (1996) Effect of high dimension: by an example of a two sample problem. Stat Sin 6:311–329
MATH Google Scholar
Bradley RC (2005) Basic properties of strong mixing conditions. A survey and some open questions. Probab Surv 2:107–144 (electronic)
MATH Google Scholar
Chiaretti S, Li X, Gentleman R, Vitale A, Vignetti M, Mandelli F, Ritz J, Foa R (2004) Gene expression profile of adult T-cell acute lymphocytic leukemia identifies distinct subsets of patients with different response to therapy and survival. Blood 103:2771–2778
Article Google Scholar
Ghosh M, Mukhopadhyay N, Sen PK (1997) Sequential estimation. Wiley, New York
Book MATH Google Scholar
Hall P, Marron JS, Neeman A (2005) Geometric representation of high dimension, low sample size data. J R Stat Soc Ser B 67:427–444
Article MathSciNet MATH Google Scholar
Kolmogorov AN, Rozanov YA (1960) On strong mixing conditions for stationary Gaussian processes. Theory Probab Appl 5:204–208
Article MathSciNet Google Scholar
Mukhopadhyay N, Duggan WT (1997) Can a two-stage procedure enjoy second-order properties? Sankhyā Ser A 59:435–448
MathSciNet MATH Google Scholar
Mukhopadhyay N, Duggan WT (1999) On a two-stage procedure having second-order properties with applications. Ann Inst Stat Math 51:621–636
Article MathSciNet MATH Google Scholar
Pollard KS, Dudoit S, van der Laan MJ (2005) Multiple testing procedures: R multitest package and applications to genomics. In: Gentleman R, Carey V, Huber W, Irizarry R, Dudoit S (eds) Bioinformatics and computational biology solutions using R and bioconductor. Springer, New York, pp 249–271
Chapter Google Scholar
Srivastava MS (2005) Some tests concerning the covariance matrix in high dimensional data. J Jpn Stat Soc 35:251–272
Google Scholar
Stein C (1945) A two-sample test for a linear hypothesis whose power is independent of the variance. Ann Math Stat 16:243–258
Article MATH Google Scholar
Yata K (2010) Effective two-stage estimation for a linear function of high-dimensional gaussian means. Seq Anal 29:463–482
Article MathSciNet MATH Google Scholar
Yata K, Aoshima M (2009a) Double shrink methodologies to determine the sample size via covariance structures. J Stat Plan Inference 139:81–99
Article MathSciNet MATH Google Scholar
Yata K, Aoshima M (2009b) PCA consistency for non-gaussian data in high dimension, low sample size context. Commun Stat, Theory Methods (Special issue honoring Zacks S, ed Mukhopadhyay N) 38:2634–2652.
MathSciNet MATH Google Scholar
Yata K, Aoshima M (2010a) Effective PCA for high-dimension, low-sample-size data with singular value decomposition of cross data matrix. J Multivar Anal 101:2060–2077
Article MathSciNet MATH Google Scholar
Yata K, Aoshima M (2010b) Intrinsic dimensionality estimation of high dimension, low sample size data with d-asymptotics. Commun Stat, Theory Method (Special issue honoring Akahira M, ed Aoshima M) 39:1511–1521.
MathSciNet MATH Google Scholar
Yata K, Aoshima M (2011) Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations. J Mult Anal, revised

Download references

Author information

Authors and Affiliations

Institute of Mathematics, University of Tsukuba, Ibaraki, 305-8571, Japan
Kazuyoshi Yata & Makoto Aoshima

Authors

Kazuyoshi Yata
View author publications
You can also search for this author in PubMed Google Scholar
Makoto Aoshima
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kazuyoshi Yata.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yata, K., Aoshima, M. Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension. Methodol Comput Appl Probab 14, 459–476 (2012). https://doi.org/10.1007/s11009-011-9233-z

Download citation

Received: 06 December 2010
Revised: 08 May 2011
Accepted: 30 May 2011
Published: 21 June 2011
Issue Date: September 2012
DOI: https://doi.org/10.1007/s11009-011-9233-z

Keywords

Mathematics Subject Classifications (2010)

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension

Abstract

Access this article

Similar content being viewed by others

Inference on high-dimensional mean vectors under the strongly spiked eigenvalue model

Testing high-dimensional mean vector with applications

Small Samples of Multidimensional Feature Vectors

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Mathematics Subject Classifications (2010)

Navigation

Inference on High-Dimensional Mean Vectors with Fewer Observations Than the Dimension

Abstract

Access this article

Similar content being viewed by others

Inference on high-dimensional mean vectors under the strongly spiked eigenvalue model

Testing high-dimensional mean vector with applications

Small Samples of Multidimensional Feature Vectors

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Mathematics Subject Classifications (2010)

Search

Navigation