Journal of Statistical Theory and Practice

, Volume 2, Issue 3, pp 407–418 | Cite as

Estimating the Mean of High Valued Observations in High Dimensions

  • Eitan GreenshteinEmail author
  • Junyong Park
  • Ya’acov Ritov


Let YiN(μi, 1), i = 1, …,n, be independent random variables. We study the problem of estimating the quantity S = Σ{i|C<Yi{ μi. We emphasize the case where n is large, the vector (μ1, …,μn) is sparse, and the value of C is large. Our approach is nonparametric empirical Bayes, where μi are assumed i.i.d from an unknown G. The performance of our suggested estimator is studied both theoretically and through simulations. We also obtain some results related to the local false discovery rates corresponding to high valued points Yi.

AMS Subject Classification

Primary 62G05 Secondary 62H12 


FDR Sparse vector Empirical Bayes 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Benjamini, Y., Hochberg, Y., 1995. Controlling the false discovery rate: A practical and powerful approach to multiple testing. JRSS-B, 57, 289–300.MathSciNetzbMATHGoogle Scholar
  2. Berger, J.O., Brown, L.D., Wolpert, R.L., 1994. A unified conditional frequentist and Bayesian test for fixed and sequential simple hypothesis testing. Ann. Stat, 42 (4), 1787–1807.MathSciNetCrossRefGoogle Scholar
  3. Bickel, Klaassen, Ritov, Wellner, 1993, Efficient and Adaptive Estimation for Semiparametric Models, John Hopkins University Press and Springer.zbMATHGoogle Scholar
  4. Bickel, P.J., Lehmann, E.L., 1975. Descriptive statistics for nonparametric models I. Introduction. Ann. Statist., 3, 1038–1044.MathSciNetCrossRefGoogle Scholar
  5. Bickel, P.J., Ritov, Y., 1987. Efficient estimation in the errors in variables model. Ann. of Statist., 15, 513–540.MathSciNetCrossRefGoogle Scholar
  6. Brown, L., 1971. Admissible estimators, recurrent diffusions, and insoluble boundary value problems. Ann. Math. Stat., 42 (3), 855–903.MathSciNetCrossRefGoogle Scholar
  7. Efron, B., Tibshirani, R., Storey, J.D., Tusher, V., 2001. Empirical Bayes analysis of a microarray experiment. JASA, 96, 1151–1160.MathSciNetCrossRefGoogle Scholar
  8. Erickson, S., Sabatti, C., 2005. Empirical Bayes estimation of a sparse vector of gene expression changes. Statistical Applications in Genetics and Molecular Biology, 4 (1), Article 22. Available at:
  9. Fan, J., 1991. Adaptively local one-dimensional subproblems with application to a deconvolution problem. Ann. Statist., 19, 1257–1272.MathSciNetCrossRefGoogle Scholar
  10. Greenshtein, E., Park, J., Lebanon, G., 2006, Regularization through variable selection and conditional m.l.e, with application to classification in high dimensions. to appear in JSPI.Google Scholar
  11. Robbins, H., Zhang, C.H., 1988. Estimating a treatment effect under biased sampling. Proc. Natl. Acad. Sci., 85, 3670–3672.MathSciNetCrossRefGoogle Scholar
  12. Silverman, E.W., 1992, Density Estimation for Statistics and Data Analysis, Chapman & Hall.Google Scholar
  13. Skinner, Chris, Shlomo, Natalie, 2006. Assessing identification risk in survey microdata using log-linear models. Southampton, UK, University of Southampton, Southampton Statistical Sciences Research Institute, 36pp. (S3RI Methodology Working Papers, M06/14) Scholar
  14. Storey, J.D., 2003. The positive False discovery rate: a Bayesian interpretation and the q-value. Ann. Stat., 31 (6), 2013–2035.MathSciNetCrossRefGoogle Scholar
  15. Zhang, C.H., 2005. Estimation of sums of random variables: Examples and information bounds. Ann. Stat, 33 (5), 2022–2041.MathSciNetCrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing 2008

Authors and Affiliations

  • Eitan Greenshtein
    • 1
    Email author
  • Junyong Park
    • 2
  • Ya’acov Ritov
    • 3
  1. 1.SAMSIResearch Triangle ParkUSA
  2. 2.Department of Mathematics and StatisticsUniversity of Maryland Baltimore CountyUSA
  3. 3.Department of StatisticsHebrew University of JerusalemIsrael

Personalised recommendations