Abstract
Many biomedical applications are concerned with the problem of selecting important predictors from a high-dimensional set of candidates, with the gene expression data as one example. Due to the fact that the sample size in any single study is usually small, it is thus important to combine information from multiple studies. In this chapter, we introduce a Bayesian hierarchical modeling approach which models study-to-study heterogeneity explicitly to borrow strength across studies. Using a carefully formulated prior specification, we develop a fast approach to predictor selection and shrinkage estimation for high-dimensional predictors. The proposed approach, which is related to the relevance vector machine (RVM), relies on maximum a posteriori (MAP) estimation to rapidly obtain a sparse estimate. As for the typical RVM, there is an intrinsic thresholding property in which unimportant predictors tend to have their coefficients shrunk to zero. The method will be illustrated with an application of selecting genes as predictors of time to an event.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Ji, S., Dunson, D., and Carin, L. (2009) Multi-task compressive sensing. IEEE Transactions Signal Processing 57, 92–106.
Chauveau, D. (1995) A stochastic em algorithm for mixtures with censored data. Journal of Statistical Planning and Inference 46, 1–25.
Ip, E. H. S. (1994) A stochastic EM estimator in the presence of missing data: theory and applications. Ph.D. thesis, Department of Statistics, Stanford University.
Marschner, I. C. (2001) On stochastic versions of the em algorithm. Biometrika 88, 281–286.
Tregouet, D. A., Escolano, S., Tiret, L., Mallet, A., and Golmard, J. L. (2004) A new algorithm for haplotype-based association analysis: the stochastic-em algorithm. Annals of Human Genetics 68, 165–177.
Datta, S., Le-Rademacher, J., and Datta, S. (2007) Predicting patient survival from microarray data by accelerated failure time modeling using partial least squares and lasso. Biometrics 63, 259–271.
West, M. (1987) On scale mixtures of normal distributions. Biometrika 74, 646–648.
Zilliox, M. and Irizarray, R. (2007) A gene expression bar code for microarray data. Nature Methods 4, 11, 911–913.
Miller, M., Wang, C., Parisini, E., Coletta, R., Goto, R., Lee, S., Barral, D., Townes, M., Roura-Mir, C., Ford, H., Brenner, M., and Dascher, C. C.(2005) Characterization of two avian MHC-like genes reveals an ancient origin of the cd1 family. Processings of National Academy of Science, USA, 102, 8674–8679.
Pawitan, Y., Bjohle, J., Amler, L., Borg, A., Egyhazi, S., Hall1, P., Han, X., Holmberg, L., Huang, F., Klaar, S., Liu, E., Miller, L., Nordgren, H., Ploner, A., Sandelin, K., Shaw, P., Smeds, J., Skoog, L., Wedren, S., and Bergh, J. (2005) Gene expression profiling spares early breast cancer patients from adjuvant therapy: derived and validated in two population-based cohorts. Breast Cancer Research 7, R953–R964.
Sotiriou, C., Wirapati, P., Loi, S. A. H., Fox, S., Smeds, J., Nordgren, H., Farmer, P., Praz, V., Haibe-Kains, B. C. D., Larsimont, D., Cardoso, F., Peterse, H., Nuyten, D. M. B., Van de Vijver, M. J. B., Piccart, M., and Delorenzi, M. (2006) Gene expression profiling in breast cancer: Understanding the molecular basis of histologic grade to improve prognosis. Journal of National Cancer Institute 98, 262–272.
Acknowledgments
This research was supported in part by the Statistical and Applied Mathematical Sciences Institute (SAMSI) Summer 2008 research program on Meta-analysis: Synthesis and Appraisal of Multiple Sources of Empirical Evidence. The gene barcode data used in this paper were kindly provided by Dr. Rafael Irizarry and Dr. Michael Zilliox. Research of Fei Liu was partially supported by the University of Missouri-Columbia research board award.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2010 Humana Press, a part of Springer Science+Business Media, LLC
About this protocol
Cite this protocol
Liu, F. (2010). A Bayesian Hierarchical Model for High-Dimensional Meta-analysis. In: Bang, H., Zhou, X., van Epps, H., Mazumdar, M. (eds) Statistical Methods in Molecular Biology. Methods in Molecular Biology, vol 620. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-60761-580-4_20
Download citation
DOI: https://doi.org/10.1007/978-1-60761-580-4_20
Published:
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-60761-578-1
Online ISBN: 978-1-60761-580-4
eBook Packages: Springer Protocols