Discovery Among Binary Biomarkers in Heterogeneous Populations
- 32 Downloads
Biomarkers have great potential to improve disease diagnosis and treatment. Disease may arise via multiple pathways, however, each associated with distinct complex interactions among multiple biomarkers, and hence patients exhibit considerable heterogeneity in the biomarker-disease association despite sharing the same clinical diagnosis. Thus identification of clinically useful biomarker combinations requires statistical methods that accommodate population heterogeneity and enable discovery of possibly complex interactions among biomarkers that associate with disease. We address jointly modeling binary and continuous disease outcomes when the association between predictors and these outcomes exhibits heterogeneity. In the context of binary biomarkers, we use ideas from logic regression to find Boolean combinations of these biomarkers that predict the binary disease outcome. The associated continuous outcome is modeled as Gaussian. Heterogeneity is cast as unknown subgroups in the population, with the associations between the joint outcome and biomarkers and other covariates varying by subgroup. We adopt a mixture of finite mixtures (MFM) fully Bayesian formulation to simultaneously estimate the number of subgroups, the subgroup membership structure, and the subgroup-specific relationships between outcomes and predictors. We describe how our model incorporates the Boolean relations as parameters arising from the MFM model and our approach to the associated challenges of specifying the prior distribution and estimation using Markov chain Monte Carlo. We illustrate the performance of the methods using simulation and discuss application.
KeywordsBayesian semiparametric model Clustering Joint modeling Markov chain Monte Carlo Product partition model
The authors were partially supported by grants R01MH104423, R01HD078410 and R01HD093055 from the National Institutes of Health. Portions of this work were revised while E. Slate was the Visiting Scholar in Honor of David C. Jordan at AbbVie, Inc. in North Chicago, IL and also a Research Fellow with the Statistical and Applied Mathematical Sciences Institute in Durham, NC. Additional support from the Graduate School and Department of Statistics at Florida State University is gratefully acknowledged. Figures 1 and 2 were adapted from a figure provided by Dr. Zhengwu Zhang, Univ. of Rochester. The authors thank the reviewers for comments that led to improvement of this manuscript.
- 2.Allman, E. S., Matias, C., & Rhodes, J. A. (2009). Identifiability of parameters in latent structure models with many observed variables. The Annals of Statistics, 3099–3132.Google Scholar
- 3.Antoniak, C. E. (1974). Mixtures of Dirichlet processes with applications to Bayesian nonparametric problems. The Annals of Statistics, 1152–1174.Google Scholar
- 4.Blackwell, D. & MacQueen, J. B. (1973). Ferguson distributions via Pólya urn schemes. The Annals of Statistics, 353–355.Google Scholar
- 9.Etzioni, R., Falcon, S., Gann, P. H., Kooperberg, C. L., Penson, D. F., & Stampfer, M. J. (2004). Prostate-specific antigen and free prostate-specific antigen in the early detection of prostate cancer: Do combination tests improve detection? Cancer Epidemiology Biomarkers and Prevention, 13(10), 1640–1645.Google Scholar
- 10.Ferguson, T. S. (1973). A Bayesian analysis of some nonparametric problems. The Annals of Statistics, 209–230.Google Scholar
- 16.Kooperberg, C., Bis, J. C., Marciante, K. D., Heckbert, S. R., Lumley, T., & Psaty, B. M. (2007). Logic regression for analysis of the association between genetic variation in the renin-angiotensin system and myocardial infarction or stroke. American Journal of Epidemiology, 165(3), 334–343.CrossRefGoogle Scholar
- 19.MacEachern, S. N., & Muller, P. (1998). Estimating mixture of Dirichlet process models. Journal of Computational and Graphical Statistics, 7(2), 223–238.Google Scholar
- 20.Miller, J. W. (2014). Nonparametric and variable-dimension Bayesian mixture models: Analysis, comparison, and new methods. Ph.D. Thesis, Brown University.Google Scholar
- 30.Slate, E. H., Geng, J., Wolf, B. J., & Hill, E. G. (2014). Discovery among binary biomarkers. In JSM Proceedings, WNAR. Alexandria: American Statistical Association.Google Scholar