Skip to main content

Bayesian Posterior Integration for Classification of Mass Spectrometry Data

  • Chapter
  • First Online:
Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry

Abstract

High-throughput technologies currently have the capability to capture information at both global and targeted scales for the transcriptome, proteome, and metabolome, as well as determining functional aspects of these biomolecules. The promise of data integration is that by utilizing these disparate data streams a more accurate predictive model of the phenotype of interest can be developed by identifying the best subset of molecules associated with the outcome. However, in a space of tens of thousands of variables (e.g., genes, proteins), feature selection approaches often yield over-trained models with poor predictive power. Moreover, feature selection algorithms are typically focused on a single source of data and do not evaluate the effect on downstream statistical integration models. The integration of Bayesian statistical outputs have been shown to be an effective approach that optimizes the outcome of interest in the context of the integrated posterior probability. This chapter demonstrates that this approach can improve sensitivity and specificity over simple selection routines based on individual high-throughput datasets generated via mass spectrometry.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Beagley, N., Stratton, K. G., & Webb-Robertson, B. J. (2010). VIBE 2.0: Visual integration for Bayesian evaluation. Bioinformatics, 26(2), 280–282. doi:10.1093/bioinformatics/btp639.

    Article  Google Scholar 

  2. Bingley, P. J., Bonifacio, E., & Mueller, P. W. (2003). Diabetes Antibody Standardization Program: First assay proficiency evaluation. Diabetes, 52(5), 1128–1136.

    Article  Google Scholar 

  3. Chen, X., Liang, Y. Z., Yuan, D. L., & Xu, Q. S. (2009). A modified uncorrelated linear discriminant analysis model coupled with recursive feature elimination for the prediction of bioactivity. SAR and QSAR in Environmental Research, 20(1–2), 1–26. doi:10.1080/10629360902724127.

    Article  Google Scholar 

  4. Dai, Q., Cheng, J. H., Sun, D. W., & Zeng, X. A. (2015). Advances in feature selection methods for hyperspectral image processing in food industry applications: A review. Critical Reviews in Food Science and Nutrition, 55(10), 1368–1382. doi:10.1080/10408398.2013.871692.

    Article  Google Scholar 

  5. De Martino, F., Valente, G., Staeren, N., Ashburner, J., Goebel, R., & Formisano, E. (2008). Combining multivariate voxel selection and support vector machines for mapping and classification of fMRI spatial patterns. NeuroImage, 43(1), 44–58. doi:10.1016/j.neuroimage.2008.06.037.

    Article  Google Scholar 

  6. Eriksson, C., Masaki, N., Yao, I., Hayasaka, T., & Setou, M. (2013). MALDI imaging mass spectrometry-A mini review of methods and recent developments. Mass Spectrom (Tokyo), 2(Spec Iss), S0022. doi:10.5702/massspectrometry.S0022.

  7. Gholami, B., Norton, I., Tannenbaum, A. R., & Agar, N. Y. (2012). Recursive feature elimination for brain tumor classification using desorption electrospray ionization mass spectrometry imaging. Conference Proceedings: Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2012, 5258–5261. doi:10.1109/EMBC.2012.6347180.

    Google Scholar 

  8. Hand, D. J. (1997). Construction and assessment of classification rules. New York: Wiley.

    MATH  Google Scholar 

  9. Hu, C., Wang, J., Zheng, C., Xu, S., Zhang, H., Liang, Y., et al. (2013). Raman spectra exploring breast tissues: Comparison of principal component analysis and support vector machine-recursive feature elimination. Medical Physics, 40(6), 063501. doi:10.1118/1.4804054.

    Article  Google Scholar 

  10. Ibanez, C., Simo, C., Garcia-Canas, V., Cifuentes, A., & Castro-Puyana, M. (2013). Metabolomics, peptidomics and proteomics applications of capillary electrophoresis-mass spectrometry in foodomics: A review. Analytica Chimica Acta, 802, 1–13. doi:10.1016/j.aca.2013.07.042.

    Article  Google Scholar 

  11. Jarman, K. H., Kreuzer-Martin, H. W., Wunschel, D. S., Valentine, N. B., Cliff, J. B., Petersen, C. E., et al. (2008). Bayesian-integrated microbial forensics. Applied and Environmental Microbiology, 74(11), 3573–3582. doi:10.1128/AEM.02526-07.

    Article  Google Scholar 

  12. Jia, P., He, H., & Lin, W. (2005). Decision by maximum of posterior probability average with weights: A method of multiple classifiers combination. In Proceedings of Fourth International Conference on Machine Learning and Cybernetics, Guangzhou, 2005 (pp. 1949–1954). IEEE.

    Google Scholar 

  13. Kruve, A., Rebane, R., Kipper, K., Oldekop, M. L., Evard, H., Herodes, K., et al. (2015). Tutorial review on validation of liquid chromatography-mass spectrometry methods: Part I. Analytica Chimica Acta, 870, 29–44. doi:10.1016/j.aca.2015.02.017.

    Article  Google Scholar 

  14. Kruve, A., Rebane, R., Kipper, K., Oldekop, M. L., Evard, H., Herodes, K., et al. (2015). Tutorial review on validation of liquid chromatography-mass spectrometry methods: Part II. Analytica Chimica Acta, 870, 8–28. doi:10.1016/j.aca.2015.02.016.

    Article  Google Scholar 

  15. Lampasona, V., Schlosser, M., Mueller, P. W., Williams, A. J., Wenzlau, J. M., Hutton, J. C., et al. (2011). Diabetes antibody standardization program: First proficiency evaluation of assays for autoantibodies to zinc transporter 8. Clinical Chemistry, 57(12), 1693–1702. doi:10.1373/clinchem.2011.170662.

    Article  Google Scholar 

  16. Lanckriet, G. R., De Bie, T., Cristianini, N., Jordan, M. I., & Noble, W. S. (2004). A statistical framework for genomic data fusion. Bioinformatics, 20(16), 2626–2635. doi:10.1093/bioinformatics/bth294.

    Article  Google Scholar 

  17. Liesenfeld, D. B., Habermann, N., Owen, R. W., Scalbert, A., & Ulrich, C. M. (2013). Review of mass spectrometry-based metabolomics in cancer research. Cancer Epidemiology, Biomarkers and Prevention, 22(12), 2182–2201. doi:10.1158/1055-9965.EPI-13-0584.

    Article  Google Scholar 

  18. Lin, X., Yang, F., Zhou, L., Yin, P., Kong, H., Xing, W., et al. (2012). A support vector machine-recursive feature elimination feature selection method based on artificial contrast variables and mutual information. Journal of Chromatography B, Analytical Technologies in the Biomedical and Life Sciences, 910, 149–155. doi:10.1016/j.jchromb.2012.05.020.

    Article  Google Scholar 

  19. Piao, Y., Piao, M., Park, K., & Ryu, K. H. (2012). An ensemble correlation-based gene selection algorithm for cancer classification with gene expression data. Bioinformatics, 28(24), 3306–3315. doi:10.1093/bioinformatics/bts602.

    Article  Google Scholar 

  20. Rolandsson, O., Hagg, E., Nilsson, M., Hallmans, G., Mincheva-Nilsson, L., & Lernmark, A. (2001). Prediction of diabetes with body mass index, oral glucose tolerance test and islet cell autoantibodies in a regional population. Journal of Internal Medicine, 249(4), 279–288.

    Article  Google Scholar 

  21. Saeys, Y., Inza, I., & Larranaga, P. (2007). A review of feature selection techniques in bioinformatics. Bioinformatics, 23(19), 2507–2517. doi:10.1093/bioinformatics/btm344.

    Article  Google Scholar 

  22. Saligan, L. N., Fernandez-Martinez, J. L., deAndres-Galiana, E. J., & Sonis, S. (2014). Supervised classification by filter methods and recursive feature elimination predicts risk of radiotherapy-related fatigue in patients with prostate cancer. Cancer Information, 13, 141–152. doi:10.4137/CIN.S19745.

    Article  Google Scholar 

  23. Semmar, N., Canlet, C., Delplanque, B., Ruyet, P. L., Paris, A., & Martin, J. C. (2014). Review and research on feature selection methods from NMR data in biological fluids. Presentation of an original ensemble method applied to atherosclerosis field. Current Drug Metabolism, 15(5), 544–556.

    Article  Google Scholar 

  24. Shapiro, C. P. (1977). Classification by maximum posterior probability. The Annals of Statistics, 5(1), 185–190.

    Article  MathSciNet  MATH  Google Scholar 

  25. Tao, P., Liu, T., Li, X., & Chen, L. (2015). Prediction of protein structural class using tri-gram probabilities of position-specific scoring matrix and recursive feature elimination. Amino Acids, 47(3), 461–468. doi:10.1007/s00726-014-1878-9.

    Article  Google Scholar 

  26. Van Oudenhove, L., & Devreese, B. (2013). A review on recent developments in mass spectrometry instrumentation and quantitative tools advancing bacterial proteomics. Applied Microbiology and Biotechnology, 97(11), 4749–4762. doi:10.1007/s00253-013-4897-7.

    Article  Google Scholar 

  27. Webb-Robertson, B. J., Kreuzer, H., Hart, G., Ehleringer, J., West, J., Gill, G., et al. (2012). Bayesian integration of isotope ratio for geographic sourcing of castor beans. Journal of Biomedicine and Biotechnology, 2012, 450967. doi:10.1155/2012/450967.

    Article  Google Scholar 

  28. Webb-Robertson, B. J., McCue, L. A., Beagley, N., McDermott, J. E., Wunschel, D. S., Varnum, S. M., et al. (2009). A Bayesian integration model of high-throughput proteomics and metabolomics data for improved early detection of microbial infections. Pac Symp Biocomput (pp. 451–463).

    Google Scholar 

  29. Webb-Robertson, B. J., Wiberg, H. K., Matzke, M. M., Brown, J. N., Wang, J., McDermott, J. E., et al. (2015). Review, evaluation, and discussion of the challenges of missing value imputation for mass spectrometry-based label-free global proteomics. Journal of Proteome Research, 14(5), 1993–2001. doi:10.1021/pr501138h.

    Article  Google Scholar 

  30. Yousef, M., Jung, S., Showe, L. C., & Showe, M. K. (2007). Recursive cluster elimination (RCE) for classification and feature selection from gene expression data. BMC Bioinformatics, 8, 144. doi:10.1186/1471-2105-8-144.

    Article  Google Scholar 

  31. Zhang, Q., Fillmore, T. L., Schepmoes, A. A., Clauss, T. R., Gritsenko, M. A., Mueller, P. W., et al. (2013). Serum proteomics reveals systemic dysregulation of innate immunity in type 1 diabetes. Journal of Experimental Medicine, 210(1), 191–203. doi:10.1084/jem.20111843.

    Article  Google Scholar 

Download references

Acknowledgments

This work was funded by NIH NIDDK grant R33 DK070146. Significant portions of the work were performed at the Environmental Sciences Laboratory, a national scientific user facility sponsored by the Department of Energy’s (DOE) Office of Biological and Environmental Research and located at Pacific Northwest National Laboratory (PNNL) in Richland, Washington. PNNL is a multi-program national laboratory operated by Battelle for the DOE under contract DE-AC05-765RL0 1830.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Bobbie-Jo M. Webb-Robertson .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Webb-Robertson, BJ.M., Metz, T.O., Waters, K.M., Zhang, Q., Rewers, M. (2017). Bayesian Posterior Integration for Classification of Mass Spectrometry Data. In: Datta, S., Mertens, B. (eds) Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-45809-0_11

Download citation

Publish with us

Policies and ethics