Skip to main content

Least Squares Estimators of Peptide Species Concentrations Based on Gaussian Mixture Decompositions of Protein Mass Spectra

  • Conference paper
  • First Online:
Stochastic Models, Statistics and Their Applications

Part of the book series: Springer Proceedings in Mathematics & Statistics ((PROMS,volume 122))

  • 2714 Accesses

Abstract

In this paper we propose to use Gaussian mixture decompositions of protein mass spectral signals to construct least squares estimators of peptide species concentrations in proteomic samples and further to use these estimators as spectral features in cancer versus normal spectral classifiers. For a real dataset we compare variances of least squares estimators to variances of analogous estimators based on spectral peaks. We also evaluate performance of spectral classifiers with features defined by either least squares estimators or by spectral peaks by their power to differentiate between patterns specific for case and control samples of head and neck cancer patients. Cancer/normal classifiers based on spectral features defined by Gaussian components achieved lower average error rates than classifiers based on spectral peaks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barla A, Jurman G, Riccadonna S, Merler S, Chierici M, Furlanello C (2008) Machine learning methods for predictive proteomics. Brief Bioinform. doi:10.1093/bib/bbn008

    MATH  Google Scholar 

  2. Baggerly KA, Morris JS, Wang J, Gold D, Xiao LC, Coombes KR (2003) A comprehensive approach to the analysis of matrix-assisted laser desorption/ionization-time of flight proteomics spectra from serum samples. Proteomics 3:1667–1672

    Article  Google Scholar 

  3. Baggerly KA, Morris JS, Coombes KR (2004) Reproducibility of SELDI-TOF protein patterns in serum: comparing datasets from different experiments. Bioinformatics 20:777–785

    Article  Google Scholar 

  4. Bao-Ling A, Qu Y, Davis JW, Ward MD, Clements MA, Cazares LH, Semmes OJ, Schellhammer PF, Yasui Y, Feng Z, Wright GL Jr (2002) Serum protein fingerprinting coupled with a pattern-matching algorithm distinguishes prostate cancer from benign prostate hyperplasia and healthy men. Cancer Res 62:3609–3614

    Google Scholar 

  5. Deutsch R (1965) Estimation theory. Prentice Hall, New York

    MATH  Google Scholar 

  6. Dijkstra M, Roelofsen H, Vonk RJ, Jansen RC (2006) Peak quantification in surface-enhanced laser desorption/ionization by using mixture models. Proteomics 6(19):5106–5116

    Article  Google Scholar 

  7. Du P, Kibbe WA, Lin SM (2006) Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics 22(17):2059–2065

    Article  Google Scholar 

  8. Hale JE, Gelfanova V, Ludwig JR, Knierman MD (2003) Application of proteomics for discovery of protein biomarkers. Brief Funct Genomics Proteomics 2:185–193

    Article  Google Scholar 

  9. Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, Berlin

    Book  Google Scholar 

  10. Karpievitch YV, Hill EG, Smolka AJ, Morris JS, Coombes KR, Baggerly KA, Almeida JS (2007) PrepMS: TOF MS data graphical preprocessing tool. Bioinformatics 23(2):264–265

    Article  Google Scholar 

  11. Kempka M, Sjodahl J, Bjork A, Roeraade J (2004) Improved method for peak picking in matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Rapid Commun Mass Spectrom 18:1208–1212

    Article  Google Scholar 

  12. Levene H (1960) In: Olkin I, Hotelling H et al. (eds) Contributions to probability and statistics: essays in honor of Harold hotelling. Stanford University Press, Stanford, pp 278–292

    Google Scholar 

  13. McLachan GJ, Peel W (2000) Finite mixture distributions. Wiley, New York

    Book  Google Scholar 

  14. Noy K, Fasulo D (2007) Improved model-based, platform-independent feature extraction for mass spectrometry. Bioinformatics 23(19):2528–2535

    Article  Google Scholar 

  15. Pelikan R, Hauskrecht M (2010) Efficient peak-labeling algorithms for whole-sample mass spectrometry proteomics. IEEE/ACM Trans Comput Biol Bioinform 7(1):126–137

    Article  MATH  Google Scholar 

  16. Pietrowska M, Polanska J, Walaszczyk A, Wygoda A, Rutkowski T, Skladowski K, Marczak L, Stobiecki M, Marczyk M, Polanski A, Widlak P (2011) Association between plasma proteome profiles analysed by mass spectrometry, a lymphocyte-based DNA-break repair assay and radiotherapy-induced acute mucosal reaction in head and neck cancer patients. Int J Radiat Biol 87(7):711–719

    Article  Google Scholar 

  17. Ressom HW, Varghese RS, Drake SK, Hortin GL, Abdel-Hamid M, Loffredo CA, Goldman R (2007) Peak selection from MALDI-TOF mass spectra using ant colony optimization. Bioinformatics 23:619–626

    Article  Google Scholar 

  18. Sauve AC, Speed TP (2004) Normalization, baseline correction and alignment of high-throughput mass spectrometry data. In: Proceedings gensips

    Google Scholar 

  19. Sokol R, Polanski A (2013) Comparison of methods for initializing EM algorithm for estimation of parameters of Gaussian multi component heteroscedastic mixture models. Studia Inform 34(1):1–25

    Google Scholar 

  20. Wang Y, Zhou X, Wang H, Li K, Yao L, Wong ST (2008) Reversible jump MCMC approach for peak identification for stroke SELDI mass spectrometry using mixture model. Bioinformatics 24(13):407–413

    Article  Google Scholar 

  21. Yang C, He Z, Yu W (2009) Comparison of public peak detection algorithms for MALDI mass spectrometry data analysis. BMC Bioinformatics 10:4

    Article  Google Scholar 

Download references

Acknowledgements

This work was financially supported by the Polish National Science Centre UMO-2011/01/B/ST6/06868 grant (A.P.), GeCONiI project number POIG.02.03.01-24-099/13 (M.M.) and internal grant from Silesian University of Technology BK/265/RAU-1/2014 t.10 (J.P.). All the calculations were carried out using GeCONiI infrastructure funded by project number POIG.02.03.01-24-099/13.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Andrzej Polanski .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2015 Springer International Publishing Switzerland

About this paper

Cite this paper

Polanski, A., Marczyk, M., Pietrowska, M., Widlak, P., Polanska, J. (2015). Least Squares Estimators of Peptide Species Concentrations Based on Gaussian Mixture Decompositions of Protein Mass Spectra. In: Steland, A., Rafajłowicz, E., Szajowski, K. (eds) Stochastic Models, Statistics and Their Applications. Springer Proceedings in Mathematics & Statistics, vol 122. Springer, Cham. https://doi.org/10.1007/978-3-319-13881-7_47

Download citation

Publish with us

Policies and ethics