Statistical Analysis of Spectral Count Data Generated by Label-Free Tandem Mass Spectrometry-Based Proteomics

Pham, Thang V.; Jimenez, Connie R.

doi:10.1007/978-1-61779-111-6_21

Thang V. Pham² &
Connie R. Jimenez

Part of the book series: Neuromethods ((NM,volume 57))

1995 Accesses

Abstract

Label-free strategies for quantitative proteomics provide a versatile and economical alternative to labeling-based proteomics strategies. We have shown for different types of biological samples that spectral counting-based label-free quantitation is a promising avenue for biomarker discovery. Analyzing spectral count data generated from these studies is, however, not straightforward, as commonly used techniques for genomics data analysis are not suitable. In this book chapter, we describe three methods to analyze spectral count data, namely, cluster analysis, significance analysis of independent samples, and significance analysis of paired samples. For cluster analysis, we devise a novel distance measure between samples based on the Jeffrey divergence. This measure prevents highly abundant proteins from dominating others in contribution to the total sample difference. We employ the beta-binomial distribution for significance analysis of independent samples, which integrates both within-sample variation and between-sample variation into a single statistical model. Finally, the Mantel–Haenszel test is used for significance analysis of paired samples. We provide detailed illustrations of the steps involved in the analyses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Protocol: USD 49.95; Price excludes VAT (USA)

eBook: USD 84.99; Price excludes VAT (USA)

Softcover Book: USD 159.00; Price excludes VAT (USA)

Hardcover Book: USD 109.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Aebersold R, Mann M (2003) Mass spectrometry-based proteomics. Nature, 422(6928):198–207.
Article PubMed CAS Google Scholar
Domon B, Aebersold R (2006) Mass spectrometry and protein analysis. Science, 312(5771):212–217.
Article PubMed CAS Google Scholar
Liu H, Sadygov RG, Yates JR III (2004) A model for random sampling and estimation of relative protein abundance in shotgun proteomics. Analytical Chemistry, 76(14):4193–4201.
Article PubMed CAS Google Scholar
Albrethsen J, Knol JC, Piersma SR, Pham TV, de Wit M, Mongera S, Carvalho B, Verheul HM, Fijneman RJ, Meijer GA, Jimenez CR (2010) Sub-nuclear proteomics in colorectal cancer: Identification of proteins enriched in the nuclear matrix fraction and regulation in adenoma to carcinoma progression. Molecular and Cellular Proteomics, 9(5):988–1005.
Article PubMed CAS Google Scholar
Dix MM, Simon GM, Cravatt BF (2008) Global mapping of the topography and magnitude of proteolytic events in apoptosis. Cell, 134(4), 679–691.
Article PubMed CAS Google Scholar
Piersma SR, Fiedler U, Span S, Lingnau A, Pham TV, Hoffmann S, Kubbutat MHG, Jimenez CR (2010) Workflow comparison for in-depth, quantitative secretome proteomics for cancer biomarker discovery: Method evaluation, differential analysis and verification in serum. Journal of Proteome Research, 9(4):1913–1922.
Article PubMed CAS Google Scholar
Rajcevic U, Piersma SR, Bougnaud S, Pham TV, Enger P, Bjerkvig R, Jimenez CR, Niclou SP (2009) Enrichment of tumorigenic stem-like cells in biopsy spheroids from colorectal cancer. In Proceedings of the 8th Annual World Congress HUPO 2009, Toronto, Canada.
Google Scholar
Ramani AK, Li ZH, Hart GT, Carlson MW, Boutz DR, Marcotte EM (2008) A map of human protein interactions derived from co-expression of human mRNAs and their orthologs. Molecular Systems Biology, 4:180.
Article PubMed Google Scholar
Saydam O, Senol O, Schaaij-Visser TB, Pham TV, Piersma SR, Stemmer-Rachamimov AO, Wurdinger T, Peerdeman SM, Jimenez CR (2010) Comparative protein profiling reveals minichromosome maintenance (MCM) proteins as novel potential tumor markers for meningiomas. Journal of Proteome Research, 9(1):485–494.
Article PubMed CAS Google Scholar
Zybailov B, Friso G, Kim J, Rudella A, Rodriguez VR, Asakura Y, Sun Q, van Wijk KJ (2009) Large scale comparative proteomics of a Chloroplast Clp protease mutant reveals folding stress, altered protein homeostasis, and feedback regulation of metabolism. Molecular & Cellular Proteomics, 8(8), 1789–1810.
Article CAS Google Scholar
R Development Core Team (2009) R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0, URL http://www.R-project.org.
Schmidt MW, Houseman A, Ivanov AR, Wolf DA (2007) Comparative proteomic and transcriptomic profiling of the fission yeast Schizosaccharomyces pombe. Molecular Systems Biology, 3:79.
Article PubMed Google Scholar
Bantscheff M, Schirle M, Sweetman G, Rick J, Kuster B (2007) Quantitative mass spectrometry in proteomics: a critical review. Analytical and Bioanalytical Chemistry, 389(4), 1017–1031.
Article PubMed CAS Google Scholar
Zhang B, VerBerkmoes NC, Langston MA, Uberbacher E, Hettich RL, Samatova NF (2006) Detecting differential and correlated protein expression in label-free shotgun proteomics. Journal of Proteome Research, 5(11), 2909–2918.
Article PubMed CAS Google Scholar
Sokal RR, Rohlf FJ (1995) Biometry: the principles and practice of statistics in biological research (3rd edition). W. H. Freeman: New York., Chapter 17. Analysis of frequencies, 685–793.
Google Scholar
Pham TV, Piersma SR, Warmoes M, Jimenez CR (2010) On the beta binomial model for analysis of spectral count data in label-free tandem mass spectrometry-based proteomics. Bioinformatics, 26(3):363–369.
Article PubMed CAS Google Scholar
Skellam JG (1948) A probability distribution derived from the binomial distribution by regarding the probability of success as variable between the sets of trials. Journal of the Royal Statistical Society. Series B (Methodological), 10(2), 257–261.
Google Scholar
Williams DA (1975) The analysis of binary responses from toxicological experiments involving reproduction and teratogenicity. Biometrics, 31(4), 949–952.
Article PubMed CAS Google Scholar

Download references

Acknowledgments

This work is supported by the VUmc Cancer Center, Amsterdam.

Author information

Authors and Affiliations

OncoProteomics Laboratory, Department of Medical Oncology, VU University Medical Center-Cancer Center Amsterdam, Amsterdam, The Netherlands
Thang V. Pham

Authors

Thang V. Pham
View author publications
You can also search for this author in PubMed Google Scholar
Connie R. Jimenez
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Thang V. Pham .

Editor information

Editors and Affiliations

Ctr Neurogenomics/Cognitive Research, Dept Molecular & Cellular Neurobiology, VU University, De Boelelaan 1085, Amsterdam, 1081 HV, Netherlands
Ka Wan Li

Rights and permissions

Reprints and permissions

Copyright information

About this protocol

Cite this protocol

Pham, T.V., Jimenez, C.R. (2011). Statistical Analysis of Spectral Count Data Generated by Label-Free Tandem Mass Spectrometry-Based Proteomics. In: Li, K. (eds) Neuroproteomics. Neuromethods, vol 57. Humana Press, Totowa, NJ. https://doi.org/10.1007/978-1-61779-111-6_21

Download citation

DOI: https://doi.org/10.1007/978-1-61779-111-6_21
Published: 25 April 2011
Publisher Name: Humana Press, Totowa, NJ
Print ISBN: 978-1-61779-110-9
Online ISBN: 978-1-61779-111-6
eBook Packages: Springer Protocols

Publish with us

Policies and ethics