Skip to main content

Platforms and Pipelines for Proteomics Data Analysis and Management

  • Chapter
  • First Online:
Modern Proteomics – Sample Preparation, Analysis and Practical Applications

Part of the book series: Advances in Experimental Medicine and Biology ((AEMB,volume 919))

Abstract

Since mass spectrometry was introduced as the core technology for large-scale analysis of the proteome, the speed of data acquisition, dynamic ranges of measurements, and data quality are continuously improving. These improvements are triggered by regular launches of new methodologies and instruments.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 249.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Abbreviations

FDR:

False Discovery Rate

GO:

Gene Ontology

GUI:

Graphical User Interface

I/O:

input, output

iTRAQ:

Isobaric tags for relative and absolute quantitation

M/Z:

mass-to-charge

PTM:

Post-Translational Modification

RT:

retention time

SILAC:

Stable isotope labeling by amino acids in cell culture

SRM:

Selected Reaction Monitoring

TB:

terra byte

TPP:

Trans-Proteomics Pipeline

References

  1. Kirkwood KJ et al (2013) Characterization of native protein complexes and protein isoform variation using size-fractionation-based quantitative proteomics. Mol Cell Proteomics 12(12):3851–3873

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  2. Kim MS et al (2014) A draft map of the human proteome. Nature 509(7502):575–581

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  3. Domon B, Aebersold R (2010) Options and considerations when selecting a quantitative proteomics strategy. Nat Biotechnol 28(7):710–721

    Article  CAS  PubMed  Google Scholar 

  4. Weisser H et al (2013) An automated pipeline for high-throughput label-free quantitative proteomics. J Proteome Res 12:1628

    Article  CAS  PubMed  Google Scholar 

  5. Eng JK, McCormack AL, Yates JR (1994) An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. J Am Soc Mass Spectrom 5(11):976–989

    Article  CAS  PubMed  Google Scholar 

  6. Perkins DN et al (1999) Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18):3551–3567

    Article  CAS  PubMed  Google Scholar 

  7. Geer LY et al (2004) Open mass spectrometry search algorithm. J Proteome Res 3(5):958–964

    Article  CAS  PubMed  Google Scholar 

  8. Cox J et al (2011) Andromeda: a peptide search engine integrated into the MaxQuant environment. J Proteome Res 10(4):1794–1805

    Article  CAS  PubMed  Google Scholar 

  9. Eng JK et al (2011) A face in the crowd: recognizing peptides through database search. Mol Cell Proteomics 10(11):R111.009522

    Article  PubMed  PubMed Central  Google Scholar 

  10. Keller A et al (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74(20):5383–5392

    Article  CAS  PubMed  Google Scholar 

  11. Kall L et al (2007) Semi-supervised learning for peptide identification from shotgun proteomics datasets. Nat Methods 4(11):923–925

    Article  PubMed  Google Scholar 

  12. Zhang R et al (2010) Evaluation of computational platforms for LS-MS based label-free QuantitativeProteomics: a global view. J Proteomics Bioinform 3:260–265

    Article  CAS  Google Scholar 

  13. Cox J, Mann M (2008) MaxQuant enables high peptide identification rates, individualized p.p.b.-range mass accuracies and proteome-wide protein quantification. Nat Biotechnol 26(12):1367–1372

    Article  CAS  PubMed  Google Scholar 

  14. Deutsch EW et al (2010) A guided tour of the trans-proteomic pipeline. Proteomics 10(6):1150–1159

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  15. Sturm M et al (2008) OpenMS – an open-source software framework for mass spectrometry. BMC Bioinf 9:163

    Article  Google Scholar 

  16. Hoekman B et al (2012) msCompare: a framework for quantitative analysis of label-free LC-MS data for comparative biomarker studies. Mol Cell Proteomics 11:M111.015974

    Article  PubMed  PubMed Central  Google Scholar 

  17. Aebersold R (2011) Editorial: from data to results. Mol Cell Proteomics 10(11):E111 014787

    Article  PubMed  PubMed Central  Google Scholar 

  18. Perez-Riverol Y et al (2014) Open source libraries and frameworks for mass spectrometry based proteomics: a developer’s perspective. Biochim Biophys Acta 1844(1 Pt A):63–76

    Article  Google Scholar 

  19. Griss J et al (2014) The mzTab data exchange format: communicating mass-spectrometry-based proteomics and metabolomics experimental results to a wider audience. Mol Cell Proteomics 13(10):2765–2775

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Deutsch EW (2012) File formats commonly used in mass spectrometry proteomics. Mol Cell Proteomics 11(12):1612–1621

    Article  PubMed  PubMed Central  Google Scholar 

  21. Gonzalez-Galarza FF et al (2014) A tutorial for software development in quantitative proteomics using PSI standard formats. Biochim Biophys Acta 1844(1 Pt A):88–97

    Article  Google Scholar 

  22. Wilhelm M et al (2012) mz5: space- and time-efficient storage of mass spectrometry data sets. Mol Cell Proteomics 11(1):O111 011379

    Article  PubMed  Google Scholar 

  23. Kessner D et al (2008) ProteoWizard: open source software for rapid proteomics tools development. Bioinformatics (Oxford, England) 24(21):2534–2536

    Article  CAS  Google Scholar 

  24. Chambers MC et al (2012) A cross-platform toolkit for mass spectrometry and proteomics. Nat Biotechnol 30(10):918–920

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Olsen JV et al (2009) A dual pressure linear ion trap orbitrap instrument with very high sequencing speed. Mol Cell Proteomics 8(12):2759–2769

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  26. Kelstrup CD et al (2014) Rapid and deep proteomes by faster sequencing on a benchtop quadrupole ultra-high-field orbitrap mass spectrometer. J Proteome Res 13(12):6187–6195

    Article  CAS  PubMed  Google Scholar 

  27. Nahnsen S et al (2013) Tools for label-free peptide quantification. Mol Cell Proteomics 12(3):549–556

    Article  CAS  PubMed  Google Scholar 

  28. Elias JE, Gygi SP (2007) Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat Methods 4(3):207–214

    Article  CAS  PubMed  Google Scholar 

  29. Nesvizhskii AI et al (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75(17):4646–4658

    Article  CAS  PubMed  Google Scholar 

  30. Reiter L et al (2009) Protein identification false discovery rates for very large proteomics data sets generated by tandem mass spectrometry. Mol Cell Proteomics 8(11):2405–2417

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  31. Webb-Robertson BJ et al (2014) Bayesian proteoform modeling improves protein quantification of global proteomic measurements. Mol Cell Proteomics 13(12):3639–3646

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  32. Gygi SP et al (1999) Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17(10):994–999

    Article  CAS  PubMed  Google Scholar 

  33. Ong S-E et al (2002) Stable isotope labeling by amino acids in cell culture, SILAC, as a simple and accurate approach to expression proteomics. Mol Cell Proteomics 1(5):376–386

    Article  CAS  PubMed  Google Scholar 

  34. Bantscheff M et al (2007) Quantitative mass spectrometry in proteomics: a critical review. Anal Bioanal Chem 389(4):1017–1031

    Article  CAS  PubMed  Google Scholar 

  35. Liao Z et al (2012) IsoQuant: a software tool for stable isotope labeling by amino acids in cell culture-based mass spectrometry quantitation. Anal Chem 84(10):4535–4543

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Wen B et al (2014) IQuant: an automated pipeline for quantitative proteomics based upon isobaric tags. Proteomics 14(20):2280–2285

    Article  CAS  PubMed  Google Scholar 

  37. Cox J et al (2014) Accurate proteomewide label-free quantification by delayed normalization and maximal peptide ratio extraction, termed MaxLFQ. Mol Cell Proteomics 13(9):2513–2526

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  38. Lundgren DH et al (2010) Role of spectral counting in quantitative proteomics. Expert Rev Proteomics 7(1):39–53

    Article  CAS  PubMed  Google Scholar 

  39. Dephoure N, Gygi SP (2012) Hyperplexing: a method for higher-order multiplexed quantitative proteomics provides a map of the dynamic response to rapamycin in yeast. Sci Signal 5(217):rs2

    Article  PubMed  Google Scholar 

  40. Lange E et al (2007) A geometric approach for the alignment of liquid chromatography-mass spectrometry data. Bioinformatics (Oxford, England) 23(13):i273–i281

    Article  CAS  Google Scholar 

  41. Kohlbacher O et al (2007) TOPP-the OpenMS proteomics pipeline. Bioinformatics (Oxford, England) 23(2):e191–e197

    Article  CAS  Google Scholar 

  42. Lange E et al (2008) Critical assessment of alignment procedures for LC-MS proteomics and metabolomics measurements. BMC Bioinf 9:375

    Article  Google Scholar 

  43. Walzer M et al (2014) qcML: an exchange format for quality control metrics from mass spectrometry experiments. Mol Cell Proteomics 13(8):1905–1913

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  44. Keller A, Shteynberg D (2011) Software pipeline and data analysis for MS/MS proteomics: the trans-proteomic pipeline. Methods Mol Biol 694:169–189

    Article  CAS  PubMed  Google Scholar 

  45. Shteynberg D et al (2011) iProphet: multi-level integrative analysis of shotgun proteomic data improves peptide and protein identification rates and error estimates. Mol Cell Proteomics MCP 10(12):M111 007690

    Article  PubMed  Google Scholar 

  46. Eng JK, Jahan TA, Hoopmann MR (2013) Comet: an open-source MS/MS sequence database search tool. Proteomics 13(1):22–24

    Article  CAS  PubMed  Google Scholar 

  47. Craig R, Beavis RC (2004) TANDEM: matching proteins with tandem mass spectra. Bioinformatics (Oxford, England) 20(9):1466–1467

    Article  CAS  Google Scholar 

  48. Kim S, Pevzner PA (2014) MS-GF+ makes progress towards a universal database search tool for proteomics. Nat Commun 5:5277

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Tanner S et al (2005) InsPecT: Identification of posttransiationally modified peptides from tandem mass spectra. Anal Chem 77(14):4626–4639

    Article  CAS  PubMed  Google Scholar 

  50. Tabb DL, Fernando CG, Chambers MC (2007) MyriMatch: highly accurate tandem mass spectral peptide identification by multivariate hypergeometric analysis. J Proteome Res 6(2):654–661

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  51. Lam H et al (2008) Building consensus spectral libraries for peptide identification in proteomics. Nat Methods 5(10):873–875

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  52. Li XJ et al (2003) Automated statistical analysis of protein abundance ratios from data generated by stable-isotope dilution and tandem mass spectrometry. Anal Chem 75(23):6648–6657

    Article  CAS  PubMed  Google Scholar 

  53. Han DK et al (2001) Quantitative profiling of differentiation-induced microsomal proteins using isotope-coded affinity tags and mass spectrometry. Nat Biotechnol 19(10):946–951

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  54. Goecks J et al (2010) Galaxy: a comprehensive approach for supporting accessible, reproducible, and transparent computational research in the life sciences. Genome Biol 11(8):R86

    Article  PubMed  PubMed Central  Google Scholar 

  55. Junker J et al (2012) TOPPAS: a graphical workflow editor for the analysis of high-throughput proteomics data. J Proteome Res 11(7):3914–3920

    Article  CAS  PubMed  Google Scholar 

  56. Trudgian DC et al (2010) CPFP: a central proteomics facilities pipeline. Bioinformatics 26(8):1131–1132

    Article  CAS  PubMed  Google Scholar 

  57. Searle BC (2010) Scaffold: a bioinformatic tool for validating MS/MS-based proteomic studies. Proteomics 10(6):1265–1269

    Article  CAS  PubMed  Google Scholar 

  58. Tabb DL, McDonald WH, Yates JR 3rd (2002) DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J Proteome Res 1(1):21–26

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Cociorva D, Tabb LD, Yates JR (2007) Validation of tandem mass spectrometry database search results using DTASelect. Curr Protoc Bioinformatics Chapter 13: p. Unit 13.4

    Google Scholar 

  60. Park SK et al (2008) A quantitative analysis software tool for mass spectrometry-based proteomics. Nat Methods 5(4):319–322

    CAS  PubMed  PubMed Central  Google Scholar 

  61. McDonald WH et al (2004) MS1, MS2, and SQT-three unified, compact, and easily parsed file formats for the storage of shotgun proteomic spectra and identifications. Rapid Commun Mass Spectrom 18(18):2162–2168

    Article  CAS  PubMed  Google Scholar 

  62. Vizcaino JA et al (2009) A guide to the proteomics identifications database proteomics data repository. Proteomics 9(18):4276–4283

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  63. Deutsch EW, Lam H, Aebersold R (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep 9(5):429–434

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Sven Nahnsen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Codrea, M.C., Nahnsen, S. (2016). Platforms and Pipelines for Proteomics Data Analysis and Management. In: Mirzaei, H., Carrasco, M. (eds) Modern Proteomics – Sample Preparation, Analysis and Practical Applications. Advances in Experimental Medicine and Biology, vol 919. Springer, Cham. https://doi.org/10.1007/978-3-319-41448-5_9

Download citation

Publish with us

Policies and ethics