Skip to main content

Data Management and Data Integration in the HUPO Plasma Proteome Project

  • Protocol
  • First Online:
Data Mining in Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 696))

Abstract

The Human Plasma Proteome Project (HPPP) is an international collaboration coordinated by the Human Proteome Organisation (HUPO). Its Pilot Phase generated the 2005 Proteomics special issue “Exploring the Human Plasma Proteome” (Omenn et al. Proteomics 5:3226–3245, 2005) and a book with the same title (Omenn GS (ed) (2006) Exploring the human plasma proteome. Wiley-Liss, Weinheim, pp 372). Data management for that Pilot Phase included collection, integration, analysis, and dissemination of findings from participating laboratories and data repositories. Many investigators face the same challenges of integration of data from complex, dynamic serum, and plasma specimens. The PPP workflow assembled a representative Core Dataset of 3,020 protein identifications, overcoming ambiguity and redundancy in the heterogeneous contributed identifications and redundancy and updates in the protein sequence databases. The results were made available with alternative thresholds from the University of Michigan, yielding a range of numbers of protein identifications. Data were submitted to EBI/PRIDE and to ISB/PeptideAtlas. The current phase of the PPP employs Proteome Xchange to link submission of well-annotated primary datasets to EBI/PRIDE, distributed file sharing by Tranche/Proteome Commons.org, and reanalysis from the primary raw spectra at ISB/PeptideAtlas. Such human plasma proteome datasets are available for data mining comparisons with the proteomes of other organs and biofluids in health and disease.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 159.00
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Omenn GS, States DJ, Adamski MR, Blackwell TW, Menon R, Hermjakob H et al (2005) Overview of the HUPO plasma proteome project: results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database. Proteomics 5:3226–3245

    Article  CAS  PubMed  Google Scholar 

  2. Omenn GS (ed.) (2006) Exploring the human plasma proteome. Wiley-Liss, Weinheim, p 372

    Google Scholar 

  3. Adamski M, Blackwell T, Menon R, Martens L, Hermjakob H, Taylor C et al (2005) Data management and preliminary data analysis in the pilot phase of the HUPO Plasma Proteome Project. Proteomics 5:3246–3261

    Article  CAS  PubMed  Google Scholar 

  4. Carr S, Aebersold R, Baldwin M, Burlingame A, Clauser K, Nesvizhskii A (2004) The need for guidelines in publication of peptide and protein identification data: Working Group on Publication Guidelines for Peptide and Protein Identification Data. Mol Cell Proteomics 3:531–533

    Article  CAS  Google Scholar 

  5. Nesvizhskii AI, Keller A, Kolker E, Aebersold R (2003) A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75:4646–4658

    Article  CAS  PubMed  Google Scholar 

  6. Kersey PJ, Duarte J, Williams A, Karavidopoulou Y, Birney E, Apweiler R (2004) The International Protein Index: an integrated database for proteomics experiments. Proteomics 4:1985–1988

    Article  CAS  PubMed  Google Scholar 

  7. Hamacher M, Apweiler R, Arnold G, Becker A, Blüggel M, Carrette O et al (2006) HUPO brain proteome project: summary of the pilot phase and introduction of a comprehen­sive data reprocessing strategy. Proteomics 6:4890–4898

    Article  CAS  PubMed  Google Scholar 

  8. States DJ, Omenn GS, Blackwell TW, Fermin D, Eng J, Speicher DW, Hanash SM (2006) Challenges in deriving high-confidence protein identifications from data gathered by HUPO plasma proteome collaborative study. Nat Biotech 24:333–338

    Article  CAS  Google Scholar 

  9. Kapp EA, Schütz F, Connolly LM, Chakel JA, Meza JE, Miller CA et al (2005) An evaluation, comparison and accurate benchmarking of several publicly-available MS/MS search algorithms: sensitivity and specificity analysis. Proteomics 5:3475–3490

    Article  CAS  PubMed  Google Scholar 

  10. Haab BB, Geierstanger BH, Michailidis G, Vitzthum F, Forrester S, Okon R et al (2005) Immunoassay and antibody microarray analysis of the HUPO PPP reference specimens: systematic variation between sample types and calibration of mass spectrometry data. Proteomics 5:3278–3291

    Article  CAS  PubMed  Google Scholar 

  11. Yan W, Apweiler R, Balgley BM, Boontheung P, Bundy JL, Cargile BJ et al (2009) Systematic comparison of the human saliva and plasma proteomes. Proteomics Clin Appl 3:116–134

    Article  CAS  PubMed  Google Scholar 

  12. Deutsch EW (2010) The PeptideAtlas Project. Methods Mol Biol 604:285–296

    Article  CAS  PubMed  Google Scholar 

  13. Deutsch EW, Lam H, Aebersold R (2008) PeptideAtlas: a resource for target selection for emerging targeted proteomics workflows. EMBO Rep 9:429–434

    Article  CAS  PubMed  Google Scholar 

  14. Deutsch EW, Eng JK, Zhang H, King NL, Nesvizhskii AI, Lin B et al (2005) Human Plasma PeptideAtlas. Proteomics 5:3497–3500

    Article  CAS  PubMed  Google Scholar 

  15. Lam H, Deutsch EW, Eddes JS, Eng JK, King N, Stein SE, Aebersold R (2007) Development and validation of a spectral library searching method for peptide identification from MS/MS. Proteomics 7:655–667

    Article  CAS  PubMed  Google Scholar 

  16. Omenn GS, Aebersold R, Paik YK (2007) HUPO plasma proteome project 2007 workshop report. Mol Cell Proteomics 6:2252–2253

    Article  CAS  Google Scholar 

  17. Omenn GS, Menon R, Adamski M, Blackwell T, Haab BB, Gao W, States DJ (2007) The human plasma and serum proteome. In: Thongboonkerd V (ed) Proteomics of human body fluids: principles, methods, and applications. Humana Press, Totowa, NJ, pp 195–224

    Chapter  Google Scholar 

  18. Omenn GS, Aebersold R, Paik YK (2009) 7th HUPO world congress of proteomics: launching the second phase of the HUPO plasma proteome project (PPP-2) 16-20 August 2008, Amsterdam, The Netherlands. Proteomics 9:4–6

    Article  CAS  PubMed  Google Scholar 

  19. HUPO – the Human Proteome Organisation. (2010) A Gene-centric Human Proteome Project. Mol Cell Proteomics 9:427–429

    Google Scholar 

  20. Gelfand C, Omenn GS (2010) Pre-analytical variables for plasma and serum proteome analyses. In Ivanov A, Lazarev A (eds), Sample preparation in biological mass spectrometry. Springer, NY. (in press)

    Google Scholar 

  21. Rai AJ, Gelfand CA, Haywood BC, Warunek DJ, Yi J, Schuchard MD et al (2005) HUPO plasma proteome project specimen collection and handling: towards the standardization of parameters for plasma proteome samples. Proteomics 5:3262–3277

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgments

I thank all the investigators and core staff for the Pilot Phase of the HUPO HPPP (see (1)) and especially the bioinformatics team headquartered at the University of Michigan who were coauthors on the original description of the Data Management and Data Integration plan for this project: Marcin Adamski, Thomas Blackwell, Rajasree Menon, and David States of the University of Michigan and Lennart Martens, Chris Taylor, and Henning Hermjakob of the European Bioinformatics Institute (see (3)). I thank Eric Deutsch and Terry Farrah of the Institute for Systems Biology for the current data from the PeptideAtlas Human Plasma Proteome build and for review of the manuscript.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gilbert S. Omenn .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Omenn, G.S. (2011). Data Management and Data Integration in the HUPO Plasma Proteome Project. In: Hamacher, M., Eisenacher, M., Stephan, C. (eds) Data Mining in Proteomics. Methods in Molecular Biology, vol 696. Humana Press. https://doi.org/10.1007/978-1-60761-987-1_15

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-987-1_15

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-986-4

  • Online ISBN: 978-1-60761-987-1

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics