A High-Throughput Bioinformatics Platform for Mass Spectrometry-Based Proteomics

  • Thodoros Topaloglou
  • Moyez Dharsee
  • Rob M. Ewing
  • Yury Bukhman
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4544)


The success of mass spectrometry-based proteomics in emerging applications such as biomarker discovery and clinical diagnostics, is predicated substantially on its ability to achieve growing demands for throughput. Support for high throughput implies sophisticated tracking of experiments and the experimental steps, larger amounts of data to be organized and summarized, more complex algorithms for inferring and tracking protein expression across multiple experiments, statistical methods to access data quality, and a streamlined proteomics-centric bioinformatics environment to establish the biological context and relevance of the experimental measurements. This paper presents a bioinformatics platform that was built for an industrial mass spectrometry-based proteomics laboratory focusing on biomarker discovery. The basis of the platform is a robust and scalable information management environment supported by database and workflow management technology that is employed for the integration of heterogeneous data, applications and processes across the entire laboratory workflow. This paper focuses on selected features of the platform which include: (a) a method for improving the accuracy of protein assignment, (b) novel software tools for protein expression analysis that combine differential MS quantitation with tandem MS for peptide identification, and (c) integration of methods to aid the biological relevance and statistical significance of differentially expressed proteins.


Elution Time Protein Inference Mass Spectrum Spectrum Isotopic Cluster Monoisotopic Peak 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422(6928), 198–207 (2003)CrossRefGoogle Scholar
  2. Baldwin, M.A.: Protein identification by mass spectrometry: issues to be considered. Mol. Cell Proteomics 3(1), 1–9 (2004)CrossRefGoogle Scholar
  3. Boyle, E.I., Weng, S., et al.: GO::TermFinder–open source software for accessing Gene Ontology information and finding significantly enriched Gene Ontology terms associated with a list of genes. Bioinformatics 20(18), 3710–3715 (2004)CrossRefGoogle Scholar
  4. Cargile, B.J., Bundy, J.L., et al.: Potential for false positive identifications from large databases through tandem mass spectrometry. J Proteome Res. 3(5), 1082–1085 (2004)CrossRefGoogle Scholar
  5. Chernushevich, I., Loboda, A., et al.: An introduction to quadrupole-time-of-flight mass spectrometry. Journal of Mass Spectrometry 26, 859–865 (2001)Google Scholar
  6. Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)CrossRefGoogle Scholar
  7. Fenyo, D., Beavis, R.C.: A method for assessing the statistical significance of mass spectrometry-based protein identifications using general scoring schemes. Anal. Chem. 75(4), 768–774 (2003)CrossRefGoogle Scholar
  8. Horn, D.M., Zubarev, R.A., et al.: Automated Reduction and Interpretation of High Resolution Electrospray Mass Spectra of Large Molecules. Journal of American Society for Mass Spectrometry 11, 320–322 (2000)CrossRefGoogle Scholar
  9. Hosack, D.A., Dennis Jr., G., et al.: Identifying biological themes within lists of genes with EASE. Genome Biol. 4(10) (2003)Google Scholar
  10. Johnson, K.L., Mason, C.J., et al.: Analysis of the Low Molecular Weight Fraction of Serum by LC-Dual ESI-FT-ICR Mass Spectrometry: Precision of Retention Time, Mass, and Ion Abundance. Analytical Chemistry 76, 5097–5103 (2004)CrossRefGoogle Scholar
  11. Keller, A., Eng, J., et al.: A uniform proteomics MS/MS analysis platform utilizing open XML file formats. Molecular Systems Biology (2005)Google Scholar
  12. Kersey, P.J., Duarte, J., et al.: The International Protein Index: an integrated database for proteomics experiments. Proteomics 4(7), 1985–1988 (2004)CrossRefGoogle Scholar
  13. Kiebel, G.R., Anderson, G.A., et al.: Proteomics Research Information Storage and Management (PRISM) System, Pacific Northwest National Laboratory (2004)Google Scholar
  14. Kristensen, D.B., Brond, J.C., et al.: Experimental Peptide Identification Repository (EPIR): an integrated peptide-centric platform for validation and mining of tandem mass spectrometry data. Mol. Cell Proteomics 3(10), 1023–1038 (2004)CrossRefGoogle Scholar
  15. Li, X.-J., Zhang, H., et al.: Automated Statistical Analysis of Protein Abundance Ratios from Data Generated by Stable-Isotope Dilution and Tandem Mass Spectrometry. Analytical Chemistry 75(23), 6648–6657 (2003)CrossRefGoogle Scholar
  16. Lilien, R., Farid, H., et al.: Probabilistic Disease Classification of Expression-Dependent Proteomic Data from Mass Spectrometry of Human Serum. Journal of Computational Biology 10(6), 925–946 (2003)CrossRefGoogle Scholar
  17. Listgarten, J., Emili, A.: Statistical and computational methods for comparative proteomic profiling using liquid chromatography-tandem mass spectrometry. Mol. Cell Proteomics 4(4), 419–434 (2005)CrossRefGoogle Scholar
  18. Ludascher, B., Goble, C.: Guest Editors’ Introduction to the Special Section on Scientific Workflows. SIGMOD Rec. 34(3), 4–5 (2005)CrossRefGoogle Scholar
  19. MacCoss, M.J., Wu, C.C., et al.: A Correlation Algorithm for the Automated Quantitative Analysis of Shothun Proteomics. Analytical Chemistry 75(24), 6912–6921 (2003)CrossRefGoogle Scholar
  20. Nesvizhskii, A.I., Keller, A., et al.: A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 75(17), 4646–4658 (2003)CrossRefGoogle Scholar
  21. Pedrioli, P.G., Eng, J.K., et al.: A common open representation of mass spectrometry data and its application to proteomics research. Nat. Biotechnol 22(11), 1459–1466 (2004)CrossRefGoogle Scholar
  22. Perkins, D.N., Pappin, D.J., et al.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)CrossRefGoogle Scholar
  23. Petricoin, E., Ardekani, A., et al.: Use of proteomic patterns in serum to identify ovarian cancer. Lancet 7(9306), 572–577 (2002)CrossRefGoogle Scholar
  24. Senko, M., Beu, S., et al.: Automated Assignment of Charge States from Resolved Isotopic Peaks for Multiply Charged Ions. Journal of American Society for Mass Spectrometry 6, 52–56 (1995)CrossRefGoogle Scholar
  25. Simmhan, Y., Plale, B., et al.: A Survey of Data Provenance in e-Science. SIGMOD Rec. 34(3), 31–36 (2005)CrossRefGoogle Scholar
  26. Simon, R.M., Korn, E.L., et al.: Design and Analysis of DNA Microarray Investigations. Springer, Heidelberg (2003)Google Scholar
  27. Smith, R., Loo, J., et al.: New Developments in Biochemical Mass Spectrometry: Electrospray Ionization. Analytical Chemistry 62, 882–899 (1990)CrossRefGoogle Scholar
  28. Syka, J., Marto, J., et al.: Novel Linear Quadrupole Ion Trap/FT Mass Spectrometer: Performance Characterization and Use in the Comparative Analysis of Histone H3 Post-translational Modifications. Journal of Proteomics Research 3, 621–626 (2004)CrossRefGoogle Scholar
  29. Tabb, D.L., McDonald, W.H., et al.: DTASelect and Contrast: tools for assembling and comparing protein identifications from shotgun proteomics. J Proteome Res. 1(1), 21–26 (2002)CrossRefGoogle Scholar
  30. Taylor, C.F., Paton, N.W., et al.: A systematic approach to modeling, capturing, and disseminating proteomics experimental data. Nat. Biotech 21(3), 247–254 (2003)CrossRefGoogle Scholar
  31. Yang, X., Dondeti, V., et al.: DBParser: web-based software for shotgun proteomic data analyses. J Proteome Res. 3(5), 1002–1008 (2004)CrossRefGoogle Scholar
  32. Zeeberg, B.R., Feng, W., et al.: GoMiner: a resource for biological interpretation of genomic and proteomic data. Genome Biol. 4(4) (2003) Google Scholar
  33. Zhang, Z., Marshall, A.: A Universal Algorithm for Fast and Automated Charge State Deconvolution of Electrospray Mass-to-Charge Ratio Spectra. Journal of American Society for Mass Spectrometry 9, 320–332 (1998)CrossRefGoogle Scholar

Copyright information

© Springer Berlin Heidelberg 2007

Authors and Affiliations

  • Thodoros Topaloglou
    • 1
  • Moyez Dharsee
    • 2
  • Rob M. Ewing
    • 2
  • Yury Bukhman
    • 3
  1. 1.Information Engineering, Dept of Mechanical & Industrial Eng., University of Toronto 
  2. 2.Infochromics, MaRS Discovery District, Toronto 
  3. 3.Campbell Family Institute for Breast Cancer Research, University Health Network, Toronto 

Personalised recommendations