A Bayesian Approach to Protein Inference Problem in Shotgun Proteomics

  • Yong Fuga Li
  • Randy J. Arnold
  • Yixue Li
  • Predrag Radivojac
  • Quanhu Sheng
  • Haixu Tang
Part of the Lecture Notes in Computer Science book series (LNCS, volume 4955)


The protein inference problem represents a major challenge in shotgun proteomics. Here we describe a novel Bayesian approach to address this challenge that incorporates the predicted peptide detectabilities as the prior probabilities of peptide identification. Our model removes some unrealistic assumptions used in previous approaches and provides a rigorious probabilistic solution to this problem. We used a complex synthetic protein mixture to test our method, and obtained promising results.


Bayesian Model Protein Inference Tandem Mass Spectrum Proteomics Experiment Shotgun Proteomics 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.


  1. 1.
    Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)Google Scholar
  2. 2.
    McDonald, W.H., Yates, J.R.: Shotgun proteomics: integrating technologies to answer biological questions. Curr. Opin. Mol. Ther. 5(3), 302–309 (2003)Google Scholar
  3. 3.
    Kislinger, T., Emili, A.: Multidimensional protein identification technology: current status and future prospects. Expert Rev. Proteomics 2(1), 27–39 (2005)Google Scholar
  4. 4.
    Swanson, S.K., Washburn, M.P.: The continuing evolution of shotgun proteomics. Drug Discov. Today 10(10), 719–725 (2005)Google Scholar
  5. 5.
    Marcotte, E.M.: How do shotgun proteomics algorithms identify proteins?. Nat. Biotechnol. 25(7), 755–757 (2007)Google Scholar
  6. 6.
    Nesvizhskii, A.I.: Protein identification by tandem mass spectrometry and sequence database searching. Methods Mol Biol 367, 87–119 (2007)Google Scholar
  7. 7.
    Yates, J.R., Eng, J.K., McCormack, A.L., Schieltz, D.: Method to correlate tandem mass spectra of modified peptides to amino acid sequences in the protein database. Anal Chem 67, 1426–1436 (1995)Google Scholar
  8. 8.
    Perkins, D.N., Pappin, D.J., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20(18), 3551–3567 (1999)Google Scholar
  9. 9.
    Craig, R., Beavis, R.C.: TANDEM: matching proteins with tandem mass spectra. Bioinformatics 20(9), 1466–1467 (2004)Google Scholar
  10. 10.
    Nesvizhskii, A.I., Aebersold, R.: Interpretation of shotgun proteomic data: the protein inference problem. Mol Cell Proteomics 4(10), 1419–1440 (2005)Google Scholar
  11. 11.
    Nesvizhskii, A.I., Keller, A., Kolker, E., Aebersold, R.: A statistical model for identifying proteins by tandem mass spectrometry. Anal Chem 75(17), 4646–4658 (2003)Google Scholar
  12. 12.
    Alves, P., Arnold, R.J., Novotny, M.V., Radivojac, P., Reilly, J.P., Tang, H.: Advancement in protein inference from shotgun proteomics using peptide detectability. In: PSB 2007: Pacific Symposium on Biocomputing, pp. 409–420. World Scientific, Singapore (2007)Google Scholar
  13. 13.
    Zhang, B., Chambers, M.C., Tabb, D.L.: Proteomic Parsimony through Bipartite Graph Analysis Improves Accuracy and Transparency. J Proteome Res. 6(9), 3549–3557 (2007)Google Scholar
  14. 14.
    Tang, H., Arnold, R.J., Alves, P., Xun, Z., Clemmer, D.E., Novotny, M.V., Reilly, J.P., Radivojac, P.: A computational approach toward label-free protein quantification using predicted peptide detectability. Bioinformatics 22(14), 481–488 (2006)Google Scholar
  15. 15.
    Lu, P., Vogel, C., Wang, R., Yao, X., Marcotte, E.M.: Absolute protein expression profiling estimates the relative contributions of transcriptional and translational regulation. Nat. Biotechnol. 25(1), 117–124 (2007)Google Scholar
  16. 16.
    Elias, J.E., Haas, W., Faherty, B.K., Gygi, S.P.: Comparative evaluation of mass spectrometry platforms used in large-scale proteomics investigations. Nat. Methods 2(9), 667–675 (2005)(Comparative Study)Google Scholar
  17. 17.
    Elias, J.E., Gygi, S.P.: Target-decoy search strategy for increased confidence in large-scale protein identifications by mass spectrometry. Nat. Methods 4(3), 207–214 (2007) (Evaluation Studies)Google Scholar
  18. 18.
    Keller, A., Nesvizhskii, A.I., Kolker, E., Aebersold, R.: Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem 74(20), 5383–5392 (2002)Google Scholar
  19. 19.
    Wu, F.-X., Gagne, P., Droit, A., Poirier, G.G.: RT-PSM, a real-time program for peptide-spectrum matching with statistical significance. Rapid Commun Mass Spectrom 20(8), 1199–1208 (2006)Google Scholar
  20. 20.
    Bern, M., Goldberg, D.: Improved ranking functions for protein and modification-site identifications. In: Speed, T., Huang, H. (eds.) RECOMB 2007. LNCS (LNBI), vol. 4453, pp. 444–458. Springer, Heidelberg (2007)Google Scholar
  21. 21.
    Geman, S., Geman, D.: Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images. IEEE Trans. on Pattern Analysis and Machine Intelligence 6, 721–741 (1984)Google Scholar
  22. 22.
    Liu, J.S.: Monte Carlo strategies in scientific computing. Springer, Heidelberg (2002)Google Scholar
  23. 23.
    Brunner, E., Ahrens, C.H., Mohanty, S., Baetschmann, H., Loevenich, S., Potthast, F., Deutsch, E.W., Panse, C., de Lichtenberg, U., Rinner, O., Lee, H., Pedrioli, P.G.A., Malmstrom, J., Koehler, K., Schrimpf, S., Krijgsveld, J., Kregenow, F., Heck, A.J.R., Hafen, E., Schlapbach, R., Aebersold, R.: A high-quality catalog of the Drosophila melanogaster proteome. Nat Biotechnol. 25(5), 576–583 (2007)Google Scholar

Copyright information

© Springer-Verlag Berlin Heidelberg 2008

Authors and Affiliations

  • Yong Fuga Li
    • 1
  • Randy J. Arnold
    • 2
  • Yixue Li
    • 3
  • Predrag Radivojac
    • 1
  • Quanhu Sheng
    • 1
    • 3
  • Haixu Tang
    • 1
  1. 1.School of InformaticsIndiana UniversityBloomingtonUSA
  2. 2.Department of ChemistryIndiana UniversityBloomingtonUSA
  3. 3.Key Lab of Systems Biology, Shanghai Institutes for Biological SciencesChinese Academy of SciencesShanghaiChina

Personalised recommendations