A Fragmentation Event Model for Peptide Identification by Mass Spectrometry
We present in this paper a novel fragmentation event model for peptide identification by tandem mass spectrometry. Most current peptide identification techniques suffer from the inaccuracies in the predicted theoretical spectrum, which is due to insufficient understanding of the ion generation process, especially the b/y ratio puzzle.
To overcome this difficulty, we propose a novel fragmentation event model, which is based on the abundance of fragmentation events rather than ion intensities. Experimental results demonstrate that this model helps improve database searching methods. On LTQ data set, when we control the false-positive rate to be 5%, our fragmentation event model has a significantly higher true positive rate (0.83) than SEQUEST (0.73). Comparison with Mascot exhibits similar results, which means that our model can effectively identify the false positive peptide-spectrum pairs reported by SEQUEST and Mascot.
This fragmentation event model can also be used to solve the problem of missing peak encountered by De Novo methods. To our knowledge, this is the first time the fragmentation preference for peptide bonds is used to overcome the missing-peak difficulty.
KeywordsTrue Positive Rate Relative Entropy Tandem Mass Spectrum Theoretical Spectrum Fragmentation Event
Unable to display preview. Download preview PDF.
- 22.Schutz, F., Kapp, E.A., Simpson, R.J., Speed, T.P.: Deriving statistical models for predicting peptide tandem ms product ion intensities. Proteomics 31, 1479–1483 (2003)Google Scholar
- 24.Wan, Y., Chen, T.: A Hidden Markov Model Based Scoring Function for Mass Spectrometry Database Search. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P.A., Waterman, M. (eds.) RECOMB 2005. LNCS (LNBI), vol. 3500, pp. 163–173. Springer, Heidelberg (2005)Google Scholar
- 29.Zhang, Z., Sun, S., Zhu, X., Chang, S., liu, X., Yu, C., Bu, D., Chen, R.: A novel scoring schema for peptide identification by searching protein sequence databases using tandem mass spectrometry data. BMC Bioinformatics 7(222) (2006)Google Scholar