Abstract
Peptide mass fingerprinting is a technique to identify a protein from its fragment masses obtained by mass spectrometry after enzymatic digestion. Recently, much attention has been given to the question of how to evaluate the significance of identifications; results have been developed mostly from a combinatorial perspective. In particular, existing methods generally do not capture the fact that the same amino acid can have different masses because of, e.g., isotopic distributions or variable chemical modifications.
We offer several new contributions to the field: We introduce probabilistically weighted alphabets, where each character can have different masses according to a probability distribution, and random weighted strings as a fundamental model for random proteins. We develop a general computational framework, Markov Additive Chains, for various statistics of cleavage fragments of random proteins, and obtain general formulas for these statistics. Special results are given for so-called standard cleavage schemes (e.g., Trypsin). Computational results are provided, as well as a comparison to proteins from the SwissProt database.
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)
Frank, A., Pevzner, P.: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem. 15, 964–973 (2005)
Henzel, W.J., Watanabe, C., Stults, J.T.: Protein idendification: The origins of peptide mass fingerprints. J. Am.Soc.Mass Spectrometry 14, 931–942 (2003)
Perkins, D., Pappin, D., Creasy, D., Cottrell, J.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1997)
Pappin, D., Hojrup, P., Bleasby, A.: Rapid identification of proteins by peptide-mass fingerprints. Current Biology 3, 327–332 (1993)
Zhang, W., Chait, B.T.: ProFound: an expert system for protein identification using mass spectrometric peptide mapping information. Anal. Chem. 72, 2482–2489 (2000)
Colinge, J., Masselot, A., Magnin, J.: A systematic statistical analysis of ion trap tandem mass spectra in view of peptide scoring. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 25–38. Springer, Heidelberg (2003)
Wang, I.J., Diehl, C.P., Pineda, F.J.: A statistical model of proteolytic digestion. In: Proceedings of IEEE CSB 2003, Stanford, California, pp. 506–508 (2003)
Böcker, S., Kaltenbach, H.M.: Mass spectra alignments and their significance. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 429–441. Springer, Heidelberg (2005)
Bairoch, A., Boeckmann, B.: The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 20, 2019–2022 (1992)
Cieliebak, M., Erlebach, T., Lipták, Zs., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: Combinatorics of weighted strings. Discrete Applied Mathematics 137, 27–46 (2004)
Edwards, N., Lippert, R.: Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 68–81. Springer, Heidelberg (2002)
Wan, Y., Chen, T.: A hidden markov model based scoring function for mass spectrometry database search. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P., Waterman, M. (eds.) Research in Computational Molecular Biology. LNCS (LNBI), vol. 3500, pp. 342–356. Springer, Heidelberg (2005)
Böcker, S., Lipták, Zs.: Efficient mass decomposition. In: Proc. of ACM Symposium on Applied Computing (ACM SAC 2005), Santa Fe, USA, pp. 151–157 (2005)
Bansal, N., Cieliebak, M., Lipták, Zs.: Efficient algorithms for finding submasses in weighted strings. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 194–204. Springer, Heidelberg (2004)
Régnier, M.: A unified approach to word occurrence probabilities. Discrete Applied Mathematics 104, 259–280 (2000)
Waterman, M.S.: Introduction to Computational Biology, 1st edn. CRC Press, Boca Raton (1996)
Reinert, G., Schbath, S., Waterman, M.S.: Probabilistic and statistical properties of words: An overview. Journal of Computational Biology 7, 1–46 (2000)
Robin, S., Daudin, J.J.: Exact distribution of word occurrences in a random sequence of letters. Journal of Applied Probability 36, 179–193 (1999)
Wyner, A.J.: More on recurrence and waiting times. The Annals of Applied Probability 9, 780–796 (1999)
Cinlar, E.: Markov Additive Processes I. Z. Wahrscheinl. verw. Geb. 24, 85–93 (1972)
Cinlar, E.: Markov Additive Processes II. Z. Wahrscheinl. verw. Geb. 24, 95–121 (1972)
Ney, P., Nummelin, E.: Markov Additive Processes I. Eigenvalue properties and limit theorems. Ann. Probab. 15, 561–592 (1987)
Ney, P., Nummelin, E.: Markov Additive Processes II. Large deviations. Ann. Probab. 15, 593–609 (1987)
Kaltenbach, H.M., Sudek, H., Böcker, S., Rahmann, S.: Statistics of cleavage fragments in random weighted strings. Technical Report 2005-06, Technische Fakultät der Universität Bielefeld, Abteilung Informationstechnik (2005)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kaltenbach, HM., Böcker, S., Rahmann, S. (2007). Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting. In: Ideker, T., Bafna, V. (eds) Systems Biology and Computational Proteomics. RSB RCP 2006 2006. Lecture Notes in Computer Science(), vol 4532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73060-6_3
Download citation
DOI: https://doi.org/10.1007/978-3-540-73060-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73059-0
Online ISBN: 978-3-540-73060-6
eBook Packages: Computer ScienceComputer Science (R0)