Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting

Kaltenbach, Hans-Michael; Böcker, Sebastian; Rahmann, Sven

doi:10.1007/978-3-540-73060-6_3

Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting

Hans-Michael Kaltenbach¹,
Sebastian Böcker² &
Sven Rahmann^1,3

Conference paper

571 Accesses
1 Citations

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4532))

Abstract

Peptide mass fingerprinting is a technique to identify a protein from its fragment masses obtained by mass spectrometry after enzymatic digestion. Recently, much attention has been given to the question of how to evaluate the significance of identifications; results have been developed mostly from a combinatorial perspective. In particular, existing methods generally do not capture the fact that the same amino acid can have different masses because of, e.g., isotopic distributions or variable chemical modifications.

We offer several new contributions to the field: We introduce probabilistically weighted alphabets, where each character can have different masses according to a probability distribution, and random weighted strings as a fundamental model for random proteins. We develop a general computational framework, Markov Additive Chains, for various statistics of cleavage fragments of random proteins, and obtain general formulas for these statistics. Special results are given for so-called standard cleavage schemes (e.g., Trypsin). Computational results are provided, as well as a comparison to proteins from the SwissProt database.

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)
Article Google Scholar
Frank, A., Pevzner, P.: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem. 15, 964–973 (2005)
Article Google Scholar
Henzel, W.J., Watanabe, C., Stults, J.T.: Protein idendification: The origins of peptide mass fingerprints. J. Am.Soc.Mass Spectrometry 14, 931–942 (2003)
Article Google Scholar
Perkins, D., Pappin, D., Creasy, D., Cottrell, J.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1997)
Article Google Scholar
Pappin, D., Hojrup, P., Bleasby, A.: Rapid identification of proteins by peptide-mass fingerprints. Current Biology 3, 327–332 (1993)
Article Google Scholar
Zhang, W., Chait, B.T.: ProFound: an expert system for protein identification using mass spectrometric peptide mapping information. Anal. Chem. 72, 2482–2489 (2000)
Article Google Scholar
Colinge, J., Masselot, A., Magnin, J.: A systematic statistical analysis of ion trap tandem mass spectra in view of peptide scoring. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 25–38. Springer, Heidelberg (2003)
Google Scholar
Wang, I.J., Diehl, C.P., Pineda, F.J.: A statistical model of proteolytic digestion. In: Proceedings of IEEE CSB 2003, Stanford, California, pp. 506–508 (2003)
Google Scholar
Böcker, S., Kaltenbach, H.M.: Mass spectra alignments and their significance. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 429–441. Springer, Heidelberg (2005)
Google Scholar
Bairoch, A., Boeckmann, B.: The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 20, 2019–2022 (1992)
Google Scholar
Cieliebak, M., Erlebach, T., Lipták, Zs., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: Combinatorics of weighted strings. Discrete Applied Mathematics 137, 27–46 (2004)
Article MATH MathSciNet Google Scholar
Edwards, N., Lippert, R.: Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 68–81. Springer, Heidelberg (2002)
Chapter Google Scholar
Wan, Y., Chen, T.: A hidden markov model based scoring function for mass spectrometry database search. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P., Waterman, M. (eds.) Research in Computational Molecular Biology. LNCS (LNBI), vol. 3500, pp. 342–356. Springer, Heidelberg (2005)
Google Scholar
Böcker, S., Lipták, Zs.: Efficient mass decomposition. In: Proc. of ACM Symposium on Applied Computing (ACM SAC 2005), Santa Fe, USA, pp. 151–157 (2005)
Google Scholar
Bansal, N., Cieliebak, M., Lipták, Zs.: Efficient algorithms for finding submasses in weighted strings. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 194–204. Springer, Heidelberg (2004)
Google Scholar
Régnier, M.: A unified approach to word occurrence probabilities. Discrete Applied Mathematics 104, 259–280 (2000)
Article MATH MathSciNet Google Scholar
Waterman, M.S.: Introduction to Computational Biology, 1st edn. CRC Press, Boca Raton (1996)
Google Scholar
Reinert, G., Schbath, S., Waterman, M.S.: Probabilistic and statistical properties of words: An overview. Journal of Computational Biology 7, 1–46 (2000)
Article Google Scholar
Robin, S., Daudin, J.J.: Exact distribution of word occurrences in a random sequence of letters. Journal of Applied Probability 36, 179–193 (1999)
Article MATH MathSciNet Google Scholar
Wyner, A.J.: More on recurrence and waiting times. The Annals of Applied Probability 9, 780–796 (1999)
Article MATH MathSciNet Google Scholar
Cinlar, E.: Markov Additive Processes I. Z. Wahrscheinl. verw. Geb. 24, 85–93 (1972)
Article MATH MathSciNet Google Scholar
Cinlar, E.: Markov Additive Processes II. Z. Wahrscheinl. verw. Geb. 24, 95–121 (1972)
Article MATH MathSciNet Google Scholar
Ney, P., Nummelin, E.: Markov Additive Processes I. Eigenvalue properties and limit theorems. Ann. Probab. 15, 561–592 (1987)
Article MATH MathSciNet Google Scholar
Ney, P., Nummelin, E.: Markov Additive Processes II. Large deviations. Ann. Probab. 15, 593–609 (1987)
Article MATH MathSciNet Google Scholar
Kaltenbach, H.M., Sudek, H., Böcker, S., Rahmann, S.: Statistics of cleavage fragments in random weighted strings. Technical Report 2005-06, Technische Fakultät der Universität Bielefeld, Abteilung Informationstechnik (2005)
Google Scholar

Download references

Author information

Authors and Affiliations

Graduate School in Bioinformatics and Genome Research, Bielefeld University,
Hans-Michael Kaltenbach & Sven Rahmann
Lehrstuhl für Bioinformatik, Friedrich-Schiller-University Jena, Ernst-Abbe-Platz 2, D-07743 Jena,
Sebastian Böcker
Algorithms and Statistics for Systems Biology group, Faculty of Technology, Bielefeld University, D-33594 Bielefeld,
Sven Rahmann

Authors

Hans-Michael Kaltenbach
View author publications
You can also search for this author in PubMed Google Scholar
Sebastian Böcker
View author publications
You can also search for this author in PubMed Google Scholar
Sven Rahmann
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Trey Ideker Vineet Bafna

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Kaltenbach, HM., Böcker, S., Rahmann, S. (2007). Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting. In: Ideker, T., Bafna, V. (eds) Systems Biology and Computational Proteomics. RSB RCP 2006 2006. Lecture Notes in Computer Science(), vol 4532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73060-6_3

Download citation

DOI: https://doi.org/10.1007/978-3-540-73060-6_3
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-73059-0
Online ISBN: 978-3-540-73060-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics