Skip to main content

Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting

  • Conference paper

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 4532))

Abstract

Peptide mass fingerprinting is a technique to identify a protein from its fragment masses obtained by mass spectrometry after enzymatic digestion. Recently, much attention has been given to the question of how to evaluate the significance of identifications; results have been developed mostly from a combinatorial perspective. In particular, existing methods generally do not capture the fact that the same amino acid can have different masses because of, e.g., isotopic distributions or variable chemical modifications.

We offer several new contributions to the field: We introduce probabilistically weighted alphabets, where each character can have different masses according to a probability distribution, and random weighted strings as a fundamental model for random proteins. We develop a general computational framework, Markov Additive Chains, for various statistics of cleavage fragments of random proteins, and obtain general formulas for these statistics. Special results are given for so-called standard cleavage schemes (e.g., Trypsin). Computational results are provided, as well as a comparison to proteins from the SwissProt database.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aebersold, R., Mann, M.: Mass spectrometry-based proteomics. Nature 422, 198–207 (2003)

    Article  Google Scholar 

  2. Frank, A., Pevzner, P.: PepNovo: de novo peptide sequencing via probabilistic network modeling. Anal Chem. 15, 964–973 (2005)

    Article  Google Scholar 

  3. Henzel, W.J., Watanabe, C., Stults, J.T.: Protein idendification: The origins of peptide mass fingerprints. J. Am.Soc.Mass Spectrometry 14, 931–942 (2003)

    Article  Google Scholar 

  4. Perkins, D., Pappin, D., Creasy, D., Cottrell, J.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1997)

    Article  Google Scholar 

  5. Pappin, D., Hojrup, P., Bleasby, A.: Rapid identification of proteins by peptide-mass fingerprints. Current Biology 3, 327–332 (1993)

    Article  Google Scholar 

  6. Zhang, W., Chait, B.T.: ProFound: an expert system for protein identification using mass spectrometric peptide mapping information. Anal. Chem. 72, 2482–2489 (2000)

    Article  Google Scholar 

  7. Colinge, J., Masselot, A., Magnin, J.: A systematic statistical analysis of ion trap tandem mass spectra in view of peptide scoring. In: Benson, G., Page, R.D.M. (eds.) WABI 2003. LNCS (LNBI), vol. 2812, pp. 25–38. Springer, Heidelberg (2003)

    Google Scholar 

  8. Wang, I.J., Diehl, C.P., Pineda, F.J.: A statistical model of proteolytic digestion. In: Proceedings of IEEE CSB 2003, Stanford, California, pp. 506–508 (2003)

    Google Scholar 

  9. Böcker, S., Kaltenbach, H.M.: Mass spectra alignments and their significance. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 429–441. Springer, Heidelberg (2005)

    Google Scholar 

  10. Bairoch, A., Boeckmann, B.: The SWISS-PROT protein sequence data bank. Nucleic Acids Res. 20, 2019–2022 (1992)

    Google Scholar 

  11. Cieliebak, M., Erlebach, T., Lipták, Zs., Stoye, J., Welzl, E.: Algorithmic complexity of protein identification: Combinatorics of weighted strings. Discrete Applied Mathematics 137, 27–46 (2004)

    Article  MATH  MathSciNet  Google Scholar 

  12. Edwards, N., Lippert, R.: Generating peptide candidates from amino-acid sequence databases for protein identification via mass spectrometry. In: Guigó, R., Gusfield, D. (eds.) WABI 2002. LNCS, vol. 2452, pp. 68–81. Springer, Heidelberg (2002)

    Chapter  Google Scholar 

  13. Wan, Y., Chen, T.: A hidden markov model based scoring function for mass spectrometry database search. In: Miyano, S., Mesirov, J., Kasif, S., Istrail, S., Pevzner, P., Waterman, M. (eds.) Research in Computational Molecular Biology. LNCS (LNBI), vol. 3500, pp. 342–356. Springer, Heidelberg (2005)

    Google Scholar 

  14. Böcker, S., Lipták, Zs.: Efficient mass decomposition. In: Proc. of ACM Symposium on Applied Computing (ACM SAC 2005), Santa Fe, USA, pp. 151–157 (2005)

    Google Scholar 

  15. Bansal, N., Cieliebak, M., Lipták, Zs.: Efficient algorithms for finding submasses in weighted strings. In: Sahinalp, S.C., Muthukrishnan, S.M., Dogrusoz, U. (eds.) CPM 2004. LNCS, vol. 3109, pp. 194–204. Springer, Heidelberg (2004)

    Google Scholar 

  16. Régnier, M.: A unified approach to word occurrence probabilities. Discrete Applied Mathematics 104, 259–280 (2000)

    Article  MATH  MathSciNet  Google Scholar 

  17. Waterman, M.S.: Introduction to Computational Biology, 1st edn. CRC Press, Boca Raton (1996)

    Google Scholar 

  18. Reinert, G., Schbath, S., Waterman, M.S.: Probabilistic and statistical properties of words: An overview. Journal of Computational Biology 7, 1–46 (2000)

    Article  Google Scholar 

  19. Robin, S., Daudin, J.J.: Exact distribution of word occurrences in a random sequence of letters. Journal of Applied Probability 36, 179–193 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  20. Wyner, A.J.: More on recurrence and waiting times. The Annals of Applied Probability 9, 780–796 (1999)

    Article  MATH  MathSciNet  Google Scholar 

  21. Cinlar, E.: Markov Additive Processes I. Z. Wahrscheinl. verw. Geb. 24, 85–93 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  22. Cinlar, E.: Markov Additive Processes II. Z. Wahrscheinl. verw. Geb. 24, 95–121 (1972)

    Article  MATH  MathSciNet  Google Scholar 

  23. Ney, P., Nummelin, E.: Markov Additive Processes I. Eigenvalue properties and limit theorems. Ann. Probab. 15, 561–592 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  24. Ney, P., Nummelin, E.: Markov Additive Processes II. Large deviations. Ann. Probab. 15, 593–609 (1987)

    Article  MATH  MathSciNet  Google Scholar 

  25. Kaltenbach, H.M., Sudek, H., Böcker, S., Rahmann, S.: Statistics of cleavage fragments in random weighted strings. Technical Report 2005-06, Technische Fakultät der Universität Bielefeld, Abteilung Informationstechnik (2005)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Trey Ideker Vineet Bafna

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Kaltenbach, HM., Böcker, S., Rahmann, S. (2007). Markov Additive Chains and Applications to Fragment Statistics for Peptide Mass Fingerprinting. In: Ideker, T., Bafna, V. (eds) Systems Biology and Computational Proteomics. RSB RCP 2006 2006. Lecture Notes in Computer Science(), vol 4532. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-73060-6_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-73060-6_3

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-73059-0

  • Online ISBN: 978-3-540-73060-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics