Skip to main content

A Database Search Algorithm for Identification of Peptides with Multiple Charges Using Tandem Mass Spectrometry

  • Conference paper
Data Mining for Biomedical Applications (BioDM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNBI,volume 3916))

Included in the following conference series:

Abstract

Peptide sequencing using tandem mass spectrometry is the process of interpreting the peptide sequence from a given mass spectrum. Peptide sequencing is an important but challenging problem in bioinformatics. The advancement in mass spectrometry machines has yielded great amount of high quality spectra data, but the methods to analyze these spectra to get peptide sequences are still accurate. There are two types of peptide sequencing methods –database search methods and the de novo methods. Much progress has been made, but the accuracy and efficiency of these methods are not satisfactory and improvements are urgently needed. In this paper, we will introduce a database search algorithm for sequencing of peptides using tandem mass spectrometry. This Peptide Sequence Pattern (PSP) algorithm first generates the peptide sequence patterns (PSPs) by connecting the strong tags with mass differences. Then a linear time database search process is used to search for candidate peptide sequences by PSPs, and the candidate peptide sequences are then scored by share peaks count. The PSP algorithm is designed for peptide sequencing from spectra with multiple charges, but it is also applicable for singly charged spectra. Experiments have shown that our algorithm can obtain better sequencing results than current database search algorithms for many multiply charged spectra, and comparative results for singly charged spectra against other algorithms.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Craig, R., Cortens, J.P., Beavis, R.C.: Open source system for analyzing, validating, and storing protein identification data. J. Proteome Res. 3, 1234–1242 (2004)

    Article  Google Scholar 

  2. Frank, A., Pevzner, P.: PepNovo: De Novo Peptide Sequencing via Probabilistic Network Modeling. Anal. Chem. 77, 964–973 (2005)

    Article  Google Scholar 

  3. Ma, B., Zhang, K., Hendrie, C., Liang, C., Li, M., Doherty-Kirby, A., Lajoie, G.: PEAKS: Powerful Software for Peptide De Novo Sequencing by MS/MS. Rapid Communications in Mass Spectrometry 17, 2337–2342 (2003)

    Article  Google Scholar 

  4. Chong, K.F., Ning, K., Leong, H.W.: De Novo Peptide Sequencing For Multiply Charged Mass Spectra. In: APBC 2006 (to appear, 2006)

    Google Scholar 

  5. Eng, J.K., McCormack, A.L., Yates, I.J.R.: An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein database. JASMS 5, 976–989 (1994)

    Google Scholar 

  6. Mann, M., Wilm, M.: Error-tolerant identification of peptides in sequence databases by peptide sequence tags. Analytical Chemistry 66, 4390–4399 (1994)

    Article  Google Scholar 

  7. Fenyo, D., Qin, J., Chait, B.T.: Protein identification using mass spectrometric information. Electrophoresis 19, 998–1005 (1998)

    Article  Google Scholar 

  8. Pevzner, P.A., Dancik, V., Tang, C.L.: Mutation-tolerant protein identification by mass-spectrometry. In: International Conference on Computational Molecular Biology (RECOMB 2000), pp. 231–236 (2000)

    Google Scholar 

  9. Perkins, D.N., Pappin, D.J.C., Creasy, D.M., Cottrell, J.S.: Probability-based protein identification by searching sequence databases using mass spectrometry data. Electrophoresis 20, 3551–3567 (1999)

    Article  Google Scholar 

  10. Tanner, S., Shu, H., Frank, A., Mumby, M., Pevzner, P., Bafna, V.: Inspect: Fast and accurate identification of post-translationally modified peptides from tandem mass spectra (submitted, 2005)

    Google Scholar 

  11. Dancik, V., Addona, T., Clauser, K., Vath, J., Pevzner, P.: De novo protein sequencing via tandem mass-spectrometry. J. Comp. Biol. 6, 327–341 (1999)

    Article  Google Scholar 

  12. Han, Y., Ma, B., Zhang, K.: SPIDER: Software for Protein Identification from Sequence Tags with De Novo Sequencing Error. In: 2004 IEEE Computational Systems Bioinformatics Conference (CSB 2004) (2004)

    Google Scholar 

  13. Gusfield, D.: Algorithm on Strings, Trees, and Sequences: Computer Science and Computational Biology, 1st edn. Cambridge University Press, Cambridge (1997)

    Book  MATH  Google Scholar 

  14. Wu, S., Manber, U.: AGREP - A Fast Approximate Pattern-matching Tool. In: Proceedings of the Winter 1992 USENIX Conference, pp. 153–162 (1992)

    Google Scholar 

  15. Keller, A., Purvine, S., Nesvizhskii, A.I., Stolyar, S., Goodlett, D.R., Kolker, E.: Experimental protein mixture for validating tandem mass spectral analysis. OMICS 6, 207–212 (2002)

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Ning, K., Chong, K.F., Leong, H.W. (2006). A Database Search Algorithm for Identification of Peptides with Multiple Charges Using Tandem Mass Spectrometry. In: Li, J., Yang, Q., Tan, AH. (eds) Data Mining for Biomedical Applications. BioDM 2006. Lecture Notes in Computer Science(), vol 3916. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11691730_2

Download citation

  • DOI: https://doi.org/10.1007/11691730_2

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-33104-9

  • Online ISBN: 978-3-540-33105-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics