Skip to main content

Protein Identification by Spectral Networks Analysis

  • Protocol
  • First Online:
Bioinformatics for Comparative Proteomics

Part of the book series: Methods in Molecular Biology ((MIMB,volume 694))

Abstract

While advances in tandem mass spectrometry (MS/MS) steadily increase the rate of generation of MS/MS spectra, standard algorithmic approaches for peptide identification recently seemed to be reaching the limit on the amount of information that could be extracted from MS/MS spectra. However, a closer look reveals that a common limiting procedure is to analyze each spectrum in isolation, even though high throughput mass spectrometry regularly generates many spectra from related peptides. By capitalizing on this redundancy we show that, similarly to the alignment of protein sequences, unidentified MS/MS spectra can also be aligned for the identification of modified and unmodified variants of the same peptide. Moreover, this alignment procedure can be iterated for the accurate grouping of multiple modification variants of the same peptides. Furthermore, the combination of shotgun proteomics with the alignment of spectra from overlapping peptides led to the development of Shotgun Protein Sequencing – similarly to the assembly of DNA reads into whole genomic sequences, we show that assembly of MS/MS spectra enables the highest ever de novo sequencing accuracy, while recovering nearly complete protein sequences. We further show that shotgun protein sequencing has the potential to overcome the limitations of ­current protein sequencing approaches and thus catalyze the otherwise impractical applications of proteomics methodologies in studies of unknown proteins.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Protocol
USD 49.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 169.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    We remark that the term precursor mass is commonly used to denote the term \(\frac{M+18+Z}{Z}\), where M is a peptide’s parent mass and Z its parent charge.

References

  1. Aebersold, R. and Mann, M. (2003) Mass spectrometry-based proteomics. Nature, 422, 198–207.

    Article  PubMed  CAS  Google Scholar 

  2. Yates, J. R. (2004) Mass spectrometry as an emerging tool for systems biology. Biotechniques, 36, 917–919.

    PubMed  CAS  Google Scholar 

  3. Biemann, K., Cone, C., Webster, B., and Arsenault, G. (1966) Determination of the amino acid sequence in oligopeptides by computer interpretation of their high-resolution mass spectra. J Am Chem Soc, 88, 5598–5606.

    Article  PubMed  CAS  Google Scholar 

  4. Henzel, W. J., Billeci, T. M., Stults, J. T., Wong, S. C., Grimley, C., and Watanabe, C. (1993) Identifying proteins from two-dimensional gels by molecular mass searching of peptide fragments in protein sequence databases. Proc Natl Acad Sci USA, 90, 5011–5015.

    Article  PubMed  CAS  Google Scholar 

  5. Yates, J., Eng, J., and McCormack, A. (1995) Mining genomes: correlating tandem mass spectra of modified and unmodified peptides to sequences in nucleotide databases. Anal Chem, 67, 3202–3210.

    Article  PubMed  CAS  Google Scholar 

  6. Keller, A., Nesvizhskii, A., Kolker, E., and Aebersold, R. (2002) Empirical statistical model to estimate the accuracy of peptide identifications made by MS/MS and database search. Anal Chem, 74, 5383–5392.

    Article  PubMed  CAS  Google Scholar 

  7. Nesvizhskii, A. I. and Aebersold, R. (2005) Interpretation of shotgun proteomic data: the protein inference problem. Mol Cell Proteomics, 4, 1419–1440.

    Article  PubMed  CAS  Google Scholar 

  8. Fischer, B., Roth, V., Roos, F., Grossmann, J., Baginsky, S., Widmayer, P., Gruissem, W., and Buhmann, J. M. (2005) Novohmm: a hidden Markov model for de novo peptide sequencing. Anal Chem, 77, 7265–7273.

    Article  PubMed  CAS  Google Scholar 

  9. MacCoss, M., et al. (2002) Shotgun identification of protein modifications from protein complexes and lens tissue. Proc Natl Acad Sci USA, 99, 7900–7905.

    Article  PubMed  CAS  Google Scholar 

  10. Englander, J., Del Mar, C., Li, W., Englander, S., Kim, J., Stranz, D., Hamuro, Y., and Woods, V. (2003) Protein structure change studied by hydrogen-deuterium exchange, functional labeling, and mass spectrometry. Proc Natl Acad Sci USA, 100, 7057–7062.

    Article  PubMed  CAS  Google Scholar 

  11. Bandeira, N., Tsur, D., Frank, A., and Pevzner, P. (2007) Protein identification via spectral networks analysis. Proc Natl Acad Sci USA, 104, 6140–6145.

    Article  PubMed  CAS  Google Scholar 

  12. Siuzdak, G. (2003) Mass Spectrometry in Biotechnology. MCC Press, San Diego.

    Google Scholar 

  13. Tabb, D., MacCoss, M., Wu, C., Anderson, S., and Yates, JR. (2003) Similarity among tandem mass spectra from proteomic experiments: detection, significance, and utility. Anal Chem, 75, 2470–2477.

    Article  PubMed  CAS  Google Scholar 

  14. Beer, I., Barnea, E., Ziv, T., and Admon, A. (2004) Improving large-scale proteomics by clustering of mass spectrometry data. Proteomics, 4, 950–960.

    Article  PubMed  CAS  Google Scholar 

  15. Bandeira, N., Tang, H., Bafna, V., and Pevzner, P. (2004) Shotgun protein sequencing by tandem mass spectra assembly. Anal Chem, 76, 7221–7233.

    Article  PubMed  CAS  Google Scholar 

  16. Klammer, A. A. and MacCoss, M. J. (2006) Effects of modified digestion schemes on the identification of proteins from complex mixtures. J Proteome Res, 5, 695–700.

    Article  PubMed  CAS  Google Scholar 

  17. Hunyadi-Gulyas, E. and Medzihradszky, K. (2004) Factors that contribute to the complexity of protein digests. DDT Targets, 3, 3–10.

    Google Scholar 

  18. Tanner, S., Shu, H., Frank, A., Wang, L., Zandi, E., Mumby, M., Pevzner, P., and Bafna, V. (2005) InsPecT: identification of posttranslationally modified peptides from tandem mass spectra. Anal Chem, 77, 4626–4639.

    Article  PubMed  CAS  Google Scholar 

  19. Tsur, D., Tanner, S., Zandi, E., Bafna, V., and Pevzner, P. A. (2005) Identification of post-translational modifications by blind search of mass spectra. Nat Biotechnol, 23, 1562–1567.

    Article  PubMed  CAS  Google Scholar 

  20. Wilmarth, P. A., Tanner, S., Dasari, S., Nagalla, S. R., Riviere, M. A., Bafna, V., Pevzner, P. A., and David, L. L. (2006) Age-related changes in human crystallins determined from comparative analysis of post-translational modifications in young and aged lens: does deamidation contribute to crystallin insolubility? J Proteome Res, 5, 2554–2566.

    Article  PubMed  CAS  Google Scholar 

  21. Smith, T. F. and Waterman, M. S. (1981) Identification of common molecular subsequences. J Mol Biol, 147(1), 195–197.

    Article  PubMed  CAS  Google Scholar 

  22. Pevzner, P., Dancík, V., and Tang, C. (2000) Mutation-tolerant protein identification by mass spectrometry. J Comput Biol, 7, 777–787

    Article  PubMed  CAS  Google Scholar 

  23. Bandeira, N., Tsur, D., Frank, A., and Pevzner, P. (2006) A New Approach to Protein Identification. Apostolico, A., Guerra, C., Istrail, S., Pevzner, P. A., and Waterman, M. (eds.), Proceeding of the Tenth Annual 21 International Conference in Research in Computational Molecular Biology (RECOMB 2006), vol. 3909 of Lecture Notes in Computer Science, pp. 363–378, Springer, Germany.

    Google Scholar 

  24. Gearhart, P. J. (2002) Immunology: the roots of antibody diversity. Nature, 419, 29–31.

    Article  PubMed  CAS  Google Scholar 

  25. Wiles, M. and Andreassen, P. (2006) Monoclonals – the billion dollar molecules of the future. Drug Discov World, Fall 2006, 17–23.

    Google Scholar 

  26. Haurum, J. S. (2006) Recombinant polyclonal antibodies: the next generation of antibody therapeutics? Drug Discov Today, 11, 655–660.

    Article  PubMed  CAS  Google Scholar 

  27. Lewis, R. J. and Garcia, M. L. (2003) Therapeutic potential of venom peptides. Nat Rev Drug Discov, 2, 790–802.

    Article  PubMed  CAS  Google Scholar 

  28. Pimenta, A. M. and De Lima, M. E. (2005) Small peptides, big world: biotechnological potential in neglected bioactive peptides from arthropod venoms. J Pept Sci, 11, 670–676.

    Article  PubMed  CAS  Google Scholar 

  29. Joseph, J. and Kini, R. (2004) Snake venom prothrombin activators similar to blood coagulation factor Xa. Curr Drug Targets Cardiovasc Haematol Disord, 4, 397–416.

    Article  PubMed  CAS  Google Scholar 

  30. Swenson, S., Toombs, C., Pena, L., Johansson, J., and Markland, F.(2004) Alpha-fibrinogenases. Curr Drug Targets Cardiovasc Haematol Disord, 4, 417–435.

    Article  PubMed  CAS  Google Scholar 

  31. Kini, R., Rao, V., and Joseph, J. (2001) Procoagulant proteins from snake venoms. Haemostasis, 31, 218–224.

    PubMed  CAS  Google Scholar 

  32. Swenson, S., Costa, F., Minea, R., Sherwin, R., Ernst, W., Fujii, G., Yang, D., and Markland, F. (2004) Intravenous liposomal delivery of the snake venom disintegrin contortrostatin limits breast cancer progression. Mol Cancer Ther, 3, 499–511.

    PubMed  CAS  Google Scholar 

  33. Pal, S. K., Gomes, A., Dasgupta, S. C., and Gomes, A. (2002) Snake venom as therapeutic agents: from toxin to drug development. Indian J Exp Biol, 40, 1353–1358.

    PubMed  CAS  Google Scholar 

  34. Markland, F., Shieh, K., Zhou, Q., Golubkov, V., Sherwin, R., Richters, V., and Sposto, R. (2001) A novel snake venom disintegrin that inhibits human ovarian cancer dissemination and angiogenesis in an orthotopic nude mouse model. Haemostasis, 31, 183–191.

    PubMed  CAS  Google Scholar 

  35. Zugasti-Cruz, A., Maillo, M., López-Vera, E., Falcón, A., Heimer de la Cotera, E. P., Olivera, B. M., and Aguilar, M. B. (2006) Amino acid sequence and biological activity of a gamma-conotoxin-like peptide from the worm-hunting snail Conus austini. Peptides, 27, 506–511.

    Article  PubMed  CAS  Google Scholar 

  36. Ogawa, Y., Yanoshita, R., Kuch, U., Samejima, Y., and Mebs, D. (2004) Complete amino acid sequence and phylogenetic analysis of a long-chain neurotoxin from the venom of the African banded water cobra, Boulengerina annulata. Toxicon, 43, 855–858.

    Article  PubMed  CAS  Google Scholar 

  37. Johnson, R. and Biemann, K. (1987) The primary structure of thioredoxin from Chromatium vinosum determined by high-performance tandem mass spectrometry. Biochemistry, 26, 1209–1214.

    Article  PubMed  CAS  Google Scholar 

  38. Pham, V., Henzel, W. J., Arnott, D., Hymowitz, S., Sandoval, W. N., Truong, B. T., Lowman, H., and Lill, J. R. (2006) De novo proteomic sequencing of a monoclonal antibody raised against ox40 ligand. Anal Biochem, 352, 77–86.

    Article  PubMed  CAS  Google Scholar 

  39. Bandeira, N., Clauser, K., and Pevzner, P. (2007) Shotgun protein sequencing: assembly of tandem mass spectra from mixtures of modified proteins. Mol Cell Proteomics, 6, 1123–1134.

    Article  PubMed  CAS  Google Scholar 

  40. Han, Y., Ma, B., and Zhang, K. (2005) Spider: software for protein identification from sequence tags with de novo sequencing error. J Bioinform Comput Biol, 3, 697–716.

    Article  PubMed  CAS  Google Scholar 

  41. Savitski, M. M., Nielsen, M. L., and Zubarev, R. A. (2006) Modificomb, a new proteomic tool for mapping substoichiometric post-translational modifications, finding novel types of modifications, and fingerprinting complex protein mixtures. Mol Cell Proteomics, 5, 935–948.

    Article  PubMed  CAS  Google Scholar 

  42. Pevzner, P., Mulyukov, Z., Dancik, V., and Tang, C. (2001) Efficiency of database search for identification of mutated and modified proteins via mass spectrometry. Genome Res, 11, 290–299.

    Article  PubMed  CAS  Google Scholar 

  43. Ferrara, N., Hillan, K. J., Gerber, H. P., and Novotny, W. (2004) Discovery and development of bevacizumab, an anti-vegf antibody for treating cancer. Nat Rev Drug Discov, 3, 391–400.

    Article  PubMed  CAS  Google Scholar 

  44. Reichert, J. M. and Valge-Archer, V. E. (2007) Development trends for monoclonal antibody cancer therapeutics. Nat Rev Drug Discov, 6, 349–356.

    Article  PubMed  CAS  Google Scholar 

  45. Bandeira, N., Pham, V., Pevzner, P., Arnott, D., and Lill, J.R. (2008) Automated de novo protein sequencing of monoclonal antibodies. Nat Biotechnol, 26, 1336–1338.

    Article  PubMed  CAS  Google Scholar 

  46. Savitski, M. M., Nielsen, M. L., and Zubarev, R. A. (2005) New data base-independent, sequence tag-based scoring of peptide ms/ms data validates mowse scores, recovers below threshold data, singles out modified peptides, and assesses the quality of ms/ms techniques. Mol Cell Proteomics, 4, 1180–1188.

    Article  PubMed  CAS  Google Scholar 

  47. Savitski, M. M., Nielsen, M. L., Kjeldsen, F., and Zubarev, R. A. (2005) Proteomics-grade de novo sequencing approach. J Proteome Res, 4, 2348–2354.

    Article  PubMed  CAS  Google Scholar 

  48. Frank, A. M., Savitski, M. M., Nielsen, M. L., Zubarev, R. A., and Pevzner, P. A. (2007) De novo peptide sequencing and identification with precision mass spectrometry. J Proteome Res, 6, 114–123.

    Article  PubMed  CAS  Google Scholar 

  49. Shevchenko, A., Chernushevich, I., Ens, W., Standing, K. G., Thomson, B., Wilm, M., and Mann, M. (1997) Rapid “de novo” peptide sequencing by a combination of nanoelectrospray, isotopic labeling and a quadrupole/time-of-flight mass spectrometer. Rapid Commun Mass Spectrom, 11, 1015–1024.

    Article  PubMed  CAS  Google Scholar 

  50. Zhou, Q., Smith, J. B., and Grossman, M. H. (1995) Molecular cloning and expression of catrocollastatin, a snake-venom protein from Crotalus atrox (western diamondback rattlesnake) which inhibits platelet adhesion to collagen. Biochem J, 307(Pt 2), 411–417.

    PubMed  CAS  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Nuno Bandeira .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer Science+Business Media, LLC

About this protocol

Cite this protocol

Bandeira, N. (2011). Protein Identification by Spectral Networks Analysis. In: Wu, C., Chen, C. (eds) Bioinformatics for Comparative Proteomics. Methods in Molecular Biology, vol 694. Humana Press. https://doi.org/10.1007/978-1-60761-977-2_11

Download citation

  • DOI: https://doi.org/10.1007/978-1-60761-977-2_11

  • Published:

  • Publisher Name: Humana Press

  • Print ISBN: 978-1-60761-976-5

  • Online ISBN: 978-1-60761-977-2

  • eBook Packages: Springer Protocols

Publish with us

Policies and ethics