PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification

Yan, Bo; Qu, You-Xing; Mao, Feng-Lou; Olman, Victor N.; Xu, Ying

doi:10.1007/s11390-005-0483-5

PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification

Regular Paper
Published: July 2005

Volume 20, pages 483–490, (2005)
Cite this article

Journal of Computer Science and Technology Aims and scope Submit manuscript

Bo Yan¹,
You-Xing Qu¹,
Feng-Lou Mao¹,
Victor N. Olman¹ &
…
Ying Xu^1,2

42 Accesses
5 Citations
3 Altmetric
Explore all metrics

Abstract

sequencing is one of the most promising proteomics techniques for identification of protein post-translation modifications (PTMs) in studying protein regulations and functions. We have developed a computer tool PRIME for identification of b and y ions in tandem mass spectra, a key challenging problem in de novo sequencing. PRIME utilizes a feature that ions of the same and different types follow different mass-difference distributions to separate b from y ions correctly. We have formulated the problem as a graph partition problem. A linear integer-programming algorithm has been implemented to solve the graph partition problem rigorously and efficiently. The performance of PRIME has been demonstrated on a large amount of simulated tandem mass spectra derived from Yeast genome and its power of detecting PTMs has been tested on 216 simulated phosphopeptides.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quantitative Mass Spectrometry-Based Proteomics: An Overview

Introduction to Bioinformatics

Tricine-SDS-PAGE

References

Gooley A A, Packer N H. The Importance of Co- and Post-Translational Modifications in Proteome Projects. Proteome Research: New Frontiers in Functional Genomics, Wilkins M R et al. (eds.), 1997, Springer-Verlag, pp.65-91.
Mann M, Jensen O N. Proteomic analysis of post-translational modifications. Nat. Biotechnol., 2003, 21(3): 255–61.
Article Google Scholar
Jensen O N. Modification-specific proteomics: Characterization of post-translational modifications by mass spectrometry. Curr. Opin. Chem. Biol., 2004, 8(1): 33–41.
Google Scholar
Aebersold R, Mann M. Mass spectrometry-based proteomics. Nature, 2003, 422(6928): 198–207.
Google Scholar
MacCoss M J, McDonald W H, Saraf A et al. Shotgun identification of protein modifications from protein complexes and lens tissue. In Proc. Natl. Acad. Sci., U.S.A., 2002, 99(12): 7900–7905.
Tabb D L, Smith L L, Breci L A et al. Statistical characterization of ion trap tandem mass spectra from doubly charged tryptic peptides. Anal. Chem., 2003, 75(5): 1155–1163.
Article Google Scholar
Dancik V, Addona T A, Clauser K R, Vath J E, Pevzner P A. De novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol., 1999, 6(3-4): 327–342.
Article Google Scholar
Taylor J A, Johnson R S. Sequence database searches via de novo peptide sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom, 1997, 11(9): 1067–1075.
Article Google Scholar
Pevzner P A, Dancik V, Tang C L. Mutation-tolerant protein identification by mass spectrometry. J. Comput. Biol., 2000, 7(6): 777-787.
Article Google Scholar
Chen T, Kao M Y, Tepel M et al. A dynamic programming approach to de novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol., 2001, 8(3): 325–337.
Article Google Scholar
Taylor J A, Johnson R S. Implementation and uses of automated de novo peptide sequencing by tandem mass spectrometry. Anal. Chem., 2001, 73(11): 2594–2604.
Article Google Scholar
Lu B, Chen T. A suboptimal algorithm for de novo peptide sequencing via tandem mass spectrometry. J. Comput. Biol., 2003, 10(1): 1–12.
Article Google Scholar
Ma B, Zhang K, Hendrie C et al. PEAKS: Powerful software for peptide de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom, 2003, 17(20): 2337–2342.
Article Google Scholar
Bartels C. Fast algorithm for peptide sequencing by mass spectroscopy. Biomed. Environ. Mass Spectrom, 1990, 19: 363–368.
Article Google Scholar
Fernandez-de-Cossio J, Gonzalez J, Betancourt L et al. Automated interpretation of high-energy collision-induced dissociation spectra of singly protonated peptides by “SeqMS”, a software aid for de novo sequencing by tandem mass spectrometry. Rapid Commun. Mass Spectrom, 1998, 12(23): 1867–1878.
Article Google Scholar
Yan B, Pan C, Olman V N, Hettich R L, Xu Y. A Graph-theoretic Approach to Separation of b and y Ions in Tandem Mass Spectra. In Proc. 2004 IEEE Computational Systems Bioinformatics (CSB), Stanford, USA, 2004, pp.236–244.
Lougee-Heimer R. The common optimization interface for operations research: Promoting open-source software in the operations research community. In IBM Journal of Research and Development, 2003, pp.57–66.
Ralphs T K, Ladányi L, Saltzman M J. Parallel branch, cut, and price for large-scale discrete optimization. Mathematical Programming, 2003, 98: 253–280.
Article MathSciNet Google Scholar
Mann M, Ong S E, Gronborg M et al. Analysis of protein phosphorylation using mass spectrometry: Deciphering the phosphoproteome. Trends Biotechnol., 2002, 20(6): 261–268.
Article Google Scholar
Ficarro S B, McCleland M L, Stukenberg P T et al. Phosphoproteome analysis by mass spectrometry and its application to Saccharomyces cerevisiae. Nat. Biotechnol., 2002, 20(3): 301-305.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computational Systems Biology Laboratory, Department of Biochemical and Molecular Biology, University of Georgia, Athens, GA, 30602, U.S.A.
Bo Yan, You-Xing Qu, Feng-Lou Mao, Victor N. Olman & Ying Xu
Computational Biology Institute, Oak Ridge National Laboratory, Oak Ridge, TN, 37831, U.S.A.
Ying Xu

Authors

Bo Yan
View author publications
You can also search for this author in PubMed Google Scholar
You-Xing Qu
View author publications
You can also search for this author in PubMed Google Scholar
Feng-Lou Mao
View author publications
You can also search for this author in PubMed Google Scholar
Victor N. Olman
View author publications
You can also search for this author in PubMed Google Scholar
Ying Xu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ying Xu.

Additional information

This research was supported in part by the National Science Foundation of U.S.A (Grant Nos.NSF/DBI-0354771 and #NSF/ITR-IIS-0407204). It was also funded in part by the U.S. Department of Energy's Genomes to Life program (http://doegenomestolife.org/) under project, “Carbon Sequestration in Synechococcus sp.: From Molecular Machines to Hierarchical Modeling” (www.genomes2life.org).

Bo Yan received his Ph.D. degree in chemistry from Peking University. He is now working in the Computational Systems Biology Lab at University of Georgia, USA. His research interests include Monte Carlo simulations, graph theory, computational biology/chemistry and bioinformatics.

You-Xing Qu received his Ph.D. degree in biophysics from Peking University, China. Currently he is working in the Computational Systems Biology Lab at the University of Georgia, USA. His research interests include computational biology, protein folding, structural biology, and biophysics.

Feng-Lou Mao received his Ph.D. degree in computational chemistry from Peking University in 2001. He is now a postdoc researcher at University of Georgia, USA. His current research interests include bioinformatics, systems biology and computational biology.

Victor N. Olman is a Senior Research Scientist in Biochemistry and Molecular Biology Department of UGA. He got the Ph.D. degree in mathematics from S. Petersburg University, Russia. Right now his main interests are in the field of mathematical applications in bioinformatics that include methods of mathematical statistics, graph theory, simulation and modeling of dynamic systems. He is a member of American Statistical Association.

Ying Xu is a chair professor of bioinformatics and computational biology in the Biochemistry and Molecular Biology Department, and the director of the Institute of Bioinformatics, University of Georgia, USA. Before joining UGA in Sept 2003, he was a senior staff scientist and group leader at Oak Ridge National Laboratory, USA, where he still holds a joint position. He also holds guest or research professor positions at the University of Tennessee at Knoxville of USA, Jilin University and Zhejiang University of China, and an adjunct professor position in the Computer Science Department of UGA. Ying Xu received his undergraduate and graduate education in computer science from Jilin University, and Ph.D. degree in theoretical computer science from the University of Colorado at Boulder of USA in 1991. He is interested in both bioinformatics tool development and study of biological problems using in silico approaches. His current research interests include (a) computational inference and modeling of biological pathways and networks, (b) protein structure prediction and modeling, (c) large-scale biological data mining, and (d) microbial & cancer bioinformatics.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Yan, B., Qu, YX., Mao, FL. et al. PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification. J Comput Sci Technol 20, 483–490 (2005). https://doi.org/10.1007/s11390-005-0483-5

Download citation

Revised: 04 November 2004
Issue Date: July 2005
DOI: https://doi.org/10.1007/s11390-005-0483-5

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification

Abstract

Access this article

Similar content being viewed by others

Quantitative Mass Spectrometry-Based Proteomics: An Overview

Introduction to Bioinformatics

Tricine-SDS-PAGE

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Navigation

PRIME: A Mass Spectrum Data Mining Tool for De Nova Sequencing and PTMs Identification

Abstract

Access this article

Similar content being viewed by others

Quantitative Mass Spectrometry-Based Proteomics: An Overview

Introduction to Bioinformatics

Tricine-SDS-PAGE

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Search

Navigation