Proteome discovery pipeline for mass spectrometry-based proteomics
- 2k Downloads
KeywordsLinear Discriminate Analysis Statistical Significance Test Canonical Discriminate Analysis Dynamic Visualization Molecular Correlation
XMass  uses chemical noise filtering, charge state fitting and de-isotoping for improved analysis of complex peptide samples. Overlapping peptide signals in mass spectra were deconvoluted by correlation with modeled peptide isotopic peak profiles. Isotopic peak profiles for peptides were generated in silico from a protein database to produce reference model distributions.
XAlign  is a two-step alignment algorithm. The first step is to detect significant peaks that are common to all samples. In the second step, all samples are aligned to the median sample using refined m/z and retention time variation values, where pattern recognition is applied as needed.
Several normalization methods have been developed for proteomics, including auto-scaling, reference sample, log linear model, trimmed constant mean, and average intensity.
Statistical significance tests
Several different test methods (two-tailed t-test, one-way ANOVA, Kolmogorov-Smirnov test, the Mann-Whitney test) can be used to identify data elements that make large contributions to the protein profile of a sample or that distinguish groups of samples from others.
We have implemented principal component analysis (PCA), linear discriminate analysis (LDA), canonical discriminate analysis (CDA), and clustering objects on subset of attributes (COSA)  as clustering methods.
The software package, SysNet , is used to provide a dynamic visualization environment for molecular correlation of 'omics data. SysNet visualizes the 'omics expression data as a two-dimensional network. It features a circular layout, where molecular species are represented as nodes and all nodes are located on circles. The intermolecular correlations are represented as links, or edges, between nodes.
- 3.Friedman JH, Meulman JJ: Clustering objects on subsets of attributes. J R Statist Soc B 2004, 66(Part 4):1–25.Google Scholar
This article is published under license to BioMed Central Ltd.