Abstract
In this chapter, we address the issue of matrix-assisted laser desorption/ionization mass spectrometry (MS) data analysis for disease biomarker discovery. We first give a general framework of MS data analysis, then focus on several key steps. After that, we show some application examples using an ovarian sera cancer dataset. Finally, we discuss the limitations of current approaches and possible future research directions.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Wu B., Abbott T., Fishman D., et al. (2003) Comparison of statistical methods for classification of ovarian cancer using mass spectrometry data. Bioinformatics 19,1636–1643.
Petricoin III, E., Ardekani A. M., Hitt B. A., et al. (2002) Use of proteomic patterns in serum to identify ovarian cancer. Lancet 359(9306), 572–577.
Torgrip R., Aberg M., Karlberg B., and Jacobsson S. (2003) Peak alignment using reduced set mapping. J. Chemomet. 17, 573–582.
Eilers P. (2004) Parametric time warping. Anal. Chem. 76, 404–411.
Tibshirani R., Hastie T., Narasimhan B., et al. (2004) Sample classification from protein mass spectrometry, by "e;peak probability contrasts."e; Bioinformatics 20(17), 3034–3044.
Coombes K., Fritsche, Jr H., Clarke C., et al. (2003) Qualitycontrol and peak finding for proteomics data collected from nipple aspirate fluid by surfaceenhanced laser desorption and ionization. Clin. Chem. 49, 1615–1623.
Yasui Y., Pepe M., Thompson M., et al. (2003) A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics 4(3), 449–463.
Yasui Y., McLerran D., Adam B., Winget M., Thornquist M., and Feng Z. (2003) An automated peak identification/calibration procedure for high-dimensional protein measures from mass spectrometers. J. Biomed. Biotechnol. 4, 242–248.
Randolph T. and Yasui Y. (2004) Multiscale processing of mass spectrometry data, in University of Washington Biostatistics Working Paper Series, Number 230.
Johnson K., Wright B., Jarman K., and Synovec R. (2003) High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis. J. Chromatog. A 996, 141–155.
Yu W., Wu B., Lin N., Stone K., Williams K., and Zhao H. (2005) Detecting and aligning peaks in mass spectrometry data with applications to MALDI. Comp. Biol. Chem., in press.
Nielsen N., Carstensen J., and Smedsgaard J. (1998) Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping. J. Chromatog. A 805, 17–35.
Aach J. and Church G. (2001) Aligning gene expression time series with time warping algorithms. Bioinformatics 17, 495–508.
Granlund G. H. and Knutsson H. (1995) Signal Processing for Computer Vision. Kluwer Academic Publishers.
Breen E., Hopwood F., Williams K., and Wilkins M. (2000) Automatic Poisson peak harvesting for high throughput protein identification. Electrophoresis 21, 2243–2251.
Gras R., Mueller M., Gasteiger E., et al. (1999) Improving protein identification from peptide mass fingerprinting through a parameterized multi-level scoring algorithm and an optimized peak detection. Electrophoresis 20, 3535–3550.
Satten G., Datta S., Moura H., et al. (2004) Standardization and denoising algorithms for mass spectra to classify whole-organism bacterial specimens. Bioinformatics 20(17), 3128–3136.
Coombes K., Tsavachidis S., Morris J., Baggerly K., Hung M., and Kuerer H.(2004) Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Tech. rep., The University of Texas M.D. Anderson Cancer Center.
Dudoit S., Yang Y. H., Speed T. P., and Callow M. J. (2002) Statistical methods for identifying differentially expressed genes in replicated cdna microarray experiments. Statistica Sinica 12, 111–139.
Lai Y., Wu B., Chen L., and Zhao H. (2004) Statistical method for identifying differential gene-gene coexpression patterns. Bioinformatics 20, 3146–3155.
Isabelle G., Jason W., Stephen B., and Vladimir V. (2002) Gene selection for cancer classification using support vector machines. Mach. Learning 46(1-3), 389–422.
Qu Y., Adam B.-L., Yasui Y., Ward M. D., Cazares L. H., Schellhammer P. F., et al. (2002) Boosted decision tree analysis of surface-enhanced laser desorption/ionization mass spectral serum profiles discriminates prostate cancer from noncancer patients. Clin. Chem. 48(10), 1835–1843.
Breiman L. (2001) Random forests. Mac. Learning 45, 5–32.
Ho T. (1998) The random subspace method for constructing decision forests. IEEE Trans. Pattern Anal. Mach. Intell. 20, 832–844.
Rights and permissions
Copyright information
© 2006 Humana Press Inc.
About this protocol
Cite this protocol
Yu, W. et al. (2006). MALDI-MS Data Analysis for Disease Biomarker Discovery. In: New and Emerging Proteomic Techniques. Methods in Molecular Biology™, vol 328. Humana Press. https://doi.org/10.1385/1-59745-026-X:199
Download citation
DOI: https://doi.org/10.1385/1-59745-026-X:199
Publisher Name: Humana Press
Print ISBN: 978-1-58829-519-4
Online ISBN: 978-1-59745-026-3
eBook Packages: Springer Protocols