Skip to main content

Mass Spectrometry Analysis Using MALDIquant

  • Chapter
  • First Online:

Part of the book series: Frontiers in Probability and the Statistical Sciences ((FROPROSTAS))

Abstract

MALDIquant and associated R packages provide a versatile and completely free open-source platform for analyzing 2D mass spectrometry data as generated, for instance, by MALDI and SELDI instruments. We first describe the various methods and algorithms available in MALDIquant. Subsequently, we illustrate a typical analysis workflow using MALDIquant by investigating an experimental cancer data set, starting from raw mass spectrometry measurements and ending at multivariate classification.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   149.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   199.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   199.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Aebersold, R., & Mann, M. (2003). Mass spectrometry-based proteomics. Nature, 422, 198–207.

    Article  Google Scholar 

  2. Ahdesmäki, M., & Strimmer, K. (2010). Feature selection in omics prediction problems using cat scores and false nondiscovery rate control. The Annals of Applied Statistics, 4(1), 503–519.

    Article  MathSciNet  MATH  Google Scholar 

  3. Andrew, M. A. (1979). Another efficient algorithm for convex hulls in two dimensions. Information Processing Letters, 9, 216–219. Amsterdam: Elsevier.

    Google Scholar 

  4. Baggerly, K. A., Morris, J. S., & Coombes, K. R. (2004). Reproducibility of SELDI-TOF protein patterns in serum: Comparing datasets from different experiments. Bioinformatics, 20, 777–785.

    Article  Google Scholar 

  5. Bloemberg, T. G., Gerretzen, J., Wouters, H. J. P., Gloerich, J., van Dael, M., Wessels, H. J. C. T., et al. (2010). Improved parametric time warping for proteomics. Chemometrics and Intelligent Laboratory Systems, 104, 65–74.

    Article  Google Scholar 

  6. Bolstad, B. M., Irizarry, R. A., Astrand, M., & Speed, T. P. (2003). A comparison of normalization methods for high density oligonucleotide array data based on variance and bias. Bioinformatics, 19, 185–193.

    Article  Google Scholar 

  7. Borgaonkar, S. P., Hocker, H., Shin, H., & Markey, M. K. (2010). Comparison of normalization methods for the identification of biomarkers using MALDI-TOF and SELDI-TOF mass spectra. OMICS, 14, 115–126.

    Article  Google Scholar 

  8. Breiman, L. (2001). Random forests. Machine Learning, 45, 5–32.

    Article  MATH  Google Scholar 

  9. Bromba, M. U. A., & Ziegler, H. (1981). Application hints for Savitzky–Golay digital smoothing filters. Analytical Chemistry, 53(11), 1583–1586.

    Article  Google Scholar 

  10. Callister, S. J., Barry, R. C., Adkins, J. N., Johnson, E. T., Qian, W.-J., Webb-Robertson, B.-J. M., et al. (2006). Normalization approaches for removing systematic biases associated with mass spectrometry and label-free proteomics. Journal of Proteome Research, 5, 277–286.

    Article  Google Scholar 

  11. Chambers, M. C., Maclean, B., Burke, R., Amodei, D., Ruderman, D. L., Neumann, S., et al. (2012). A cross-platform toolkit for mass spectrometry and proteomics. Nature Biotechnology, 30(10), 918–920.

    Article  Google Scholar 

  12. Clifford, D., Montoliu, G. S. I., Rezzi, S., Martin, F.-P., Guy, P., Bruce, S., et al. (2009). Alignment using variable penalty dynamic time warping. Analytical Chemistry, 81, 1000–1007.

    Article  Google Scholar 

  13. Coombes, K. R., Tsavachidis, S., Morris, J. S., Baggerly, K. A., Hung, M.-C., & Kuerer, H. M. (2005). Improved peak detection and quantification of mass spectrometry data acquired from surface-enhanced laser desorption and ionization by denoising spectra with the undecimated discrete wavelet transform. Proteomics, 5, 4107–4117.

    Article  Google Scholar 

  14. Cornett, D. S., Reyzer, M. L., Chaurand, P., & Caprioli, R. M. (2007). MALDI imaging mass spectrometry: Molecular snapshots of biochemical systems. Nature Methods, 4, 828–833.

    Article  Google Scholar 

  15. Dieterle, F., Ross, A., Schlotterbeck, G., & Senn, H. (2006). Probabilistic quotient normalization as robust method to account for dilution of complex biological mixtures. Application in 1H NMR metabonomics. Analytical Chemistry, 78, 4281–4290.

    Article  Google Scholar 

  16. Du, P., Kibbe, W. A., & Lin, S. M. (2006). Improved peak detection in mass spectrum by incorporating continuous wavelet transform-based pattern matching. Bioinformatics, 22, 2059–2065.

    Article  Google Scholar 

  17. Du, P., Stolovitzky, G., Horvatovich, P., Bischoff, R., Lim, J., & Suits, F. (2008). A noise model for mass spectrometry based proteomics. Bioinformatics, 24, 1070–1077.

    Article  Google Scholar 

  18. Fiedler, G. M., Leichtle, A. B., Kase, J., Baumann, S., Ceglarek, U., Felix, K., et al. (2009). Serum peptidome profiling revealed platelet factor 4 as a potential discriminating peptide associated with pancreatic cancer. Clinical Cancer Research, 15, 3812–3819.

    Article  Google Scholar 

  19. Friedman, J. H. (1984). A variable span smoother. Technical Report, DTIC Document.

    Google Scholar 

  20. Gammerman, A., Nouretdinov, I., Burford, B., Chervonenkis, A., Vovk, V., & Luo, Z. (2008). Clinical mass spectrometry proteomic diagnosis by conformal predictors. Statistical Applications in Genetics and Molecular Biology, 7, 13.

    Article  MathSciNet  Google Scholar 

  21. Gibb, S., & Strimmer, K. (2012). MALDIquant: A versatile R package for the analysis of mass spectrometry data. Bioinformatics, 28, 2270–2271.

    Article  Google Scholar 

  22. Gibb, S., & Strimmer, K. (2015). Differential protein expression and peak selection in mass spectrometry data by binary discriminant analysis. Bioinformatics, 31, 3156–3162.

    Article  Google Scholar 

  23. Gil, J. Y., & Kimmel, R. (2002). Efficient dilation, erosion, opening, and closing algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 1606–1617.

    Article  MATH  Google Scholar 

  24. Gregori, J., Villarreal, L., Méndez, O., Sánchez, A., Baselga, J., & Villanueva, J. (2012). Batch effects correction improves the sensitivity of significance tests in spectral counting-based comparative discovery proteomics. Journal of Proteomics, 75(13), 3938–3951.

    Article  Google Scholar 

  25. He, Q. P., Wang, J., Mobley, J. A., Richman, J., & Grizzle, W. E. (2011). Self-calibrated warping for mass spectra alignment. Cancer Informatics, 10, 65–82.

    Article  Google Scholar 

  26. House, L. L., Clyde, M. A., & Wolpert, R. L. (2011). Bayesian nonparametric models for peak identification in MALDI-TOF mass spectroscopy. The Annals of Applied Statistics, 5, 1488–1511.

    Article  MathSciNet  MATH  Google Scholar 

  27. Hu, J., Coombes, K. R., Morris, J. S., & Baggerly, K. A. (2005). The importance of experimental design in proteomic mass spectrometry experiments: Some cautionary tales. Briefings in Functional Genomics and Proteomics, 3, 322–331.

    Article  Google Scholar 

  28. Jeffries, N. (2005). Algorithms for alignment of mass spectrometry proteomic data. Bioinformatics, 21, 3066–3073.

    Article  Google Scholar 

  29. Kim, S., Koo, I., Fang, A., & Zhang, X. (2011). Smith-Waterman peak alignment for comprehensive two-dimensional gas chromatography-mass spectrometry. BMC Bioinformatics, 12, 235.

    Article  Google Scholar 

  30. Lange, E., Gröpl, C., Reinert, K., Kohlbacher, O., & Hildebrandt, A. (2006). High-accuracy peak picking of proteomics data using wavelet techniques. In Pacific Symposium on Biocomputing (Vol. 11, pp. 243–254).

    Google Scholar 

  31. Leek, J. T., Scharpf, R. B., Bravo, H. C., Simcha, D., Langmead, B., Johnson, W. E., et al. (2010). Tackling the widespread and critical impact of batch effects in high-throughput data. Nature Reviews Genetics, 11, 733–739.

    Article  Google Scholar 

  32. Leichtle, A. B., Dufour, J.-F., & Fiedler, G. M. (2013). Potentials and pitfalls of clinical peptidomics and metabolomics. Swiss Medical Weekly, 143, w13801.

    Google Scholar 

  33. Li, X. (2005). PROcess: Ciphergen SELDI-TOF Processing. R package version 1.44.0.

    Google Scholar 

  34. Lilley, K. S., Deery, M. J., & Gatto, L. (2011). Challenges for proteomics core facilities. Proteomics, 11(6), 1017–1025.

    Article  Google Scholar 

  35. Lin, S. M., Haney, R. P., Campa, M. J., Fitzgerald, M. C., & Patz, E. F. (2005). Characterising phase variations in MALDI-TOF data & correcting them by peak alignment. Cancer Informatics, 1, 32–40.

    Google Scholar 

  36. Liu, Q., Krishnapuram, B., Pratapa, P., Liao, X., Hartemink, A., & Carin, L. (2003). Identification of differentially expressed proteins using MALDI-TOF mass spectra. Signals, Systems & Computers, 2003. Conference Record (Vol. 2, pp. 1323–1327).

    Google Scholar 

  37. Liu, Q., Sung, A. H., Qiao, M., Chen, Z., Yang, J. Y., Yang, M. Q., et al. (2009). Comparison of feature selection & classification for MALDI-MS data. BMC Genomics, 10(Suppl 1), S3.

    Article  Google Scholar 

  38. Liu, L. H., Shan, B. E., Tian, Z. Q., Sang, M. X., Ai, J., Zhang, Z. F., et al. (2010). Potential biomarkers for esophageal carcinoma detected by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry. Clinical Chemistry & Laboratory Medicine, 486, 855–861.

    Google Scholar 

  39. Martens, L., Chambers, M., Sturm, M., Kessner, D., Levander, F., Shofstahl, J., et al. (2011). mzML–a community standard for mass spectrometry data. Molecular & Cellular Proteomics, 10, R110.000133.

    Google Scholar 

  40. Mertens, B. J. A., de Noo, M. E., Tollenaar, R. A. E. M., & Deelder, A. M. (2006). Mass spectrometry proteomic diagnosis: Enacting the double cross-validatory paradigm. Journal of Computational Biology, 13, 1591–1605.

    Article  MathSciNet  Google Scholar 

  41. Meuleman, W., Engwegen, J. Y., Gast, M.-C. W., Beijnen, J. H., Reinders, M. J., & Wessels, L. F. (2008). Comparison of normalisation methods for surface-enhanced laser desorption and ionisation (SELDI) time-of-flight (TOF) mass spectrometry data. BMC Bioinformatics, 9, 88.

    Article  Google Scholar 

  42. Morhác, M. (2009). An algorithm for determination of peak regions and baseline elimination in spectroscopic data. Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment, 600, 478–487.

    Article  Google Scholar 

  43. Morris, J. S., Baggerly, K. A., Gutstein, H. B., & Coombes, K. R. (2010). Statistical contributions to proteomic research. Methods in Molecular Biology, 641, 143–166.

    Article  Google Scholar 

  44. Morris, J. S., Coombes, K. R., Koomen, J., Baggerly, K. A., & Kobayashi, R. (2005). Feature extraction and quantification for mass spectrometry in biomedical applications using the mean spectrum. Bioinformatics, 21, 1764–1775.

    Article  Google Scholar 

  45. Norris, J. L., Cornett, D. S., Mobley, J. A., Andersson, M., Seeley, E. H., Chaurand, P., et al. (2007). Processing MALDI mass spectra to improve mass spectral direct tissue analysis. International Journal of Mass Spectrometry, 260, 212–221.

    Article  Google Scholar 

  46. Pedrioli, P. G. A., Eng, J. K., Hubley, R., Vogelzang, M., Deutsch, E. W., Raught, B., et al. (2004). A common open representation of mass spectrometry data and its application to proteomics research. Nature Biotechnology, 22, 1459–1466.

    Article  Google Scholar 

  47. Purohit, P. V., & Rocke, D. M. (2003). Discriminant models for high-throughput proteomics mass spectrometer data. Proteomics, 3, 1699–1703.

    Article  Google Scholar 

  48. R Core Team (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria.

    Google Scholar 

  49. Robb, R. A., Hanson, D. P., Karwoski, R. A., Larson, A. G., Workman, E. L., & Stacy, M. C. (1989). Analyze: A comprehensive, operator-interactive software package for multidimensional medical image display and analysis. Computerized Medical Imaging and Graphics, 13, 433–454.

    Article  Google Scholar 

  50. Ryan, C. G., Clayton, E., Griffin, W. L., Sie, S. H., & Cousens, D. R. (1988). SNIP, a statistics-sensitive background treatment for the quantitative analysis of PIXE spectra in geoscience applications. Nuclear Instruments and Methods in Physics Research Section B: Beam Interactions with Materials and Atoms, 34, 396–402.

    Article  Google Scholar 

  51. Sakoe, H., & Chiba, S. (1978). Dynamic programming algorithm optimization for spoken word recognition. IEEE Transactions on Acoustics, Speech and Signal Processing, 26, 43–49.

    Article  MATH  Google Scholar 

  52. Sauve, A. C., & Speed, T. P. (2004). Normalization, baseline correction and alignment of high-throughput mass spectrometry data. In Proceedings of the Data Proceedings Gensips.

    Google Scholar 

  53. Savitzky, A., & Golay, M. J. E. (1964). Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry, 36, 1627–1639.

    Article  Google Scholar 

  54. Schramm, T., Hester, A., Klinkert, I., Both, J.-P., Heeren, R. M. A., Brunelle, A., et al. (2012). imzML–a common data format for the flexible exchange and processing of mass spectrometry imaging data. Journal of Proteomics, 75, 5106–5110.

    Article  Google Scholar 

  55. Shin, H., & Markey, M. K. (2006). A machine learning perspective on the development of clinical decision support systems utilizing mass spectra of blood samples. Journal of Biomedical Informatics, 39, 227–248.

    Article  Google Scholar 

  56. Sköld, M., Rydén, T., Samuelsson, V., Bratt, C., Ekblad, L., Olsson, H., et al. (2007). Regression analysis and modelling of data acquisition for SELDI-TOF mass spectrometry. Bioinformatics, 23, 1401–1409.

    Article  Google Scholar 

  57. Smith, C. A., Want, E. J., O’Maille, G., Abagyan, R., & Siuzdak, G. (2006). XCMS: Processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Analytical Chemistry, 78, 779–787.

    Article  Google Scholar 

  58. Smith, R., Ventura, D., & Prince, J. T. (2013). LC-MS alignment in theory and practice: A comprehensive algorithmic review. Briefings in Bioinformatics, 16(1), 104–117.

    Article  Google Scholar 

  59. Strimmer, K. (2014). crossval: Generic functions for cross validation. R package version 1.0.1.

    Google Scholar 

  60. Tibshirani, R., Hastie, T., Narsimhan, B., & Chu, G. (2003). Class prediction by nearest shrunken centroids, with applications to DNA microarrays. Statistical Science, 18, 104–117.

    Article  MathSciNet  MATH  Google Scholar 

  61. Tibshirani, R., Hastie, T., Narasimhan, B., Soltys, S., Shi, G., Koong, A., et al. (2004). Sample classification from protein mass spectrometry, by ‘peak probability contrasts’. Bioinformatics, 20, 3034–3044.

    Article  Google Scholar 

  62. Toppoo, S., Roveri, A., Vitale, M. P., Zaccarin, M., Serain, E., Apostolidis, E., et al. (2008). MPA: A multiple peak alignment algorithm to perform multiple comparisons of liquid-phase proteomic profiles. Proteomics, 8, 250–253.

    Article  Google Scholar 

  63. Torgrip, R. J. O., Åberg, M., Karlberg, B., & Jacobsson, S. P. (2003). Peak alignment using reduced set mapping. Journal of Chemometrics, 17, 573–582.

    Article  Google Scholar 

  64. Tracy, M. B., Chen, H., Weaver, D. M., Malyarenko, D. I., Sasinowski, M., Cazares, L. H., et al. (2008). Precision enhancement of MALDI-TOF MS using high resolution peak detection and label-free alignment. Proteomics, 8, 1530–1538.

    Article  Google Scholar 

  65. van Herk, M. (1992). A fast algorithm for local minimum and maximum filters on rectangular and octagonal kernels. Pattern Recognition Letters, 13, 517–521.

    Article  Google Scholar 

  66. Veselkov, K. A., Lindon, J. C., Ebbels, T. M. D., Crockford, D., Volynkin, V. V., Holmes, E., et al. (2009). Recursive segment-wise peak alignment of biological (1)h NMR spectra for improved metabolic biomarker recovery. Analytical Chemistry, 81, 56–66.

    Article  Google Scholar 

  67. Wang, B., Fang, A., Heim, J., Bogdanov, B., Pugh, S., Libardoni, M., et al. (2010). DISCO: distance and spectrum correlation optimization alignment for two-dimensional gas chromatography time-of-flight mass spectrometry-based metabolomics. Analytical Chemistry, 82, 5069–5081.

    Article  Google Scholar 

  68. Wehrens, R., Bloemberg, T., & Eilers, P. (2015). Fast parametric time warping of peak lists. Bioinformatics, 15, 3063–3065.

    Article  Google Scholar 

  69. Williams, B., Cornett, S., Dawant, B., Crecelius, A., Bodenheimer, B., & Caprioli, R. (2005). An algorithm for baseline correction of MALDI mass spectra. In Proceedings of the 43rd Annual Southeast Regional Conference (Vol. 1, pp. 137–142). ACM-SE 43.

    Google Scholar 

  70. Yasui, Y., McLerran, D., Adam, B., Winget, M., Thornquist, M., & Feng, Z. (2003). An automated peak-identification/calibration procedure for high-dimensional protein measures from mass spectrometers. Journal of Biomedicine and Biotechnology, 4, 242–248.

    Article  Google Scholar 

  71. Yasui, Y., Pepe, M., Thompson, M. L., Adam, B.-L., Wright, G. L., Qu, Y., et al. (2003). A data-analytic strategy for protein biomarker discovery: profiling of high-dimensional proteomic data for cancer detection. Biostatistics, 4, 449–463.

    Article  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Korbinian Strimmer .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing Switzerland

About this chapter

Cite this chapter

Gibb, S., Strimmer, K. (2017). Mass Spectrometry Analysis Using MALDIquant. In: Datta, S., Mertens, B. (eds) Statistical Analysis of Proteomics, Metabolomics, and Lipidomics Data Using Mass Spectrometry. Frontiers in Probability and the Statistical Sciences. Springer, Cham. https://doi.org/10.1007/978-3-319-45809-0_6

Download citation

Publish with us

Policies and ethics