Advertisement

Classifying Mass Spectral Data Using SVM and Wavelet-Based Feature Extraction

  • Wong Liyen
  • Maybin K. Muyeba
  • John A. Keane
  • Zhiguo Gong
  • Valerie Edwards-Jones
Part of the Lecture Notes in Computer Science book series (LNCS, volume 8210)

Abstract

The paper investigates the use of support vector machines (SVM) in classifying Matrix-Assisted Laser Desorption Ionisation (MALDI) Time Of Flight (TOF) mass spectra. MALDI-TOF screening is a simple and useful technique for rapidly identifying microorganisms and classifying them into specific subtypes. MALDI-TOF data presents data analysis challenges due to its complexity and inherent data uncertainties. In addition, there are usually large mass ranges within which to identify the spectra and this may pose problems in classification. To deal with this problem, we use Wavelets to select relevant and localized features. We then search for best optimal parameters to choose an SVM kernel and apply the SVM classifier. We compare classification accuracy and dimensionality reduction between the SVM classifier and the SVM classifier with wavelet-based feature extraction. Results show that wavelet-based feature extraction improved classification accuracy by at least 10%, feature reduction by 76% and runtime by over 80%.

Keywords

SVM wavelets MALDI-TOF parameter search feature reduction 

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Lay, J.O.: MALDI-TOF Mass Spectrometry of Bacteria. John Wiley (2002)Google Scholar
  2. 2.
    Bundy, J., Fenselau, C.: Lectin-based Affinity Capture for MALDI-MS Analysis of Bacteria. Analy. Chem. 71(7), 1460–1463 (1999)CrossRefGoogle Scholar
  3. 3.
    Li, T., Li, Q., Zhu, S., Ogihara, M.: A Survey on Wavelet Applications in Data Mining. SIGKDD Explorations 4(2), 49–68 (2003)CrossRefGoogle Scholar
  4. 4.
    Bruyne, K.D., et al.: Bacterial Species Identification from MALDI-TOF Mass Spectra through Data Analysis and Machine Learning. Syst. and Appl. Microb. 34, 20–29 (2011)CrossRefGoogle Scholar
  5. 5.
    Li, D., Pedrycz, W., Pizzi, N.J.: Fuzzy Wavelet Packet Based Feature Extraction Method and its Application to Biomedical Signal Classification. IEEE Trans. Biom. Eng. 526, 1132–1139 (2005)CrossRefGoogle Scholar
  6. 6.
  7. 7.
    Morris, J.S., Coombes, K.R., Koomen, J., Baggerly, K.A., Kobayashi, R.: Feature Extraction and Quantification for Mass Spectrometry in Biomedical Applications using the Mean Spectrum. Bioinformatics 21, 1764–1775 (2005)CrossRefGoogle Scholar
  8. 8.
    Chui, C.K.: An Introduction to Wavelets. Academic Press, Boston (1992)zbMATHGoogle Scholar
  9. 9.
    Daubechies, I.: Orthonormal Bases of Compactly Support Wavelets. Comm. Pure Appl. Math. 41, 909–996 (1988)MathSciNetCrossRefzbMATHGoogle Scholar
  10. 10.
    Daubechies, I.: Ten Lectures on Wavelets. Capital City Press, Montpelier (1992)CrossRefzbMATHGoogle Scholar
  11. 11.
    McDonough, R.N., Whale, A.D.: Detection of Signals in Noise, 2nd edn. Academic Press, San Diego (1995)Google Scholar
  12. 12.
    Conrad, T.O.F., Leichtle, A., Hagehülsmann, A., Diederichs, E., Baumann, S., Thiery, J., Schütte, C.: Beating the Noise: New Statistical Methods for Detecting Signals in MALDI-TOF Spectra Below Noise Level. In: Berthold, M., Glen, R.C., Fischer, I. (eds.) CompLife 2006. LNCS (LNBI), vol. 4216, pp. 119–128. Springer, Heidelberg (2006)CrossRefGoogle Scholar
  13. 13.
    Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification, 2nd edn. (2001)Google Scholar
  14. 14.
    Shin, H., Sampat, M.P., Koomen, J.M., Markey, M.K.: Wavelet-based Adaptive Denoising and Baseline Correction for MALDI-TOF MS. J. of Integr. Biol. 14(3), 283–295 (2010)Google Scholar
  15. 15.
    Pedrycz, W., Vukovich, G.: Feature Analysis through Information Granulation and Fuzzy Sets. Pattern Recog. 35, 825–834 (2002)CrossRefzbMATHGoogle Scholar
  16. 16.
    Resson, H.W., et al.: Peak Selection from MALDI-TOF Mass Spectra using Ant Colony Optimisation. Bioinformatics 23(5), 619–626 (2007)CrossRefGoogle Scholar
  17. 17.
    Malyarenko, D.I., et al.: Enhancement of Sensitivity and Resolution of Surface-enhanced Laser Desorption Ionisation Time-of-flight Mass Spectrometric Records for Serum Peptides using Time-series Analysis Techniques. Clin. Chem. 51, 65–74 (2005)CrossRefGoogle Scholar
  18. 18.
    Alexandrov, T., et al.: Biomarker Discovery in MALDI-TOF Serum Protein using Discrete Wavelet Transformation. Bioinformatics 25(5), 643–649 (2009)MathSciNetCrossRefGoogle Scholar
  19. 19.
    Khushaba, R.N., Al-Jumaily, A.: Fuzzy Wavelet Packet Based Feature Extraction Method for Multifunction Myoelectric Control. J. of Biol. and Life Sci. 2(3), 186–194 (2007)Google Scholar
  20. 20.
    Sweldens, W.: Lifting Scheme: A New Philosophy in Biorthogonal Wavelet Constructions. In: SPIE Wavelet Applications in Signal and Image Processing III, vol. 2569, pp. 68–79 (1995)Google Scholar
  21. 21.
    Chih-Chung, C., Chih-Jen, L.: LIBSVM: A Library for Support Vector Machines. ACM Transactions on Intelligent Systems and Technology 2(27), 1–27 (2011)CrossRefGoogle Scholar
  22. 22.
    Hsu, C.W., Chang, C.C., Lin, C.J.: A Practical Guide to Support Vector Classification. Bioinformatics 1(1), 1–16 (2010)MathSciNetGoogle Scholar
  23. 23.
    Boser, B.E., Guyon, I.M., Vapnik, V.N.: A Training Algorithm for Optimal Margin Classifiers. In: 5th Annual ACM Workshop on COLT, pp. 144–152 (1992)Google Scholar
  24. 24.
    Ramaswamy, R., et al.: Multiclass Cancer Diagnosis using Tumor Gene Expression Signatures. Proceedings of the National Academy of Sciences of the United States 98(26), 15149–15154 (2001)CrossRefGoogle Scholar
  25. 25.
    Savchuk, O.Y., Hart, J.D., Sheather, S.J.: Indirect Cross-validation for Density Estimation. Amer. Stat. Ass. 105(489), 415–423 (2010)MathSciNetCrossRefGoogle Scholar
  26. 26.
    Shutao, L., Chen, L., James, K.: Wavelet-based Feature Selection for Microarray Data Classification. In: Proc. Int. Joint Conference on Neur. Net. (IJCNN), pp. 5028–5033 (2006)Google Scholar
  27. 27.
    Frank-Michael, S., et al.: Support Vector Classification of Proteomic Profile Spectra Based on Feature Extraction with the Bi-orthogonal Discrete Wavelet Transform. Comp. and Visual. in Sci. 12(4), 189–199 (2009)CrossRefGoogle Scholar
  28. 28.
    Wong, L., Muyeba, M., Keane, J.: Towards Adaptive Mining of Spectral Features. In: Proceedings of UK Workshop on Computational Intelligence, pp. 213–216 (2011)Google Scholar
  29. 29.
    Smith, M., Martinez, T.: Improving Classification Accuracy by Identifying and Removing Instances that Should Be Misclassified. In: Proc. Int. Joint Conference on Neur. Net. (IJCNN), San Jose, pp. 2690–2697 (2011)Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Wong Liyen
    • 1
  • Maybin K. Muyeba
    • 1
  • John A. Keane
    • 2
  • Zhiguo Gong
    • 3
  • Valerie Edwards-Jones
    • 4
  1. 1.School of ComputingMathematics and Digital TechnologyUK
  2. 2.School of Computer ScienceUniversity of ManchesterUK
  3. 3.Faculty of Science and TechnologyUniversity of MacauChina
  4. 4.Institute for Biomedical Research into Human Movement and HealthManchester Metropolitan UniversityUK

Personalised recommendations