Skip to main content

A Machine Learning Application for Classification of Chemical Spectra

  • Conference paper
Applications and Innovations in Intelligent Systems XVI (SGAI 2008)

Abstract

This paper presents a software package that allows chemists to analyze spectroscopy data using innovative machine learning (ML) techniques. The package, designed for use in conjunction with lab-based spectroscopic instruments, includes features to encourage its adoption by analytical chemists, such as having an intuitive graphical user interface with a step-by-step ‘wizard’ for building new ML models, supporting standard file types and data preprocessing, and incorporating well-known standard chemometric analysis techniques as well as new ML techniques for analysis of spectra, so that users can compare their performance. The ML techniques that were developed for this application have been designed based on considerations of the defining characteristics of this problem domain, and combine high accuracy with visualization, so that users are provided with some insight into the basis for classification decisions.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Glossary of Terms Related to Chemical and Instrumental Analysis of Fire Debris. IAAI Forensic Science Committee, http://www.fire.org.uk/glossary.htm (Accessed Jan 2008).

    Google Scholar 

  2. Ferraro, J.R., Nakamoto, K. and Brown, C.W. (2003). Introductory Raman Spectroscopy. Academic Press, San Diego, second edition.

    Google Scholar 

  3. Savitzky, A. & Golay, M.J.E. (1964). “Smoothing and differentiation of data by simplified least squares procedures.” Analytical Chemistry, 36, 1627–1639.

    Article  Google Scholar 

  4. Howley, T., Madden, M.G., O’Connel, M.L., Ryder, A.G. (2006). “The Effect of Principal Component Analysis on Machine Learning Accuracy with High Dimensional Spectral Data”. Knowledge Based Systems, Vol. 19, Issue 5.

    Google Scholar 

  5. Hennessy, K., Madden, M.G., Conroy, J., Ryder, A.G. (2005). “An Improved Genetic Programming Technique for Identification of Solvents from Raman Spectra,” Knowledge Based Systems, Vol. 18, Issue 4–5.

    Google Scholar 

  6. Howley, T. (2007). “Kernel Methods for Machine Learning with Applications to the Analysis of Reaman Spectra”. PhD Thesis, National University of Ireland, Galway.

    Google Scholar 

  7. Hennessy, K. (2007). “Machine Learning Techniques for the Analysis of Raman Spectra”. PhD Thesis, National University of Ireland, Galway.

    Google Scholar 

  8. Geladi, P. & Kowalski, B.R. (1986). Partial Least Squares: A Tutorial. Analytica Chemica Acta, 185, 1–17.

    Article  Google Scholar 

  9. Wold, Svante, and Sjostrom, Michael (1977). SIMCA: A method for analyzing chemical data in terms of similarity and analogy, in Kowalski, B.R., ed., Chemometrics Theory and Application, American Chemical Society Symposium Series 52, Wash., D.C., American Chemical Society, p. 243–282.

    Chapter  Google Scholar 

  10. Markey, M.K., Tourassi, G.D. & Floyd, C.E. (2003). Decision tree classification of proteins identified by mass spectrometry of blood serum samples from people with and without lung cancer. Proteomics, 3, 1678–1679.

    Article  Google Scholar 

  11. Liu, H., Li, J. & Wong, L. (2002). A Comparative Study on Feature Selection and Classification Methods Using Gene Expression Profiles and Proteomic Patterns. Genome Informatics, 13, 51–60.

    Google Scholar 

  12. Yang, H., Griffiths, P.R. & Tate, J.D. (2003). Comparison of partial least squares regression and multi-layer neural networks for quantification of non-linear systems and application to gas phase fourier transfrom infrared spectra. Analytica Chimica Acta, 489, 125–136.

    Article  Google Scholar 

  13. Zou, T., Dou, Y., Mi, H., Ren, Y. & Ren, Y. (2006). Support vector regression for determination of component of compound Oxytetracycline powder on near-infrared spectroscopy. Analytical Biochemistry, 355, 1–7.

    Article  Google Scholar 

  14. Luinge, H.J., van der Maas, J.H. & Visser, T. (1995). Partial least squares regression as a multivariate tool for the interpretation of infrared spectra. Chemometrics and intelligent laboratory system, 28, 125–138.

    Google Scholar 

  15. Madden, M.G. and Ryder A.G. (2002). Machine learning methods for quantitative analysis of Raman Spectroscopy data. In Proceedings of SPIE, Vol. 4876, 1013–1019.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag London Limited

About this paper

Cite this paper

Madden, M.G., Howley, T. (2009). A Machine Learning Application for Classification of Chemical Spectra. In: Allen, T., Ellis, R., Petridis, M. (eds) Applications and Innovations in Intelligent Systems XVI. SGAI 2008. Springer, London. https://doi.org/10.1007/978-1-84882-215-3_6

Download citation

  • DOI: https://doi.org/10.1007/978-1-84882-215-3_6

  • Publisher Name: Springer, London

  • Print ISBN: 978-1-84882-214-6

  • Online ISBN: 978-1-84882-215-3

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics