Skip to main content

Analysing chromatographic data using data mining to monitor petroleum content in water

  • Conference paper
  • First Online:
Information Technologies in Environmental Engineering

Part of the book series: Environmental Science and Engineering ((ENVENG))

  • 2424 Accesses

Abstract

Chromatography is an important analytical technique that has widespread use in environmental applications. A typical application is the monitoring of water samples to determine if they contain petroleum. These tests are mandated in many countries to enable environmental agencies to determine if tanks used to store petrol are leaking into local water systems.

Chromatographic techniques, typically using gas or liquid chromatography coupled with mass spectrometry, allow an analyst to detect a vast array of compounds—potentially in the order of thousands. Accurate analysis relies heavily on the skills of a limited pool of experienced analysts utilising semi-automatic techniques to analyse these datasets—making the outcomes subjective.

The focus of current laboratory data analysis systems has been on refinements of existing approaches. The work described here represents a paradigm shift achieved through applying data mining techniques to tackle the problem. These techniques are compelling because the efficacy of preprocessing methods, which are essential in this application area, can be objectively evaluated. This paper presents preliminary results using a data mining framework to predict the concentrations of petroleum compounds in water samples. Experiments demonstrate that the framework can be used to produce models of sufficient accuracy—measured in terms of root mean squared error and correlation coefficients—to offer the potential for significantly reducing the time spent by analysts on this task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 169.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 219.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 219.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  • Christensen, JH, Tomasi, G (2007). Practical aspects of chemometrics for oil spill fingerprinting. Journal of Chromatography A, Volume 1169, Issues 1-2, Pages 1-22

    Article  CAS  Google Scholar 

  • Hupp, AM, Marshall LJ, Campbell DI, Waddell Smith R, McGuffin VL (2008). Chemometric analysis of diesel fuel for forensic and environmental applications. Analytica Chimica Acta, Volume 606, Issue 2, Pages 159-171.

    Article  CAS  Google Scholar 

  • Johnson KJ, Wright BW, Jarman KH, Synovec RE (2003) High-speed peak matching algorithm for retention time alignment of gas chromatographic data for chemometric analysis, Journal of Chromatography A, Volume 996, Issues 1-2, Pages 141-155.

    Article  CAS  Google Scholar 

  • Nederkassel AM, Daszykowski M, Eilers PHC, Van der Heyden A (2006) A comparison of three algorithms for chromatograms alignment, Journal of Chromatography A, Volume 1118, Issue 2, Pages 199-2.

    Article  Google Scholar 

  • Nielsen N-P, Carstensen JM, Smedsgaard J (1998) Aligning of single and multiple wavelength chromatographic profiles for chemometric data analysis using correlation optimised warping, Journal of Chromatography A, Volume 805, Issues 1-2, Pages 17-35.

    Article  CAS  Google Scholar 

  • Nouretdinov, I, Melluish, T and Vovk, V (2001) Ridge Regression Confidence Machine, Proceedings of the 18th International Conference on Machine Learning, USA, pp 385-392, Morgan Kaufmann.

    Google Scholar 

  • Pérez Pavón JL, Peña AC, Pinto CG, Cordero, BM (2004) Detection of soil pollution by hydrocarbons using headspace–mass spectrometry and identification of compounds by headspace–fast gas chromatography–mass spectrometry. Journal of Chromatography A, Volume 1047, Issue 1, Pages 101-109

    Article  Google Scholar 

  • Pravdova V, Walczak B, Massart DL (2002) A comparison of two algorithms for warping of analytical signals, Analytica Chimica Acta, Volume 456, Issue 1, Pages 77-92.

    Article  CAS  Google Scholar 

  • Taylor J, King RD, Altmann T, and Fiehn O (2002) Application of metabolomics to plant genotype discrimination using statistics and machine learning. Bioinformatics, 18: S241 - S248.

    Article  Google Scholar 

  • Witten IH, Frank E (2005) Data Mining: Practical Machine Learning Tools and Techniques. Morgan Kaufmann, San Francisco, 2nd Edition.

    Google Scholar 

Download references

Acknowledgements

The authors would like to thank the Environmental GC-MS team at R. J. Hill Laboratories for their support.

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2009 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Holmes, G., Fletcher, D., Reutemann, P., Frank, E. (2009). Analysing chromatographic data using data mining to monitor petroleum content in water. In: Athanasiadis, I.N., Rizzoli, A.E., Mitkas, P.A., Gómez, J.M. (eds) Information Technologies in Environmental Engineering. Environmental Science and Engineering(). Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-88351-7_21

Download citation

Publish with us

Policies and ethics