Skip to main content
Log in

Feature Extraction for LC–MS via Hierarchical Density Clustering

  • Original
  • Published:
Chromatographia Aims and scope Submit manuscript

Abstract

Liquid chromatography coupled with mass spectrometry (LC–MS) is a popular analytical platform for metabolomic studies. Accurate and sensitive feature detection is a key step before further analysis. It is still challenging due to the large quantity and high complexity of LC–MS data sets. Pure ion chromatogram (PIC) consists of ions produced from metabolite without interferences. Therefore, hierarchical density-based spatial clustering of applications with noise (HDBSCAN) was applied to extract PICs from LC to MS data sets in this study. Since metabolites generate high-density and continuous ions in both m/z and elution time axes, HDBSCAN can cluster ions of the same metabolite into the same group and avoid the definition of m/z tolerance. Compared to centWave and PITracer, the proposed method achieved higher recall and comparable levels of precision for feature detection on simulated, MM48 and Arabidopsis thaliana (L.) Heynh data sets. It was implemented in Python and opensourced at http://www.github.com/zmzhang/HPIC.

Graphic Abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5

Similar content being viewed by others

References

  1. Zhou B, Feng Xiao J, Tuli L, Ressom HW (2012) LC–MS-based metabolomics. Mol BioSyst 8:470–481. https://doi.org/10.1039/C1MB05350G

    Article  CAS  PubMed  Google Scholar 

  2. Xiao JF, Zhou B, Ressom HW (2012) Metabolite identification and quantitation in LC–MS/MS-based metabolomics. TrAC, Trends Anal Chem 32:1–14. https://doi.org/10.1016/j.trac.2011.08.009

    Article  CAS  Google Scholar 

  3. Gorrochategui E, Jaumot J, Lacorte S, Tauler R (2016) Data analysis strategies for targeted and untargeted LC–MS metabolomic studies: overview and workflow. TrAC, Trends Anal Chem 82:425–442. https://doi.org/10.1016/j.trac.2016.07.004

    Article  CAS  Google Scholar 

  4. Katajamaa M, Orešič M (2005) Processing methods for differential analysis of LC/MS profile data. BMC Bioinform 6:179. https://doi.org/10.1186/1471-2105-6-179

    Article  CAS  Google Scholar 

  5. Lommen A, Kools HJ (2012) MetAlign 3.0: performance enhancement by efficient use of advances in computer hardware. Metabolomics 8:719–726. https://doi.org/10.1007/s11306-011-0369-1

    Article  CAS  PubMed  Google Scholar 

  6. Lommen A (2009) MetAlign: interface-driven, versatile metabolomics tool for hyphenated full-scan mass spectrometry data preprocessing. Anal Chem 81:3079–3086. https://doi.org/10.1021/ac900036d

    Article  CAS  PubMed  Google Scholar 

  7. Wei X, Sun W, Shi X et al (2011) MetSign: a computational platform for high-resolution mass spectrometry-based metabolomics. Anal Chem 83:7668–7675. https://doi.org/10.1021/ac2017025

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  8. Melamud E, Vastag L, Rabinowitz JD (2010) Metabolomic analysis and visualization engine for LC–MS data. Anal Chem 82:9818–9826. https://doi.org/10.1021/ac1021166

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  9. Röst HL, Sachsenberg T, Aiche S et al (2016) OpenMS: a flexible open-source software platform for mass spectrometry data analysis. Nat Methods 13:741–748. https://doi.org/10.1038/nmeth.3959

    Article  CAS  PubMed  Google Scholar 

  10. Sturm M, Bertsch A, Gröpl C et al (2008) OpenMS—an open-source software framework for mass spectrometry. BMC Bioinform 9:163. https://doi.org/10.1186/1471-2105-9-163

    Article  CAS  Google Scholar 

  11. Röst HL, Schmitt U, Aebersold R, Malmström L (2014) pyOpenMS: a python-based interface to the OpenMS mass-spectrometry algorithm library. Proteomics 14:74–77. https://doi.org/10.1002/pmic.201300246

    Article  CAS  PubMed  Google Scholar 

  12. Tautenhahn R, Patti GJ, Rinehart D, Siuzdak G (2012) XCMS online: a web-based platform to process untargeted metabolomic data. Anal Chem 84:5035–5039. https://doi.org/10.1021/ac300698c

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  13. Smith CA, Want EJ, O’Maille G et al (2006) XCMS: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification. Anal Chem 78:779–787. https://doi.org/10.1021/ac051437y

    Article  CAS  Google Scholar 

  14. Pluskal T, Castillo S, Villar-Briones A, Orešič M (2010) MZmine 2: modular framework for processing, visualizing, and analyzing mass spectrometry-based molecular profile data. BMC Bioinform 11:395. https://doi.org/10.1186/1471-2105-11-395

    Article  CAS  Google Scholar 

  15. Katajamaa M, Miettinen J, Orešič M (2006) MZmine: toolbox for processing and visualization of mass spectrometry based molecular profile data. Bioinformatics 22:634–636. https://doi.org/10.1093/bioinformatics/btk039

    Article  CAS  PubMed  Google Scholar 

  16. Fu H-Y, Guo X-M, Zhang Y-M et al (2017) AntDAS: automatic data analysis strategy for UPLC–QTOF-based nontargeted metabolic profiling analysis. Anal Chem 89:11083–11090. https://doi.org/10.1021/acs.analchem.7b03160

    Article  CAS  PubMed  Google Scholar 

  17. Tautenhahn R, Böttcher C, Neumann S (2008) Highly sensitive feature detection for high resolution LC/MS. BMC Bioinform 9:504. https://doi.org/10.1186/1471-2105-9-504

    Article  CAS  Google Scholar 

  18. Mihaleva VV, Vorst O, Maliepaard C et al (2008) Accurate mass error correction in liquid chromatography time-of-flight mass spectrometry based metabolomics. Metabolomics 4:171–182. https://doi.org/10.1007/s11306-008-0108-4

    Article  CAS  Google Scholar 

  19. Åberg KM, Torgrip RJO, Kolmert J et al (2008) Feature detection and alignment of hyphenated chromatographic–mass spectrometric data: extraction of pure ion chromatograms using Kalman tracking. J Chromatogr A 1192:139–146. https://doi.org/10.1016/j.chroma.2008.03.033

    Article  CAS  PubMed  Google Scholar 

  20. Tengstrand E, Lindberg J, Åberg KM (2014) TracMass 2—a modular suite of tools for processing chromatography-full scan mass spectrometry data. Anal Chem 86:3435–3442. https://doi.org/10.1021/ac403905h

    Article  CAS  PubMed  Google Scholar 

  21. Conley CJ, Smith R, Torgrip RJO et al (2014) Massifquant: open-source Kalman filter-based XC–MS isotope trace feature detection. Bioinformatics 30:2636–2643. https://doi.org/10.1093/bioinformatics/btu359

    Article  CAS  PubMed  Google Scholar 

  22. Wang S-Y, Kuo C-H, Tseng YJ (2015) Ion trace detection algorithm to extract pure ion chromatograms to improve untargeted peak detection quality for liquid chromatography/time-of-flight mass spectrometry-based metabolomics data. Anal Chem 87:3048–3055. https://doi.org/10.1021/ac504711d

    Article  CAS  PubMed  Google Scholar 

  23. Ji H, Lu H, Zhang Z (2016) Pure ion chromatogram extraction via optimal k-means clustering. RSC Adv 6:56977–56985. https://doi.org/10.1039/C6RA08409E

    Article  CAS  Google Scholar 

  24. Ji H, Zeng F, Xu Y et al (2017) KPIC2: an effective framework for mass spectrometry-based metabolomics using pure ion chromatograms. Anal Chem 89:7631–7640. https://doi.org/10.1021/acs.analchem.7b01547

    Article  CAS  PubMed  Google Scholar 

  25. Wang H, Song M (2011) Ckmeans. 1d. dp: optimal k-means clustering in one dimension by dynamic programming. R J 3:29–33

    Article  PubMed  PubMed Central  Google Scholar 

  26. Campello RJGB, Moulavi D, Sander J (2013) Density-based clustering based on hierarchical density estimates. In: Pei J, Tseng VS, Cao L et al (eds) Advances in knowledge discovery and data mining. Springer, Berlin Heidelberg, pp 160–172

    Chapter  Google Scholar 

  27. Ester M, Kriegel H-P, Sander J, Xu X (1996) A density-based algorithm for discovering clusters a density-based algorithm for discovering clusters in large spatial databases with noise. In: Simoudis E, Han J, Fayyad U (eds) Proceedings of the second international conference on knowledge discovery and data mining. AAAI Press, Portland, Oregon, pp 226–231

    Google Scholar 

  28. Campello RJGB, Moulavi D, Zimek A, Sander J (2015) Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Trans Knowl Discov Data 10:1–51. https://doi.org/10.1145/2733381

    Article  Google Scholar 

  29. Zhang Z-M, Tong X, Peng Y et al (2015) Multiscale peak detection in wavelet space. Analyst 140:7955–7964. https://doi.org/10.1039/C5AN01816A

    Article  CAS  PubMed  Google Scholar 

  30. Tong X, Zhang Z, Zeng F et al (2016) Recursive wavelet peak detection of analytical signals. Chromatographia 79:1247–1255. https://doi.org/10.1007/s10337-016-3155-4

    Article  CAS  Google Scholar 

  31. Wang R, Ji H, Ma P et al (2017) Fast pure ion chromatograms extraction method for LC–MS. Chemom Intell Lab Syst 170:68–74. https://doi.org/10.1016/j.chemolab.2017.10.001

    Article  CAS  Google Scholar 

  32. Bielow C, Aiche S, Andreotti S, Reinert K (2011) MSSimulator: simulation of mass spectrometry data. J Proteome Res 10:2922–2929. https://doi.org/10.1021/pr200155f

    Article  CAS  PubMed  Google Scholar 

  33. Kuhl C, Tautenhahn R, Böttcher C et al (2012) CAMERA: an integrated strategy for compound spectra extraction and annotation of liquid chromatography/mass spectrometry data sets. Anal Chem 84:283–289. https://doi.org/10.1021/ac202450g

    Article  CAS  PubMed  Google Scholar 

  34. Haug K, Salek RM, Conesa P et al (2012) MetaboLights—an open-access general-purpose repository for metabolomics studies and associated meta-data. Nucleic Acids Res 41:D781–D786. https://doi.org/10.1093/nar/gks1004

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references

Acknowledgements

This work is financially supported by the National Natural Science Foundation of China (Grant Numbers. 21305163, 21375151, 21675174, and 21873116) and the Yunnan Provincial Tobacco Monopoly Bureau China (Grant Number. 2019530000241019).

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Zhi-Min Zhang or Hongmei Lu.

Ethics declarations

Conflict of interest

All authors declare that they have no conflict of interest.

Ethical approval

This article does not contain any studies with human participants or animals performed by any of the authors.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (DOCX 28 kb)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, H., Chen, Y., Liu, C. et al. Feature Extraction for LC–MS via Hierarchical Density Clustering. Chromatographia 82, 1449–1457 (2019). https://doi.org/10.1007/s10337-019-03766-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10337-019-03766-1

Keywords

Navigation