Skip to main content

Feature Selection and Analysis on Correlated Breath Data

  • Chapter
  • First Online:

Abstract

Feature selection is a useful step in data analysis procedure. In this chapter, we study the classical support vector machine recursive feature elimination (SVM-RFE) algorithm and improve it by incorporating a correlation bias reduction (CBR) strategy into the feature elimination procedure. Experiments are conducted on a synthetic dataset and two breath analysis datasets. Large and comprehensive sets of transient features are extracted from the sensor responses. The classification accuracy with feature selection proves the efficacy of the proposed SVM-RFE + CBR. It outperforms the original SVM-RFE and other typical algorithms. An ensemble method is further studied to improve the stability of the proposed method. By statistically analyzing the features’ rankings, some knowledge is obtained, which can guide future design of e-noses and feature extraction algorithms.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD   109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  • Awada W, Khoshgoftaar TM, Dittman D, Wald R, Napolitano A (2012) A review of the stability of feature selection techniques for bioinformatics data. In: 2012 IEEE 13th international conference on information reuse and integration (IRI). IEEE, Las Vegas, USA, pp 356–363

    Google Scholar 

  • Bhondekar AP, Kaur R, Kumar R, Vig R, Kapur P (2011) A novel approach using dynamic social impact theory for optimization of impedance-tongue (itongue). Chemom Intell Lab 109(1):65–76

    Article  Google Scholar 

  • Burges CJ (1998) A tutorial on support vector machines for pattern recognition. Data Min Knowl Disc 2(2):121–167

    Article  Google Scholar 

  • Cho JH, Kurup PU (2011) Decision tree approach for classification and dimensionality reduction of electronic nose data. Sens Actuators: B Chem 160(1):542–548

    Article  Google Scholar 

  • Duan KB, Rajapakse JC, Wang H, Azuaje F (2005) Multiple SVM-RFE for gene selection in cancer classification with expression data. IEEE T NanoBiosci 4(3):228–234

    Article  Google Scholar 

  • Gualdrón O, Brezmes J, Llobet E, Amari A, Vilanova X, Bouchikhi B, Correig X (2007) Variable selection for support vector machine based multisensor systems. Sens Actuators: B Chem 122(1):259–268

    Article  Google Scholar 

  • Guo D, Zhang D, Li N, Zhang L, Yang J (2010) A novel breath analysis system based on electronic olfaction. IEEE Trans Biomed Eng 57(11):2753–2763

    Article  Google Scholar 

  • Gutierrez-Osuna R, Gutierrez-Galvez A, Powar N (2003) Transient response analysis for temperature-modulated chemoresistors. Sens Actuators: B Chem 93(1):57–66

    Article  Google Scholar 

  • Guyon I, Elisseeff A (2003) An introduction to variable and feature selection. J Mach Learn Res 3:1157–1182

    MATH  Google Scholar 

  • Guyon I, Weston J, Barnhill S, Vapnik V (2002) Gene selection for cancer classification using support vector machines. Mach Learn 46(1–3):389–422

    Article  MATH  Google Scholar 

  • Hierlemann A, Gutierrez-Osuna R (2008) Higher-order chemical sensing. Chem Rev 108(2):563–613

    Article  Google Scholar 

  • Hosseini-Golgoo S, Hossein-Babaei F (2011) Assessing the diagnostic information in the response patterns of a temperature-modulated tin oxide gas sensor. Meas Sci Technol 22(3):035, 201

    Google Scholar 

  • Kalousis A, Prados J, Hilario M (2007) Stability of feature selection algorithms: a study on high-dimensional spaces. Knowl Inf Syst 12(1):95–116

    Article  Google Scholar 

  • Kaur R, Kumar R, Gulati A, Ghanshyam C, Kapur P, Bhondekar AP (2012) Enhancing electronic nose performance: a novel feature selection approach using dynamic social impact theory and moving window time slicing for classification of kangra orthodox black tea (camellia sinensis (l.) o. kuntze). Sens Actuators B: Chem 166:309–319

    Article  Google Scholar 

  • Llobet E, Gualdrón O, Vinaixa M, El-Barbri N, Brezmes J, Vilanova X, Bouchikhi B, Gomez R, Carrasco J, Correig X (2007) Efficient feature selection for mass spectrometry based electronic nose applications. Chemom Intell Lab 85(2):253–261

    Article  Google Scholar 

  • Marco S, Gutiérrez-Gálvez A (2012) Signal and data processing for machine olfaction and chemical sensing: a review. IEEE Sens J 12(11):3189–3214

    Article  Google Scholar 

  • Martinelli E, Falconi C, D’Amico A, Di Natale C (2003) Feature extraction of chemical sensors in phase space. Sens Actuators: B Chem 95(1):132–139

    Article  Google Scholar 

  • Mundra PA, Rajapakse JC (2010) SVM-RFE with MRMR filter for gene selection. IEEE Trans NanoBiosci 9(1):31–37

    Article  Google Scholar 

  • Pardo M, Sberveglieri G (2008) Random forests and nearest shrunken centroids for the classification of sensor array data. Sens Actuators: B Chem 131(1):93–99

    Article  Google Scholar 

  • Park MY, Hastie T, Tibshirani R (2007) Averaged gene expressions for regression. Biostatistics 8(2):212–227

    Article  MATH  Google Scholar 

  • Paulsson N, Larsson E, Winquist F (2000) Extraction and selection of parameters for evaluation of breath alcohol measurement with an electronic nose. Sens Actuators: A Phys 84(3):187–197

    Article  Google Scholar 

  • Peng H, Long F, Ding C (2005) Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27(8):1226–1238

    Article  Google Scholar 

  • Rakotomamonjy A (2003) Variable selection using SVM based criteria. J Mach Learn Res 3:1357–1370

    MathSciNet  MATH  Google Scholar 

  • Saeys Y, Inza I, Larrañaga P (2007) A review of feature selection techniques in bioinformatics. Bioinformatics 23(19):2507–2517

    Article  Google Scholar 

  • Saeys Y, Abeel T, Van de Peer Y (2008) Robust feature selection using ensemble feature selection techniques. In: Machine learning and knowledge discovery in databases. Springer, pp 313–325

    Google Scholar 

  • Sharma DB, Bondell HD, Zhang HH (2013) Consistent group identification and variable selection in regression with correlated predictors. J Comput Graph Stat 22(2):319–340

    Article  MathSciNet  Google Scholar 

  • Somol P, Novovicova J (2010) Evaluating stability and comparing output of feature selectors that optimize feature subset cardinality. IEEE Trans Pattern Anal Mach Intell 32(11):1921–1939

    Article  Google Scholar 

  • Tang Y, Zhang YQ, Huang Z (2007) Development of two-stage SVM-RFE gene selection strategy for microarray expression data analysis. IEEE ACM T Comput Bi 4(3):365–381

    Google Scholar 

  • Toloşi L, Lengauer T (2011) Classification with correlated features: unreliability of feature ranking and solutions. Bioinformatics 27(14):1986–1994

    Article  Google Scholar 

  • Yan K, Zhang D (2014a) Blood glucose prediction by breath analysis system with feature selection and model fusion. In: 2014 36th Annual international conference of the IEEE engineering in medicine and biology society (EMBC). IEEE, pp 6406–6409

    Google Scholar 

  • Yan K, Zhang D (2014b) Sensor evaluation in a breath analysis system. In: 2014 International Conference on medical biometrics (ICMB). IEEE, pp 35–40

    Google Scholar 

  • Yan K, Zhang D (2015) Feature selection and analysis on correlated gas sensor data with recursive feature elimination. Sens Actuators B: Chem 212:353–363

    Article  Google Scholar 

  • Yan K, Zhang D, Wu D, Wei H, Lu G (2014) Design of a breath analysis system for diabetes screening and blood glucose level prediction. IEEE Trans Biomed Eng 61(11):2787–2795

    Article  Google Scholar 

  • Yoon S, Kim S (2009) Mutual information-based SVM-RFE for diagnostic classification of digitized mammograms. Pattern Recogn Lett 30(16):1489–1495

    Article  Google Scholar 

  • Zhang S, Xie C, Hu M, Li H, Bai Z, Zeng D (2008) An entire feature extraction method of metal oxide gas sensors. Sens Actuators: B Chem 132(1):81–89

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Zhang .

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Nature Singapore Pte Ltd.

About this chapter

Cite this chapter

Zhang, D., Guo, D., Yan, K. (2017). Feature Selection and Analysis on Correlated Breath Data . In: Breath Analysis for Medical Applications. Springer, Singapore. https://doi.org/10.1007/978-981-10-4322-2_10

Download citation

  • DOI: https://doi.org/10.1007/978-981-10-4322-2_10

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-10-4321-5

  • Online ISBN: 978-981-10-4322-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics