Skip to main content

Feature Selection in Regression Tasks Using Conditional Mutual Information

  • Conference paper
Pattern Recognition and Image Analysis (IbPRIA 2011)

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 6669))

Included in the following conference series:

Abstract

This paper presents a supervised feature selection method applied to regression problems. The selection method uses a Dissimilarity matrix originally developed for classification problems, whose applicability is extended here to regression and built using the conditional mutual information between features with respect to a continuous relevant variable that represents the regression function. Applying an agglomerative hierarchical clustering technique, the algorithm selects a subset of the original set of features. The proposed technique is compared with other three methods. Experiments on four data-sets of different nature are presented to show the importance of the features selected from the point of view of the regression estimation error (using Support Vector Regression) considering the Root Mean Squared Error (RMSE).

This work was supported by the Spanish Ministry of Science and Innovation under the projects Consolider Ingenio 2010 CSD2007 − 00018, and EODIX AYA2008 − 05965 − C04 − 04/ESP and by Fundació Caixa-Castelló through the project P1 1B2007 − 48.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Cover, T.M., Thomas, J.A.: Elements of Information Theory. John Wiley & Sons Inc., Chichester (1991)

    Book  MATH  Google Scholar 

  2. Drucker, H., Burges, C., Kaufman, L., Kaufman, L., Smola, A., Vapnik, V.: Support Vector Regression Machines. In: NIPS 1996, pp. 155–161 (1996)

    Google Scholar 

  3. Friedmann, M.: The use of ranks to avoid the assumption of normality implicit in the analysis of variance. J. Am. Stat. Assoc. 32(200), 675–701 (1937)

    Article  MATH  Google Scholar 

  4. García, S., Fernández, A., Luengo, J., Herrera, F.: Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power. Information Sciences 180, 2044–2064 (2010)

    Article  Google Scholar 

  5. Gill, P.E., Murray, W., Wright, M.H.: Practical Optimization. Academic Press, London (1981)

    MATH  Google Scholar 

  6. Harrison, D., Rubinfeld, D.L.: Hedonic prices and the demand for clean air. J. Environ. Econ. Manag. 5, 81–102 (1978)

    Article  MATH  Google Scholar 

  7. Holmes, M.P., Gray, A., Isbell, C.L.: Fast kernel conditional density estimation: A dual-tree Monte Carlo approach. Comput. Stat. Data Analysis 54(7), 1707–1718 (2010)

    Article  MathSciNet  MATH  Google Scholar 

  8. Hyndman, R.J., Bashtannyk, D.M., Grunwald, G.K.: Estimating and visualizing conditional densities. Journal of Computation and graphical Statistics 5(4), 315–336 (1996)

    MathSciNet  Google Scholar 

  9. Jain, A.K., Duin, R.P.W., Mao, J.: Statistical pattern recognition: A review. IEEE Trans. PAMI 22(1), 4–37 (2000)

    Article  Google Scholar 

  10. Kennedy, J., Eberhart, R.: Particle Swarm Optimization. In: IEEE ICNN, pp. 1942–1948 (1995)

    Google Scholar 

  11. Kwak, N., Choi, C.-H.: Input feature selection by mutual information based on parzen window. IEEE Trans. PAMI 24(12), 1667–1671 (2002)

    Article  Google Scholar 

  12. Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1, 131–156 (1997)

    Article  Google Scholar 

  13. Monteiro, S.T., Kosugi, Y.: Particle Swarms for Feature Extraction of Hyperspectral Data. IEICE Trans. Inf. and Syst. E90D(7), 1038–1046 (2007)

    Article  Google Scholar 

  14. Moreno, J.F.: Sen2flex data acquisition report, Universidad de Valencia, Tech. Rep. (2005)

    Google Scholar 

  15. Ney, H.: On the relationship between classification error bounds and training criteria in statistical pattern recognition. In: Perales, F.J., Campilho, A.C., Pérez, N., Sanfeliu, A. (eds.) IbPRIA 2003. LNCS, vol. 2652, pp. 636–645. Springer, Heidelberg (2003)

    Chapter  Google Scholar 

  16. Pudil, P., Ferri, F.J., Novovicova, J., Kittler, J.: Floating search methods for feature selection with nonmonotonic criterion functions. Pattern Recognition 2, 279–283 (1994)

    Google Scholar 

  17. Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. Math. Statist. 27, 832–837 (1956)

    Article  MathSciNet  MATH  Google Scholar 

  18. Sotoca, J.M., Pla, F.: Supervised feature selection by clustering using conditional mutual information-based distances. Pattern Recognition 43(6), 2068–2081 (2010)

    Article  MATH  Google Scholar 

  19. Verleysen, M., Rossi, F., François, D.: Advances in Feature Selection with Mutual Information. In: Biehl, M., Hammer, B., Verleysen, M., Villmann, T. (eds.) Similarity-Based Clustering. LNCS, vol. 5400, pp. 52–69. Springer, Heidelberg (2009)

    Chapter  Google Scholar 

  20. Ward, J.H.: Hierarchical grouping to optimize an objective function. J. Am. Stat. Assoc. 58(301), 236–244 (1963)

    Article  MathSciNet  Google Scholar 

  21. Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. R. Statist. Soc. B 67(part 2), 301–320 (2005)

    Article  MathSciNet  MATH  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Latorre Carmona, P., Sotoca, J.M., Pla, F., Phoa, F.K.H., Bioucas Dias, J. (2011). Feature Selection in Regression Tasks Using Conditional Mutual Information. In: Vitrià, J., Sanches, J.M., Hernández, M. (eds) Pattern Recognition and Image Analysis. IbPRIA 2011. Lecture Notes in Computer Science, vol 6669. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-21257-4_28

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-21257-4_28

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-21256-7

  • Online ISBN: 978-3-642-21257-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics