Advertisement

Mutual Information Based Initialization of Forward-Backward Search for Feature Selection in Regression Problems

  • Alberto Guillén
  • Antti Sorjamaa
  • Gines Rubio
  • Amaury Lendasse
  • Ignacio Rojas
Part of the Lecture Notes in Computer Science book series (LNCS, volume 5768)

Abstract

Pure feature selection, where variables are chosen or not to be in the training data set, still remains as an unsolved problem, especially when the dimensionality is high. Recently, the Forward-Backward Search algorithm using the Delta Test to evaluate a possible solution was presented, showing a good performance. However, due to the locality of the search procedure, the initial starting point of the search becomes crucial in order to obtain good results. This paper presents new heuristics to find a more adequate starting point that could lead to a better solution. The heuristic is based on the sorting of the variables using the Mutual Information criterion, and then performing parallel local searches. These local searches provide an initial starting point for the actual parallel Forward-Backward algorithm.

Keywords

Feature Selection Local Search Mutual Information Time Series Prediction Sorting Scheme 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. 1.
    Eirola, E., Liitiäinen, E., Lendasse, A., Corona, F., Verleysen, M.: Using the Delta Test for Variable Selection. In: ESANN 2008, European Symposium on Artificial Neural Networks, Bruges, Belgium (April 2008)Google Scholar
  2. 2.
    Sorjamaa, A., Hao, J., Reyhani, N., Ji, Y., Lendasse, A.: Methodology for long-term prediction of time series. Neurocomputing 70(16-18), 2861–2869 (2007)CrossRefGoogle Scholar
  3. 3.
    Bishop, C.: Neural Networks for Pattern Recognition. Oxford University Press, Oxford (1995)zbMATHGoogle Scholar
  4. 4.
    Pi, H., Peterson, C.: Finding the embedding dimension and variable dependencies in time series. Neural Computation 6(3), 509–520 (1994)CrossRefGoogle Scholar
  5. 5.
    Jones, A.J.: New tools in non-linear modelling and prediction. Computational Management Science 1(2), 109–149 (2004)CrossRefzbMATHGoogle Scholar
  6. 6.
    Kraskov, A., Stögbauer, H., Grassberger, P.: Estimating mutual information. Phys. Rev. 69, 66–138 (2004)MathSciNetGoogle Scholar
  7. 7.
    Guillen, A., Rojas, I., Rubio, G., Pomares, H., Herrera, L.J., Gonzalez, J.: A new interface for MPI in matlab and its application over a genetic algorithm. In: Lendasse, A. (ed.) Proceedings of the European Symposium on Time Series Prediction, pp. 37–46 (2008), http://atc.ugr.es/~aguillen

Copyright information

© Springer-Verlag Berlin Heidelberg 2009

Authors and Affiliations

  • Alberto Guillén
    • 1
  • Antti Sorjamaa
    • 2
  • Gines Rubio
    • 3
  • Amaury Lendasse
    • 2
  • Ignacio Rojas
    • 3
  1. 1.Department of InformaticsUniversity of JaenSpain
  2. 2.Department of Computer Architecture and TechnologyUniversity of GranadaSpain
  3. 3.Department of Information and Computer ScienceHelsinki University of TechnologyFinland

Personalised recommendations