Abstract
The relatively recent appearance of high-dimensional databases has made traditional search algorithms too expensive in terms of time and memory resources. Thus, several modifications or enhancements to local search algorithms can be found in the literature to deal with this problem. However, non-deterministic global search, which is expected to perform better than local, still lacks appropriate adaptations or new developments for high-dimensional databases. We present a new non-deterministic iterative method which performs a global search and can easily handle datasets with high cardinality and, furthermore, it outperforms a wide variety of local search algorithms.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Bermejo, P., de la Ossa, L., Gámez, J.A., Puerta, J.M.: Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking, Knowledge-Based Systems (in press)
Bermejo, P., Gámez, J., Puerta, J.: On incremental wrapper-based attribute selection: experimental analysis of the relevance criteria. In: IPMU 2008: Proceedings of the 12th Intl. Conf. on Information Processing and Management of Uncertainty in Knowledge-Based Systems (2008)
Bermejo, P., Gámez, J.A., Puerta, J.M.: Incremental wrapper-based subset selection with replacement: An advantageous alternative to sequential forward selection. In: Proceedings of the IEEE Symposium on Computational Intelligence and Data Mining, CIDM 2009 (2009)
Bermejo, P., Gámez, J.A., Puerta, J.M.: A grasp algorithm for fast hybrid (filter-wrapper) feature subset selection in high-dimensional datasets. Pattern Recognition Letters 32(5), 701–711 (2011)
Blanco, R., Naga, P.L., Iñaki Inza, I., Sierra, B.: Selection of highly accurate genes for cancer classification by estimation of distribution algorithms. In: Workshop of Bayesian Models in Medicine, AIME 2001 (2001)
Casado-Yusta, S.: Different metaheuristic strategies to solve the feature selection problem. Pattern Recognition Letters 30(5), 525–534 (2009)
Demsar, J.: Statistical comparisons of classifiers over multiple data sets. Journal of Machine Learning Research 7, 1–30 (2006)
Esseghir, M.A.: Effective wrapper-filter hybridization through grasp schemata. In: MLR Workshop and Conference Proceedings, Feature Selection in Data Mining, vol. 10 (2010)
Feo, T.A., Resende, M.G.: Greedy randomized adaptive search procedures. Global Optimization 6(2), 109–133 (1995)
Fleuret, F.: Fast binary feature selection with conditional mutual information. Journal of Machine Learning Research 5, 1531–1555 (2004)
Flores, J., Gámez, J.A., Mateo, J.L.: Mining the esrom: A study of breeding value classification in manchego sheep by means of attribute selection and construction. Computers and Electronics in Agriculture 60(2), 167–177 (2008)
Garcia, S., Herrera, F.: An extension on ”statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons. Journal of Machine Learning Research 9, 2677–2694 (2008)
Guyon, I., Elisseeff, A.: An introduction to variable and feature selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Inza, I., Larrañaga, P., Etxeberria, R., Sierra, B.: Feature subset selection by bayesian network-based optimization. Artificial Intelligence 123, 157–184 (2000)
Jolliffe, I.: Principal Component Analysis. Springer, Heidelberg (1986)
Kittler, J.: Feature set search algorithms. Pattern Recognition and Signal Processing, 41–60 (1978)
Larrañaga, P., Lozano, J.A.: Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation. Kluwer Academic Publishers (2001)
Mühlenbein, H.: The equation for response to selection and its use for prediction. Evolutionary Computation 5, 303–346 (1998)
Ruiz, R., Aguilar, J.S., Riquelme, J.: Best agglomerative ranked subset for feature selection. In: JMLR: Workshop and Conference Proceedings, vol. 4 (New Challenges for feature selection) (2009)
Ruiz, R., Riquelme, J.C., Aguilar-Ruiz, J.S.: Incremental wrapper-based gene selection from microarray data for cancer classification. Pattern Recogn. 39, 2383–2392 (2006)
Tan, Q., Thomassen, M., Jochumsen, K.M., Zhao, J.H., Christensen, K., Kruse, T.A.: Evolutionary algorithm for feature subset selection in predicting tumor outcomes using microarray data. In: Măndoiu, I., Wang, S.-L., Zelikovsky, A. (eds.) ISBRA 2008. LNCS (LNBI), vol. 4983, pp. 426–433. Springer, Heidelberg (2008)
Yang, J., Honavar, V.: Feature subset selection using a genetic algorithm. IEEE Intelligent Systems 13(2), 44–49 (1998)
Zhu, Z., Ong, Y.-S., Dash, M.: Wrapper-filter feature selection algorithm using a memetic framework. IEEE Transactions on Systems, Man, and Cybernetics, Part B 37(1), 70–76 (2007)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Bermejo, P., de La Ossa, L., Puerta, J.M. (2011). Global Feature Subset Selection on High-Dimensional Datasets Using Re-ranking-based EDAs. In: Lozano, J.A., Gámez, J.A., Moreno, J.A. (eds) Advances in Artificial Intelligence. CAEPIA 2011. Lecture Notes in Computer Science(), vol 7023. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-25274-7_6
Download citation
DOI: https://doi.org/10.1007/978-3-642-25274-7_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-25273-0
Online ISBN: 978-3-642-25274-7
eBook Packages: Computer ScienceComputer Science (R0)