Abstract
Feature Selection methods in Data Mining and Data Analysis problems aim at selecting a subset of the variables, or features, that describe the data in order to obtain a more essential and compact representation of the available information. The selected subset has to be small in size and must retain the information that is most useful for the specific application. The role of Feature Selection is particularly important when computationally expensive Data Mining tools are used, or when the data collection process is difficult or costly. Feature Selection problems are typically solved in the literature using search techniques, where the evaluation of a specific subset is accomplished by a proper function (filter methods) or directly by the performance of a Data Mining tool (wrapper methods). In this work we show how the Feature Selection problem can be formulated as a subgraph selection problem derived from the lightest k-subgraph problem, and solved as an Integer Program. The proposed formulation is very flexible, as additional conditions on the solution can be added in the formulation. Although optimal solutions for such problems are difficult to find in the worst case, a large number of test instances have been solved efficiently by commercial tools. Finally, an application to a database on urban mobility is presented, where the proposed method is integrated in the Data Mining tool named Lsquare and is compared with other approaches.
Triantaphyllou, E. and G. Felici (Eds.), Data Mining and Knowledge Discovery Approaches based on Rule Induction Techniques, Massive Computing Series, Springer, Heidelberg, Germany, pp. 227–252, 2006.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
G. Alleva, F. D. Falorsi, S. Falorsi. Modelli interpretativi e previsivi della domanda di trasporto locale. Rapporto finale di ricerca-28 febbraio 2002 ISFORT.
H. Almuallim and T. G. Dietterich. Learning with many irrelevant features. In Proceedings of the 9 th National Conference on Artificial Intelligence. MIT Press, Cambridge, Mass., 1991.
Y. Asahiro, K. Iwama, H. Tamaki, T. Tokuyama. Greedily finding a dense subgraph. Proceedings of the 5 th Scandinavian Workshop on Algorithm Theory (SWAT). Lecture Notes in Computer Science, 1097, p. 136–148, Springer-Verlag, Reykjavik, Iceland, 1996.
M. Charikar, V. Guruswami, R. Kumar, S. Rajagopalan and A. Sahai. Combinatorial Feature Selection Problems. In Proceedings of FOCS 2000.
R. Caruana and D. Freitag. Greedy attribute selection. In Machine Learning: Proceedings of the 11 th International Conference. Morgan Kaufmann, New Brunswick, New Jersey, 1994.
K.J. Cherkauer and J.W. Shavlik. Growing simpler decision trees to facilitate knowledge discovery. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining. AAAI Press, Portland, Oregon, 1996.
M. Dash and H. Liu. Feature Selection for Classification. Intelligent Data Analysis, I(3), 1997.
P. Domingos. Context-sensitive feature selection for lazy learners. Artificial Intelligence Review, (11):227–253, 1997.
U. Feige and M. Seltser. On the densest k-subgraph problem. Technical Report CS97-16, Weizmann Institute of Science.
U. Feige, G. Kortsarz, and D. Peleg. The Dense k-Subgraph Problem. Algoritmica, 2001.
G. Felici and M. F. Arezzo. Tecniche avanzate di Data Mining applicate all’analisi della mobilità individuale, www.ing.unipi.it/input2003.
G. Felici and K. Truemper. A Minsat Approach for Learning in Logic Domains, INFORMS Journal of Computing, Vol. 14, No. 1, 20–36, 2002.
M. R. Garey and D. S. Johnson. Computers and Intractability. Freeman, 1979.
J. H. Gennari, P. Langley, and D. Fisher. Models of incremental concept formation. Artificial Intelligence 40,: 11–61, 1989.
M. X. Goemans. Mathematical programming and approximation algorithms, Lezione su Approximate Solution of Hard Combinatorial Problems, Summer School, Udine 1996.
M. X. Goemans and D. P. Williamson. Improved approximation algorithms for Maximum Cut and Satisfiability Problems Using Semidefinite Programming. Journal of ACM, VO1 42, p. 1115–1145, 1995.
M. A. Hall. Correlation-based Feature Selection for Machine Learning. In Proceedings of the 17 th International Conference on Machine Learning, Stanford University, C.A. Morgan Kaufmann Publishers, 2000.
G. H. John, R. Kohavi, and P. Pfleger. Irrelevant features and the subset selection problem. In Machine Learning: Proceedings of the 11 th International Conference. Morgan Kaufmann, 1994.
R. Kohavi and G. John. Wrappers for feature subset selection. Artificial Intelligence, special issue on relevance, 97(1–2):273–324, 1996.
D. Koller and M. Sahami. Hierachically classifying documents using very few words. In Machine learning: Proceedings of the 14 th International Conference, 1997.
P. Langley. Selection of relevant features in machine learning. In Proceedings of the AAAI Fall Symposium on Relevance. AAAI Press, 1994.
P. Langley and S. Sage. Scaling to domains with irrelevant features. In R. Greiner, editor, Computational Learning Theory an Natural Learning Systems, volume 4. MIT Press, 1994.
H. Liu and H. Motoda. Feature Selection for knowledge discovery and data mining. Kluwer Academic Publishers, 2000.
H. Liu and R. Setiono. A probabilistic approach to feature selection: A filter solution. In Machine learning: Proceedings of the 13 th International Conference on Machine Learning. Morgan Kaufmann, 1996.
J. Sheinvald, B. Dom and W. Niblack. Unsupervised image segmentation using the minimum description length principle. In Proceedings of the 10 th International Conference on Pattern Recognition, 1992.
A. L. Oliveira and A. S. Vincetelli. Constructive induction using a non-greedy strategy for feature selection. In Proceedings of the 9 th International Conference on Machine Learning, 355–360, Morgan Kaufmann, Aberdeen, Scotland, 1992.
A. W. Moore and M. S. Lee. Efficient algorithms for minimizing cross validation error. In Machine learning: Proceedings of the 11 th International Conference. Morgan Kaufmann, 1994.
A. W. Moore, D. J. Hill and M. P. Johnson. Computational Learning Theory and Natural Learning Systems, Volume 3. MIT Press, 1992.
M. Pazzani. Searching for dependencies in Bayesian classifiers. In Proceedings of the 5 th International Workshop on AI and Statistics, 1995.
J. R. Quinlan. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
J. C. Schlimmer. Efficiently inducing determinations: A complete and systematic search algorithm that uses optimal pruning. In Proceedings of the 10 th International Conference on Machine Learning, pp. 284–290, Amherst, MA: Morgan Kaufmann (1993).
D. B. Skalak. Prototype and feature selection by sampling and random mutation hill climbing algorithms. In Machine Learning: Proceedings of the 11 th International Conference. Morgan Kaufmann, 1994.
H. Vafaie and K. De Jong. Genetic algorithms as a tool for restructuring feature space representations. In Proceedings of the International Conference on Tools with A.I. IEEE Computer Society Press, 1995.
Y. Ye and J. Zhang. Approximation of Dense-n/2-Subgraph and the Complement of Min-Bisection, Working Paper, Department of Management Sciences, The University of Iowa (1999).
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer Science+Business Media, LLC
About this chapter
Cite this chapter
de Angelis, V., Felici, G., Mancinelli, G. (2006). Feature Selection for Data Mining. In: Triantaphyllou, E., Felici, G. (eds) Data Mining and Knowledge Discovery Approaches Based on Rule Induction Techniques. Massive Computing, vol 6. Springer, Boston, MA . https://doi.org/10.1007/0-387-34296-6_6
Download citation
DOI: https://doi.org/10.1007/0-387-34296-6_6
Publisher Name: Springer, Boston, MA
Print ISBN: 978-0-387-34294-8
Online ISBN: 978-0-387-34296-2
eBook Packages: Computer ScienceComputer Science (R0)