Abstract
The irrelevant and redundant features may degrade the learner speed (due to the high dimensionality) and reduce both the accuracy and comprehensibility of induced model. To cope with these problems, many methods have been proposed to select a subset of pertinent features. In order to evaluate these subsets, two main approaches are generally distinguished: (1) filter approach: which considers only data i.e. algorithm-independent; (2) wrapper approach: which takes into account both data and a given learning algorithm i.e. algorithm-dependent.
In this paper, we address the problem of subset selection using α-RST (a generalized rough sets theory). We propose an algorithm to find a set of α-reducts which are non deterministic reducts. To select the best one among them, we also propose a Hybrid Approach by putting filter and wrapper together to overcome the disadvantages of each approach. Our study shows that generally the highest-accuracy-subset is not the best one as regards to the filter criteria. The highest accuracy subset is found by the new approach with minimum cost.
Chapter PDF
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Almuallim, H., Dietterich, T.G.: Learning boolean concepts in the presence of many irrelevant features. Artificial Intelligence, 69 (1–2) (November 1994) 279–305
Caruana, R., Freitag, D.: Greedy attribute selection. Proceedings of the Eleventh International Conference. M. Kaufman Publ. in Cohen&Hirsh eds, Machine learning Inc. (1994) 28–36
Fayyad U.M., Irani K.B.: Multi-interval Discretization of Continuous-attributes for classification learning IJCAI’93, (1993) 1022–1027
John, G. H., Kohavi, R., Pfleger, K.: Irrelevant features and the subset selection Problem. In Proceedings of the Eleventh International Conference on Machine Learning, (1994) 121–129
Kira, K., Rendell, L.A.: The feature selection problem: traditional methods and a new algorithm. In Proceedings of the 9th National Conference on Artificial Intelligence, (1992) 129–134
Kohavi, R.: Feature subset selection as search probabilistic estimates. AAAI Fall Symposium on Relevance, (1994) 122–126
Kohavi, R., Frasca, B.: Useful feature subset and rough sets reducts. In Proceedings of the third International Workshop on Rough Sets and Soft Computing, (1994) 310–317
Kohavi, R., Sommerfield, D.: Feature subset selection using the wrapper method: over-fitting and dynamic search space topology. In proceeding of the First International Conference on Knowledge Discovery and Data Mining, (1994) 192–197
Kononenko, I.: On biases of multi-valued attributes. In proceeding oh the 14th international joint conference on artificial intelligence, in C.S.mellish ed. (1995) 1034–1040.
Liu, H., Setiono, R.: A probabilistic approach for feature selection: A filter solution. In 13th International Conference on Machine Learning (ICML’96), (1996) 319–327
Liu, H., Setiono, R.: Feature selection and classification—a probabilistic wrapper approach. In Proceedings of the 9th International Conference on Industrial and Engineering Applications of AI and ES, (1996) 419–425
Modrzejewski, M.: Feature selection using rough sets theory. In Proceedings of the European Conference on Machine Learning, (1993) 213–226
Pawlak, Z.: Rough Sets: Theoretical Aspects of Reasoning About Data. Kluwer Academic Publishers, Dordrecht, The Netherlands (1991)
Pawlak, Z.: Rough Sets: present state and the future. Foundations of Computing and Decision Sciences, 18(3–4), (1993) 157–166.
Quafafou, M.: α-RST: A generalization of rough sets theory. In Proceedings in the Fifth International Workshop on Rough Sets and Soft Computing, RSSC’97. (1997)
Quafafou, M., Boussouf, M.: Induction of Strong Feature Subsets. In 1st European Symposium on Principles of Data Mining and Knowledge Discovery PKDD’97, (1997) 384–392
Slowinsky K., Slowinsky R.: Sensitivity analysis of rough classification. International journal of Man-Machine studies, 32 (1990) 693–705.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1998 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Boussouf, M. (1998). A hybrid approach to feature selection. In: Żytkow, J.M., Quafafou, M. (eds) Principles of Data Mining and Knowledge Discovery. PKDD 1998. Lecture Notes in Computer Science, vol 1510. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0094824
Download citation
DOI: https://doi.org/10.1007/BFb0094824
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65068-3
Online ISBN: 978-3-540-49687-8
eBook Packages: Springer Book Archive