Feature Selection by Bayesian Networks

Hruschka, Estevam R.; Hruschka, Eduardo R.; Ebecken, Nelson F. F.

doi:10.1007/978-3-540-24840-8_26

Estevam R. Hruschka Jr.¹⁸,
Eduardo R. Hruschka¹⁹ &
Nelson F. F. Ebecken¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3060))

Included in the following conference series:

Conference of the Canadian Society for Computational Studies of Intelligence

1738 Accesses
18 Citations

Abstract

This work both describes and evaluates a Bayesian feature selection approach for classification problems. Basically, a Bayesian network is generated from a dataset, and then the Markov Blanket of the class variable is used to the feature subset selection task. The proposed methodology is illustrated by means of simulations in three datasets that are benchmarks for data mining methods: Wisconsin Breast Cancer, Mushroom and Congressional Voting Records. Three classifiers were employed to show the efficacy of the proposed method. The average classification rates obtained in the datasets formed by all features are compared to those achieved in the datasets formed by the features that belong to the Markov Blanket. The performed simulations lead to interesting results.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Guyon, I., Elisseeff, A.: An Introduction to Variable and Feature Selection. Journal of Machine Learning Research 3, 1157–1182 (2003)
Article MATH Google Scholar
Koller, D., Sahami, M.: Toward optimal feature selection. In: Proceedings of the 13th International Conference on Machine Learning, July 1996, pp. 284–292 (1996)
Google Scholar
Reunanen, J.: Overfitting in Making Comparissons Between Variable Selection Methods. Journal of Machine Learning Research 3, 1371–1382 (2003)
Article MATH Google Scholar
Blum, A.L., Langley, P.: Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence, 245–271 (1997)
Google Scholar
Fayyad, U.M., Shapiro, G.P., Smyth, P.: From Data Mining to Knowledge Discovery: An Overview. In: Fayyad, et al. (eds.) Advances in Knowledge Discovery and Data Mining, pp. 1–37. MIT Press, Cambridge (1996)
Google Scholar
Bigus, J.P.: Data Mining with Neural Networks, 1st edn. McGraw-Hill, USA (1996)
Google Scholar
Han, J., Kamber, M.: Data Mining, Concepts and Techniques. Morgan Kaufmann, San Francisco (2001)
Google Scholar
Liu, H., Motoda, H.: Feature Selection for Knowledge Discovery and Data Mining. Kluwer Academic, Dordrecht (1998)
MATH Google Scholar
Yang, Y., Pederson, J.: A comparative study on feature selection in text categorization. In: Proc. of the Fourteenth International Conference on Machine Learning (1997)
Google Scholar
Cheng, J., Bell, D.A., Liu, W.: Learning belief networks from data: An information theory based approach. In: Proceedings of the sixth ACM International Conference on Information and Knowledge Management (1997)
Google Scholar
Quinlan, J.R.: C4.5 Program for Machine Learning. Morgan Kaufmann, San Francisco (1993)
Google Scholar
Witten, I.H., Frank, E.: Data Mining – Practical Machine Learning Tools and Techniques with Java Implementations. Morgan Kaufmann Publishers, USA (2000)
Google Scholar
Duch, W., Adamczak, R., Grabczewski, K.: A New Methodology of Extraction, Optimization and Application of Crisp and Fuzzy Logical Rules. IEEE Transactions on Neural Networks 11(2), 1–31 (2000)
Google Scholar
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, San Mateo (1988)
Google Scholar
Chickering, D.M.: Optimal Structure Identification with Greedy Search. Journal of Machine Learning Research (3), 507–554 (2002)
Article MathSciNet Google Scholar
Hruschka Jr, E.R., Ebecken, N.F.F.: Ordering attributes for missing values prediction and data classification. In: Data Mining III. Management Information Systems Series, vol. 6, WIT Press, Southampton (2002)
Google Scholar
Spirtes, P., Glymour, C., Scheines, R.: Causation, Predication, and Search. Springer, New York (1993)
Google Scholar
Merz, C.J., Murphy, P.M.: UCI Repository of Machine Learning Databases Irvine, CA, University of California, Department of Information and Computer Science, http://www.ics.uci.edu
Cooper, G., Herskovitz, E.: A Bayesian Method for the Induction of Probabilistic Networks from Data. Machine Learning 9, 309–347 (1992)
MATH Google Scholar
Schllimmer, J.C.: Concept acquisition through representational adjustment, Doctoral Dissertation, Department of Information and Computer Science, University of California, Irvine (1987)
Google Scholar

Download references

Author information

Authors and Affiliations

COPPE / Universidade Federal do Rio de Janeiro, Bloco B, Sala 100, Caixa Postal 68506, CEP 21945-970, Rio de Janeiro, RJ, Brasil
Estevam R. Hruschka Jr. & Nelson F. F. Ebecken
Universidade Católica de Santos (UniSantos), Rua Carvalho de Mendonça, 144, CEP 11.070-906, Santos, SP, Brasil
Eduardo R. Hruschka

Authors

Estevam R. Hruschka Jr.
View author publications
You can also search for this author in PubMed Google Scholar
Eduardo R. Hruschka
View author publications
You can also search for this author in PubMed Google Scholar
Nelson F. F. Ebecken
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

University of Windsor, 401 Sunset Avenue, N9B 3P4, Windsor, Ontario, Canada
Ahmed Y. Tawfik
School of Computer Science, University of Windsor, Windsor, Ontario,
Scott D. Goodwin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hruschka, E.R., Hruschka, E.R., Ebecken, N.F.F. (2004). Feature Selection by Bayesian Networks. In: Tawfik, A.Y., Goodwin, S.D. (eds) Advances in Artificial Intelligence. Canadian AI 2004. Lecture Notes in Computer Science(), vol 3060. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-24840-8_26

Download citation

DOI: https://doi.org/10.1007/978-3-540-24840-8_26
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-22004-6
Online ISBN: 978-3-540-24840-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics