Abstract
Learning Markov Blankets is important for classification and regression, causal discovery, and Bayesian network learning. We present an argument that ensemble masking measures can provide an approximate Markov Blanket. Consequently, an ensemble feature selection method can be used to learnMarkov Blankets for either discrete or continuous networks (without linear, Gaussian assumptions). We use masking measures for redundancy and statistical inference for feature selection criteria. We compare our performance in the causal structure learning problem to a collection of common feature selection methods.We also compare to Bayesian local structure learning. These results can also be easily extended to other casual structure models such as undirected graphical models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)
Frey, L., Fisher, D., Tsamardinos, I., Aliferis, C., Statnikov, A.: Identifying Markov blankets with decision tree induction. In: Proc. the 3rd IEEE Int. Conf. Data Mining, Melbourne, FL, pp. 59–66. IEEE Comp. Society, Los Alamitos (2003)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28, 832–844 (2000)
Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)
Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Langley, P. (ed.) Proc. the 17th Int. Conf. Machine Learning, Stanford, CA, pp. 359–366. Morgan Kaufmann, San Francisco (2000)
Hornik, K., Buchta, C., Zeileis, A.: Open-source machine learning: R meets Weka. Computational Statistics 24, 225–232 (2009)
Hoyer, P., Janzing, D., Mooij, J., Peters, J., Scholkopf, B.: Nonlinear causal discovery with additive noise models. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Inf. Proc. Syst., pp. 689–696. MIT Press, Cambridge (2009)
Ihaka, R., Gentleman, R.: A language for data analysis and graphics. J. Comp. and Graphical Stat. 5, 299–314 (1996)
Koller, D., Sahami, M.: Toward optimal feature selection. In: Saitta, L. (ed.) Proc. the 13th Int. Conf. Machine Learning, Bari, Italy, pp. 284–292. Morgan Kaufmann, San Francisco (1996)
Li, F., Yang, Y.: Use modified lasso regressions to learn large undirected graphs in a probabilistic framework. In: Veloso, M.M., Kambhampati, S. (eds.) Proc. the 20th Natl. Conf. Artif. Intell. and the 17th Innovative Appl. Artif. Intell. Conf., Pittsburgh, PA, pp. 81–86. AAAI Press, MIT Press (2005)
Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Inf. Proc. Syst., pp. 505–511. MIT Press, Cambridge (2000)
Pudil, P., Kittler, J., Novovicová, J.: Floating search methods in feature selection. Pattern Recognition Letters 15, 1119–1125 (1994)
Pellet, J.P., Elisseeff, A.: Using Markov blankets for causal structure learning. J. Machine Learning Research 9, 1295–1342 (2008)
Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relief and relieff. Machine Learning 53, 23–69 (2003)
Scutari, M.: Learning bayesian networks with the bnlearn R package. J. Stat. Software 35, 1–22 (2010)
Shimizu, S., Hoyer, P., Hyvärinen, A., Kerminen, A.: A linear non-gaussian acyclic model for causal discovery. J. Machine Learning Research 7, 2003–2030 (2006)
Tillman, R., Gretton, A., Spirtes, P.: Nonlinear directed acyclic structure learning with weakly additive noise models. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Inf. Proc. Syst., pp. 1847–1855. MIT Press, Cambridge (2010)
Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov blanket discovery. In: Russell, I., Haller, S.M. (eds.) Proc. the 16th Florida Artif. Intell. Research Society Conference, St. Augustine, FL, pp. 376–381. AAAI Press, New York (2003)
Tsamardinos, I., Aliferis, C., Statnikov, A.: Time and sample efficient discovery of Markov blankets and direct causal relations. In: Getoor, L., Senator, T.E., Domingos, P., Faloutsos, C. (eds.) Proc. the 9th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Washington DC, pp. 673–678. ACM, New York (2003)
Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. J. Machine Learning Research 10, 1341–1366 (2009)
Voortman, M., Druzdzel, M.: Insensitivity of constraint-based causal discovery algorithms to violations of the assumption of multivariate normality. In: Wilson, D., Lane, H.C. (eds.) Proc. the 21st Int. Florida Artif. Intell. Research Society Conf., Coconut Grove, FL, pp. 680–695. AAAI Press, New York (2008)
Witten, I.H., Frank, E.: Data mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)
Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Machine Learning Research 5, 1205–1224 (2004)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Deng, H., Davila, S., Runger, G., Tuv, E. (2011). Learning Markov Blankets for Continuous or Discrete Networks via Feature Selection. In: Okun, O., Valentini, G., Re, M. (eds) Ensembles in Machine Learning Applications. Studies in Computational Intelligence, vol 373. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22910-7_7
Download citation
DOI: https://doi.org/10.1007/978-3-642-22910-7_7
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-22909-1
Online ISBN: 978-3-642-22910-7
eBook Packages: EngineeringEngineering (R0)