Skip to main content

Learning Markov Blankets for Continuous or Discrete Networks via Feature Selection

  • Chapter
Ensembles in Machine Learning Applications

Part of the book series: Studies in Computational Intelligence ((SCI,volume 373))

Abstract

Learning Markov Blankets is important for classification and regression, causal discovery, and Bayesian network learning. We present an argument that ensemble masking measures can provide an approximate Markov Blanket. Consequently, an ensemble feature selection method can be used to learnMarkov Blankets for either discrete or continuous networks (without linear, Gaussian assumptions). We use masking measures for redundancy and statistical inference for feature selection criteria. We compare our performance in the causal structure learning problem to a collection of common feature selection methods.We also compare to Bayesian local structure learning. These results can also be easily extended to other casual structure models such as undirected graphical models.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

eBook
USD 16.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info
Hardcover Book
USD 109.99
Price excludes VAT (USA)
  • Durable hardcover edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Breiman, L.: Random forests. Machine Learning 45, 5–32 (2001)

    Article  MATH  Google Scholar 

  2. Frey, L., Fisher, D., Tsamardinos, I., Aliferis, C., Statnikov, A.: Identifying Markov blankets with decision tree induction. In: Proc. the 3rd IEEE Int. Conf. Data Mining, Melbourne, FL, pp. 59–66. IEEE Comp. Society, Los Alamitos (2003)

    Chapter  Google Scholar 

  3. Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Annals of Statistics 28, 832–844 (2000)

    Article  MathSciNet  Google Scholar 

  4. Guyon, I., Weston, J., Barnhill, S., Vapnik, V.: Gene selection for cancer classification using support vector machines. Machine Learning 46, 389–422 (2002)

    Article  MATH  Google Scholar 

  5. Hall, M.A.: Correlation-based feature selection for discrete and numeric class machine learning. In: Langley, P. (ed.) Proc. the 17th Int. Conf. Machine Learning, Stanford, CA, pp. 359–366. Morgan Kaufmann, San Francisco (2000)

    Google Scholar 

  6. Hornik, K., Buchta, C., Zeileis, A.: Open-source machine learning: R meets Weka. Computational Statistics 24, 225–232 (2009)

    Article  MathSciNet  MATH  Google Scholar 

  7. Hoyer, P., Janzing, D., Mooij, J., Peters, J., Scholkopf, B.: Nonlinear causal discovery with additive noise models. In: Koller, D., Schuurmans, D., Bengio, Y., Bottou, L. (eds.) Advances in Neural Inf. Proc. Syst., pp. 689–696. MIT Press, Cambridge (2009)

    Google Scholar 

  8. Ihaka, R., Gentleman, R.: A language for data analysis and graphics. J. Comp. and Graphical Stat. 5, 299–314 (1996)

    Article  Google Scholar 

  9. Koller, D., Sahami, M.: Toward optimal feature selection. In: Saitta, L. (ed.) Proc. the 13th Int. Conf. Machine Learning, Bari, Italy, pp. 284–292. Morgan Kaufmann, San Francisco (1996)

    Google Scholar 

  10. Li, F., Yang, Y.: Use modified lasso regressions to learn large undirected graphs in a probabilistic framework. In: Veloso, M.M., Kambhampati, S. (eds.) Proc. the 20th Natl. Conf. Artif. Intell. and the 17th Innovative Appl. Artif. Intell. Conf., Pittsburgh, PA, pp. 81–86. AAAI Press, MIT Press (2005)

    Google Scholar 

  11. Margaritis, D., Thrun, S.: Bayesian network induction via local neighborhoods. In: Solla, S.A., Leen, T.K., Müller, K.-R. (eds.) Advances in Neural Inf. Proc. Syst., pp. 505–511. MIT Press, Cambridge (2000)

    Google Scholar 

  12. Pudil, P., Kittler, J., Novovicová, J.: Floating search methods in feature selection. Pattern Recognition Letters 15, 1119–1125 (1994)

    Article  Google Scholar 

  13. Pellet, J.P., Elisseeff, A.: Using Markov blankets for causal structure learning. J. Machine Learning Research 9, 1295–1342 (2008)

    MathSciNet  Google Scholar 

  14. Robnik-Sikonja, M., Kononenko, I.: Theoretical and empirical analysis of relief and relieff. Machine Learning 53, 23–69 (2003)

    Article  MATH  Google Scholar 

  15. Scutari, M.: Learning bayesian networks with the bnlearn R package. J. Stat. Software 35, 1–22 (2010)

    Google Scholar 

  16. Shimizu, S., Hoyer, P., Hyvärinen, A., Kerminen, A.: A linear non-gaussian acyclic model for causal discovery. J. Machine Learning Research 7, 2003–2030 (2006)

    Google Scholar 

  17. Tillman, R., Gretton, A., Spirtes, P.: Nonlinear directed acyclic structure learning with weakly additive noise models. In: Bengio, Y., Schuurmans, D., Lafferty, J., Williams, C.K.I., Culotta, A. (eds.) Advances in Neural Inf. Proc. Syst., pp. 1847–1855. MIT Press, Cambridge (2010)

    Google Scholar 

  18. Tsamardinos, I., Aliferis, C., Statnikov, A.: Algorithms for large scale Markov blanket discovery. In: Russell, I., Haller, S.M. (eds.) Proc. the 16th Florida Artif. Intell. Research Society Conference, St. Augustine, FL, pp. 376–381. AAAI Press, New York (2003)

    Google Scholar 

  19. Tsamardinos, I., Aliferis, C., Statnikov, A.: Time and sample efficient discovery of Markov blankets and direct causal relations. In: Getoor, L., Senator, T.E., Domingos, P., Faloutsos, C. (eds.) Proc. the 9th ACM SIGKDD Int. Conf. Knowledge Discovery and Data Mining, Washington DC, pp. 673–678. ACM, New York (2003)

    Chapter  Google Scholar 

  20. Tuv, E., Borisov, A., Runger, G., Torkkola, K.: Feature selection with ensembles, artificial variables, and redundancy elimination. J. Machine Learning Research 10, 1341–1366 (2009)

    MathSciNet  Google Scholar 

  21. Voortman, M., Druzdzel, M.: Insensitivity of constraint-based causal discovery algorithms to violations of the assumption of multivariate normality. In: Wilson, D., Lane, H.C. (eds.) Proc. the 21st Int. Florida Artif. Intell. Research Society Conf., Coconut Grove, FL, pp. 680–695. AAAI Press, New York (2008)

    Google Scholar 

  22. Witten, I.H., Frank, E.: Data mining: Practical machine learning tools and techniques. Morgan Kaufmann, San Francisco (2005)

    MATH  Google Scholar 

  23. Yu, L., Liu, H.: Efficient feature selection via analysis of relevance and redundancy. J. Machine Learning Research 5, 1205–1224 (2004)

    MathSciNet  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2011 Springer-Verlag Berlin Heidelberg

About this chapter

Cite this chapter

Deng, H., Davila, S., Runger, G., Tuv, E. (2011). Learning Markov Blankets for Continuous or Discrete Networks via Feature Selection. In: Okun, O., Valentini, G., Re, M. (eds) Ensembles in Machine Learning Applications. Studies in Computational Intelligence, vol 373. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-22910-7_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-642-22910-7_7

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-642-22909-1

  • Online ISBN: 978-3-642-22910-7

  • eBook Packages: EngineeringEngineering (R0)

Publish with us

Policies and ethics