Advertisement

Feature Importance in Causal Inference for Numerical and Categorical Variables

  • Bram Minnaert
Chapter
Part of the The Springer Series on Challenges in Machine Learning book series (SSCML)

Abstract

Predicting whether A causes B (write A → B ) or B causes A from samples (X, Y) is a challenging task. Several methods have already been proposed when both A and B are numerical. However, when A and/or B are categorical, few studies have already been performed.

This paper aims to learn the causal direction between two variables by fitting the regressions of X on Y and Y on X with machine learning algorithm and giving preference to the direction that yields a better fit.

This paper will investigate which features are the most important when A/B is numerical/categorical. Via an ensemble method, it finds that the features that are important heavily depend on the different combination of numerical/categorical.

Keywords

Causal inference Deterministic causal relations Random forest regression Graphical models Feature selection 

Notes

Acknowledgements

I would like to thank Kaggle and Chalearn to stir my interest into this topic [7] and I thank Isabelle Guyon and Mehreen Saeed for their assistance to make my source code portable.

References

  1. 1.
    Leo Breiman. Random forests. Mach. Learn., 45(1):5–32, October 2001. ISSN 0885-6125. URL http://dx.doi.org/10.1023/A:1010933404324.
  2. 2.
    Rich Caruana and Alexandru Niculescu-Mizil. An empirical comparison of supervised learning algorithms. In Proceedings of the 23rd International Conference on Machine Learning, ICML ’06, pages 161–168, New York, NY, USA, 2006. ACM. ISBN 1-59593-383-2. URL http://doi.acm.org/10.1145/1143844.1143865.
  3. 3.
    Povilas Daniušis, Dominik Janzing, Joris M. Mooij, Jakob Zscheischler, Bastian Steudel, Kun Zhang, and Bernhard Schölkopf. Inferring deterministic causal relations. In Proceedings of the 26th Annual Conference on Uncertainty in Artificial Intelligence (UAI-10), 2010. URL http://event.cwi.nl/uai2010/papers/UAI2010_0121.pdf.
  4. 4.
    Isabelle Guyon et al. Results and analysis of the 2013 chalearn cause-effect pair challenge. 2014.Google Scholar
  5. 5.
    Jerome H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189–1232, 2000.MathSciNetCrossRefGoogle Scholar
  6. 6.
    Jerome H. Friedman. Stochastic gradient boosting. Comput. Stat. Data Anal., 38(4):367–378, February 2002. ISSN 0167-9473. URL http://dx.doi.org/10.1016/S0167-9473(01)00065-2.
  7. 7.
    Isbelle Guyon. Cause-effect pairs challenge, 2013. Isabelle Guyon (ChaLearn) and Ben Hamner (Kaggle) and Alexander Statnikov (NYU) and Mikael Henaff (NYU) and Vincent Lemaire (Orange) and Bernhard Shoelkopf (MPI).Google Scholar
  8. 8.
    Patrik O. Hoyer, Dominik Janzing, Joris M. Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In D. Koller, D. Schuurmans, Y. Bengio, and L. Bottou, editors, Advances in Neural Information Processing Systems 21 (NIPS*2008), pages 689–696, 2009.Google Scholar
  9. 9.
    J. D. Hunter. Matplotlib: A 2d graphics environment. Computing In Science & Engineering, 9(3):90–95, 2007.CrossRefGoogle Scholar
  10. 10.
    Dominik Janzing, Joris Mooij, Kun Zhang, Jan Lemeire, Jakob Zscheischler, Povilas Daniušis, Bastian Steudel, and Bernhard Schölkopf. Information-geometric approach to inferring causal directions. Artif. Intell., 182–183:1–31, May 2012. ISSN 0004-3702. URL http://dx.doi.org/10.1016/j.artint.2012.01.002.
  11. 11.
    Eric Jones, Travis Oliphant, Pearu Peterson, et al. SciPy: Open source scientific tools for Python, 2001–. URL http://www.scipy.org/.
  12. 12.
    Joris M. Mooij, Oliver Stegle, Dominik Janzing, Kun Zhang, and Bernhard Schölkopf. Probabilistic latent variable models for distinguishing between cause and effect. In J. Lafferty, C. K. I. Williams, J. Shawe-Taylor, R.S. Zemel, and A. Culotta, editors, Advances in Neural Information Processing Systems 23 (NIPS*2010), pages 1687–1695, 2010. URL http://books.nips.cc/papers/files/nips23/NIPS2010_1270.pdf.
  13. 13.
    Shohei Shimizu, Patrik O. Hoyer, Aapo Hyvärinen, and Antti Kerminen. A linear non-gaussian acyclic model for causal discovery. J. Mach. Learn. Res., 7:2003–2030, December 2006. ISSN 1532-4435. URL http://dl.acm.org/citation.cfm?id=1248547.1248619.
  14. 14.
    Xiaohai Sun, Dominik Janzing, and Bernhard Schölkopf. Causal inference by choosing graphs with most plausible Markov kernels. In ISAIM, 2006. URL http://dblp.uni-trier.de/db/conf/isaim/isaim2006.html#SunJS06.
  15. 15.
    K Zhang and A Hyvärinen. Distinguishing causes from effects using nonlinear acyclic causal models. In I Guyon, D Janzing, and B Schölkopf, editors, JMLR Workshop and Conference Proceedings, Volume 6, pages 157–164, Cambridge, MA, USA, 2010. MIT Press. URL.Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  • Bram Minnaert
    • 1
  1. 1.ArcelorMittalGhentBelgium

Personalised recommendations