Causal Inference on Multivariate and Mixed-Type Data
Abstract
How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate, multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the same, or of different data type, be it discrete, numeric, or mixed? These are exactly the questions we answer. We take an information theoretic approach, based on the Minimum Description Length principle, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. Simply put, if Y can be explained more succinctly by a set of classification or regression trees conditioned on X, than in the opposite direction, we conclude that X causes Y. Empirical evaluation on a wide range of data shows that our method, Crack, infers the correct causal direction reliably and with high accuracy on a wide range of settings, outperforming the state of the art by a wide margin. Code related to this paper is available at: http://eda.mmci.uni-saarland.de/crack.
Notes
Acknowledgements
The authors wish to thank Kailash Budhathoki for insightful discussions. Alexander Marx is supported by the International Max Planck Research School for Computer Science (IMPRS-CS). Both authors are supported by the Cluster of Excellence “Multimodal Computing and Interaction” within the Excellence Initiative of the German Federal Government.
Supplementary material
References
- 1.Blöbaum, P., Janzing, D., Washio, T., Shimizu, S., Schölkopf, B.: Cause-effect inference by comparing regression errors. In: AISTATS (2018)Google Scholar
- 2.Budhathoki, K., Vreeken, J.: MDL for causal inference on discrete data. In: ICDM, pp. 751–756 (2017)Google Scholar
- 3.Budhathoki, K., Vreeken, J.: Origo: causal inference by compression. Knowl. Inf. Sys. 56(2), 285–307 (2018)CrossRefGoogle Scholar
- 4.Chen, Z., Zhang, K., Chan, L.: Nonlinear causal discovery for high dimensional data: a kernelized trace method. In: ICDM, pp. 1003–1008. IEEE (2013)Google Scholar
- 5.Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017)Google Scholar
- 6.Ghiringhelli, L.M., Vybiral, J., Levchenko, S.V., Draxl, C., Scheffler, M.: Big data of materials science: critical role of the descriptor. PRL 114, 105503 (2015)CrossRefGoogle Scholar
- 7.Grünwald, P.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)CrossRefGoogle Scholar
- 8.Heikinheimo, H., Fortelius, M., Eronen, J., Mannila, H.: Biogeography of European land mammals shows environmentally distinct and spatially coherent clusters. J. Biogeogr. 34, 1053–1064 (2007)CrossRefGoogle Scholar
- 9.Hoyer, P., Janzing, D., Mooij, J., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: NIPS, pp. 689–696 (2009)Google Scholar
- 10.Janzing, D., Hoyer, P., Schölkopf, B.: Telling cause from effect based on high-dimensional observations. In: ICML, pp. 479–486. JMLR (2010)Google Scholar
- 11.Janzing, D., Schölkopf, B.: Causal inference using the algorithmic markov condition. IEEE TIT 56(10), 5168–5194 (2010)MathSciNetzbMATHGoogle Scholar
- 12.Janzing, D., Steudel, B.: Justifying additive noise model-based causal discovery via algorithmic information theory. OSID 17(2), 189–212 (2010)MathSciNetzbMATHGoogle Scholar
- 13.Janzing, D., et al.: Information-geometric approach to inferring causal directions. AIJ 182–183, 1–31 (2012)MathSciNetzbMATHGoogle Scholar
- 14.Kontkanen, P., Myllymäki, P.: MDL histogram density estimation. In: AISTATS, pp. 219–226 (2007)Google Scholar
- 15.Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications. TCS. Springer, New York (2008). https://doi.org/10.1007/978-0-387-49820-1CrossRefzbMATHGoogle Scholar
- 16.Marx, A., Vreeken, J.: Telling Cause from Effect using MDL-based Local and Global Regression. In: ICDM, pp. 307–316. IEEE (2017)Google Scholar
- 17.Mooij, J., Peters, J., Janzing, D., Zscheischler, J., Schölkopf, B.: Distinguishing cause from effect using observational data: methods and benchmarks. JMLR 17(32), 1–102 (2016)MathSciNetzbMATHGoogle Scholar
- 18.Mooij, J., Stegle, O., Janzing, D., Zhang, K., Schölkopf, B.: Probabilistic latent variable models for distinguishing between cause and effect. In: NIPS (2010)Google Scholar
- 19.Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, New York (2009)CrossRefGoogle Scholar
- 20.Peters, J., Mooij, J., Janzing, D., Schölkopf, B.: Causal discovery with continuous additive noise models. JMLR 15, 2009–2053 (2014)MathSciNetzbMATHGoogle Scholar
- 21.Peters, J., Janzing, D., Schölkopf, B.: Causal inference on discrete data using additive noise models. IEEE TPAMI 33(12), 2436–2450 (2011)CrossRefGoogle Scholar
- 22.Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
- 23.Rissanen, J.: Modeling by shortest data description. Automatica 14(1), 465–471 (1978)CrossRefGoogle Scholar
- 24.Rissanen, J.: A universal prior for integers and estimation by minimum description length. Ann. Stat. 11(2), 416–431 (1983)MathSciNetCrossRefGoogle Scholar
- 25.Sgouritsa, E., Janzing, D., Hennig, P., Schölkopf, B.: Inference of cause and effect with unsupervised inverse regression. AISTATS 38, 847–855 (2015)Google Scholar
- 26.Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A.: A linear non-gaussian acyclic model for causal discovery. JMLR 7, 2003–2030 (2006)MathSciNetzbMATHGoogle Scholar
- 27.Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT press, Cambridge (2000)zbMATHGoogle Scholar
- 28.Steudel, B., Janzing, D., Schölkopf, B.: Causal markov condition for submodular information measures. In: COLT, pp. 464–476. OmniPress (2010)Google Scholar
- 29.Van Vechten, J.A.: Quantum dielectric theory of electronegativity in covalent systems. I. Electronic dielectric constant. PhysRev 182(3), 891 (1969)Google Scholar
- 30.Verma, T., Pearl, J.: Equivalence and synthesis of causal models. In: UAI, pp. 255–270 (1991)Google Scholar
- 31.Vreeken, J.: Causal inference by direction of information. In: SDM, pp. 909–917. SIAM (2015)Google Scholar
- 32.Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: UAI, pp. 647–655 (2009)Google Scholar