Causal Inference on Multivariate and Mixed-Type Data

  • Alexander MarxEmail author
  • Jilles Vreeken
Conference paper
Part of the Lecture Notes in Computer Science book series (LNCS, volume 11052)


How can we discover whether X causes Y, or vice versa, that Y causes X, when we are only given a sample over their joint distribution? How can we do this such that X and Y can be univariate, multivariate, or of different cardinalities? And, how can we do so regardless of whether X and Y are of the same, or of different data type, be it discrete, numeric, or mixed? These are exactly the questions we answer. We take an information theoretic approach, based on the Minimum Description Length principle, from which it follows that first describing the data over cause and then that of effect given cause is shorter than the reverse direction. Simply put, if Y can be explained more succinctly by a set of classification or regression trees conditioned on X, than in the opposite direction, we conclude that X causes Y. Empirical evaluation on a wide range of data shows that our method, Crack, infers the correct causal direction reliably and with high accuracy on a wide range of settings, outperforming the state of the art by a wide margin. Code related to this paper is available at:



The authors wish to thank Kailash Budhathoki for insightful discussions. Alexander Marx is supported by the International Max Planck Research School for Computer Science (IMPRS-CS). Both authors are supported by the Cluster of Excellence “Multimodal Computing and Interaction” within the Excellence Initiative of the German Federal Government.

Supplementary material

478890_1_En_39_MOESM1_ESM.pdf (174 kb)
Supplementary material 1 (pdf 173 KB)


  1. 1.
    Blöbaum, P., Janzing, D., Washio, T., Shimizu, S., Schölkopf, B.: Cause-effect inference by comparing regression errors. In: AISTATS (2018)Google Scholar
  2. 2.
    Budhathoki, K., Vreeken, J.: MDL for causal inference on discrete data. In: ICDM, pp. 751–756 (2017)Google Scholar
  3. 3.
    Budhathoki, K., Vreeken, J.: Origo: causal inference by compression. Knowl. Inf. Sys. 56(2), 285–307 (2018)CrossRefGoogle Scholar
  4. 4.
    Chen, Z., Zhang, K., Chan, L.: Nonlinear causal discovery for high dimensional data: a kernelized trace method. In: ICDM, pp. 1003–1008. IEEE (2013)Google Scholar
  5. 5.
    Dheeru, D., Karra Taniskidou, E.: UCI machine learning repository (2017)Google Scholar
  6. 6.
    Ghiringhelli, L.M., Vybiral, J., Levchenko, S.V., Draxl, C., Scheffler, M.: Big data of materials science: critical role of the descriptor. PRL 114, 105503 (2015)CrossRefGoogle Scholar
  7. 7.
    Grünwald, P.: The Minimum Description Length Principle. MIT Press, Cambridge (2007)CrossRefGoogle Scholar
  8. 8.
    Heikinheimo, H., Fortelius, M., Eronen, J., Mannila, H.: Biogeography of European land mammals shows environmentally distinct and spatially coherent clusters. J. Biogeogr. 34, 1053–1064 (2007)CrossRefGoogle Scholar
  9. 9.
    Hoyer, P., Janzing, D., Mooij, J., Peters, J., Schölkopf, B.: Nonlinear causal discovery with additive noise models. In: NIPS, pp. 689–696 (2009)Google Scholar
  10. 10.
    Janzing, D., Hoyer, P., Schölkopf, B.: Telling cause from effect based on high-dimensional observations. In: ICML, pp. 479–486. JMLR (2010)Google Scholar
  11. 11.
    Janzing, D., Schölkopf, B.: Causal inference using the algorithmic markov condition. IEEE TIT 56(10), 5168–5194 (2010)MathSciNetzbMATHGoogle Scholar
  12. 12.
    Janzing, D., Steudel, B.: Justifying additive noise model-based causal discovery via algorithmic information theory. OSID 17(2), 189–212 (2010)MathSciNetzbMATHGoogle Scholar
  13. 13.
    Janzing, D., et al.: Information-geometric approach to inferring causal directions. AIJ 182–183, 1–31 (2012)MathSciNetzbMATHGoogle Scholar
  14. 14.
    Kontkanen, P., Myllymäki, P.: MDL histogram density estimation. In: AISTATS, pp. 219–226 (2007)Google Scholar
  15. 15.
    Li, M., Vitányi, P.: An Introduction to Kolmogorov Complexity and Its Applications. TCS. Springer, New York (2008). Scholar
  16. 16.
    Marx, A., Vreeken, J.: Telling Cause from Effect using MDL-based Local and Global Regression. In: ICDM, pp. 307–316. IEEE (2017)Google Scholar
  17. 17.
    Mooij, J., Peters, J., Janzing, D., Zscheischler, J., Schölkopf, B.: Distinguishing cause from effect using observational data: methods and benchmarks. JMLR 17(32), 1–102 (2016)MathSciNetzbMATHGoogle Scholar
  18. 18.
    Mooij, J., Stegle, O., Janzing, D., Zhang, K., Schölkopf, B.: Probabilistic latent variable models for distinguishing between cause and effect. In: NIPS (2010)Google Scholar
  19. 19.
    Pearl, J.: Causality: Models, Reasoning and Inference. Cambridge University Press, New York (2009)CrossRefGoogle Scholar
  20. 20.
    Peters, J., Mooij, J., Janzing, D., Schölkopf, B.: Causal discovery with continuous additive noise models. JMLR 15, 2009–2053 (2014)MathSciNetzbMATHGoogle Scholar
  21. 21.
    Peters, J., Janzing, D., Schölkopf, B.: Causal inference on discrete data using additive noise models. IEEE TPAMI 33(12), 2436–2450 (2011)CrossRefGoogle Scholar
  22. 22.
    Quinlan, J.R.: Induction of decision trees. Mach. Learn. 1(1), 81–106 (1986)Google Scholar
  23. 23.
    Rissanen, J.: Modeling by shortest data description. Automatica 14(1), 465–471 (1978)CrossRefGoogle Scholar
  24. 24.
    Rissanen, J.: A universal prior for integers and estimation by minimum description length. Ann. Stat. 11(2), 416–431 (1983)MathSciNetCrossRefGoogle Scholar
  25. 25.
    Sgouritsa, E., Janzing, D., Hennig, P., Schölkopf, B.: Inference of cause and effect with unsupervised inverse regression. AISTATS 38, 847–855 (2015)Google Scholar
  26. 26.
    Shimizu, S., Hoyer, P.O., Hyvärinen, A., Kerminen, A.: A linear non-gaussian acyclic model for causal discovery. JMLR 7, 2003–2030 (2006)MathSciNetzbMATHGoogle Scholar
  27. 27.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. MIT press, Cambridge (2000)zbMATHGoogle Scholar
  28. 28.
    Steudel, B., Janzing, D., Schölkopf, B.: Causal markov condition for submodular information measures. In: COLT, pp. 464–476. OmniPress (2010)Google Scholar
  29. 29.
    Van Vechten, J.A.: Quantum dielectric theory of electronegativity in covalent systems. I. Electronic dielectric constant. PhysRev 182(3), 891 (1969)Google Scholar
  30. 30.
    Verma, T., Pearl, J.: Equivalence and synthesis of causal models. In: UAI, pp. 255–270 (1991)Google Scholar
  31. 31.
    Vreeken, J.: Causal inference by direction of information. In: SDM, pp. 909–917. SIAM (2015)Google Scholar
  32. 32.
    Zhang, K., Hyvärinen, A.: On the identifiability of the post-nonlinear causal model. In: UAI, pp. 647–655 (2009)Google Scholar

Copyright information

© Springer Nature Switzerland AG 2019

Authors and Affiliations

  1. 1.Max Planck Institute for Informatics and Saarland UniversitySaarbrückenGermany

Personalised recommendations