Abstract
Finding the causal direction in the cause-effect pair problem has been addressed in the literature by comparing two alternative generative models X → Y and Y → X. In this chapter, we first define what is meant by generative modeling and what are the main assumptions usually invoked in the literature in this bivariate setting. Then we present the theoretical identifiability problem that arises when considering causal graph with only two variables. It will lead us to present the general ideas used in the literature to perform a model selection based on the evaluation of a complexity/fit trade-off. Three main families of methods can be identified: methods making restrictive assumptions on the class of admissible causal mechanism, methods computing a smooth trade-off between fit and complexity and methods exploiting independence between cause and mechanism.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Example coming from https://en.wikipedia.org/wiki/Correlation_and_dependence.
- 2.
This database is composed of more than one hundred real cause-effect pairs with known ground truth and is available online at https://webdav.tuebingen.mpg.de/cause-effect/.
- 3.
GaussianProcessRegressor algorithm with default parameters from python library scikit-learn 0.19.1 [23] are used.
- 4.
Let us note however that a recent work of [2] shows that comparing mean square error after fitting regression models in both direction can achieve overall good results when specific assumptions are satisfied such as the function ϕ that represents the causal mechanism is monotonically increasing (or decreasing) and a specific independence postulate between the variance of the noise and the derivative ϕ′ is satisfied (see Sect. 3.5.3.4 for a description of this method). A wide comparative evaluation of all the methods, including this method RECI, will also be proposed in Sect. 3.6.
- 5.
However as discuss in Sect. 3.3.2 there always exists a potential nonlinear model Y → X that holds (X := f Y(Y, N X)) but this model is assumed to be more complex and is rejected due to the prior assumption that linear mechanisms are simpler.
- 6.
The first four datasets are available at http://dx.doi.org/10.7910/DVN/3757KX. The Tuebingen cause-effect pairs dataset with real pairs is available at https://webdav.tuebingen.mpg.de/cause-effect/.
- 7.
Available online at https://github.com/Diviyan-Kalainathan/CausalDiscoveryToolbox.
- 8.
Computational times are measured on Intel Xeon 2.7Ghz (CPU) or on Nvidia GTX 1080Ti graphics card (GPU).
- 9.
- 10.
References
Robert Axelrod and William Donald Hamilton. The evolution of cooperation. science, 211(4489):1390–1396, 1981.
Patrick Bloebaum, Dominik Janzing, Takashi Washio, Shohei Shimizu, and Bernhard Schölkopf. Cause-effect inference by comparing regression errors. In International Conference on Artificial Intelligence and Statistics, pages 900–909, 2018.
David Maxwell Chickering. Optimal structure identification with greedy search. Journal of machine learning research, 3(Nov):507–554, 2002.
Povilas Daniušis, Dominik Janzing, Joris Mooij, Jakob Zscheischler, Bastian Steudel, Kun Zhang, and Bernhard Schölkopf. Inferring deterministic causal relations. In Proceedings of the Twenty-Sixth Conference on Uncertainty in Artificial Intelligence, UAI’10, pages 143–150, Arlington, Virginia, United States, 2010. AUAI Press. ISBN 978-0-9749039-6-5. http://dl.acm.org/citation.cfm?id=3023549.3023566.
Bruce Edmonds and Scott Moss. From kiss to kids–an ‘anti-simplistic’ modelling approach. In International workshop on multi-agent systems and agent-based simulation, pages 130–144. Springer, 2004.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. In Neural Information Processing Systems (NIPS), pages 2672–2680, 2014.
Olivier Goudet, Diviyan Kalainathan, Philippe Caillou, Isabelle Guyon, David Lopez-Paz, and Michèle Sebag. Causal generative neural networks. arXiv preprint arXiv:1711.08936, 2017.
Clive WJ Granger. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, pages 424–438, 1969.
Arthur Gretton, Olivier Bousquet, Alex Smola, and Bernhard Schölkopf. Measuring statistical dependence with Hilbert-Schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.
Arthur Gretton, Karsten M Borgwardt, Malte Rasch, Bernhard Schölkopf, Alexander J Smola, et al. A kernel method for the two-sample-problem. 19:513, 2007.
Isabelle Guyon. Chalearn cause effect pairs challenge, 2013. http://www.causality.inf.ethz.ch/cause-effect.php.
Patrik O Hoyer, Dominik Janzing, Joris M Mooij, Jonas Peters, and Bernhard Schölkopf. Nonlinear causal discovery with additive noise models. In Neural Information Processing Systems (NIPS), pages 689–696, 2009.
Aapo Hyvärinen and Stephen M Smith. Pairwise likelihood ratios for estimation of non-gaussian structural equation models. Journal of Machine Learning Research, 14(Jan):111–152, 2013.
Dominik Janzing and Bernhard Schölkopf. Causal inference using the algorithmic Markov condition. IEEE Transactions on Information Theory, 56(10):5168–5194, 2010.
Dominik Janzing and Bernhard Schölkopf. Detecting confounding in multivariate linear models via spectral analysis. Journal of Causal Inference, 6(1), 2018.
David Lopez-Paz and Maxime Oquab. Revisiting classifier two-sample tests. arXiv preprint arXiv:1610.06545, 2016.
Alexander Marx and Jilles Vreeken. Causal inference on multivariate and mixed-type data. arXiv preprint arXiv:1702.06385, 2017.
Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
Jovana Mitrovic, Dino Sejdinovic, and Yee Whye Teh. Causal inference via kernel deviance measures. arXiv preprint arXiv:1804.04622, 2018.
Joris M Mooij, Jonas Peters, Dominik Janzing, Jakob Zscheischler, and Bernhard Schölkopf. Distinguishing cause from effect using observational data: methods and benchmarks. Journal of Machine Learning Research, 17(32):1–102, 2016.
Judea Pearl. Causality: models, reasoning and inference. Econometric Theory, 19(675–685):46, 2003.
Judea Pearl. Causality. Cambridge university press, 2009.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay. Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12:2825–2830, 2011.
Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Causal inference on discrete data using additive noise models. IEEE Transactions on Pattern Analysis and Machine Intelligence, 33(12):2436–2450, 2011.
Jonas Peters, Joris M Mooij, Dominik Janzing, and Bernhard Schölkopf. Causal discovery with continuous additive noise models. The Journal of Machine Learning Research, 15(1):2009–2053, 2014.
Dominik Rothenhäusler, Christina Heinze, Jonas Peters, and Nicolai Meinshausen. Backshift: Learning causal cyclic graphs from unknown shift interventions. In Advances in Neural Information Processing Systems, pages 1513–1521, 2015.
Eleni Sgouritsa, Dominik Janzing, Philipp Hennig, and Bernhard Schölkopf. Inference of cause and effect with unsupervised inverse regression. In Artificial Intelligence and Statistics, pages 847–855, 2015.
Shohei Shimizu, Patrik O Hoyer, Aapo Hyvärinen, and Antti Kerminen. A linear non-gaussian acyclic model for causal discovery. Journal of Machine Learning Research, 7(Oct):2003–2030, 2006.
Galit Shmueli et al. To explain or to predict? Statistical science, 25(3):289–310, 2010.
Peter Spirtes, Clark N Glymour, and Richard Scheines. Causation, prediction, and search. MIT press, 2000.
Oliver Stegle, Dominik Janzing, Kun Zhang, Joris M Mooij, and Bernhard Schölkopf. Probabilistic latent variable models for distinguishing between cause and effect. In Neural Information Processing Systems (NIPS), pages 1687–1695, 2010.
Xiaohai Sun, Dominik Janzing, and Bernhard Schölkopf. Causal inference by choosing graphs with most plausible Markov kernels. In ISAIM, 2006.
Chris S Wallace and Peter R Freeman. Estimation and inference by compact coding. Journal of the Royal Statistical Society. Series B (Methodological), pages 240–265, 1987.
Kun Zhang and Aapo Hyvärinen. Distinguishing causes from effects using nonlinear acyclic causal models. In Proceedings of the 2008th International Conference on Causality: Objectives and Assessment-Volume 6, pages 157–164. JMLR. org, 2008.
Kun Zhang and Aapo Hyvärinen. On the identifiability of the post-nonlinear causal model. In Proceedings of the twenty-fifth conference on uncertainty in artificial intelligence, pages 647–655. AUAI Press, 2009.
Kun Zhang, Zhikun Wang, Jiji Zhang, and Bernhard Schölkopf. On estimation of functional causal models: general results and application to the post-nonlinear causal model. ACM Transactions on Intelligent Systems and Technology (TIST), 7(2):13, 2016.
Acknowledgements
The authors would like to thank Daniel Rolland for proofreading this document, as well as the reviewers for their constructive feedback.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Goudet, O., Kalainathan, D., Sebag, M., Guyon, I. (2019). Learning Bivariate Functional Causal Models. In: Guyon, I., Statnikov, A., Batu, B. (eds) Cause Effect Pairs in Machine Learning. The Springer Series on Challenges in Machine Learning. Springer, Cham. https://doi.org/10.1007/978-3-030-21810-2_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-21810-2_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-21809-6
Online ISBN: 978-3-030-21810-2
eBook Packages: Computer ScienceComputer Science (R0)