Abstract
Causality notion lies at the heart of science, but when statistics tries to address this issue some profound questions remain unanswered. How statistical inference in probabilistic terms is linked with causality? What modern causality models offer that is substantially different from the traditional dependency models like regression or decision trees, and if yes, do they deliver these promises? How causality models are related to statistical and machine learning techniques? What is the relationship between causality modeling, statistical inference, and machine learning on one side – and operations research and optimization on the other? Or, more generally: if the causal picture of the world is a commonly accepted goal of any science, could the non-causal statistical models be of any use? If yes – in what sense? If not – why are they so widely used? The insufficient level of detail in discussions of these and similar problems creates a lot of confusion, especially now, when lauded terms like Data Mining, Big Data, Deep Learning and others appear even in the non-professional media. This paper inspects the underlying logic of different approaches, directly or indirectly, related with causality. It shows that even established methods are vulnerable to small deviations from the ideal setting; that the leading approaches to statistical causality, Structural Equations Modeling (SEM), Directed Acyclic Graphs (DAG) and Potential Outcomes (PO) theories do not provide a coherent causality theory, and argues that this theory is impossible on pure statistical grounds. It also discusses a new approach in which the concept of causality is replaced by the concept of dependent variable generation. Separation of the variables generating the outcome from others just correlated with it (which often separates also causal from non-causal variables) is proposed.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Bang-Jensen, J., Gutin, G.: Digraphs: Theory, Algorithms and Applications. Springer, Heidelberg (2009). https://doi.org/10.1007/978-1-84800-998-1
Bennett, A.: The mother of all “isms”: organizing political science around causal mechanisms. In: Groff, R. (ed.) Revitalizing Causality: Realism About Causality in Philosophy and Social Science, pp. 205–219. Routledge (2008)
Berk, R.: Regression Analysis: A Constructive Critique. Sage Publications, Newbury Park (2004)
Berzuini, C., Dawid, P., Bernardinelli, L. (eds.): Causality: Statistical Perspectives and Applications. Wiley, Chichester (2012)
Bigelow, J., Ellis, B., Pargetter, R.: Forces. Philos. Sci. 55, 614–630 (1988)
Bontempi, G., Flauder, M.: From dependency to causality: a machine learning approach. J. Mach. Learn. Res. 16, 2437–2457 (2015)
Bunge, M.: Causality and Modern Science. Transaction Publishers, New Brunswick (2009)
Buonaccorsi, J.P.: Measurement Error: Models, Methods, and Applications. Chapman and Hall, Boca Raton (2010)
Carroll, R., et al.: Measurement Error in Nonlinear Models: A Modern Perspective. Chapman and Hall, New York (2006)
Cheng, C.L., Van Ness, J.W.: Statistical Regression with Measurement Error. Arnold Publishers, London (1999)
Conrady, S., Jouffe, L.: Bayesian Networks & BayesiaLab: A Practical Introduction for Researchers. Bayesia USA, Franklin (2015)
Consumer Price Index Manual: Theory and Practice. International Monetary Fund (2004)
Craycroft, J.: Propensity score methods: a simulation and case study involving breast cancer patients. Paper 2460 (2016). https://doi.org/10.18297/etd/2460
Dawid, P.: Conditional independence in statistical theory. J. R. Stat. Soc. B 41, 1–31 (1979)
Dawid, P.: Beware of the DAG! In: JMLR: Workshop and Conference Proceedings, vol. 6, pp. 59–86 (2009)
Dowe, P.: Causal processes. In: Stanford Encyclopedia of Philosophy (2007). http://seop.illc.uva.nl/entries/causation-process/
Demidenko, E., Mandel, I.: Yield analysis and mixed model. In: Proceedings of Joint Statistical Meeting. ASA, Alexandria, VA (2005)
Dodson, D., Mandel, I.: Causal Analytics for Media Planning (2015). https://et220.etelmar.net/index.aspx
Efron, B., Hastie, T.: Computer Age Statistical Inference Algorithms, Evidence, and Data Science. Cambridge University Press, New York (2016)
Good, I.J.: Good Thinking: The Foundations of Probability and Its Applications. The University of Minnesota, Minneapolis (1983)
Greenland, S., Robins, J.M., Pearl, J.: Confounding and collapsibility in causal inference. Stat. Sci. 14(1), 29–46 (1999)
Groff, R. (ed.): Revitalizing Causality: Realism about Causality in Philosophy and Social Science. Taylor and Francis Group, London (2008)
Hastie,T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer (2009)
Hildreth, C., Houck, J.P.: Some estimators for a linear model with random coefficients. J. Am. Stat. Assoc. 63, 584–595 (1968)
Hitchcock, C.: Probabilistic causation. In: Stanford Encyclopedia of Philosophy (2010). http://plato.stanford.edu/entries/causation-probabilistic/
Hofmann, T., Scholkopf, B., Smola, A.J.: Kernel methods in machine learning. Ann. Stat. 36(3), 1171–1220 (2008)
Hoover, K.D.: Causality in economics and econometrics. In: The New Palgrave Dictionary of Economics. Springer, Heidelberg (2016). https://doi.org/10.1057/978-1-349-95121-5_2227-1
Illari, P., Russo, F.: Causality: Philosophical Theory meets Scientific Practice. Oxford University Press, London (2014)
Imai, K., Tingley, D.: A statistical method for empirical testing of competing theories. Am. J. Polit. Sci. 56(1), 218–236 (2012)
Imbens, G., Rubin, D.: Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction. Cambridge University Press, New York (2015)
Johnson, V., Payne, R., Wang, T., Asher, A., Mandal, S.: On the reproducibility of psychological science. J. Am. Stat. Assoc. 112, 517 (2017)
Kaplan, D., Chen, C.: Bayesian Propensity Score Analysis: Simulation and Case Study (2011). https://www.sree.org/conferences/2011/program/downloads/slides/20.pdf
King, G., Nielsen, R.: Why Propensity Scores Should Not Be Used for Matching (2016). https://pdfs.semanticscholar.org/8ed9/88fa9e9ed4b7569faaab920639953c881b27.pdf
Kistler, M.: Causation and Laws of Nature. Routledge, London (2006)
Kline, R.: Principles and Practice of Structural Equation Modeling. The Guilford Press, New York (2011)
Kuznetsov, D., Mandel, I.: Statistical physics of media processes: mediaphysics. Phys. A 377, 253–268 (2007)
Leightner, J., Inoue, T.: Solving the omitted variables problem of regression analysis using the relative vertical position of observations. Adv. Decis. Sci. 2012 (2012). Paper ID 728980
Lewis, D.: Counterfactuals. Harvard University Press, Cambridge (1973)
Li, H., Yuan, Z., Su, P., Wang, T., Yu, Y., Sun, X., Xue, F.: A simulation study on matched case-control designs in the perspective of causal diagrams. BMC Med. Res. Methodol. BMC Ser. 16, 102 (2016)
Lipovetsky, S., Conklin, M.: Analysis of regression in game theory approach. Appl. Stochastic Models Bus. Ind. 17, 319–330 (2001)
Lipovetsky, S., Conklin, M.: Data aggregation and Simpson_s paradox gauged by index numbers. Eur. J. Oper. Res. 172, 334–351 (2006)
Lipovetsky, S.: Iteratively re-weighted random-coefficient models and Shapley value regression. Model Assist. Stat. Appl. 2, 201–212 (2007)
Lipovetsky, S., Conklin, M.: Predictor relative importance and matching regression parameters. J. Appl. Stat. (2014)
Lipovetsky, S., Mandel, I.: Review on: handbook of causal analysis in social research, Springer, 2015. Technometrics 57(2), 298–300 (2015a)
Lipovetsky, S., Mandel, I.: Modeling probability of causal and random impacts. J. Mod. Appl. Stat. Methods 14(1), 180–195 (2015b)
Mandel, I.: Sociosystemics, statistics, decisions. Model Assist. Stat. Appl. 6, 163–217 (2011)
Mandel, I.: Fusion and causal analysis in big marketing data sets. In: Proceedings of JSM. ASA, Alexandria, VA, pp. 1719–1732 (2013)
Mandel, I.: Causal models in estimation of the advertising ROI. In: Proceedings of JSM. ASA, Alexandria, VA, pp. 1720–1725 (2016)
Mandel, I.: Troublesome Dependency Modeling: Causality, Inference, Statistical Learning (2017a). https://papers.ssrn.com/sol3/papers.cfm?abstract_id=2984045
Mandel, I.: Regression coefficients vs causal coefficients. Post in ASA blog, 19 July 2017 (2017b). http://community.amstat.org
Masiuk, S., Kukush, A., Shklyar, S., Chepurny, M., Likhtarov, I.: Radiation Risk Estimation: Based on Measurement Error Models. Walter de Gruyter, Boston (2017)
Menzies, P.: Counterfactual theories of causation. In: Stanford Encyclopedia of Philosophy (2014). http://seop.illc.uva.nl/entries/causation-counterfactual/
Mirkin, B.: Core Concepts in Data Analysis: Summarization, Correlation and Visualization. Springer, Heidelberg (2011). https://doi.org/10.1007/978-0-85729-287-2
Morgan, S.L. (ed.): Handbook of Causal Analysis in Social Research. Springer, Heidelberg (2014). https://doi.org/10.1007/978-94-007-6094-3
Morgan, S.L., Winship, C.: Counterfactuals and Causal Inference: Methods and Principles for Social Research. Cambridge University Press, Cambridge (2014)
Open Science Collaboration: Investigating variation in replicability: a “Many Labs” replication project. Soc. Psychol. 45, 142–152 (2014)
Open Science Collaboration: Estimating the reproducibility of psychological science. Science 349(6251) (2015)
Pearl, J.: Causality. Cambridge University Press, Cambridge (2009)
Pearl, J.: The Causal Foundations of Structural Equation Modeling. Technical report R-370 (2012). http://ftp.cs.ucla.edu/pub/stat_ser/r370.pdf
Pearl, J., Glymour, M., Jewell, N.: Causal Inference in Statistics: A Primer. Wiley, Chichester (2016)
Peters, J., Janzing, D., Schölkopf, B.: Elements of Causal Inference: Foundations and Learning Algorithms. The MIT Press, Cambridge (2017)
Ralph, J., O’Neill, R., Winton, J.: A Practical Introduction to Index Numbers. Wiley (2015)
Rubin, D.: Matched Samples for Causal Effect. Cambridge University Press, New York (2006)
Scholkopf, B.: Causal Inference and Statistical Learning (2012). http://ml.dcs.shef.ac.uk/masamb/schoelkopf.pdf. http://machinelearningmastery.com/machine-learning-statistical-causal-methods/
Scholkopf, B., Janzing, D., Peters, J., Sgouritsa, E., Zhang, K., Mooij, J.: Semi-supervised learning in causal and anticausal settings. In: Schölkopf, B., Luo, Z., Vovk, V. (eds.) Empirical Inference, pp. 129–141. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-41136-6_13
Skow, B.: An Argument Against Woodward’s Theory of Causal Explanation (2013). http://web.mit.edu/bskow/www/research/manipulationism.pdf
Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search. The MIT Press, Cambridge (2001)
Squazzoni, F.: Agent-Based Computational Sociology. Wiley, Chichester (2012)
VanderWeele, T.: Explanation in Causal Inference: Methods for Mediation and Interaction. Oxford University Press, New York (2015)
Vapnik, V.: Estimation of Dependences Based on Empirical Data: Empirical Inference Science. Springer, Heidelberg (2006). https://doi.org/10.1007/0-387-34239-7
Viswanathan, M.: Measurement Error and Research Design. SAGE Publications, Thousand Oaks (2005)
Wansbeek, T., Meijer, E.: Measurement Error and Latent Variables in Econometrics. Elsevier, Amsterdam (2000)
Wasserstein, R., Lazar, N.: The ASA’s statement on p-values: context, process, and purpose. Am. Stat. 70(2), 129–133 (2016)
Zagar, A., Kadziola, Z., Lipkovich, I., Faries, D.: Evaluating different strategies for estimating treatment effects in observational studies. J. Biopharm. Stat. 27(3), 535–553 (2017)
Zadeh, L.: Causality is Undefinable. Toward a Theory of Hierarchical Definability (2001). http://link.springer.com/chapter/10.1007/3-540-45813-1_2#page-1
Acknowledgements
The study of causality was supported by Telmar Inc. and some of the results were incorporated in its software. Author sincerely thanks I. Lipkovich and S. Lipovetsky for the numerous fruitful discussions and B. Mirkin for very meaningful comments and suggestions.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this chapter
Cite this chapter
Mandel, I. (2018). Causality Modeling and Statistical Generative Mechanisms. In: Rozonoer, L., Mirkin, B., Muchnik, I. (eds) Braverman Readings in Machine Learning. Key Ideas from Inception to Current State. Lecture Notes in Computer Science(), vol 11100. Springer, Cham. https://doi.org/10.1007/978-3-319-99492-5_7
Download citation
DOI: https://doi.org/10.1007/978-3-319-99492-5_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-99491-8
Online ISBN: 978-3-319-99492-5
eBook Packages: Computer ScienceComputer Science (R0)