Fast causal inference with non-random missingness by test-wise deletion

  • Eric V. StroblEmail author
  • Shyam Visweswaran
  • Peter L. Spirtes
Regular Paper


Many real datasets contain values missing not at random (MNAR). In this scenario, investigators often perform list-wise deletion, or delete samples with any missing values, before applying causal discovery algorithms. List-wise deletion is a sound and general strategy when paired with algorithms such as FCI and RFCI, but the deletion procedure also eliminates otherwise good samples that contain only a few missing values. In this report, we show that we can more efficiently utilize the observed values with test-wise deletion while still maintaining algorithmic soundness. Here, test-wise deletion refers to the process of list-wise deleting samples only among the variables required for each conditional independence (CI) test used in constraint-based searches. Test-wise deletion therefore often saves more samples than list-wise deletion for each CI test, especially when we have a sparse underlying graph. Our theoretical results show that test-wise deletion is sound under the justifiable assumption that none of the missingness mechanisms causally affect each other in the underlying causal graph. We also find that FCI and RFCI with test-wise deletion outperform their list-wise deletion and imputation counterparts on average when MNAR holds in both synthetic and real data.


Causal inference Missing values Missing not at random MNAR 



Research reported in this publication was supported by Grant U54HG008540 awarded by the National Human Genome Research Institute through funds provided by the trans-NIH Big Data to Knowledge initiative. The research was also supported by the National Library of Medicine of the National Institutes of Health under award numbers T15LM007059 and R01LM012095. The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health.


  1. 1.
    Brand, J.: Development, Implementation and Evaluation of Multiple Imputation Strategies for the Statistical Analysis of Incomplete Data Sets. The Author (1999).
  2. 2.
    Colombo, D., Maathius, M., Kalisch, M., Richardson, T.: Learning high-dimensional directed acyclic graphs with latent and selection variables. Ann. Stat. 40(1), 294–321 (2012).
  3. 3.
    Cranmer, S.J., Gill, J.: We have to be discrete about this: a non-parametric imputation technique for missing categorical data. Br. J. Polit. Sci. 43, 425–449 (2013). CrossRefGoogle Scholar
  4. 4.
    Daniel, R.M., Kenward, M.G., Cousens, S.N., De Stavola, B.L.: Using causal diagrams to guide analysis in missing data problems. Stoch. Models 21(3), 243–256 (2012)MathSciNetzbMATHGoogle Scholar
  5. 5.
    Doove, L., Van Buuren, S., Dusseldorp, E.: Recursive partitioning for missing data imputation in the presence of interaction effects. Comput. Stat. Data Anal. 72(C), 92–104 (2014)MathSciNetCrossRefGoogle Scholar
  6. 6.
    Kowarik, A., Templ, M.: Imputation with the R package VIM. J. Stat. Softw. 74, 1–16 (2016). CrossRefGoogle Scholar
  7. 7.
    Lauritzen, S.L., Dawid, A.P., Larsen, B.N., Leimer, H.G.: Independence properties of directed Markov fields. Networks 20(5), 491–505 (1990). MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Little, R.J.A.: Missing data adjustments in large surveys. J. Bus. Econ. Stat. 6, 287–296 (1988)Google Scholar
  9. 9.
    McArdle, J., Rodgers, W., Willis, R.: Cognition and aging in the USA (CogUSA) 2007–2009. Inter-university Consortium for Political and Social Research, Ann Arbor, MI (2015).
  10. 10.
    Mohan, K., Pearl, J., Tian, J.: Graphical models for inference with missing data. In: Burges, C.J.C., Bottou, L., Welling, M., Ghahramani, Z., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 26, pp. 1277–1285. Curran Associates, Inc., New York (2013)Google Scholar
  11. 11.
    Rubin, D.B.: Multiple Imputation for Nonresponse in Surveys. Wiley, New York (1987)CrossRefzbMATHGoogle Scholar
  12. 12.
    Schafer, J.: Analysis of Incomplete Multivariate Data. Chapman and Hall, London (1997)CrossRefzbMATHGoogle Scholar
  13. 13.
    Shah, A.D., Bartlett, J.W., Carpenter, J., Nicholas, O., Hemingway, H.: Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study. Am. J. Epidemiol. 179(6), 764–774 (2014). CrossRefGoogle Scholar
  14. 14.
    Shpitser, I., Mohan, K., Pearl, J.: Missing data as a causal and probabilistic problem. In: Proceedings of the Thirty-First Conference on Uncertainty in Artificial Intelligence, UAI 2015, 12–16 July 2015, Amsterdam, The Netherlands, pp. 802–811 (2015)Google Scholar
  15. 15.
    Sokolova, E., Groot, P., Claassen, T., von Rhein, D., Buitelaar, J., Heskes, T.: Causal discovery from medical data: dealing with missing values and a mixture of discrete and continuous data. In: Artificial Intelligence in Medicine—Proceedings of 15th Conference on Artificial Intelligence in Medicine, AIME 2015, Pavia, Italy, 17–20 June 2015, pp. 177–181 (2015).
  16. 16.
    Sokolova, E., von Rhein, D., Naaijen, J., Groot, P., Claassen, T., Buitelaar, J., Heskes, T.: Handling hybrid and missing data in constraint-based causal discovery to study the etiology of ADHD. Int. J. Data Sci. Anal. 3(2), 105–119 (2017). CrossRefGoogle Scholar
  17. 17.
    Spirtes, P.: An anytime algorithm for causal inference. In: In the Presence of Latent Variables and Selection Bias in Computation, Causation and Discovery, pp. 121–128. MIT Press, Cambridge (2001)Google Scholar
  18. 18.
    Spirtes, P., Glymour, C., Scheines, R.: Causation, Prediction, and Search, 2nd edn. MIT Press, Cambridge (2000)zbMATHGoogle Scholar
  19. 19.
    Spirtes, P., Meek, C., Richardson, T.: An algorithm for causal inference in the presence of latent variables and selection bias. Computation, Causation, and Discovery, pp. 211–252. AAAI Press, Menlo Park, CA (1999)Google Scholar
  20. 20.
    Spirtes, P., Richardson, T.: A polynomial time algorithm for determining DAG equivalence in the presence of latent variables and selection bias. In: Proceedings of the 6th International Workshop on Artificial Intelligence and Statistics, Fort Lauderdale, pp. 489–500 (1996)Google Scholar
  21. 21.
    Strobl, E.V., Zhang, K., Visweswaran, S.: Approximate Kernel-Based Conditional Independence Tests for Fast Non-Parametric Causal Discovery (2017).
  22. 22.
    Tillman, R.E., Danks, D., Glymour, C.: Integrating locally learned causal structures with overlapping variables. In: Advances in Neural Information Processing Systems 21, Proceedings of the Twenty-Second Annual Conference on Neural Information Processing Systems, Vancouver, British Columbia, Canada, 8–11 Dec 2008, pp. 1665–1672 (2008)Google Scholar
  23. 23.
    Tillman, R.E., Eberhardt, F.: Learning causal structure from multiple datasets with similar variable sets. Behaviormetrika 41(1), 41–64 (2014)CrossRefGoogle Scholar
  24. 24.
    Tillman, R.E., Spirtes, P.: Learning equivalence classes of acyclic models with latent and selection variables from multiple datasets with overlapping variables. In: Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2011, Fort Lauderdale, USA, 11–13 April 2011, pp. 3–15 (2011).
  25. 25.
    Triantafilou, S., Tsamardinos, I., Tollis, I.G.: Learning causal structure from overlapping variable sets. In: Proceedings of the Thirteenth International Conference on Artificial Intelligence and Statistics, AISTATS 2010, Chia Laguna Resort, Sardinia, Italy, 13–15 May 2010, pp. 860–867 (2010).
  26. 26.
    van Buuren, S.: Flexible Imputation of Missing Data (Chapman and Hall, CRC Interdisciplinary Statistics), 1st edn. Chapman and Hall, London (2012)CrossRefGoogle Scholar
  27. 27.
    van Buuren, S., Brand, J.P.L., Groothuis-Oudshoorn, K.C., Rubin, D.B.: Fully conditional specification in multivariate imputation. J. Stat. Comput. Simul. (in press) (2005)Google Scholar
  28. 28.
    van Buuren, S., Groothuis-Oudshoorn, K.: Mice: multivariate imputation by chained equations in R. J. Stat. Softw. 45(3) (2011).
  29. 29.
    Zhang, J.: On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias. Artif. Intell. 172(16–17), 1873–1896 (2008). MathSciNetCrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG, part of Springer Nature 2018

Authors and Affiliations

  1. 1.University of PittsburghPittsburghUSA
  2. 2.Carnegie Mellon UniversityPittsburghUSA

Personalised recommendations