The Sample Average Treatment Effect

  • Laura B. Balzer
  • Maya L. Petersen
  • Mark J. van der Laan
Part of the Springer Series in Statistics book series (SSS)


In cluster randomized trials (CRTs), the study units usually are not a simple random sample from some clearly defined target population. Instead, the target population tends to be hypothetical or ill-defined, and the selection of study units tends to be systematic, driven by logistical and practical considerations. As a result, the population average treatment effect (PATE) may be neither well defined nor easily interpretable. In contrast, the sample average treatment effect (SATE) is the mean difference in the counterfactual outcomes for the study units. The sample parameter is easily interpretable and arguably the most relevant when the study units are not sampled from some specific super-population of interest. Furthermore, in most settings the sample parameter will be estimated more efficiently than the population parameter.



Research reported in this chapter was supported by Division of AIDS, NIAID of the National Institutes of Health under award numbers R01-AI074345, R37-AI051164, UM1AI069502 and U01AI099959. The content is solely the responsibility of the authors and does not necessarily represent the official views of the NIH.


  1. A. Abadie, G. Imbens, Simple and bias-corrected matching estimators for average treatment effects. Technical Report 283. NBER Working Paper (2002)Google Scholar
  2. L. Balzer, M. Petersen, M.J. van der Laan, Adaptive pair-matching in randomized trials with unbiased and efficient effect estimation. Stat. Med. 34(6), 999–1011 (2015)MathSciNetCrossRefGoogle Scholar
  3. L. Balzer, J. Ahern, S. Galea, M.J. van der Laan, Estimating effects with rare outcomes and high dimensional covariates: Knowledge is power. Epidemiol. Methods. 5(1), 1–18 (2016a)Google Scholar
  4. L.B. Balzer, M.L. Petersen, M.J. van der Laan, the SEARCH Collaboration, Targeted estimation and inference of the sample average treatment effect in trials with and without pair-matching. Stat. Med. 35(21), 3717–3732 (2016c)Google Scholar
  5. E. Bareinboim, J. Pearl, A general algorithm for deciding transportability of experimental results. J. Causal Inf. 1(1), 107–134 (2013)Google Scholar
  6. C. Beck, B. Lu, R. Greevy, nbpMatching: functions for optimal non-bipartite optimal matching (2016).
  7. M.J. Campbell, A. Donner, N. Klar, Developments in cluster randomized trials and statistics in medicine. Stat. Med. 26, 2–19 (2007)MathSciNetCrossRefGoogle Scholar
  8. W.G. Cochran, Analysis of covariance: its nature and uses. Biometrics 13, 261–281 (1957)MathSciNetCrossRefGoogle Scholar
  9. S.R. Cole, E.A. Stuart, Generalizing evidence from randomized clinical trials to target populations: the ACTG 320 trial. Am. J. Epidemiol. 172(1), 107–115 (2010)CrossRefGoogle Scholar
  10. D.R. Cox, P. McCullagh, Some aspects of analysis of covariance. Biometrics 38(3), 541–561 (1982)MathSciNetCrossRefzbMATHGoogle Scholar
  11. European Medicines Agency, Guideline on adjustment for baseline covariates in clinical trials. London, February (2015)Google Scholar
  12. R.A. Fisher, Statistical Methods for Research Workers, 4th edn. (Oliver and Boyd Ltd., Edinburgh, 1932)zbMATHGoogle Scholar
  13. R.A. Fisher, The Design of Experiments, (Oliver and Boyd Ltd, London, 1935)Google Scholar
  14. L.S. Freedman, M.H. Gail, S.B. Green, D.K. Corle, The COMMIT Research Group, The Efficiency of the matched-pairs design of the community intervention trial for smoking cessation (COMMIT). Control. Clin. Trials 18(2), 131–139 (1997)CrossRefGoogle Scholar
  15. R. Greevy, B. Lu, J.H. Silber, P. Rosenbaum, Optimal multivariate matching before randomization. Biostatistics 5(2), 263–275 (2004)CrossRefzbMATHGoogle Scholar
  16. H. Grosskurth, F. Mosha, J. Todd, E. Mwijarubi, A. Klokke, K. Senkoro, P. Mayaud, J. Changalucha, A. Nicoll, G. ka-Gina, J. Newell, K. Mugeye, D. Mabey, R. Hayes, Impact of improved treatment of sexually transmitted diseases on HIV infection in rural Tanzania: randomised controlled trial. Lancet 346(8974), 530–536 (1995)Google Scholar
  17. S. Gruber, M.J. van der Laan, A targeted maximum likelihood estimator of a causal effect on a bounded continuous outcome. Int. J. Biostat. 6(1), Article 26 (2010b)Google Scholar
  18. J. Hahn, On the role of the propensity score in efficient semiparametric estimation of average treatment effects. Econometrica 2, 315–331 (1998)MathSciNetCrossRefzbMATHGoogle Scholar
  19. E. Hartman, R. Grieve, R. Ramsahai, J.S. Sekhon, From sample average treatment effect to population average treatment effect on the treated: combining experimental with observational studies to estimate population treatment effects. J. R. Stat. Soc. Ser. A 178(3), 757–778 (2015)MathSciNetCrossRefGoogle Scholar
  20. R.J. Hayes, L.H. Moulton, Cluster Randomised Trials. (Chapman & Hall/CRC, Boca Raton, 2009)Google Scholar
  21. D.G. Horvitz, D.J. Thompson, A generalization of sampling without replacement from a finite universe. J. Am. Stat. Assoc. 47, 663–685 (1952)MathSciNetCrossRefzbMATHGoogle Scholar
  22. K. Imai, Variance identification and efficiency analysis in randomized experiments under the matched-pair design. Stat. Med. 27(24), 4857–4873 (2008)MathSciNetCrossRefGoogle Scholar
  23. K. Imai, G. King, C. Nall, The essential role of pair matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation. Stat. Sci. 24(1), 29–53 (2009)MathSciNetCrossRefzbMATHGoogle Scholar
  24. G.W. Imbens, Nonparametric estimation of average treatment effects under exogeneity: a review. Rev. Econ. Stat. 86(1), 4–29 (2004)MathSciNetCrossRefGoogle Scholar
  25. G. Imbens, D.B. Rubin, Causal Inference for Statistics, Social, and Biomedical Sciences (Cambridge University Press, New York, 2015)CrossRefzbMATHGoogle Scholar
  26. N. Klar, A. Donner, The merits of matching in community intervention trials: a cautionary tale. Stat. Med. 16(15), 1753–1764 (1997)CrossRefGoogle Scholar
  27. B. Lu, R. Greevy, X. Xu, C. Beck, Optimal nonbipartite matching and its statistical applications. Am. Stat. 65(1), 21–30 (2011)MathSciNetCrossRefGoogle Scholar
  28. K.L. Moore, M.J. van der Laan, Covariate adjustment in randomized trials with binary outcomes: targeted maximum likelihood estimation. Stat. Med. 28(1), 39–64 (2009b)Google Scholar
  29. J. Neyman, Sur les applications de la theorie des probabilites aux experiences agricoles: Essai des principes (In Polish). English translation by D.M. Dabrowska and T.P. Speed (1990). Stat. Sci. 5, 465–480 (1923)Google Scholar
  30. J. Pearl, Causal diagrams for empirical research. Biometrika 82, 669–710 (1995)MathSciNetCrossRefzbMATHGoogle Scholar
  31. J. Pearl, Causality: Models, Reasoning, and Inference, 2nd edn. (Cambridge, New York, 2009a)Google Scholar
  32. R Development Core Team. R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna (2016).
  33. J.M. Robins, A new approach to causal inference in mortality studies with sustained exposure periods–application to control of the healthy worker survivor effect. Math. Modell. 7, 1393–1512 (1986)MathSciNetCrossRefzbMATHGoogle Scholar
  34. J.M. Robins, A. Rotnitzky, L.P. Zhao, Estimation of regression coefficients when some regressors are not always observed. J. Am. Stat. Assoc. 89(427), 846–866 (1994)MathSciNetCrossRefzbMATHGoogle Scholar
  35. P.R. Rosenbaum, D.B. Rubin, The central role of the propensity score in observational studies for causal effects. Biometrika 70, 41–55 (1983b)Google Scholar
  36. M. Rosenblum, M.J. van der Laan, Simple, efficient estimators of treatment effects in randomized trials using generalized linear models to leverage baseline variables. Int. J. Biostat. 6(1), Article 13 (2010b)Google Scholar
  37. D.B. Rubin, Comment: Neyman (1923) and causal inference in experiments and observational studies. Stat. Sci. 5(4), 472–480 (1990)CrossRefzbMATHGoogle Scholar
  38. P. Schochet, Estimators for clustered education RCTs using the Neyman model for causal inference. J. Educ. Behav. Stat. 38(3), 219–238 (2013)CrossRefGoogle Scholar
  39. C. Shen, X. Li, L. Li, Inverse probability weighting for covariate adjustment in randomized studies. Stat. Med. 33, 555–568 (2014)MathSciNetCrossRefGoogle Scholar
  40. J.M. Snowden, S. Rose, K.M. Mortimer, Implementation of g-computation on a simulated data set: demonstration of a causal inference technique. Am. J. Epidemiol. 173(7), 731–738 (2011)CrossRefGoogle Scholar
  41. E.A. Stuart, S.R. Cole, C.P. Bradshaw, P.J. Leaf, The use of propensity scores to assess the generalizability of results from randomized trials. J. R. Stat. Soc. Ser. A 174(Part 2), 369–386 (2011)Google Scholar
  42. M. Toftager, L.B. Christiansen, P.L. Kristensen, J. Troelsen, Space for physical activity-a multicomponent intervention study: study design and baseline findings from a cluster randomized controlled trial. BMC Public Health 11, 777 (2011)CrossRefGoogle Scholar
  43. A.A. Tsiatis, M. Davidian, M. Zhang, X. Lu, Covariate adjustment for two-sample treatment comparisons in randomized clinical trials: a principled yet flexible approach. Stat. Med. 27, 4658–4677 (2008)MathSciNetCrossRefGoogle Scholar
  44. M.J. van der Laan, J.M. Robins, Unified Methods for Censored Longitudinal Data and Causality (Springer, Berlin Heidelberg New York, 2003)CrossRefzbMATHGoogle Scholar
  45. M.J. van der Laan, S. Rose, Targeted Learning: Causal Inference for Observational and Experimental Data (Springer, Berlin, Heidelberg, New York, 2011)CrossRefGoogle Scholar
  46. M.J. van der Laan, D.B. Rubin, Targeted maximum likelihood learning. Int. J. Biostat. 2(1), Article 11 (2006)Google Scholar
  47. M.J. van der Laan, L.B. Balzer, M.L. Petersen, Adaptive matching in randomized trials and observational studies. J. Stat. Res. 46(2), 113–156 (2013a)Google Scholar
  48. L. Watson, R. Small, S. Brown, W. Dawson, J. Lumley, Mounting a community-randomized trial: sample size, matching, selection, and randomization issues in PRISM. Control. Clin. Trials 25(3), 235–250 (2004)CrossRefGoogle Scholar
  49. K. Zhang, D.S. Small, Comment: the essential role of pair matching in cluster-randomized experiments, with application to the Mexican universal health insurance evaluation. Stat. Sci. 25(1), 59–64 (2009)CrossRefzbMATHGoogle Scholar

Copyright information

© Springer International Publishing AG 2018

Authors and Affiliations

  • Laura B. Balzer
    • 1
  • Maya L. Petersen
    • 2
  • Mark J. van der Laan
    • 3
  1. 1.Department of Biostatistics and EpidemiologySchool of Public Health and Health Sciences, University of Massachusetts - AmherstAmherstUSA
  2. 2.Division of Epidemiology and Division of BiostatisticsUniversity of California, BerkeleyBerkeleyUSA
  3. 3.Division of Biostatistics and Department of StatisticsUniversity of California, BerkeleyBerkeleyUSA

Personalised recommendations