Advertisement

D-Optimal Design for Network A/B Testing

  • Victoria Pokhilko
  • Qiong Zhang
  • Lulu KangEmail author
  • D’arcy P. Mays
Original Article
  • 41 Downloads
Part of the following topical collections:
  1. Algorithms, Analysis and Advanced Methodologies in the Design of Experiments

Abstract

A/B testing refers to the statistical procedure of experimental design and analysis to compare two treatments, A and B, applied to different testing subjects. It is widely used by technology companies such as Facebook, LinkedIn, and Netflix, to compare different algorithms, web-designs, and other online products and services. The subjects participating in these online A/B testing experiments are users who are connected in different scales of social networks. Two connected subjects are similar in terms of their social behaviors, education and financial background, and other demographic aspects. Hence, it is only natural to assume that their reactions to online products and services are related to their network adjacency. In this paper, we propose to use the conditional auto-regressive model to present the network structure and include the network effects in the estimation and inference of the treatment effect. A D-optimal design criterion is developed based on the proposed model. Mixed integer programming formulations are developed to obtain the D-optimal designs. The effectiveness of the proposed method is shown through numerical results with synthetic networks and real social networks.

Keywords

A/B testing Conditional auto-regressive model D-optimal design Mixed integer programming Social network 

Notes

Acknowledgements

We sincerely thank the editor, the handling editor, and all the referees for their insightful comments which helped us improving the paper. This research was supported by US National Science Foundation Grant CMMI-1435902 and DMS-1916467.

Compliance with Ethical Standards

Conflict of interest

On behalf of all authors, the corresponding author states that there is no conflict of interest.

References

  1. 1.
    Atkinson A, Donev A, Tobias R (2007) Optimum experimental designs, with SAS, vol 34. Oxford University Press, OxfordzbMATHGoogle Scholar
  2. 2.
    Atkinson AC, Woods DC (2015) Designs for generalized linear models, Chapter 13. In: Handbook of design and analysis of experiments. Chapman & Hall/CRC, Boca Raton, FL, pp 471–514Google Scholar
  3. 3.
    Atwood CL (1969) Optimal and efficient designs of experiments. Ann Math Stat 40:1570–1602MathSciNetCrossRefzbMATHGoogle Scholar
  4. 4.
    Basse GW, Airoldi EM (2015) Optimal model-assisted design of experiments for network correlated outcomes suggests new notions of network balance. ArXiv preprint arXiv:1507.00803
  5. 5.
    Basse GW, Airoldi EM (2018a) Limitations of design-based causal inference and A/B testing under arbitrary and network interference. Sociol Methodol 48:136–151CrossRefGoogle Scholar
  6. 6.
    Basse GW, Airoldi EM (2018b) Model-assisted design of experiments in the presence of network-correlated outcomes. Biometrika 105:849–858MathSciNetCrossRefzbMATHGoogle Scholar
  7. 7.
    Bertsimas D, Johnson M, Kallus N (2015) The power of optimization over randomization in designing experiments involving small samples. Oper Res 63:868–876MathSciNetCrossRefzbMATHGoogle Scholar
  8. 8.
    Besag J (1974) Spatial interaction and the statistical analysis of lattice systems. J R Stat Soc Ser B (Methodol) 36(2):192–225MathSciNetzbMATHGoogle Scholar
  9. 9.
    Bhat N, Farias VF, Moallemi CC, Sinha D (2017) Near optimal AB testing. Columbia Business School, New YorkGoogle Scholar
  10. 10.
    Bivand R, Bernat A, Carvalho M, Chun Y, Dormann C, Dray S, Halbersma R, Lewin-Koh N, Ma J, Millo G et al (2005) The spdep package. Comprehensive R Archive Network, Version 05–83Google Scholar
  11. 11.
    Brook D (1964) On the distinction between the conditional probability and the joint probability approaches in the specification of nearest-neighbour systems. Biometrika 51:481–483MathSciNetCrossRefzbMATHGoogle Scholar
  12. 12.
    Chen Y, Qi Y, Liu Q, Chien P (2018) Sequential sampling enhanced composite likelihood approach to estimation of social intercorrelations in large-scale networks. Quant Market Econ 16:409–440CrossRefGoogle Scholar
  13. 13.
    Draper N, Smith H (1966) Applied regression analysis. New York, Wiley, pp 108–116Google Scholar
  14. 14.
    Eckles D, Karrer B, Ugander J (2017) Design and analysis of experiments in networks: reducing bias from interference. J Causal Infer.  https://doi.org/10.1515/jci-2015-0021
  15. 15.
    Fedorov V (2010) Optimal experimental design. Wiley Interdiscip Rev Comput Stat 2:581–589CrossRefGoogle Scholar
  16. 16.
    Gui H, Xu Y, Bhasin A, Han J (2015) Network a/b testing: from sampling to estimation. In: Proceedings of the 24th international conference on world wide web, international world wide web conferences steering committee, pp 399–409Google Scholar
  17. 17.
    Hore S, Dewanji A, Chatterjee A (2014) Design issues related to allocation of experimental units with known covariates into two treatment groups. J Stat Plan Inference 155:117–126MathSciNetCrossRefzbMATHGoogle Scholar
  18. 18.
    Kiefer J (1959) Optimum experimental designs. J R Stat Soc Ser B (Methodol) 21:272–304MathSciNetzbMATHGoogle Scholar
  19. 19.
    Kiefer J, Wolfowitz J (1959) Optimum designs in regression problems. Ann Math Stat 30(2):271–294MathSciNetCrossRefzbMATHGoogle Scholar
  20. 20.
    Leskovec J, Mcauley JJ (2012) Learning to discover social circles in ego networks. In: Advances in neural information processing systems, pp 539–547Google Scholar
  21. 21.
    Morgan KL, Rubin DB et al (2012) Rerandomization to improve covariate balance in experiments. Ann Stat 40:1263–1282MathSciNetCrossRefzbMATHGoogle Scholar
  22. 22.
    Nandy P, Basu K, Chatterjee S, Tu Y (2019) A/B testing in dense large-scale networks: design and inference. ArXiv preprint arXiv:1901.10505
  23. 23.
    Wolsey LA, Nemhauser GL (2014) Integer and combinatorial optimization. John Wiley & SonsGoogle Scholar
  24. 24.
    Nemhauser GL, Savelsbergh MWP, Sigismondi GS (1992) Constraint classification for mixed integer programming formulations. COAL Bull 20:8–12Google Scholar
  25. 25.
    Ogburn EL, Sofrygin O, Diaz I, van der Laan MJ (2017) Causal inference for social network data. ArXiv preprint arXiv:1705.08527
  26. 26.
    Pouget-Abadie J, Saveski M, Saint-Jacques G, Duan W, Xu Y, Ghosh S, Airoldi EM (2017) Testing for arbitrary interference on experimentation platforms. ArXiv preprint arXiv:1704.01190
  27. 27.
    Pukelsheim F (1993) Optimal design of experiments, vol 50. SIAM, New DelhizbMATHGoogle Scholar
  28. 28.
    Rubin DB (1974) Estimating causal effects of treatments in randomized and nonrandomized studies. J Educ Psychol 66:688CrossRefGoogle Scholar
  29. 29.
    Saveski M, Pouget-Abadie J, Saint-Jacques G, Duan W, Ghosh S, Xu Y, Airoldi EM (2017) Detecting network effects: randomizing over randomized experiments. In: Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 1027–1035Google Scholar
  30. 30.
    Schmidt AM, Nobre WS (2014) Conditional autoregressive (CAR) model. Statistics reference online, Wiley StatsRef, pp 1–11Google Scholar
  31. 31.
    Wall MM (2004) A close look at the spatial structure implied by the CAR and SAR models. J Stat Plan inference 121:311–324MathSciNetCrossRefzbMATHGoogle Scholar
  32. 32.
    Woods D (2005) Designing experiments under random contamination with application to polynomial spline regression. Stat Sin 15:619MathSciNetzbMATHGoogle Scholar
  33. 33.
    Wu CJ, Hamada MS (2011) Experiments: planning, analysis, and optimization, vol 552. Wiley, HobokenzbMATHGoogle Scholar
  34. 34.
    Xu Y, Chen N, Fernandez A, Sinno O, Bhasin A (2015) From infrastructure to culture: A/b testing challenges in large scale social networks. In: Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. ACM, pp 2227–2236Google Scholar
  35. 35.
    Yang M, Biedermann S, Tang E (2013) On optimal designs for nonlinear models: a general and efficient algorithm. J Am Stat Assoc 108:1411–1420MathSciNetCrossRefzbMATHGoogle Scholar
  36. 36.
    Yates F (1964) Sir Ronald fisher and the design of experiments. Biometrics 20:307–321CrossRefGoogle Scholar

Copyright information

© Grace Scientific Publishing 2019

Authors and Affiliations

  1. 1.Department of Statistical Sciences and Operations ResearchVirginia Commonwealth UniversityRichmondUSA
  2. 2.School of Mathematical and Statistical SciencesClemson UniversityClemsonUSA
  3. 3.Department of Applied MathematicsIllinois Institute of TechnologyChicagoUSA

Personalised recommendations