Skip to main content

An Activity and Metric Model for Online Controlled Experiments

  • Conference paper
  • First Online:

Part of the book series: Lecture Notes in Computer Science ((LNPSE,volume 11271))

Abstract

Accurate prioritization of efforts in product and services development is critical to the success of every company. Online controlled experiments, also known as A/B tests, enable software companies to establish causal relationships between changes in their systems and the movements in the metrics. By experimenting, product development can be directed towards identifying and delivering value. Previous research stresses the need for data-driven development and experimentation. However, the level of granularity in which existing models explain the experimentation process is neither sufficient, in terms of details, nor scalable, in terms of how to increase number and run different types of experiments, in an online setting. Based on a case study of multiple products running online controlled experiments at Microsoft, we provide an experimentation framework composed of two detailed experimentation models focused on two main aspects; the experimentation activities and the experimentation metrics. This work intends to provide guidelines to companies and practitioners on how to set and organize experimentation activities for running trustworthy online controlled experiments.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

References

  1. Fabijan, A., Dmitriev, P., Olsson, H.H., Bosch, J.: The benefits of controlled experimentation at scale. In: Proceedings of the 43rd Euromicro Conference on Software Engineering and Advanced Applications, SEAA 2017, pp. 18–26 (2017)

    Google Scholar 

  2. Kohavi, R., Deng, A., Frasca, B., Longbotham, R., Walker, T., Xu, Y.: Trustworthy online controlled experiments. In: Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2012, p. 786 (2012)

    Google Scholar 

  3. Bakshy, E., Eckles, D., Bernstein, M.S.: Designing and deploying online field experiments. In: Proceedings of the 23rd International Conference on World Wide Web, WWW 2014, pp. 283–292, September 2014

    Google Scholar 

  4. Gui, H., Xu, Y., Bhasin, A., Han, J.: Network A/B testing. In: Proceedings of the 24th International Conference on World Wide Web, WWW 2015, pp. 399–409 (2015)

    Google Scholar 

  5. Tang, D., Agarwal, A., O’Brien, D., Meyer, M.: Overlapping experiment infrastructure. In: Proceedings of the 16th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2010, p. 17 (2010)

    Google Scholar 

  6. Dmitriev, P., Frasca, B., Gupta, S., Kohavi, R., Vaz, G.: Pitfalls of long-term online controlled experiments. In: Proceedings of the 2016 IEEE International Conference on Big Data, Big Data 2016, pp. 1367–1376 (2016)

    Google Scholar 

  7. Fagerholm, F., Sanchez Guinea, A., Mäenpää, H., Münch, J.: The RIGHT model for continuous experimentation. J. Syst. Softw. 123, 292–305 (2017)

    Article  Google Scholar 

  8. Olsson, H.H., Bosch, J.: The HYPEX model: from opinions to data-driven software development. In: Bosch, J. (ed.) Continuous Software Engineering, pp. 155–164. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-11283-1_13

    Chapter  Google Scholar 

  9. Olsson, H.H., Bosch, J.: Towards continuous customer validation: a conceptual model for combining qualitative customer feedback with quantitative customer observation. In: Fernandes, J.M., Machado, R.J., Wnuk, K. (eds.) ICSOB 2015. LNBIP, vol. 210, pp. 154–166. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19593-3_13

    Chapter  Google Scholar 

  10. Dmitriev, P., Gupta, S., Dong Woo, K., Vaz, G.: A dirty dozen: twelve common metric interpretation pitfalls in online controlled experiments. In: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2017, pp. 1427–1436 (2017)

    Google Scholar 

  11. Crook, T., Frasca, B., Kohavi, R., Longbotham, R.: Seven pitfalls to avoid when running controlled experiments on the web. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2009, p. 1105 (2009)

    Google Scholar 

  12. Kluck, T., Vermeer, L.: Leaky abstraction in online experimentation platforms: a conceptual framework to categorize common challenges (2017)

    Google Scholar 

  13. Chen, R., Chen, M., Jadav, M.R., Bae, J., Matheson, D.: Faster online experimentation by eliminating traditional A/A validation, pp. 1635–1641 (2017)

    Google Scholar 

  14. Kaufman, R.L., Pitchforth, J., Vermeer, L.: Democratizing online controlled experiments at Booking.com. http://arxiv.org/abs/1710.08217. Accessed 23 Oct 2017

  15. Kohavi, R., Longbotham, R., Sommerfield, D., Henne, R.M.: Controlled experiments on the web: survey and practical guide. Data Min. Knowl. Discov. 18(1), 140–181 (2009)

    Article  MathSciNet  Google Scholar 

  16. Fabijan, A., Olsson, H.H., Bosch, J.: Customer feedback and data collection techniques in software R&D: a literature review. In: Fernandes, J.M., Machado, R.J., Wnuk, K. (eds.) ICSOB 2015. LNBIP, vol. 210, pp. 139–153. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-19593-3_12

    Chapter  Google Scholar 

  17. Kohavi, R., Thomke, S.: The surprising power of online experiments. Harv. Bus. Rev. 95, 74 (2017)

    Google Scholar 

  18. Gupta, S., Bhardwaj, S., Dmitriev, P., Ulanova, L., Raff, P., Fabijan, A.: The anatomy of a large-scale online experimentation platform. In: International Conference on Software Architecture, ICSA 2018, May 2018

    Google Scholar 

  19. Kevic, K., Murphy, B., Williams, L., Beckmann, J.: Characterizing experimentation in continuous deployment: a case study on bing. In: Proceedings of the 2017 IEEE/ACM 39th International Conference on Software Engineering: Software Engineering in Practice Track, ICSE-SEIP 2017, pp. 123–132 (2017)

    Google Scholar 

  20. Fabijan, A., Dmitriev, P., Olsson, H.H., Bosch, J.: The evolution of continuous experimentation in software product development. In: Proceedings of the 39th International Conference on Software Engineering, ICSE 2017 (2017)

    Google Scholar 

  21. Dmitriev, P., Wu, X.: Measuring metrics. In: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management, CIKM 2016, pp. 429–437 (2016)

    Google Scholar 

  22. Deng, A., Shi, X.: Data-driven metric development for online controlled experiments. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2016, pp. 77–86 (2016)

    Google Scholar 

  23. Ries, E.: The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses (2011)

    Google Scholar 

  24. Runeson, P., Höst, M.: Guidelines for conducting and reporting case study research in software engineering. Empir. Softw. Eng. 14(2), 131–164 (2009)

    Article  Google Scholar 

  25. Robson, C., McCartan, K.: Real World Research, 4th edn. John Wiley & Sons Ltd., New York (2016)

    Google Scholar 

  26. Kohavi, R., Deng, A., Frasca, B., Walker, T., Xu, Y., Pohlmann, N.: Online controlled experiments at large scale. In: Proceedings of the 19th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2013, p. 1168 (2013)

    Google Scholar 

  27. Deng, A., Lu, J., Litz, J.: Trustworthy analysis of online A/B tests. In: Proceedings of the Tenth ACM International Conference on Web Search and Data Mining, WSDM 2017, pp. 641–649 (2017)

    Google Scholar 

  28. Bottou, L., et al.: Counterfactual reasoning and learning systems. J. Mach. Learn. Res. 14, 3207–3260 (2013)

    MathSciNet  MATH  Google Scholar 

  29. Kohavi, R., Deng, A., Longbotham, R., Xu, Y.: Seven rules of thumb for web site experimenters. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD 2014, pp. 1857–1866 (2014)

    Google Scholar 

Download references

Acknowledgments

This work was partially supported by the Wallenberg Artificial Intelligence, Autonomous Systems and Software Program (WASP), funded by the Knut and Alice Wallenberg Foundation. The authors would like to thank Microsoft’s Analysis and Experimentation team for the opportunity to conduct this study with them.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to David Issa Mattos .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2018 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Issa Mattos, D., Dmitriev, P., Fabijan, A., Bosch, J., Holmström Olsson, H. (2018). An Activity and Metric Model for Online Controlled Experiments. In: Kuhrmann, M., et al. Product-Focused Software Process Improvement. PROFES 2018. Lecture Notes in Computer Science(), vol 11271. Springer, Cham. https://doi.org/10.1007/978-3-030-03673-7_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-03673-7_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-03672-0

  • Online ISBN: 978-3-030-03673-7

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics