What Cannot Be Measured Cannot Be Controlled: Gauging Success with A/B Tests

  • Alexander Paprotny
  • Michael Thess
Part of the Applied and Numerical Harmonic Analysis book series (ANHA)


The robust measurement of the efficiency of recommendation algorithms is an extremely important factor in the development of recommendation engines. We provide some useful methodical remarks on this topic in this chapter, even though it is not directly connected to the problem of adaptive learning. We further propose a straightforward algorithm to calculate confidence intervals for REs. At the end, we discuss Simpson’s paradox which illustrates the importance of constant environment conditions for testing.


Recommendation Algorithm Recommendation Group Calculate Confidence Interval Recommendation Engine Straightforward Algorithm 
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.


  1. [Gea30]
    Geary, R.C.: The frequency distribution of the quotient of two normal variates. J. R. Stat. Soc. 93(3), 442–446 (1930)CrossRefzbMATHGoogle Scholar
  2. [Sem81]
    Semjonow, N. Wissenschaft und Gesellschaft (in Russian). Nauka (1981)Google Scholar
  3. [SV10]
    Sieber, H., Volkmer, T.: Ein Konfidenzintervall für den Mehrumsatz bei einem A-B-Test. (in German) Documentation, prudsys AG, 2010Google Scholar

Copyright information

© Springer International Publishing Switzerland 2013

Authors and Affiliations

  • Alexander Paprotny
    • 1
  • Michael Thess
    • 2
  1. 1.Research and Developmentprudsys AGBerlinGermany
  2. 2.Research and Developmentprudsys AGChemnitzGermany

Personalised recommendations