Artificial intelligence and statistics

  • Bin Yu
  • Karl Kumbier


Artificial intelligence (AI) is intrinsically data-driven. It calls for the application of statistical concepts through human-machine collaboration during the generation of data, the development of algorithms, and the evaluation of results. This paper discusses how such human-machine collaboration can be approached through the statistical concepts of population, question of interest, representativeness of training data, and scrutiny of results (PQRS). The PQRS workflow provides a conceptual framework for integrating statistical ideas with human input into AI products and researches. These ideas include experimental design principles of randomization and local control as well as the principle of stability to gain reproducibility and interpretability of algorithms and data results. We discuss the use of these principles in the contexts of self-driving cars, automated medical diagnoses, and examples from the authors’ collaborative research.


Artificial intelligence Statistics Human-machine collaboration 

CLC number

TP391 C8 


Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.



The authors thank Bryan Liu and Rebecca Barter for their helpful comments.


  1. Basu S, Kumbier K, Brown JB, et al., 2018. Iterative random forests to discover predictive and stable highorder interactions. PNAS, 115(8):1–6. Scholar
  2. Box GE, Hunter JS, Hunter WG, 2005. Statistics for Experimenters: Design, Innovation, and Discovery (2nd Ed.). Wiley-Interscience, New York, USA.zbMATHGoogle Scholar
  3. Imbens GW, Rubin DB, 2015. Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, UK. Scholar
  4. McCulloch WS, Pitts W, 1943. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys, 5(4):115–133. Scholar
  5. Wolpert L, 1969. Positional information and the spatial pattern of cellular differentiation. J Theor Biol, 25(1):1–47. Scholar
  6. Yu B, 2013. Stability. Bernoulli, 19(4):1484–1500. Scholar

Copyright information

© Zhejiang University and Springer-Verlag GmbH Germany, part of Springer Nature 2018

Authors and Affiliations

  1. 1.Department of StatisticsUniversity of CaliforniaBerkeleyUSA
  2. 2.Department of Electrical Engineering and Computer SciencesUniversity of CaliforniaBerkeleyUSA

Personalised recommendations