Full Text:   <1823>

Suppl. Mater.: 

CLC number: TP391; C8

On-line Access: 2018-03-10

Received: 2017-12-07

Revision Accepted: 2018-01-10

Crosschecked: 2018-01-28

Cited: 0

Clicked: 7056

Citations:  Bibtex RefMan EndNote GB/T7714


Bin Yu


-   Go to

Article info.
Open peer comments

Frontiers of Information Technology & Electronic Engineering  2018 Vol.19 No.1 P.6-9


Artificial intelligence and statistics

Author(s):  Bin Yu, Karl Kumbier

Affiliation(s):  Department of Statistics, University of California, Berkeley, CA 94720, USA; more

Corresponding email(s):   binyu@stat.berkeley.edu

Key Words:  Artificial intelligence, Statistics, Human-machine collaboration

Bin Yu, Karl Kumbier. Artificial intelligence and statistics[J]. Frontiers of Information Technology & Electronic Engineering, 2018, 19(1): 6-9.

@article{title="Artificial intelligence and statistics",
author="Bin Yu, Karl Kumbier",
journal="Frontiers of Information Technology & Electronic Engineering",
publisher="Zhejiang University Press & Springer",

%0 Journal Article
%T Artificial intelligence and statistics
%A Bin Yu
%A Karl Kumbier
%J Frontiers of Information Technology & Electronic Engineering
%V 19
%N 1
%P 6-9
%@ 2095-9184
%D 2018
%I Zhejiang University Press & Springer
%DOI 10.1631/FITEE.1700813

T1 - Artificial intelligence and statistics
A1 - Bin Yu
A1 - Karl Kumbier
J0 - Frontiers of Information Technology & Electronic Engineering
VL - 19
IS - 1
SP - 6
EP - 9
%@ 2095-9184
Y1 - 2018
PB - Zhejiang University Press & Springer
ER -
DOI - 10.1631/FITEE.1700813

artificial intelligence (AI) is intrinsically data-driven. It calls for the application of statistical concepts through human-machine collaboration during the generation of data, the development of algorithms, and the evaluation of results. This paper discusses how such human-machine collaboration can be approached through the statistical concepts of population, question of interest, representativeness of training data, and scrutiny of results (PQRS). The PQRS workflow provides a conceptual framework for integrating statistical ideas with human input into AI products and researches. These ideas include experimental design principles of randomization and local control as well as the principle of stability to gain reproducibility and interpretability of algorithms and data results. We discuss the use of these principles in the contexts of self-driving cars, automated medical diagnoses, and examples from the authors’ collaborative research.


概要:人工智能(artificial intelligence, AI)本质上是由数据驱动的。在其通过人机协作完成数据生成、算法开发与结果评估的任务中,需要应用许多统计概念。本文讨论了如何通过数据产生、兴趣问题探究、训练数据代表性和对结果审视等环节(Population, Question of interest, Representativeness of training data, and Scrutiny of results, PQRS)来解决人机协作的问题。PQRS的工作流程为融合统计分析的思想与人类输入提供了一个概念框架。这些统计分析的思想包括通过随机化、局部控制以及稳定性的原则来获得算法和结果的可重复性与可解释性。我们讨论了这些原则在自动驾驶、自动医疗以及作者其他合作研究中的应用。


Darkslateblue:Affiliate; Royal Blue:Author; Turquoise:Article


[1]Basu S, Kumbier K, Brown JB, et al., 2018. Iterative random forests to discover predictive and stable high-order interactions. PNAS, 115(8):1-6.

[2]Box GE, Hunter JS, Hunter WG, 2005. Statistics for Experimenters: Design, Innovation, and Discovery (2nd Ed.). Wiley-Interscience, New York, USA.

[3]Imbens GW, Rubin DB, 2015. Causal Inference for Statistics, Social, and Biomedical Sciences. Cambridge University Press, UK.

[4]McCulloch WS, Pitts W, 1943. A logical calculus of the ideas immanent in nervous activity. Bull Math Biophys, 5(4):115-133.

[5]Wolpert L, 1969. Positional information and the spatial pattern of cellular differentiation. J Theor Biol, 25(1):1-47.

[6]Yu B, 2013. Stability. Bernoulli, 19(4):1484-1500.

Open peer comments: Debate/Discuss/Question/Opinion


Please provide your name, email address and a comment

Journal of Zhejiang University-SCIENCE, 38 Zheda Road, Hangzhou 310027, China
Tel: +86-571-87952783; E-mail: cjzhang@zju.edu.cn
Copyright © 2000 - 2024 Journal of Zhejiang University-SCIENCE