Designing experiments on social, healthcare, and information networks
Prof. Edoardo Airoldi
Millard E. Gladfelter Professor of Statistics and Data Science
Professor of Finance
Fox School of Business
Temple University
ABSTRACT
Designing experiments that can estimate causal effects of an intervention when the units of analysis are connected through a network is the primary interest, and a major challenge, in many modern endeavors at the nexus of science, technology and society. Examples include HIV testing and awareness campaigns on mobile phones, improving healthcare in rural populations using social interventions, promoting standard of care practices among US oncologists on dedicated social media platforms, gaining a mechanistic understanding of cellular and regulation dynamics in the cell, and evaluating the impact of tech innovations that enable multi-sided market platforms. A salient technical feature of all these problems is that the response(s) measured on any one unit likely also depends on the intervention given to other units, a situation referred to as “interference” in the parlance of statistics and machine learning. Importantly, the causal effect of interference itself is often among the inferential targets of interest. On the other hand, classical approaches to causal inference largely rely on the assumption of “lack of interference”, and/or on designing experiments that limit the role interference as a nuisance. Classical approaches also rely on additional simplifying assumptions, including the absence of strategic behavior, that are untenable in many modern endeavors. In the technical portion of this talk, we will formalize issues that arise in estimating causal effects when interference can be attributed to a network among the units of analysis, within the potential outcomes framework. We will introduce and discuss several strategies for experimental design in this context centered around a useful role for statistics and machine learning models. In particular, we wish for certain finite-sample properties of the estimator to hold even if the model catastrophically fails, while we would like to gain efficiency if certain aspects of the model are correct. We will then contrast design-based, model-based and model-assisted approaches to experimental design from a decision theoretic perspective.