April 10, 2025
Randomization infeasible, unethical
Public policies are expensive…
Relying on spatial geographies
Observed data: A tuple of \(n\)-vectors, \(O_1, \ldots, O_n\), where \[O = (L, A, Y) \sim \P_0 \in \M\]
Network \(\bf{F}\): An adjacency matrix of each unit’s neighbors (known).
Per Hudgens and Halloran (2008), interference occurs when the potential outcome of one unit is affected by the exposure of another unit: \[Y_i(a_i, a_j) \neq Y_i(a_i, a_j') \text{ if } a_j \neq a_j'\]
This violates consistency and SUTVA.
Network interference: Potential outcomes depend on neighboring units in the adjacency matrix \(\bf{F}\) (van der Laan 2014).
Under interference, consider the following structural equation: \[Y_i = f(s_A(\{A_j : j \in \mathbf{F}_i\}), s_L(\{L_j : j \in \mathbf{F}_i\}))\]
Measuring the effect of a population-level intervention requires that the intervention be possible within the network.
Summary \(s(A)\) yields a new potential outcome \(Y_i(s(a))\)
…but it may be impossible to set \(s(A) = s(a)\) for all units
Example: Setting \(A = 1\) with
\[s(A) = \sum_{j \in \bf{F}_i} A_j\]
Measuring the effect of a population-level intervention requires that the intervention be possible within the network.
Summary \(s(A)\) yields a new potential outcome \(Y_i(s(a))\)
…but it may be impossible to set \(s(A) = s(a)\) for all units
Example: Setting \(A = 1\) with
\[s(A) = \sum_{j \in \bf{F}_i} A_j\]
Measuring the effect of a population-level intervention requires that the intervention be possible within the network.
A modified treatment policy (MTP) is a user-specified function \(d(A, L; \delta)\) that maps the observed exposure \(A\) to an post-intervention value \(A^+\).
\[d(A, L;\delta) = \begin{cases}A + \delta \cdot L & A \in \mathcal{A}(L) \\ A & \text{otherwise}\end{cases}\]
Induced MTP: Function \(h\) satisfying \(h \circ s = s \circ d\)
The population intervention effect of an induced MTP is
\[\begin{align*} \theta_n =& \E\Big(\frac{1}{n}\sum_{i=1}^n Y(s(d(A_i, L_i; \delta))_i)\Big) \\ =& \E\Big(\frac{1}{n}\sum_{i=1}^n Y(h(A_{s,i}))\Big) \end{align*}\]
This is equivalent to identifying the MTP effect under intervention \(h\)
If \(d(A)_i = A_i + \delta\), and \(s(A)_i = \sum_{j \in \mathbf{F}_i} A_j\), then
\[h(s(A))_i = s(A)_i + \delta\cdot |\mathbf{F}_i|\]
And the causal estimand \(\theta_n\) is
\[\E\Big(\frac{1}{n}\sum_{i=1}^n Y_i(s(A)_i + \delta\cdot|\mathbf{F}_i|)\Big)\]
\[\psi_n = \E\Big(\frac{1}{n}\sum_{i=1}^n \E(Y_i \mid A_{s,i} = h(A_{s,i}), L_{s,i})\Big)\]
Two sets of assumptions needed to compute from data:
Identification: when does causal \(\theta_n\) equal a statistical functional \(\psi_n\)?
Estimation: when is \(\psi_n\) efficiently estimable via semiparametric theory?
A0 (SCM). Data are generated from a structural causal model:
\[\begin{align*} L_i &= f_L(\varepsilon_{L_i}) \\ A_i &= f_A(L_{s,i}, \varepsilon_{A_i}) \\ Y_i &= f_Y(A_{s,i}, L_{s,i}, \varepsilon_{Y_i}) \ , \end{align*}\] with all error vectors \(\varepsilon_{X_i}\) independent if units do not share a neighbor.
A1 (Positivity). \((h(a_s), l_s) \in \text{supp}(A_s, L_s)\) if \((a_s, l_s) \in \text{supp}(A_s, L_s)\).
A2 (No unmeasured confounding). \(Y(a_s) \indep A_s \mid L_s\)
A3 (Piecewise smooth invertibility).
\[ h(a_s, l_s) = \sum_{k=1}^K h_k(a_s, l_s) \cdot \I(a_s \in \mathcal{A}_k(l_s)) \] such that \(h^{-1}_k\) as a function of \(a_s\) is (piecewise) differentiable for all \(k\).
Theorem 1: If \(h\) is piecewise differentiable, then \(s\) must be piecewise linear for A3 to hold for any \(\mathbf{F}\).
Theorem 2: If A3 holds and \(s\) is piecewise linear, then \[ d(A_i, L_i; \delta) = \alpha(\delta) A_i + \beta_i(\delta, L_i) \I(A_i \in \mathcal{A}) \]
Consequences:
Construct an efficient estimator based on the efficient influence function
The efficient influence function of \(\psi_n\), a special case of the EIF for the counterfactual mean of a stochastic intervention (Ogburn et al. 2022), is
\[\begin{align*} \bar{\phi}(O_i) =& \frac{1}{n}\sum_{i=1}^n w(A_{s,i}, L_{s,i}) (Y_i - m(A_{s,i}, L_{s,i}))\\ &+ \E(m(h(A_i^s, L_i^s; \delta), L_i^s) \mid L = l) - \psi_n \ , \end{align*}\] where \(w(A_{s,i}, L_{s,i})\) is the product of a ratio of conditional densities and \(h^{'(-1)}(A_{s,i})\) and \(m(A_{s,i}, L_{s,i})\) is the outcome regression
Ogburn et al. (2022)’s CLT: If \(\hat{\psi}_n\) is constructed to solve \(\bar{\phi} \approx 0\) and \(K_{\text{max}}^2 / n \rightarrow 0\), then, under mild regularity conditions, \[\sqrt{C_n}(\hat{\psi}_n - \psi_n) \rightarrow \text{N}(0, \sigma^2) \ ,\] where \(K_{\text{max}}\) is the network’s maximum degree.
The estimator \(\hat{\psi}_n\) is asymptotically normal, but the appropriate scaling depends on a factor \(n/K_{\text{max}}^2 < C_n < n\).
Method | Learner | % Bias | Coverage | MSE |
---|---|---|---|---|
Network-TMLE | Correct GLM | 0.45 | 95.2% | 0.013 |
Network-TMLE | Super Learner | - 6.58 | 95.0% | 0.013 |
IID-TMLE | Correct GLM | -103.39 | 26.0% | 0.049 |
Linear Regression | — | -103.52 | 54.2% | 0.057 |
Serious challenges remain:
Future work may benefit from moving away from standard efficiency theory in the network interference setting
Funded by NIEHS T32 ES007142
and NSF DGE 2140743
EIF was given in the form \(\frac{1}{n}\sum_{i=1}^n \phi_P(O_i)\), but must be centered at the means of units with the same number of neighbors \(N(|\mathbf{F}_i|)\):
\[\varphi_i = \phi_{\hat{P}_n(O_j)}(O_i) - \frac{1}{|N(|\mathbf{F}_i)|)|} \sum_{j \in N(|F_i|)} \phi_{\hat{P}_n(O_j)}\]
Then, \(\hat{\sigma}^2 = \frac{1}{n^2}\sum_{i,j} \mathbf{F}_{ij} \varphi_i\varphi_j \overset{P}{\rightarrow} \sigma^2\)
Draw 200 iterations, estimate effect of MTP based on
\[\begin{align*} L_1 &\sim \text{Beta}(3,2); L_2 \sim \text{Poisson}(100);\\ L_3 &\sim \text{Gamma}(2,4); L_4 \sim \text{Bernoulli}(0.6) \\ A &\sim \text{Normal}(0.1 m_L, 1.0) \,\, \text{and} \,\, A_s = \Big[\sum_{j \in F_i} A_i\Big]_{i = 1}^n \\ Y &\sim \text{Normal}(0.2A + A_s + 0.2 m_L, 0.1) \end{align*}\]
\[\begin{align*} m_L = & (L_2 > 50) + (L_2 > 100) + (L_2 > 200) + (L_3 > 0.1) \\ & + (L_3 > 0.5) + (L_3 > 4)+ (L_3 > 10) + L_4 + \\ & L_4 \cdot \Big((L_1 > 0.4) + (L_1 > 0.6) + (L_1 + 0.8\Big) \end{align*}\]
European Causal Inference Meeting (Ghent, Belgium)