March 17, 2026
Environmental health is a major concern area. How to quantify health effects of…
All involve continuous exposures


Observed data: A tuple of \(n\)-vectors, \(O_1, \ldots, O_n\), sampled i.i.d., where \[\Ob = (\Lb, \Ab, \Yb) \sim \Pf \in \Pm\]
Question: How much would \(Y\) have changed had we intervened upon \(A\)?
Let \(Y(a)\) denote potential outcome, value of \(Y\) had \(A = a\) been set.
Typically, interest lies in counterfactual mean \(\E[Y(a)]\), the average value of \(Y\) had \(A\) been set according to \(A = a\).
What goes wrong when \(A\) is continuous…
A user-specified function \(d(A, L; \delta)\) that maps the observed exposure \(A\) to a post-intervention value \(A^d\) (Haneuse and Rotnitzky 2013). Examples:

MTPs coincide with stochastic interventions (Dı́az and van der Laan 2012).
The counterfactual mean is \[ \E_{\Pf}\Big[Y(d(A, L; \delta))\Big] = \E_{\Pf}\Big[Y(A^d)\Big] \ , \] and the population intervention effect (PIE) is \(\E[Y(A^d)] - \E[Y]\).
Question: What is the impact of zero-emissions vehicles (ZEV) on NO2 air pollution in California?
How to identify and estimate causal effects of MTPs in spatial data?
Must be…
Hudgens and Halloran (2008): Interference occurs when potential outcome of unit \(i\) depends on exposures of other units \(j \neq i\)
\[Y_i(a_i, a_j) \neq Y_i(a_i, a_j') \text{ if } a_j \neq a_j'\]
Hudgens and Halloran (2008): Interference occurs when potential outcome of unit \(i\) depends on exposures of other units \(j \neq i\)
\[Y_i(a_i, a_j) \neq Y_i(a_i, a_j') \text{ if } a_j \neq a_j'\]
Network interference: Potential outcomes depend only on neighbors in a known adjacency matrix \(\Fb\) (van der Laan 2014).
Observed data: A tuple of \(n\)-vectors, \(O_1, \ldots, O_n\), where \[\Ob = (\Lb, \Ab, \Yb)\]
Network \(\bf{F}\): An adjacency matrix of each unit’s neighbors (known).
Under interference, consider the following structural equation: \[Y_i = f\Big(s_A(A_j : j \in \Fb_i), s_L(L_j : j \in \Fb_i)\Big)\]
Treating \(s(A)\) as the exposure instead of \(A\) restores SUTVA (Aronow and Samii 2017); just use \(Y(s(a))\) instead of \(Y(a)\).
But what happens if we apply the MTP and then summarize?
\[ A \overset{d}{\longrightarrow} A^d \overset{s}{\longrightarrow} A^{s \circ d} \]
We term the function \(s \circ d\) the induced MTP.
Population intervention effect (PIE) of an induced MTP: \[ \Psi_n(\Pf) = \E_{\Pf} \Big[\frac{1}{n}\sum_{i=1}^n Y_i(s(d(\Ab, \Lb; \delta))_i)\Big] - \E_{\Pf}\Big[Y\Big] \]
\(\Psi_n(\Pf)\) is a data-adaptive parameter (we only observe a single network)
A0 (SCM). Data are generated from a structural causal model: \[ L_i = f_L(\varepsilon_{L_i}); A_i = f_A(L_i^s, \varepsilon_{A_i}); Y_i = f_Y(A_i^s, L_i^s, \varepsilon_{Y_i}) \ , \] with error vectors independent of each other, with identically distributed entries, and with \(\varepsilon_{i} \indep \varepsilon_{j}\) provided \(i, j\) are not neighbors in \(\Fb\)
A1 (Summary positivity). If \(s(a), s(l) \in \text{supp}(A^s, L^s)\) then also \(s(a^d), s(l) \in \text{supp}(A^s, L^s)\)
A2 (No unmeasured confounding). \(Y(a^s) \indep A^s \mid L\)
A3 (Piecewise smooth invertibility). The MTP \(d\) has a differentiable inverse on a countable partition of \(\text{supp}(A)\).
A4 (Summary coarea). \(s\) has Jacobian \(Js\) satisfying \[ \sqrt{\det J s(a) J s(a)^\top} > 0 \] (adapted from measure-theoretic calculus to use \(A^s\) instead of \(\Ab\))
Statistical estimand factorizes in terms of \(A^s\): \[ \psi_n = \frac{1}{n}\sum_{i=1}^n \E_{\Pf}[\textcolor{teal}{m(A_i^s, L_i^s)} \cdot \textcolor{crimson}{r(A_i^s, A_i^{s\circ d}, L_i^s)} \cdot \textcolor{maroon}{w(\Ab, \Lb)_i}] \ , \] with nuisance parameters \(m\) and \(r\), and deterministic weights \(w\): \[\begin{align*} & \textcolor{gray}{m(a^s, l^s) = \E_{\Pf}[Y \mid A_i^s = a^s, L_i^s = l^s]}\\ & \textcolor{crimson}{r(a^s, a^{s \circ d^{-1}}, l^s) = \frac{p(a^{s \circ d^{-1}} \mid l^s)} {p(a^s \mid l^s)}}\\ & \textcolor{maroon}{w(\ab, \lb) = \sqrt{\frac{\det J (s \circ d^{-1})(\ab)J (s \circ d^{-1})(\ab)^\top}{\det J s(\ab)J s(\ab)^\top}}} \end{align*}\]
Construct an asymptotically linear, efficient estimator based on the efficient influence function \(\phi(\Pf)\)
\[\frac{1}{n}\sum_{i=1}^n \phi(\Pf_{\hat{\eta}})(O_i) \ ,\]
where \(\hat{\eta}\) is a set of nuisance estimators whose product converges at \(o_{\Pf}(n^{-1/2})\) (i.e., only need \(o_{\Pf}(n^{-1/4})\), typical in statistical learning)
The efficient influence function of \(\psi_n\), a special case of the EIF for the counterfactual mean of a stochastic intervention (Ogburn et al. 2022), is
\[\begin{align*} \bar{\phi}(\Pf)(O_i) =& \frac{1}{n}\sum_{i=1}^n w(\Ab, \Lb)_i \cdot r(A_{i}^s, L_{s,i}) (Y_i - m(A_{i}^s, L_{i}^s))\\ &+ \E(m(A_i^{s\circ d}, L_i^s; \delta), L_i^s) \mid \Lb = \lb) - \psi_n \ , \end{align*}\]
Ogburn et al. (2022)’s CLT: If \(\hat{\psi}_n\) is constructed to approximately solve \(\bar{\phi} \approx 0\) and \(K_{\text{max}}^2 / n \rightarrow 0\), then, under mild regularity conditions, \[\sqrt{C_n}(\hat{\psi}_n - \psi_n) \rightarrow \text{N}(0, \sigma^2) \ ,\] where \(K_{\text{max}}\) is the network’s maximum degree.
The estimator \(\hat{\psi}_n\) is asymptotically normal, but the rate depends on a factor \(n/K_{\text{max}}^2 < C_n < n\) (“automatically” contained within \(\hat{\sigma}^2\))
| Method | Learner | % Bias | Variance | Coverage |
|---|---|---|---|---|
| Network-TMLE | Correct GLM | 0.11 | 1.56 | 96.2% |
| Network-TMLE | Super Learner | 1.03 | 1.56 | 94.0% |
| IID-TMLE | Correct GLM | 20.42 | 2.11 | 54.8% |
| Linear Regression | — | 20.62 | 2.12 | 55.0% |
Challenges remain:
Difficult to estimate conditional density ratio nuisance \(r\)
If summaries \(s\) unknown, can we learn them automatically?
Same theory of the Longitudinal MTP (Díaz et al. 2021) should extend when reduced; useful for time-varying setting
stat.berkeley.edu/~nhejazi/present/2026_enar_mtpnet/
Funded by NIEHS T32 ES007142 and NSF DGE 2140743

EIF was given in the form \(\frac{1}{n}\sum_{i=1}^n \phi_{\Pf}(O_i)\), but must be centered at the means of units with the same number of neighbors \(N(\lvert\mathbf{F}_i\rvert)\):
\[ \varphi_i = \phi_{\hat{\Pf}_n(O_j)}(O_i) - \frac{1}{\lvert N(\lvert\mathbf{F}_i)\rvert) \rvert} \sum_{j \in N(\lvert F_i \rvert)} \phi_{\hat{\Pf}_n(O_j)} \]
Then, \(\hat{\sigma}^2 = \frac{1}{n^2}\sum_{i,j} \mathbf{F}_{ij} \varphi_i\varphi_j \pto \sigma^2\)
Main idea: cross-fitting eliminates the empirical process term
\[ \Pf_n \phi_{\hat{\eta}} = \underbrace{\Pf_n \phi_{\eta_0}}_{\text{CLT}} + \underbrace{\Pf(\phi_{\hat{\eta}} - \phi_{\eta_0})}_{\text{Nuisance product}} + \underbrace{(\Pf_n - \Pf)(\phi_{\hat{\eta}} - \phi_{\eta_0})}_{\text{Empirical process}} \]
Draw 400 iterations, estimate effect of MTP based on

ENAR 2026 Annual Meeting