October 22, 2025
Environmental health is a major concern area. How to quantify health effects of…
Common issue: continuous exposures


Observed data: A tuple of \(n\)-vectors, \(O_1, \ldots, O_n\), sampled iid, where \[\Ob = (\Lb, \Ab, \Yb) \sim \Pf \in \Pm\]
Question: How much would \(Y\) have changed if we had intervened upon \(A\)?
Let \(Y(a)\) denote potential outcome, value of \(Y\) had \(A = a\) been set.
Typically, interest lies in counterfactual mean \(\E[Y(a)]\), the average value of \(Y\) had \(A\) been set according to \(A = a\)
What goes wrong when \(A\) is continuous…
Solution: Consider modifying observed exposure…
A user-specified function \(d(A, L; \delta)\) that maps the observed exposure \(A\) to a post-intervention value \(A^d\) (Haneuse and Rotnitzky 2013). Examples:

The counterfactual mean is \[\E_{\Pf}\Big[Y(d(A, L; \delta))\Big] = \E_{\Pf}\Big[Y(A^d)\Big]\]
and the population intervention effect (PIE) is \(\E[Y(A^d)] - \E[Y]\)
Question: What is the impact of zero-emissions vehicles (ZEV) on NO2 air pollution in California?
How to identify and estimate causal effects of MTPs in spatial data?
Must be…
Hudgens and Halloran (2008): Interference occurs when potential outcome of unit \(i\) depends on exposures of other units
\[Y_i(a_i, a_j) \neq Y_i(a_i, a_j') \text{ if } a_j \neq a_j'\]
Hudgens and Halloran (2008): Interference occurs when potential outcome of unit \(i\) depends on exposures of other units
\[Y_i(a_i, a_j) \neq Y_i(a_i, a_j') \text{ if } a_j \neq a_j'\]
Network interference: Potential outcomes only depend on neighbors in a known adjacency matrix \(\Fb\) (van der Laan 2014).
Observed data: A tuple of \(n\)-vectors, \(O_1, \ldots, O_n\), where \[\Ob = (\Lb, \Ab, \Yb)\]
Network \(\bf{F}\): An adjacency matrix of each unit’s neighbors (known).
Under interference, consider the following structural equation: \[Y_i = f\Big(s_A(A_j : j \in \Fb_i), s_L(L_j : j \in \Fb_i)\Big)\]
Treating \(s(A)\) as the exposure instead of \(A\) restores SUTVA (Aronow and Samii 2017); just use \(Y(s(a))\) instead of \(Y(a)\)
But what happens if we apply the MTP and then summarize?
\[ A \overset{d}{\longrightarrow} A^d \overset{s}{\longrightarrow} A^{s \circ d} \]
We term the function \(s \circ d\) the induced MTP.
Population intervention effect (PIE) of an induced MTP: \[ \Psi_n(\Pf) = \E_{\Pf} \Big[\frac{1}{n}\sum_{i=1}^n Y_i(s(d(\Ab, \Lb; \delta))_i)\Big] - \E_{\Pf}\Big[Y\Big] \]
Data-adaptive parameter, since we only observe a single network
A0 (SCM). Data are generated from a structural causal model: \[ L_i = f_L(\varepsilon_{L_i}); A_i = f_A(L_i^s, \varepsilon_{A_i}); Y_i = f_Y(A_i^s, L_i^s, \varepsilon_{Y_i}) \ , \] with error vectors independent of each other, with identically distributed entries, and with \(\varepsilon_{i} \indep \varepsilon_{j}\) provided \(i, j\) are not neighbors in \(\Fb\)
A1 (Summary positivity). If \(s(a), s(l) \in \text{supp}(A^s, L^s)\) then \(s(a^d), s(l) \in \text{supp}(A^s, L^s)\)
A2 (No unmeasured confounding). \(Y(A^s) \indep A^s \mid L\)
A3 (Piecewise smooth invertibility). The MTP \(d\) has a differentiable inverse on a countable partition of \(\text{supp}(A)\).
A4 (Summary coarea). \(s\) has Jacobian \(Js\) satisfying \[ \sqrt{\det J s(a) J s(a)^\top} > 0 \] (adapted from measure-theoretic calculus to use \(A^s\) instead of \(\Ab\))
Statistical estimand factorizes in terms of \(A^s\): \[ \psi_n = \frac{1}{n}\sum_{i=1}^n \E_{\Pf}[\textcolor{teal}{m(A_i^s, L_i^s)} \cdot \textcolor{crimson}{r(A_i^s, A_i^{s\circ d}, L_i^s)} \cdot \textcolor{maroon}{w(\Ab, \Lb)_i}] \] with nuisance parameters \(m\) and \(r\), and deterministic weights \(w\): \[\begin{align*} & \textcolor{gray}{m(a^s, l^s) = \E_{\Pf}[Y \mid A_i^s = a^s, L_i^s = l^s]}\\ & \textcolor{crimson}{r(a^s, a^{s \circ d^{-1}}, l^s) = \frac{p(a^{s \circ d^{-1}} \mid l^s)} {p(a^s \mid l^s)}}\\ & \textcolor{maroon}{w(\ab, \lb) = \sqrt{\frac{\det J (s \circ d^{-1})(\ab)J (s \circ d^{-1})(\ab)^\top}{\det J s(\ab)J s(\ab)^\top}}} \end{align*}\]
Construct an asymptotically linear, efficient estimator based on the efficient influence function \(\phi(\Pf)\)
\[\frac{1}{n}\sum_{i=1}^n \phi(\Pf_{\hat{\eta}})(O_i) \ ,\]
where \(\hat{\eta}\) is a set of nuisance estimators whose product converges at \(o_{\Pf}(n^{-1/2})\) (i.e., only need \(o_{\Pf}(n^{-1/4})\), typical in statistical learning)
The efficient influence function of \(\psi_n\), a special case of the EIF for the counterfactual mean of a stochastic intervention (Ogburn et al. 2022), is
\[\begin{align*} \bar{\phi}(\Pf)(O_i) =& \frac{1}{n}\sum_{i=1}^n w(\Ab, \Lb)_i \cdot r(A_{i}^s, L_{s,i}) (Y_i - m(A_{i}^s, L_{i}^s))\\ &+ \E(m(A_i^{s\circ d}, L_i^s; \delta), L_i^s) \mid \Lb = \lb) - \psi_n \ , \end{align*}\]
Ogburn et al. (2022)’s CLT: If \(\hat{\psi}_n\) is constructed to approximately solve \(\bar{\phi} \approx 0\) and \(K_{\text{max}}^2 / n \rightarrow 0\), then, under mild regularity conditions, \[\sqrt{C_n}(\hat{\psi}_n - \psi_n) \rightarrow \text{N}(0, \sigma^2) \ ,\] where \(K_{\text{max}}\) is the network’s maximum degree.
The estimator \(\hat{\psi}_n\) is asymptotically normal, but the rate depends on a factor \(n/K_{\text{max}}^2 < C_n < n\) (“automatically” contained within \(\hat{\sigma}^2\))
| Method | Learner | % Bias | Variance | Coverage |
|---|---|---|---|---|
| Network-TMLE | Correct GLM | 0.11 | 1.56 | 96.2% |
| Network-TMLE | Super Learner | 1.03 | 1.56 | 94.0% |
| IID-TMLE | Correct GLM | 20.42 | 2.11 | 54.8% |
| Linear Regression | — | 20.62 | 2.12 | 55.0% |
Challenges remain:
Difficult to estimate conditional density ratio nuisance \(r\)
If summaries \(s\) unknown, can we learn them automatically?
Same theory of the Longitudinal MTP (Díaz et al. 2021) should extend when reduced; useful for time-varying setting

Funded by NIEHS T32 ES007142 and NSF DGE 2140743
EIF was given in the form \(\frac{1}{n}\sum_{i=1}^n \phi_P(O_i)\), but must be centered at the means of units with the same number of neighbors \(N(|\mathbf{F}_i|)\):
\[\varphi_i = \phi_{\hat{P}_n(O_j)}(O_i) - \frac{1}{|N(|\mathbf{F}_i)|)|} \sum_{j \in N(|F_i|)} \phi_{\hat{P}_n(O_j)}\]
Then, \(\hat{\sigma}^2 = \frac{1}{n^2}\sum_{i,j} \mathbf{F}_{ij} \varphi_i\varphi_j \overset{P}{\rightarrow} \sigma^2\)
Main idea: cross-fitting eliminates the “empirical process term”
\[ \Pf_n \phi_{\hat{\eta}} = \underbrace{\Pf_n \phi_{\eta_0}}_{\text{CLT}} + \underbrace{\Pf(\phi_{\hat{\eta}} - \phi_{\eta_0})}_{\text{Nuisance product}} + \underbrace{(\Pf_n - \Pf)(\phi_{\hat{\eta}} - \phi_{\eta_0})}_{\text{Empirical process}} \]
Draw 400 iterations, estimate effect of MTP based on

Biostatistics Seminar, Department of Epidemiology, Biostatistics, and Occupational Health, McGill University