Causal mediation analysis with mediators subject to censoring by assay limits of detection

Nima Hejazi

nhejazi@hsph.harvard.edu

Harvard Biostatistics

Cong Jiang

cjiang@hsph.harvard.edu

Harvard Biostatistics

May 29, 2025

The ACTIV-2/ACTG A5401 clinical trial

Objective: Evaluate how monoclonal antibody (mAb) agents reduce risk of hospitalization due to COVID-19 via their role in viral clearance.

ACTIV-2: phase 2/3 adaptive platform trial of investigational mAb agents to treat non-hospitalized COVID-19 patients (Evering et al. 2023; Giganti et al. 2025).
The action mechanisms of mAb therapies is understood to revolve around their role in clearing viral (SARS-CoV-2, in this case) RNA.
- Viral RNA was measured in several ways: both anteronasal (AN; daily) and nasopharyngeal (NP; every 3 days) swabs, and blood plasma (twice)
- Viral RNA, as part of the putative action mechanism of mAbs, acts as a mediator of the effect of mAb agents.
- Viral RNA measurements subject to a significant degree of missingness.
Clinical endpoint: Hospitalization or death through Day 28 (end of study).

Causal mediation analysis in the ACTIV-2 trial

Objective: Decompose the total effect of mAb agents on hospitalization or death into an indirect effect (thru viral RNA) and a direct effect (the remainder).

Causal mediation analysis methods operate under the assumption that mediators are not subject to (severe) censoring.
Problem: Mediator (viral RNA) is left-censored by assay limit of detection.
- Approximately 75% of plasma samples at baseline are reported as falling below the assay limits of detection (LoD).
- Mishandling censored mediators (e.g., “naive” imputation by LOD/2) can risk introducing (possibly severe) bias into the estimation process.
Left-censoring of mediators reflects technical limitations and biologically meaningful clearance of virus. How to disentangle these?
Recent work (e.g., Chernofsky, Bosch, and Lok 2024) proposes exptrapolation based on a parametric form or numerical integration of an observed-data likelihood.

Causal mediation analysis without censored data

Data: \(n\) study units, sampled iid from \(\P \in \M\), with data on i^th unit as \(O_i = (L_i, A_i, M_i, Y_i)\), where
- \(L\) are baseline covariates, \(A \in \{0, 1\}\) is the treatment.
- \(M\) is a continuous mediator, \(Y\) is the clinical endpoint of interest.
Total effect decomposition: \[\small{ \begin{aligned} \underbrace{\E\left[Y(A = a) - Y(A = a^{\prime})\right]}_{\text{Total effect (TE)}} &= \E\left[Y(A = a, M = M(a)) - Y(A = a^{\prime}, M = M(a^{\prime}))\right] \\ &= \underbrace{\E\left[Y(A = a, M = M(a)) - Y(A = a, M = M(a^{\prime}))\right]}_{\text{Natural indirect effect (NIE)}} \\ &\quad + \underbrace{\E\left[Y(A = a, M = M(a^{\prime})) - Y(A = a^{\prime}, M = M(a^{\prime}))\right]}_{\text{Natural direct effect (NDE)}} \end{aligned}} \]

Standard identification assumptions

Consistency:
1. Outcome: \((A, M) = (a, m) \implies Y = Y^{a,m}\)
2. Mediator: \(A = a \implies M = M^a\)
Positivity:
1. Treatment: \(\Pr(A = a \mid L) > 0\) and \(\Pr(A = a^{\prime} \mid L) > 0\), with probability one
2. Mediator: \(\Pr(M = m \mid A = a^{\prime}, L) > 0\) with probability one only whenever \(\Pr(M = m \mid A = a, L) > 0\)
Exchangeability:
1. Treatment: \(M(a) \indep A \mid L\)
2. Treatment–mediator: \(Y(a,m) \indep (A, M) \mid L\)
3. Cross-world counterfactuals: \(Y(a^{\prime}, m) \indep M^a \mid L\) for all \(a \neq a^{\prime}\) and \(m\)
No unmeasured mediator–outcome confounder affected by exposure

Identification result: Statistical estimand

Recall that the NDE and NIE contain terms that are trivially identified, i.e., \(Y(A = a, M = M(a))\) and \(Y(A = a, M = M(a))\)
But the decomposition strategy hinges on \(\E[Y(A = a, M = M(a^{\prime}))]\), which is identified by the “mediation formula” as follows \[\small{ \begin{aligned} \mathbb{E}&\left[Y(A = a, M = M(a^{\prime}))\right] = \\ &\int_{\mathcal{L}} \int_{\mathcal{M}} \E[Y \mid A = a, M = m, L = l] f_{M \mid A, L}(m \mid A = a', L = l) f_{L}(l) \,\mathrm{d}m\,\mathrm{d}l \end{aligned}} \]
From this identification result, we can construct plug-in or re-weighted estimators (Imai, Keele, and Tingley 2010; Imai, Keele, and Yamamoto 2010).
Construct asymptotically efficient estimators (Tchetgen Tchetgen and Shpitser 2012; Zheng and van der Laan 2012) by using the efficient influence function (EIF).
Can we extend the use of these results to the case with left-censored mediators?

Identification under informatively censored mediators

The data unit is now \(O_i = (L_i, A_i, C_i M_i, C_i, Y_i)\), where the assay limit of detection \(\lambda_{\text{LOD}}\) results in censoring of \(M_i\), that is, \(C_i := \I(M_i > \lambda_{\text{LOD}})\).
Censoring of the mediator complicates identification; key targets are
1. \(\Pr(Y = y \mid A = a, M = m, L = l)\), and
2. \(f(M = m \mid A = a, L = l)\)
Under the assumption: \(C \perp Y \mid (M, A, L)\), conditional outcome mean can be estimated from complete cases only, that is, \[ \Pr(Y \mid M, A, L) = \Pr(Y \mid M, A, L, C = 1) \ . \]
However, \(f(M \mid A, L)\) (or \(\Pr(Y, M \mid A, L)\)) is not identified non-parametrically due to informative censoring when \(M\) is continuous and \(Y\) is binary (Zuo et al. 2024).
Identifiability in this setting requires a completeness assumption—that the “dimension” of \(Y\) be at least as large as that of \(M\).

Addressing identifiability: Parametric framework

Non-parametric identification fails due to violation of the completeness assumption.
To work around, impose parametric distributional constraints to achieve identification; for example, structural assumptions on \(f(M = m \mid A = a, L = l)\):
- exponential family distributions (Newey and Powell 2003), or
- location-scale families (Hu and Shiu 2018).
Identifiability then relies on the uniqueness of the solution to score equations in standard maximum likelihood estimation.
With this framing, the complete-data log-likelihood (conditional on \(A\) and \(L\)) is \[\small \begin{aligned} \ell_{\text{com}}(\theta) = \sum_{i=1}^n \Big[ \log \Pr(Y_i \mid M_i, A_i, L_i; \alpha) + \log f(M_i \mid A_i, L_i; \beta) + \log \Pr(C_i \mid M_i, A_i, L_i) \Big] \ , \end{aligned} \] for \(\theta = (\alpha, \beta)\), and where \(\alpha\) are parameters of the conditional outcome mean and \(\beta\) are those for the conditional mediator density.

From complete- to observed-data log-likelihoods

Given the limit of detection \(\lambda_{\text{LOD}}\) and known form of censoring, \(C_i = \I(M_i > \lambda_{\text{LOD}})\), the censoring mechanism is fully determined by \(M\).

Due to this, in the likelihood expressions, the censoring probability simplifies to \[ \Pr(C_i = c \mid M_i, A_i, L_i) = 1 \]

Then, we can express the observed-data log-likelihood as follows

Observed-data log-likelihood (under left-censoring by LOD ): \[ \begin{aligned} \ell_{\text{obs}}(\boldsymbol{\theta}) = \sum_{i : C_i = 1} &\left[\log \Pr(Y_i \mid M_i, A_i, L_i; \alpha) + \log f(M_i \mid A_i, L_i; \beta) \right] \\ + &\sum_{i : C_i = 0} \log \int_0^{\lambda_{\text{LOD}}} \Pr(Y_i \mid M_i, A_i, L_i; \alpha) \cdot f(M \mid A_i, L_i; \beta) \, dM \end{aligned} \]

Problem summary: Challenges and proposed solution

Started with left-censored mediators and non-parametric definitions of causal mediation estimands.
Reduced this to identification via parameter estimation in parametric (working) models with informatively censored mediators.
Core challenge: Censoring mechanism depends on unobserved values; thus, falls outside of the missing-at-random (MAR) paradigm.
Solution: Use a combination of fractional imputation (FI) and the expectation-maximization (EM) algorithm to handle left-censoring.
1. FI: Generate multiple candidate values of the mediator for units subject to left-censoring, assigning importance weights to each.
2. EM: Update importance weights iteratively to solve score equations.

Fractional imputation (FI, Kim (2011))

For each study unit with censored mediator, replace missing mediator value with multiple plausible values, from a proposal distribution.
Use importance sampling to assign fractional weights to each of the candidate imputed mediator values.
Key steps:
1. Generate \(S\) imputed values using rejection sampling—that is, \(m^{\star(j)}_{i}\) for \(j = 1, \ldots, S\) from a proposal distribution \(f(M \mid A, L)\).
2. Compute importance-based fractional weights.
3. Update model parameters iteratively.

Importance weighting

Target density: \(f(m_i^{\star(j)} \mid Y_i, C_i = 0, a_i, l_i)\)—direct estimation is expensive!

Importance weights: \[ \frac{\text{Target}}{\text{Proposal}} = \frac{ f(m_i^{\star(j)} \mid Y_i, C_i = 0, a_i, l_i)}{ f(m_i^{\star(j)} \mid a_i, l_i)} \]

Avoid explicit evaluation of the joint density via the identity: \[ \begin{aligned} &\frac{\Pr(Y_i,\ M_i = m_i^{\star(j)}, C_i = 0 \mid a_i, l_i) / f(m_i^{\star(j)} \mid a_i, l_i)}{ \sum_{k=1}^S \Pr(Y_i,\ M_i = m_i^{\star(k)},\ C_i = 0 \mid a_i, l_i) / f(m_i^{\star(k)} \mid a_i, l_i)} \\[1.2em] = &\frac{ f(m_i^{\star(j)} \mid Y_i,\ C_i = 0, a_i, l_i) / f(m_i^{\star(j)} \mid a_i, l_i)}{ \sum_{k=1}^S f(m_i^{\star(k)} \mid Y_i,\ C_i = 0,\ a_i, l_i) / f(m_i^{\star(k)} \mid a_i, l_i)} \end{aligned} \]

Importance weighting

Normalized Importance Weights

\[ w_{ij} = \frac{\tilde{w}_{ij}}{\sum_{k=1}^S \tilde{w}_{ik}}, \quad \tilde{w}_{ij} \propto \frac{ \Pr(Y_i, M_i = m_i^{\star(j)}, C_i = 0 \mid a_i, l_i)}{ f(m_i^{\star(j)} \mid a_i, l_i) } \]

Numerator term is the joint distribution of observed data. This can be factorized as

Conditional mediator density: \(f(m_i \mid a_i, l_i)\)
Conditional censoring probability: \(\Pr(C_i = 0 \mid M_i, a_i, l_i)\)
Conditional outcome probability: \(\Pr(Y_i = 1 \mid M_i, a_i, l_i)\)

We will treat these as nuisance parameters, estimating each and updating the fractional weights assigned to imputation replicates via EM.

EM algorithm for informatively censored mediators

E-Step:

Imputation step: Generate \(S\) values from a proposal distribution.
Weighting step: Compute normalized importance weights: \[\small w_{ij} \propto \frac{ \Pr(Y_i,\ M_i = m^{\star(j)}_i,\ C_i = 0 \mid a_i, l_i) }{ f(m^{\star(j)}_i \mid a_i, l_i) }, \quad \text{subject to } \sum_{j=1}^S w_{ij} = 1 \]

M-Step:

Solve weighted score equations to update parameters: \[\small S(\theta;\ Y_i,\ m^{\star(j)}_i, a_i, l_i) = \frac{\partial, \ell_{\text{com}}(M_i = m^{\star(j)}_i; \theta)} {\partial \theta} \]

EM-FI versus MCEM: Efficiency and convergence

Monte Carlo EM (MCEM)

Resamples imputed values each iteration, leading to a higher computational cost
Stochastic fluctuations may delay convergence

EM with Fractional Imputation

Fixed imputations across iterations, only updating weights \(w_{ij}\) at each step \(t\)
Computational savings: Importance sampling performed once

Theoretical Convergence

Under suitable regularity conditions, for a sufficiently large number of iterations \(t\) in the EM algorithm, the estimated parameter \(\hat{\theta}^{(t)}\) converges to its asymptotic limit \(\hat{\theta}^{\star}_{S}\), a stationary point of \(Q^{\star}\) for fixed \(S\); that is, \(\hat{\theta}^{(t)} \to \hat{\theta}^{\star}_{S}\,\, \text{as}\,\, t \to \infty\). Then, for a sufficiently large number of imputations \(S\), we have \(\hat{\theta}^{\star}_{S} \to \hat{\theta}_{\text{MLE}}\).

Stability: Fixed imputation scheme reduces Monte Carlo variability.
Reliability: Guaranteed convergence to MLE through imputation size \(S\).
Efficiency: Much faster than MCEM in numerical experiments (and data analysis).

Inference: Adaptive \(m\)-out-of-\(n\) bootstrap

Key idea

Overcome failure of the standard bootstrap for non-smooth/irregular estimators (e.g., NDE/NIE functional with fractionally imputed mediator) by adapting the size of the resamples \(m\).

Select \(m = \lfloor n^{f(p)} \rfloor\), where \(f(p)\):
- Monotone decreasing in degree of non-regularity wrt \(p\), with \(f(0) = 1\)
- Continuous, with bounded derivative
This ensures \(m \to \infty\) and \(m = o(n)\) for consistency.
Data-driven selection of \(m\) \[ m = \left\lfloor n^{c_i} \right\rfloor, \quad \text{where} \quad c_i = \frac{1 + \gamma_i \exp(-k\, p_{\text{cens}})}{1 + \gamma_i} \ , \] where \(p_{\text{cens}}\) is censoring rate and \(\gamma_i\) and \(k\) is hyperparameter. If \(p_{\text{cens}} = 0\), then \(m = n\) and the procedure reverts to the standard bootstrap.

Numerical experiment with censored mediator

Simulate data from an observational study with a mediator left-censored by an assay limit of detection (LoD): \[\small \begin{aligned} L_1 & \sim \text{Bern}(0.7), \quad L_2 \sim \text{Bern}(0.5), \quad L_3 \sim \text{Bern}(0.25) \\\\ A \mid \boldsymbol{L} & \sim \text{Bern}\left( \expit(-1 + 0.5 L_1 + 1.25 L_2 + 0.75 L_3 - 1.25 L_1 L_3) \right) \\\\ \log M \mid A, \boldsymbol{L} & \sim \mathcal{N}(-3 + 1.5 A + 1.75 L_1 + 1.5 L_2 - 0.25 L_3,\; 0.25^2) \\\\ Y \mid M, A, \boldsymbol{L} & \sim \text{Bern}\left(\expit(-1 + 2.5 A + 1.75 M + 0.5 A M - 2.25 L_1 - 1.75 L_2 - 1.5 L_3) \right) \end{aligned} \]

Assay LoD defined by a fixed quantile (e.g., 30%) below which \(M\) is left-censored; used to generate several experimental scenarios.
Mediator is left-censored: Observed \(M\) is reported as \(\text{LoD}\) when \(M < \text{LoD}\).
Outcome: Binary endpoint (e.g., hospitalization, death—as in ACTIV-2).

Imputation strategies and estimation approach

Objective: Mitigate bias by applying the proposed imputation approaches with standard (e.g., plug-in) estimators of NDE/NIE statistical estimands.

Implemented 5 fractional imputation strategies, including a conventional method while keeping nuisance estimators correctly specified for NDE/NIE estimation.

Compared performance across the five strategies and against an oracle EM strategy that used all correctly specified nuisances.
Investigated performance in terms of
- standard metrics: bias, variance, MSE, CI coverage (based on both standard and m-out-of-n bootstrap); and
- severity of left-censoring of \(M\): 25%, 50%, 75% of observations.
Our findings indicate that flexible, semi-parametric estimation strategies approach performance of the oracle strategy, even with heavy left-censoring (75%).

Simulation results: Natural direct effect

Details on imputation strategies

EM: true nuisances (Best case!)
✅ Correct log-normal density for \(M\)
✅ Correct models for \(\E(M \mid A, L)\) and \(\E(Y \mid M, A, L)\)
LOD/2 imputation (Worst case!)
⚠️ Imputes censored \(M\) values by LoD/2
EM: oracle dens., misspec. GLMs
✅ Correct log-normal density for \(M\)
❌ Misspecified mediator model \(\E(M \mid A, L)\) (omits covariates)
❌ Misspecified outcome model (omits covariates)
EM: misspec. dens., misspec. GLMs
❌ Incorrect log-normal density for \(M\)
❌ Misspecified mediator model \(\E(M \mid A, L)\) (omits covariates)
❌ Misspecified outcome model (omits covariates)
EM: heterosced. CDE, HAL
🔹 Semiparametric conditional density estimation (CDE) with heteroscedastic errors
🔹 Estimates conditional mean and variance of \(M\) using the highly adaptive lasso (HAL)
EM: homosced. CDE, HAL
🔹 Semiparametric conditional density estimation (CDE) with homoscedastic errors
🔹 Estimates conditional mean of \(M\) using the highly adaptive lasso (HAL)

Simulation results: Natural indirect effect

Work-in-progress

Verify that fractional imputation-based approach is compatible with efficient estimators of the NDE/NIE functional.
Apply methods to re-analysis of the ACTIV-2/ACTG A5401 trial data, informing prior evidence on the role of viral RNA clearance in COVID-19 disease (Li et al. 2022).

References

Chernofsky, Ariel, Ronald J Bosch, and Judith J Lok. 2024. “Causal Mediation Analysis with Mediator Values Below an Assay Limit.” Statistics in Medicine 43 (12): 2299–2313.

Evering, Teresa H, Kara W Chew, Mark J Giganti, Carlee Moser, Mauricio Pinilla, David Alain Wohl, Judith S Currier, et al. 2023. “Safety and Efficacy of Combination SARS-CoV-2 Neutralizing Monoclonal Antibodies Amubarvimab Plus Romlusevimab in Nonhospitalized Patients with COVID-19.” Annals of Internal Medicine 176 (5): 658–66.

Giganti, Mark J, Kara W Chew, Carlee Moser, Joseph J Eron, Mauricio Pinilla, Jonathan Z Li, Justin Ritz, et al. 2025. “Implementation of a Seamless Phase 2/3 Study Design in the Setting of an Emergent Infectious Disease Pandemic: Lessons Learned from the ACTIV-2 Platform COVID-19 Treatment Trial.” Contemporary Clinical Trials 153: 107887.

Hu, Yingyao, and Ji-Liang Shiu. 2018. “Nonparametric Identification Using Instrumental Variables: Sufficient Conditions for Completeness.” Econometric Theory 34 (3): 659–93.

Imai, Kosuke, Luke Keele, and Dustin Tingley. 2010. “A General Approach to Causal Mediation Analysis.” Psychological Methods 15 (4): 309–34. https://doi.org/10.1037/a0020761.

Imai, Kosuke, Luke Keele, and Teppei Yamamoto. 2010. “Identification, Inference and Sensitivity Analysis for Causal Mediation Effects.” Statistical Science 25 (1): 51–71. https://doi.org/10.1214/10-STS321.

Kim, Jae Kwang. 2011. “Parametric Fractional Imputation for Missing Data Analysis.” Biometrika 98 (1): 119–32.

Li, Yijia, Linda J Harrison, Kara W Chew, Judy S Currier, David A Wohl, Eric S Daar, Teresa H Evering, et al. 2022. “Nasal and Plasma SARS-CoV-2 RNA Levels Are Associated with Timing of Symptom Resolution in the ACTIV-2 Trial of Non-Hospitalized Adults with COVID-19.” Clinical Infectious Diseases, ciac818.

Newey, Whitney K, and James L Powell. 2003. “Instrumental Variable Estimation of Nonparametric Models.” Econometrica 71 (5): 1565–78.

Tchetgen Tchetgen, Eric J, and Ilya Shpitser. 2012. “Semiparametric Theory for Causal Mediation Analysis: Efficiency Bounds, Multiple Robustness, and Sensitivity Analysis.” The Annals of Statistics 40 (3): 1816–45. https://doi.org/10.1214/12-AOS990.

Zheng, Wenjing, and Mark J van der Laan. 2012. “Targeted Maximum Likelihood Estimation of Natural Direct Effects.” The International Journal of Biostatistics 8 (1): 1–40. https://doi.org/10.2202/1557-4679.1361.

Zuo, Shuozhi, Debashis Ghosh, Peng Ding, and Fan Yang. 2024. “Mediation Analysis with the Mediator and Outcome Missing Not at Random.” Journal of the American Statistical Association, 1–11.

Appendix

Estimation of NDE and NIE

Natural direct effect (NDE) functional: \[ \Psi_{\text{NDE}}(\Pr) = \E \left[\E \left\{ \E(Y \mid A=1, M, L) - \E(Y \mid A=0, M, L) \mid A=0, L \right\} \right] \]

Plug-in (g-computation) estimator:

Fit \(\hat{\E}(Y \mid A=1, M=m, L = l)\)
Fit \(\hat{\E}(Y \mid A=0, M=m, L = l)\)
Fit \(\hat{f}(M=m \mid A=0, L = l)\)
Estimate \(\Pr(L = l)\) empirically (from the sample)

Evaluate plug-in estimator (g-computation): \[\small \Psi_{\text{NDE}}(\hat{\Pr}) = \hat{\E} \left[ \hat{\E} \left\{ \hat{\E}(Y \mid A=1, M, L) - \hat{\E}(Y \mid A=0, M, L) \mid A=0, L \right\} \right] \] The bootstrap may be used to obtain inference based on this plug-in estimator.

Double bootstrap algorithm

Initialize a grid of \(\gamma\) values and specify \(k\) (e.g., \(k = 5\) for \(p_{\text{cens}} = 0.5\)); draw \(B_1\) samples of size \(n\).
For each sample \(b_1\):
- Run \(m\)-out-of-\(n\) bootstrap (\(B_2\) iterations)
- Construct confidence interval (CI) based on the \(m\)-out-of-\(n\) samples
Evaluate coverage:
- Calculate proportion of CIs that cover original estimate
- Adjust \(\gamma\) until target nominal coverage is achieved
Final step: Compute final CIs using optimal \(\hat{m} = n^{\hat{c}}\)

Advantages

Robust to non-regularity via adaptive \(m\)
Seamless transition to standard bootstrap when \(p_{\text{cens}} = 0\)