cvDCY {cvDSA}R Documentation

Selecting/Fitting Causal difference Models

Description

'cvDCY' is used to select/fit Causal difference models with G-computation, IPTW and DR estimates.

Usage

cvDCY(y, a, v, w, data=NULL, yfamily='gaussian', afamily='binomial', 
        model.dcy=NULL, model.aw=NULL, model.yaw=NULL, 
        model.yw0=NULL, model.yw1=NULL, wt.censor=NULL, all3=F,
        ncv=5, ncv.nuisance=5, mapping='DR', detail=F, printout=T, 
        rep.ID=0, ID=NULL)

Arguments

y response variable: vector of length 'n'.
a treatment variable: vector of length 'n'.
v adjustment variable: vector/matrix.
w baseline covariates: vector/matrix.
data an optional data frame containing the variables in 'y', 'a', 'v' and 'w'.
yfamily a description of the error distribution and link function to be used in the 'y'-related models E[Y|A,W]. Availible choices are 'gaussian' and 'binomial'.
afamily a description of the error distribution and link function to be used in the 'a'-related models g(A|W) and g(A|V)). Availible choice is 'binomial' only.
model.dcy a list description of the MSM, e.g., model.msm=list(Model=, Size=, Order=, Int= ).
model.aw a list description of g(A|W). See 'model.msm'.
model.yaw a list description of E(Y|A,W). See 'model.msm'.
wt.censor an optional vector of censoring weights to be used in the selecting/fitting process.
ncv an integer of the number of fold for the V-fold cross-validation for selecting MSM.
ncv.nuisance an integer of the number of fold for the V-fold cross-validation for selecting the nuisance parameter models, e.g., g(A|W), g(A|V), E(Y|A,W) and E(Y^2|A,W).
mapping the method to be used in calcuating the loss function, e.g., 'IPTW', 'Gcomp' and 'DR'.
all3 a logical value indicating whether to select/fit all three MSM's or the one specified by user.
detail if True, the details of the D/S/A sets are printed.
printout if True, intermediate results are printed.
rep.ID a logical value indicating whether the observations have repeated IDs (not independent observations).
ID a vector which identifies the clusters.

Value

If the causal difference model is not provided by the user, 'cvDCY' will return the best fit. 'cvDCY' also returns the cross-validation risk matrix.

Note

All the 'models' should be defined as a list with either of the following four components: Model, Size, Order, Int. $Model is a string formula indicating the linear combination of model; $Size should give the maximum number of terms in the model; $Order is a vector with the same length as the number of covariates, indicating the maximal power is allowed for each covariate; $Int indicates the maximal interactions allowed in the model.

See Also

cvMSM, cvGLM, cv.predict, check.ETA, create.obs.data

Examples

# Example 1.
#Let W={W1, W2, W3}
n <- 2000
w1 <- runif(n, 0, 1); w2 <- runif(n, 0, 1); w3<- runif(n,0,1)
w <- cbind(w1=w1, w2=w2, w3=w3);
# g(A|W) = logit^(-1) (1 - W1 + W2 +w3)
model.aw <- list(formula="w1+w2+w3", coef=c(1,-1,1,1));
# E(Y|A,W) = 1 + 2A + 1.5W1*A + A*W2 - W1*W2 + w3
model.yaw <- list(formula="a+a:w1+a:w2+w1:w2+w3", coef=c(1, 2, 1.5, 1, -1, 1));

obs.data <- create.obs.data(w, afamily='binomial', yfamily='gaussian', 
            model.yaw=model.yaw, model.aw=model.aw);

attach(obs.data)

model.dcy.iptw.given <- cvDCY(y,a,v=cbind(w1,w2),w, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Model="w1+w2+w1:w2"), 
        model.aw=list(Model="w1+w2+w3"), mapping='IPTW',
        printout=T, rep.ID=0, ID=NULL)

model.dcy.iptw <- cvDCY(y,a,v=cbind(w1,w2),w, ncv=5, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Size=5,Order=c(2,1),Int=2), 
        model.aw=list(Model="w1+w2+w3"), mapping='IPTW',
        printout=T, rep.ID=0, ID=NULL)

model.dcy.dr <- cvDCY(y,a,v=cbind(w1,w2),w, ncv=2, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Size=5,Order=c(2,1),Int=2), 
        model.aw=list(Model="w1+w2+w3"), 
        model.yaw=list(Model="a+w1:a+a:w2+w1:w2+w3"), 
        mapping='DR', printout=T, rep.ID=0, ID=NULL)

model.dcy.dr2 <- cvDCY(y,a,v=cbind(w1,w2),w, ncv=2, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Size=5,Order=c(2,1),Int=2), 
        model.aw=list(Model="w1+w2+w3"), 
        model.yaw=list(Model="a+w1:a+a:w2+w1:w2+w3"), 
        model.yw0=y~w1:w2+w3, model.yw1=y~w1+w2+w1:w2+w3, 
        mapping='DR', printout=T, rep.ID=0, ID=NULL)

model.dcy.g<- cvDCY(y,a,v=cbind(w1,w2),w, ncv=2, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Size=5,Order=c(2,1),Int=2), 
        model.yaw=list(Model="a+w1:a+a:w2+w1:w2+w3"), 
        mapping='Gcomp', printout=T, rep.ID=0, ID=NULL)

model.dcy.g2 <- cvDCY(y,a,v=cbind(w1,w2),w, ncv=2, 
        yfamily='gaussian', afamily='binomial', 
        model.dcy=list(Size=5,Order=c(2,1),Int=2), 
        model.yw0=y~w1:w2+w3, model.yw1=y~w1+w2+w1:w2+w3, 
        mapping='Gcomp', printout=T, rep.ID=0, ID=NULL)

[Package cvDSA version 0.5-3 Index]