IPD meta-analysis

Michael Seo

2022-01-25

Fitting IPD meta-analysis model

We demonstrate how to run IPD meta-analysis using this package. First, let’s generate sample IPD for illustration.

# devtools::install_github('MikeJSeo/bipd')
library(bipd)

## load in data
ds <- generate_ipdma_example(type = "continuous")
ds2 <- generate_ipdma_example(type = "binary")
head(ds2)
#>   studyid treat         w1         w2 y
#> 1       1     0  0.3052599  0.1889069 0
#> 2       1     1 -1.6158489 -1.7098747 0
#> 3       1     0 -1.2548097 -0.5531431 1
#> 4       1     0  1.7173395  1.2318429 1
#> 5       1     1 -0.7718067  0.1187175 0
#> 6       1     1  1.0925264 -0.8590769 1

The main function to set up the function for one-stage IPD meta-analysis is ipdma.model.onestage function. Refer to help(ipdma.model.onestage) for more details. Briefly to describe, “y” is the outcome of the study; “study” is a vector indicating which study the patient belongs to in a numerical sequence (i.e. 1, 2, 3, etc); “treat” is a vector indicating which treatment the patient was assigned to (i.e. 1 for treatment, 0 for placebo); “x” is a matrix of covariates for each patients; “response” is the outcome type, either “normal” or “binomial”.

Another important parameter is the “shrinkage” parameter. To specify IPD meta-analysis without shrinkage, we set shrinkage = “none”.

# continuous outcome
ipd <- with(ds, ipdma.model.onestage(y = y, study = studyid,
    treat = treat, X = cbind(z1, z2), response = "normal", shrinkage = "none"))

To view the JAGS code that was used to run the model, we can run the following command. Note that “alpha” is the study intercept, “beta” is the coefficient for main effects of the covariates, “gamma” is the coefficient for effect modifier, and “delta” is the average treatment effect.

cat(ipd$code)
#> model {
#> 
#> ########## IPD-MA model
#> for (i in 1:Np) {
#>  y[i] ~ dnorm(mu[i], sigma)
#>  mu[i] <- alpha[studyid[i]] + inprod(beta[], X[i,]) +
#>      (1 - equals(treat[i],1)) * inprod(gamma[], X[i,]) + d[studyid[i],treat[i]]
#> }
#> sigma ~ dgamma(0.001, 0.001)
#> 
#> #####treatment effect
#> for(j in 1:Nstudies){
#>  d[j,1] <- 0
#>  d[j,2] ~ dnorm(delta[2], tau)
#> }
#> sd ~ dnorm(0, 1)T(0,)
#> tau <- pow(sd, -2)
#> 
#> ## prior distribution for the average treatment effect
#> delta[1] <- 0
#> delta[2] ~ dnorm(0, 0.001)
#> 
#> 
#> ## prior distribution for the study intercept
#> for (j in 1:Nstudies){
#>  alpha[j] ~ dnorm(0, 0.001)
#> }
#> 
#> ## prior distribution for the main effect of the covariates
#> for(k in 1:Ncovariate){
#>  beta[k] ~ dnorm(0, 0.001)
#> }
#> ## prior distribution for the effect modifiers under no shrinkage
#> for(k in 1:Ncovariate){
#>  gamma[k] ~ dnorm(0, 0.001) 
#> }
#> }

Once the model is set up using ipdma.model.onestage function, we use ipd.run function to run the model. help(ipd.run) describes possible parameters to specify.

samples <- ipd.run(ipd, pars.save = c("beta", "gamma", "delta"),
    n.chains = 3, n.burnin = 500, n.iter = 5000)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 600
#>    Unobserved stochastic nodes: 19
#>    Total graph size: 6034
#> 
#> Initializing model

samples <- samples[, -3]  #remove delta[1] which is 0
summary(samples)
#> 
#> Iterations = 1501:6500
#> Thinning interval = 1 
#> Number of chains = 3 
#> Sample size per chain = 5000 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>             Mean      SD  Naive SE Time-series SE
#> beta[1]   0.1507 0.02123 0.0001734      0.0003095
#> beta[2]   0.2832 0.02290 0.0001870      0.0003545
#> delta[2] -2.9448 0.04553 0.0003718      0.0009685
#> gamma[1] -0.4807 0.03083 0.0002518      0.0004440
#> gamma[2]  0.5671 0.03091 0.0002524      0.0004773
#> 
#> 2. Quantiles for each variable:
#> 
#>             2.5%     25%     50%     75%   97.5%
#> beta[1]   0.1085  0.1365  0.1506  0.1650  0.1921
#> beta[2]   0.2387  0.2676  0.2831  0.2986  0.3283
#> delta[2] -3.0346 -2.9714 -2.9448 -2.9183 -2.8558
#> gamma[1] -0.5411 -0.5012 -0.4806 -0.4600 -0.4196
#> gamma[2]  0.5070  0.5463  0.5671  0.5878  0.6275
# plot(samples) #traceplot and posterior of parameters
# coda::gelman.plot(samples) #gelman diagnostic plot

We can find patient-specific treatment effect using the treatment.effect function. To do this we need to specify the covariate values for the patient that we want to predict patient-specific treatment effect.

treatment.effect(ipd, samples, newpatient = c(1, 0.5))
#>     0.025       0.5     0.975 
#> -3.307351 -3.197554 -3.084559

Incorporating shrinkage and variable selection

For the second example, let’s use the same data, but include shrinkage (i.e. Bayesian LASSO) in the effect modifiers. We can specify Bayesian LASSO by setting shrinkage = “laplace”. Lambda is the shrinkage parameter and we can set the prior for lambda using lambda.prior parameter. The default lambda prior for Bayesian LASSO is \(\lambda^{-1} \sim dunif(0,5)\).

ipd <- with(ds, ipdma.model.onestage(y = y, study = studyid,
    treat = treat, X = cbind(z1, z2), response = "normal", shrinkage = "laplace"))
samples <- ipd.run(ipd, pars.save = c("lambda", "beta", "gamma",
    "delta"), n.chains = 3, n.burnin = 500, n.iter = 5000)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 600
#>    Unobserved stochastic nodes: 20
#>    Total graph size: 6039
#> 
#> Initializing model
summary(samples)
#> 
#> Iterations = 1501:6500
#> Thinning interval = 1 
#> Number of chains = 3 
#> Sample size per chain = 5000 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>             Mean      SD  Naive SE Time-series SE
#> beta[1]   0.1500 0.02089 0.0001706      0.0003115
#> beta[2]   0.2845 0.02262 0.0001847      0.0003821
#> delta[1]  0.0000 0.00000 0.0000000      0.0000000
#> delta[2] -2.9437 0.04741 0.0003871      0.0009756
#> gamma[1] -0.4791 0.03014 0.0002461      0.0005012
#> gamma[2]  0.5646 0.03092 0.0002524      0.0005617
#> lambda    0.3298 0.13190 0.0010770      0.0014469
#> 
#> 2. Quantiles for each variable:
#> 
#>             2.5%     25%     50%     75%   97.5%
#> beta[1]   0.1094  0.1357  0.1500  0.1642  0.1903
#> beta[2]   0.2407  0.2691  0.2844  0.2996  0.3291
#> delta[1]  0.0000  0.0000  0.0000  0.0000  0.0000
#> delta[2] -3.0385 -2.9703 -2.9440 -2.9173 -2.8502
#> gamma[1] -0.5384 -0.4994 -0.4793 -0.4590 -0.4202
#> gamma[2]  0.5038  0.5441  0.5646  0.5853  0.6251
#> lambda    0.2033  0.2371  0.2884  0.3806  0.6829

We can also use SSVS (stochastic search variable selection) by setting shrinkage = “SSVS”. This time let’s use the binomial dataset. “Ind” is the indicator for assigning a slab prior (instead of a spike prior) i.e. indicator for including a covariate. “eta” is the standard deviation of the slab prior.

ipd <- with(ds2, ipdma.model.onestage(y = y, study = studyid,
    treat = treat, X = cbind(w1, w2), response = "binomial",
    shrinkage = "SSVS"))
samples <- ipd.run(ipd, pars.save = c("beta", "gamma", "delta",
    "Ind", "eta"), n.chains = 3, n.burnin = 500, n.iter = 5000)
#> Compiling model graph
#>    Resolving undeclared variables
#>    Allocating nodes
#> Graph information:
#>    Observed stochastic nodes: 600
#>    Unobserved stochastic nodes: 21
#>    Total graph size: 6649
#> 
#> Initializing model
summary(samples)
#> 
#> Iterations = 1501:6500
#> Thinning interval = 1 
#> Number of chains = 3 
#> Sample size per chain = 5000 
#> 
#> 1. Empirical mean and standard deviation for each variable,
#>    plus standard error of the mean:
#> 
#>             Mean     SD Naive SE Time-series SE
#> Ind[1]    0.8781 0.3272 0.002672       0.011714
#> Ind[2]    0.9704 0.1695 0.001384       0.006527
#> beta[1]   0.3061 0.1458 0.001191       0.003675
#> beta[2]   0.2988 0.1429 0.001167       0.003067
#> delta[1]  0.0000 0.0000 0.000000       0.000000
#> delta[2] -1.1044 0.2339 0.001910       0.009036
#> eta       1.7048 1.2295 0.010039       0.027660
#> gamma[1] -0.4910 0.2227 0.001818       0.006197
#> gamma[2]  0.6981 0.2256 0.001842       0.005423
#> 
#> 2. Quantiles for each variable:
#> 
#>              2.5%     25%     50%     75%    97.5%
#> Ind[1]    0.00000  1.0000  1.0000  1.0000  1.00000
#> Ind[2]    0.00000  1.0000  1.0000  1.0000  1.00000
#> beta[1]   0.01956  0.2087  0.3068  0.4054  0.58582
#> beta[2]   0.02905  0.2025  0.2951  0.3886  0.59989
#> delta[1]  0.00000  0.0000  0.0000  0.0000  0.00000
#> delta[2] -1.57508 -1.2512 -1.1084 -0.9589 -0.61426
#> eta       0.34711  0.7568  1.2619  2.3703  4.72465
#> gamma[1] -0.89866 -0.6455 -0.5065 -0.3514 -0.01922
#> gamma[2]  0.19777  0.5594  0.7044  0.8490  1.11930
treatment.effect(ipd, samples, newpatient = c(1, 0.5))  # binary outcome reports odds ratio
#>     0.025       0.5     0.975 
#> 0.1586131 0.3061650 0.6102359