This tutorial uses the Treatment Selection package (version 1.2.0) to analyze the example data provided in the package.

First, you need to download and install the package:

install.packages("TreatmentSelection")

If you would like the current version of the package directly from github, use the devtools package:

if (!require("devtools")) install.packages("devtools")
devtools::install_github("TreatmentSelection", "mdbrown")

First, load the data called tsdata. Four markers are included in the data example, a ‘’weak’‘and a’‘strong’’ marker (\(Y1\) and \(Y2\) respectively), along with a weak/strong discrete markers.

library(TreatmentSelection)
set.seed(12321)

data(tsdata)

tsdata[1:5, ]
##   trt event      Y1      Y2 Y1_disc Y2_disc
## 1   1     1 39.9120 -0.8535       1       0
## 2   1     0  6.6820  0.2905       0       1
## 3   1     0  6.5820  0.0800       0       1
## 4   0     0  1.3581  1.1925       0       1
## 5   0     0  7.6820 -0.2070       0       0

Create TrtSel objects

Once we have the package and our data loaded into R, we need to create a treatment selection R object using the function trtsel. This function takes as inputs a data.frame of treatment indicators, adverse event status, marker values, and other optional information. Once we have created this object, we can then use it to plot risk/treatment effect curves, estimate summary measures, and check model calibration.

First let’s create a trtsel object using the weak marker data Y1, and take a look at it’s contents:

trtsel.Y1 <- trtsel(event = "event", 
                    trt = "trt", 
                    marker = "Y1", 
                    data = tsdata, 
                    study.design = "randomized cohort",
                    link = "logit", 
                    default.trt = "trt all")

trtsel.Y1
## Study design: randomized cohort 
## 
## Model Fit:
## 
##  Link function: logit 
## 
##  Coefficients: 
##                Estimate  Std. Error    z value     Pr(>|z|)
## (Intercept) -2.51814383 0.235642511 -10.686288 1.179991e-26
## trt          0.48938620 0.311762857   1.569739 1.164759e-01
## marker       0.04760056 0.006453791   7.375597 1.636104e-13
## trt:marker  -0.02318881 0.008324063  -2.785756 5.340300e-03
## 
## 
## Derived Data: (first ten rows)
## 
##    trt event  marker fittedrisk.t0 fittedrisk.t1    trt.effect marker.neg
## 1    1     1 39.9120    0.35016583     0.2583742  0.0917916549          0
## 2    1     0  6.6820    0.09974358     0.1340472 -0.0343036269          1
## 3    1     0  6.5820    0.09931697     0.1337641 -0.0344471266          1
## 4    0     0  1.3581    0.07918316     0.1196652 -0.0404820847          1
## 5    0     0  7.6820    0.10410005     0.1369063 -0.0328062456          1
## 6    0     0 41.1720    0.36393311     0.2643117  0.0996213622          0
## 7    1     0 19.4920    0.16933976     0.1746644 -0.0053246137          1
## 8    1     1 20.8220    0.17843231     0.1793943 -0.0009620341          1
## 9    0     0  6.9620    0.10094678     0.1348426 -0.0338958439          1
## 10   0     0  2.5020    0.08324538     0.1226384 -0.0393929781          1

As we see above, the object contains information about the study design, model fit, fitted risks given treatment, and estimated treatment effect for each individual.

Now create a trtsel object using a discrete marker.

# Y2_disc = as.numeric(Y2>0)
trtsel.Y2_disc <- trtsel(event = "event", 
                         trt = "trt", 
                         marker = "Y2_disc", 
                         data = tsdata, 
                         study.design = "randomized cohort", 
                         link = "logit")

See ?trtsel for more information. Now that we have created trtsel objects, we can plot, evaluate, calibrate and compare them.

Use the plot function

Plot risk curves:

plot(trtsel.Y1, 
     main = "Y1: Oncotype-DX-like marker", 
     plot.type = "risk", 
     ci = "horizontal", 
     conf.bands = TRUE, 
     bootstraps = 50,       #more bootstraps should be run than this in practice!
     trt.names = c("chemo.", "no chemo."), 
     show.marker.axis = FALSE)

For a binary marker, we calculate vertical confidence bands:

tmp <- plot(trtsel.Y2_disc,
                   main = "Discrete version of Y2", 
                   plot.type = "risk", 
                   ci = "vertical", 
                   conf.bands = TRUE, 
                   offset = 0.01, 
                   bootstraps = 50, 
                   trt.names = c("chemo.", "no chemo."))

tmp is now a list with elements plot that holds the ggplot output, and ci.bounds which holds the information regarding the confidence bounds.

tmp$ci.bounds
##         risk trt marker      lower      upper
## 1 0.35984848   1      0 0.32422707 0.39520792
## 2 0.06198347   1      1 0.03510314 0.09270948
## 4 0.32061069   0      1 0.26155639 0.37440180
## 5 0.17241379   0      0 0.12972671 0.20252033

Treatment effect curves

We can also plot the distribution of treatment effects.

plot(trtsel.Y1, 
     plot.type = "treatment effect",
     ci = "horizontal", 
     conf.bands = TRUE, 
     bootstraps = 50)

plot(trtsel.Y2_disc, 
     plot.type = "treatment effect", 
     conf.bands = TRUE, 
     bootstraps = 50)

#### Selection impact plot With v1.1.2, an option has been added to plot ‘selection impact’ curves. Which show the estimated event rate if different proportions of observations where treated based off the marker of interest.

plot(trtsel.Y1, 
     plot.type = "selection impact", 
     ci = "vertical", 
     conf.bands = TRUE, 
     bootstraps = 50)

Evaluate marker performance

Calculate summary measures of marker performance along with bootstrap confidence intervals.

tmp <- eval.trtsel(trtsel.Y1, bootstraps = 50)
tmp
## 
## 
##   Hypothesis test:
##  ------------------
##   H0: No marker-by-treatment interaction
##                                        P value = 0.00534
##                                        Z statistic = -2.786
## 
##   Summary Measure Estimates (with 95% confidence intervals) 
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.013 (-0.005,0.033) 
##     Model Based:  0.01 (0,0.034) 
## 
##   Proportion marker negative:
##    0.461 (0,0.702) 
##   Proportion marker positive:
##    0.539 (0.298,1) 
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.029 (-0.026,0.061) 
##     Model Based:  0.023 (0,0.052) 
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.089 (0.038,0.139) 
##     Model Based:  0.098 (0.044,0.151) 
## 
## 
##   Variance in estimated treatment effect: 
##     0.007 (0.001,0.017) 
##   Total Gain: 
##     0.066 (0.021,0.1) 
## 
##   Marker positivity threshold:  21.082
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.204    
##             (0.184,0.249)   (0.219,0.285)   (0.165,0.235) 
##  Model Based:   0.214           0.257          0.204    
##             (0.184,0.245)   (0.222,0.291)   (0.173,0.231)
# access the estimates
tmp$estimates
##   p.neg p.pos  B.neg.emp  B.neg.mod  B.pos.emp  B.pos.mod  Theta.emp
## 1 0.461 0.539 0.02864779 0.02252989 0.08866547 0.09846275 0.01320663
##    Theta.mod   Var.Delta         TG ER.trt0.emp ER.trt0.mod ER.trt1.emp
## 1 0.01038628 0.007416402 0.06579359   0.2510121   0.2567964   0.2173913
##   ER.trt1.mod ER.mkrbased.emp ER.mkrbased.mod Marker.Thresh
## 1   0.2141113       0.2041847        0.203725        21.082
# discrete marker
eval.trtsel(trtsel.Y2_disc, bootstraps = 50)
## 
## 
##   Hypothesis test:
##  ------------------
##   H0: No marker-by-treatment interaction
##                                        P value = 0
##                                        Z statistic = -8.045
## 
##   Summary Measure Estimates (with 95% confidence intervals) 
##  -----------------------------------------------------------
##   Decrease in event rate under marker-based treatment (Theta)
##     Empirical:    0.093 (0.065,0.126) 
##     Model Based:  0.093 (0.065,0.126) 
## 
##   Proportion marker negative:
##    0.496 (0.46,0.522) 
##   Proportion marker positive:
##    0.504 (0.478,0.54) 
## 
##   Average benefit of no treatment among marker-negatives (B.neg)
##     Empirical:    0.187 (0.128,0.25) 
##     Model Based:  0.187 (0.128,0.25) 
## 
##   Average benefit of treatment among marker-positives (B.pos)
##     Empirical:    0.259 (0.2,0.345) 
##     Model Based:  0.259 (0.2,0.345) 
## 
## 
##   Event Rates:
##  --------------------
##              Treat all       Treat None    Marker-based Treatment
##  Empirical:     0.217           0.251          0.124    
##             (0.192,0.267)   (0.218,0.293)   (0.099,0.158) 
##  Model Based:   0.210           0.247          0.117    
##             (0.181,0.250)   (0.216,0.290)   (0.085,0.151)

Assess model calibration

Currently, model calibration is only available for continuous markers.

calibrate.trtsel(trtsel.Y1, 
                 groups = 10, 
                 plot = "calibration", 
                 trt.names = c("chemo.", "no chemo."))

## 
##   Hosmer - Lemeshow test for model calibration
##  ----------------------------------------------
## 
##    Number of Groups: 10 
## 
##    No Treatment (trt = 0):
##     Test Statistic = 4.496,   DF = 8,   p value = 0.8098813
## 
##    Treated (trt = 1):
##     Test Statistic = 4.986,   DF = 8,   p value = 0.7591213

See ?calibrate.trtsel for more plot options.

Compare markers

To compare markers, the trt and event labels must be identical for the two markers. Plots can not be generated if comparing a discrete marker with a continuous marker.

# trtsel object for the stronger marker 2
trtsel.Y2 <- trtsel(event = "event", 
                    trt = "trt", 
                    marker = "Y2", 
                    data = tsdata, 
                    default.trt = "trt all")

# Compare the markers based on summary measures
mycompare <- compare.trtsel(trtsel1 = trtsel.Y1, 
                            trtsel2 = trtsel.Y2, 
                            marker.names = c("Weak", "Strong"), 
                            bootstraps = 50, 
                            plot = TRUE, 
                            ci = "vertical", 
                            offset = 0.01, 
                            conf.bands = TRUE)

mycompare
##                       Summary Measure Estimates 
##                     (with  95 % confidence intervals) 
## 
##                marker 1    |    marker 2    |   difference    (p-value)
##  ------------------------------------------------------------------------
## 
## Decrease in event rate under marker-based treatment (Theta)
##  Empirical:     0.013      |     0.090     |     -0.076         (< 0.02)
##             (-0.004,0.171) | (0.073,0.244) | (-0.107,-0.057) 
##  Model Based:   0.010      |     0.099      |     -0.088         (< 0.02)
##             (0.000,0.043)  | (0.074,0.107)  | (-0.106,-0.041) 
## 
## Proportion marker negative:
##                 0.461      |     0.377      |     0.084         (0.64)
##             (0.002,0.694)  | (0.291,0.461)  | (-0.324,0.243) 
## Proportion marker positive:
##                 0.539      |     0.623      |     -0.084         (0.62)
##             (0.306,0.998)  | (0.539,0.709)  | (-0.243,0.324) 
## 
## Average benefit of no treatment among marker-negatives (B.neg)
##  Empirical:     0.029      |     0.238     |     -0.209         (< 0.02)
##             (-0.050,0.077) | (0.202,0.294) | (-0.337,-0.134) 
##  Model Based:   0.023      |     0.262      |     -0.239         (< 0.02)
##             (0.000,0.051)  | (0.220,0.304)  | (-0.287,-0.190) 
## 
## Average benefit of treatment among marker-positives (B.pos)
##  Empirical:     0.089      |     0.203     |     -0.114         (< 0.02)
##             (0.019,0.181) | (0.152,0.248) | (-0.183,-0.036) 
##  Model Based:   0.098      |     0.211      |     -0.113         (< 0.02)
##             (0.041,0.171)  | (0.176,0.244)  | (-0.161,-0.057) 
## 
## 
## Variance in estimated treatment effect : 
##                 0.007      |     0.080      |     -0.073         (< 0.02)
##             (0.001,0.023)  | (0.059,0.099)  | (-0.090,-0.049) 
## 
## Total Gain: 
##                 0.066      |     0.224      |     -0.158         (< 0.02)
##             (0.018,0.121)  | (0.189,0.250)  | (-0.209,-0.092)
## Compare two discrete markers Y1_disc = as.numeric(Y1>mean(Y1))
trtsel.Y1_disc <- trtsel(event = "event",
                         trt = "trt", 
                         marker = "Y1_disc", 
                         data = tsdata, 
                         study.design = "randomized cohort", 
                         link = "logit")


compare.trtsel(trtsel1 = trtsel.Y1_disc, 
               trtsel2 = trtsel.Y2_disc, 
               ci = "vertical", 
               offset = 0.2, 
               bootstraps = 50, 
               plot = TRUE, 
               conf.bands = TRUE, 
               annotate.plot = FALSE)

##                       Summary Measure Estimates 
##                     (with  95 % confidence intervals) 
## 
##                marker 1    |    marker 2    |   difference    (p-value)
##  ------------------------------------------------------------------------
## 
## Decrease in event rate under marker-based treatment (Theta)
##  Empirical:     0.011      |     0.093     |     -0.082         (< 0.02)
##             (-0.015,0.196) | (0.064,0.313) | (-0.108,-0.064) 
##  Model Based:   0.011      |     0.093      |     -0.082         (< 0.02)
##             (-0.015,0.057)  | (0.064,0.126)  | (-0.108,-0.046) 
## 
## Proportion marker negative:
##                 0.570      |     0.496      |     0.074         (< 0.02)
##             (0.541,0.597)  | (0.465,0.525)  | (0.043,0.108) 
## Proportion marker positive:
##                 0.430      |     0.504      |     -0.074         (< 0.02)
##             (0.403,0.459)  | (0.475,0.535)  | (-0.108,-0.043) 
## 
## Average benefit of no treatment among marker-negatives (B.neg)
##  Empirical:     0.019      |     0.187     |     -0.168         (< 0.02)
##             (-0.026,0.099) | (0.129,0.253) | (-0.217,-0.099) 
##  Model Based:   0.019      |     0.187      |     -0.168         (< 0.02)
##             (-0.026,0.099)  | (0.129,0.253)  | (-0.217,-0.099) 
## 
## Average benefit of treatment among marker-positives (B.pos)
##  Empirical:     0.106      |     0.259     |     -0.153         (< 0.02)
##             (0.019,0.196) | (0.203,0.313) | (-0.218,-0.064) 
##  Model Based:   0.106      |     0.259      |     -0.153         (< 0.02)
##             (0.019,0.196)  | (0.203,0.313)  | (-0.218,-0.064) 
## 
## 
## Variance in estimated treatment effect : 
##                 0.004      |     0.050      |     -0.046         (< 0.02)
##             (0.000,0.013)  | (0.032,0.064)  | (-0.058,-0.027) 
## 
## Total Gain: 
##                 0.061      |     0.223      |     -0.162         (< 0.02)
##             (0.013,0.113)  | (0.178,0.252)  | (-0.212,-0.087)

See ?compare.trtsel for more options.

Including fitted risks (new option with version 1.1.0)

Alternative to including a marker and fitting a logistic model, the user can specify fitted risks for trt = 0 and trt = 1. In this case, no model fitting will be implemented and all bootstrap confidence intervals will be conditional on the provided fitted model.

#calculate model fit
mymod <- glm(event~trt*Y2, data= tsdata, family = binomial("logit"))

tsdata$fitted.t0 <- predict(mymod, newdata=data.frame(trt = 0, Y2 = tsdata$Y2), type = "response")
tsdata$fitted.t1 <- predict(mymod, newdata=data.frame(trt = 1, Y2 = tsdata$Y2), type = "response")


myfitted.trtsel <- trtsel( event ="event", trt = "trt",  
                         data = tsdata,
                         fittedrisk.t0 = "fitted.t0",
                         fittedrisk.t1 = "fitted.t1",
                         study.design = "randomized cohort", 
                         default.trt = "trt all")

We can now use this trtsel object just as before, but confidence intervals will be smaller because we do not account for the variation due to model fitting.

plot(myfitted.trtsel, bootstraps = 50, plot.type = "risk",
            ci = "horizontal", show.marker.axis = FALSE)

References

Janes H, Brown MD, Pepe MS, Huang Y. Statistical methods for evaluating and comparing biomarkers for patient treatment selection. International Journal of Biostatistics (under review).