A Few Notes on Labels

Ethan Heinzen

08 December, 2017

Introduction

The arsenal package relies somewhat heavily on variable labels to make output more “pretty”. A label here is understood to be a single character string with “pretty” text (i.e., not an “ugly” variable name). Three of the main arsenal function use labels in their summary() output. There are several ways to set these labels.

We’ll use the mockstudy dataset for all examples here:

library(arsenal)
data(mockstudy)
library(magrittr)

# for 'freqlist' examples
tab.ex <- table(mockstudy[, c("arm", "sex", "mdquality.s")], useNA="ifany")

Examples

Set labels in the function call

The summary() method for tableby(), modelsum(), and freqlist() objects contains a labelTranslations = argument to specify labels in the function call. Note that the freqlist() function matches labels in order, whereas the other two match labels by name. The labels can be input as a list or a character vector.

summary(freqlist(tab.ex),
        labelTranslations = c("Treatment Arm", "Gender", "LASA QOL"))
Treatment Arm Gender LASA QOL Freq cumFreq freqPercent cumPercent
A: IFL Male 0 29 29 1.93 1.93
1 214 243 14.28 16.21
NA 34 277 2.27 18.48
Female 0 12 289 0.80 19.28
1 118 407 7.87 27.15
NA 21 428 1.40 28.55
F: FOLFOX Male 0 31 459 2.07 30.62
1 285 744 19.01 49.63
NA 95 839 6.34 55.97
Female 0 21 860 1.40 57.37
1 198 1058 13.21 70.58
NA 61 1119 4.07 74.65
G: IROX Male 0 17 1136 1.13 75.78
1 187 1323 12.47 88.26
NA 24 1347 1.60 89.86
Female 0 14 1361 0.93 90.79
1 121 1482 8.07 98.87
NA 17 1499 1.13 100.00
summary(tableby(arm ~ sex + age, data = mockstudy),
        labelTranslations = c(sex = "SEX", age = "Age, yrs"))
A: IFL (N=428) F: FOLFOX (N=691) G: IROX (N=380) Total (N=1499) p value
SEX 0.190
    Male 277 (64.7%) 411 (59.5%) 228 (60%) 916 (61.1%)
    Female 151 (35.3%) 280 (40.5%) 152 (40%) 583 (38.9%)
Age, yrs 0.614
    Mean (SD) 59.7 (11.4) 60.3 (11.6) 59.8 (11.5) 60 (11.5)
    Q1, Q3 53, 68 52, 69 52, 68 52, 68
    Range 27 - 88 19 - 88 26 - 85 19 - 88
summary(modelsum(bmi ~ age, adjust = ~sex, data = mockstudy),
        labelTranslations = list(sexFemale = "Female", age = "Age, yrs"))
estimate std.error p.value adj.r.squared
(Intercept) 26.8 0.766 <0.001 0.004
Age, yrs 0.012 0.012 0.348 .
sex Female -0.718 0.291 0.014 .

Modify labels after the fact

Another option is to add labels after you have created the object. To do this, you can use the form labels(x) <- value or use the pipe-able version, set_labels().

# the non-pipe version; somewhat clunky
tmp <- freqlist(tab.ex)
labels(tmp) <- c("Treatment Arm", "Gender", "LASA QOL")
summary(tmp)
Treatment Arm Gender LASA QOL Freq cumFreq freqPercent cumPercent
A: IFL Male 0 29 29 1.93 1.93
1 214 243 14.28 16.21
NA 34 277 2.27 18.48
Female 0 12 289 0.80 19.28
1 118 407 7.87 27.15
NA 21 428 1.40 28.55
F: FOLFOX Male 0 31 459 2.07 30.62
1 285 744 19.01 49.63
NA 95 839 6.34 55.97
Female 0 21 860 1.40 57.37
1 198 1058 13.21 70.58
NA 61 1119 4.07 74.65
G: IROX Male 0 17 1136 1.13 75.78
1 187 1323 12.47 88.26
NA 24 1347 1.60 89.86
Female 0 14 1361 0.93 90.79
1 121 1482 8.07 98.87
NA 17 1499 1.13 100.00
# piped--much cleaner
mockstudy %>% 
  tableby(arm ~ sex + age, data = .) %>% 
  set_labels(c(sex = "SEX", age = "Age, yrs")) %>% 
  summary()
A: IFL (N=428) F: FOLFOX (N=691) G: IROX (N=380) Total (N=1499) p value
SEX 0.190
    Male 277 (64.7%) 411 (59.5%) 228 (60%) 916 (61.1%)
    Female 151 (35.3%) 280 (40.5%) 152 (40%) 583 (38.9%)
Age, yrs 0.614
    Mean (SD) 59.7 (11.4) 60.3 (11.6) 59.8 (11.5) 60 (11.5)
    Q1, Q3 53, 68 52, 69 52, 68 52, 68
    Range 27 - 88 19 - 88 26 - 85 19 - 88
mockstudy %>% 
  modelsum(bmi ~ age, adjust = ~ sex, data = .) %>% 
  set_labels(list(sexFemale = "Female", age = "Age, yrs")) %>% 
  summary()
estimate std.error p.value adj.r.squared
(Intercept) 26.8 0.766 <0.001 0.004
Age, yrs 0.012 0.012 0.348 .
Female -0.718 0.291 0.014 .

Add labels to a data.frame

tableby() and modelsum() also allow you to have label attributes on the data. Note that by default these attributes get dropped upon subsetting, hence the call to keep.labels().

mockstudy.lab <- keep.labels(mockstudy)

You can set attributes one at a time:

attr(mockstudy.lab$sex, "label") <- "Sex"
attr(mockstudy.lab$age, "label") <- "Age, yrs"

…or all at once:

labels(mockstudy.lab) <- list(sex = "Sex", age = "Age, yrs")
summary(tableby(arm ~ sex + age, data = mockstudy.lab))
A: IFL (N=428) F: FOLFOX (N=691) G: IROX (N=380) Total (N=1499) p value
Sex 0.190
    Male 277 (64.7%) 411 (59.5%) 228 (60%) 916 (61.1%)
    Female 151 (35.3%) 280 (40.5%) 152 (40%) 583 (38.9%)
Age, yrs 0.614
    Mean (SD) 59.7 (11.4) 60.3 (11.6) 59.8 (11.5) 60 (11.5)
    Q1, Q3 53, 68 52, 69 52, 68 52, 68
    Range 27 - 88 19 - 88 26 - 85 19 - 88

You can pipe this, too.

mockstudy %>% 
  set_labels(list(sex = "SEX", age = "Age, yrs")) %>% 
  modelsum(bmi ~ age, adjust = ~ sex, data = .) %>% 
  summary()
estimate std.error p.value adj.r.squared
(Intercept) 26.8 0.766 <0.001 0.004
Age, yrs 0.012 0.012 0.348 .
SEX Female -0.718 0.291 0.014 .

To extract labels from a data.frame, simply use the labels() function:

labels(mockstudy.lab)
## $case
## NULL
## 
## $age
## [1] "Age, yrs"
## 
## $arm
## [1] "Treatment Arm"
## 
## $sex
## [1] "Sex"
## 
## $race
## [1] "Race"
## 
## $fu.time
## NULL
## 
## $fu.stat
## NULL
## 
## $ps
## NULL
## 
## $hgb
## NULL
## 
## $bmi
## [1] "Body Mass Index (kg/m^2)"
## 
## $alk.phos
## NULL
## 
## $ast
## NULL
## 
## $mdquality.s
## NULL
## 
## $age.ord
## NULL