sum_up
prints detailed summary statistics (corresponds to Stata summarize)
N <- 100
df <- data_frame(
id = 1:N,
v1 = sample(5, N, TRUE),
v2 = sample(1e6, N, TRUE)
)
sum_up(df)
df %>% sum_up(starts_with("v"), d = TRUE)
df %>% group_by(v1) %>% sum_up()
tab
prints distinct rows with their count. Compared to the dplyr function count
, this command just adds Frequency and Cumulative frequency.
N <- 1e2 ; K = 10
df <- data_frame(
id = sample(5, N, TRUE),
v1 = sample(5, N, TRUE)
)
tab(df, id, v1)
tab(df, id, v1, na.rm = TRUE)
df %>% group_by(id) %>% tab(v1)
join
is a wrapper for dplyr merge functionalities, with two added functions
The option check
checks there are no duplicates in the master or using data.tables (as in Stata).
# merge m:1 v1
join(x, y, kind = "full", check = m~1)
The option gen
specifies the name of a new variable that identifies non matched and matched rows (as in Stata).
# merge m:1 v1, gen(_merge)
join(x, y, kind = "full", gen = "_merge")
The option update
allows to update missing values of the master dataset by the value in the using dataset