Runner examples

The most fundamental function in runner package is runner. With runner::runner one can apply any R function on running windows. This tutorial presents set of examples explaining how to tackle some tasks. Some of the examples are referenced to original topic on stack-overflow.

Number of unique elements in 7 days window

library(runner)
x <- sample(letters, 20, replace = TRUE)
date <- as.Date(cumsum(sample(1:5, 20, replace = TRUE)), origin = Sys.Date()) # unequaly spaced time series

runner(x, k = 7, idx = date, f = function(x) length(unique(x)))
##  [1] 1 2 3 4 2 1 2 2 3 3 3 3 2 3 3 4 3 2 2 3

weekly trimmed mean

x <- cumsum(rnorm(20))
date <- as.Date(cumsum(sample(1:5, 20, replace = TRUE)), origin = Sys.Date()) # unequaly spaced time series

runner(x, k = 7, idx = date, f = function(x) mean(x, trim = 0.05))
##  [1] -0.57523208 -0.33100734 -0.50892869 -0.87644137 -0.83139519 -0.72358078
##  [7] -0.59744785 -0.20342014  0.15743290 -0.04685322 -1.69637922 -2.93054915
## [13] -3.12727833 -3.08163537 -3.05675158 -2.40861889  0.49986025  0.82282423
## [19]  1.08703920  0.91782981

2 weeks regression

x <- cumsum(rnorm(20))
y <- 3 * x + rnorm(20)
date <- as.Date(cumsum(sample(1:3, 20, replace = TRUE)), origin = Sys.Date()) # unequaly spaced time series
data <- data.frame(date, y, x)

running_regression <- function(idx) {
  predict(lm(y ~ x, data = data))[max(idx)]
}

data$pred <- runner(seq_along(x), k = 14, idx = date, f = running_regression)

plot(data$date, data$y, type = "l", col = "red")
lines(data$date, data$pred, col = "blue")

Rolling sums for groups with uneven time gaps

SO discussion

library(dplyr)
set.seed(3737)
df <- data.frame(
  user_id = c(rep(27, 7), rep(11, 7)),
  date = as.Date(rep(c('2016-01-01', '2016-01-03', '2016-01-05', '2016-01-07', '2016-01-10', '2016-01-14', '2016-01-16'), 2)),
  value = round(rnorm(14, 15, 5), 1))

df %>%
  group_by(user_id) %>%
  mutate(
    v_minus7  = sum_run(value, 7, idx = date),
    v_minus14 = sum_run(value, 14, idx = date))
## # A tibble: 14 x 5
## # Groups:   user_id [2]
##    user_id date       value v_minus7 v_minus14
##      <dbl> <date>     <dbl>    <dbl>     <dbl>
##  1      27 2016-01-01  15       15        15  
##  2      27 2016-01-03  22.4     37.4      37.4
##  3      27 2016-01-05  13.3     50.7      50.7
##  4      27 2016-01-07  21.9     72.6      72.6
##  5      27 2016-01-10  20.6     55.8      93.2
##  6      27 2016-01-14  18.6     39.2     112. 
##  7      27 2016-01-16  16.4     55.6     113. 
##  8      11 2016-01-01   6.8      6.8       6.8
##  9      11 2016-01-03  21.3     28.1      28.1
## 10      11 2016-01-05  19.8     47.9      47.9
## 11      11 2016-01-07  22       69.9      69.9
## 12      11 2016-01-10  19.4     61.2      89.3
## 13      11 2016-01-14  17.5     36.9     107. 
## 14      11 2016-01-16  19.3     56.2     119.