The underlying assumptions of traditional autoregressive models are well known. The resulting complexity with these models leads to observations such as, ``We have found that choosing the wrong model or parameters can often yield poor results, and it is unlikely that even experienced analysts can choose the correct model and parameters efficiently given this array of choices.’’ Source
NNS
simplifies the forecasting process. Below are some examples demonstrating NNS.ARMA
and its assumption free, minimal parameter forecasting method.
NNS.ARMA
has the ability to fit a linear regression to the relevant component series, yielding very fast results. For our running example we will use the AirPassengers
dataset loaded in base R.
We will forecast 44 periods h = 44
of AirPassengers
using the first 100 observations training.set = 100
, returning estimates of the final 44 observations. We will then test this against our validation set of tail(AirPassengers,44)
.
Since this is monthly data, we will try a seasonal.factor = 12
.
Below is the linear fit and associated root mean squared error (RMSE) using method = "lin"
.
nns = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "lin", plot = TRUE, seasonal.factor = 12, seasonal.plot = FALSE)
sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
## [1] 35.39965
Now we can try using a nonlinear regression on the relevant component series using method = "nonlin"
.
nns = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "nonlin", plot = TRUE, seasonal.factor = 12, seasonal.plot = FALSE)
sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
## [1] 24.34268
We can test a series of seasonal.factors
and select the best one to fit. The largest period to consider would be 0.25 * length(variable)
, in our case 25. Remember, we are testing the first 100 observations of AirPassengers
, not the full 144 observations.
seas = t(sapply(1 : 25, function(i) c(i, sqrt(mean((NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "lin", seasonal.factor = i, plot=FALSE) - tail(AirPassengers, 44)) ^ 2)))))
colnames(seas) = c("Period", "RMSE")
seas
## Period RMSE
## [1,] 1 75.67783
## [2,] 2 75.71250
## [3,] 3 75.87604
## [4,] 4 75.16563
## [5,] 5 76.07418
## [6,] 6 70.43185
## [7,] 7 77.98493
## [8,] 8 75.48997
## [9,] 9 79.16378
## [10,] 10 81.47260
## [11,] 11 106.56886
## [12,] 12 35.39965
## [13,] 13 90.98265
## [14,] 14 95.64979
## [15,] 15 82.05345
## [16,] 16 74.63052
## [17,] 17 87.54036
## [18,] 18 74.90881
## [19,] 19 96.96011
## [20,] 20 88.75015
## [21,] 21 100.21346
## [22,] 22 108.68674
## [23,] 23 85.06430
## [24,] 24 35.49018
## [25,] 25 75.16192
Now we know seasonal.factor = 12
is our best fit, we can see if there’s any benefit from using a nonlinear regression. Alternatively, we can define our best fit as the corresponding seas$Period
entry of the minimum value in our seas$RMSE
column.
You may experience instances with monthly data that report seasonal.factor
close to multiples of 3, 4, 6 or 12. For instance, if the reported seasonal.factor = {37, 47, 71, 73}
use (seasonal.factor = c(36, 48, 72))
. The same suggestion holds for daily data and multiples of 7, or any other time series with logically inferred cyclical patterns.
a = seas[which.min(seas[ , 2]), 1]
Below you will notice the use of seasonal.factor=a
generates the same output.
nns = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "nonlin",seasonal.factor = a, plot = TRUE, seasonal.plot = FALSE)
sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
## [1] 24.34268
There is a benefit to using a nonlinear regression as our RMSE has been lowered. We can also test if using both linear and nonlinear estimates combined result in a lower RMSE (method = "both"
).
nns = NNS.ARMA(AirPassengers, h = 44, training.set = 100, method = "both", seasonal.factor = a, plot = TRUE, seasonal.plot = FALSE)
sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
## [1] 26.31436
Using method = "both"
did not lower our RMSE. There are far fewer parameters to test using NNS than traditional methods and the relative simplicity of the method ensures robustness.
seasonal.factor
NNS also offers a wrapper function NNS.ARMA.optim()
to test a given vector of seasonal.factor
and returns the lowest SSE and the optimal periods, as well as the NNS.ARMA
regression method used.
Given our monthly dataset, we will try multiple years by setting seasonal.factor = seq(12, 48, 12)
.
nns.optimal = NNS.ARMA.optim(AirPassengers, training.set = 100, seasonal.factor = seq(12, 48, 12), method = "comb")
nns.optimal
## $periods
## [1] 12
##
## $SSE
## [1] 26072.9
##
## $method
## [1] "nonlin"
Using our new parameters yields the same results:
nns = NNS.ARMA(AirPassengers, training.set = 100, h = 44, seasonal.factor = nns.optimal$periods, method = nns.optimal$method, plot = TRUE, seasonal.plot = FALSE)
sqrt(mean((nns - tail(AirPassengers, 44)) ^ 2))
## [1] 24.34268
Use of a seasonal.factor
too large will result in the following error. Please reduce the maximum seasonal.factor
NNS.ARMA.optim(AirPassengers,training.set = 132,seasonal.factor = seq(12, 72, 12))
Error in NNS.ARMA.optim(AirPassengers, training.set = 132, seasonal.factor = seq(12, :
Please set maximum [seasonal.factor] to less than 72
Using our cross-validated parameters (seasonal.factor
and method
) we can forecast another 50 periods out-of-range (h = 50
), by dropping the training.set
parameter.
NNS.ARMA(AirPassengers, h = 50, seasonal.factor = nns.optimal$periods, method = nns.optimal$method, plot = TRUE, seasonal.plot = FALSE)
seasonal.factor = c(1, 2, ...)
We included the ability to use any number of specified seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
seasonal.factor = FALSE
We also included the ability to use all detected seasonal periods simultaneously, weighted by their strength of seasonality. Computationally expensive when used with nonlinear regressions and large numbers of relevant periods.
best.periods
This parameter restricts the number of detected seasonal periods to use, again, weighted by their strength. To be used in conjunction with seasonal.factor = FALSE
.
dynamic = TRUE
This setting generates a new seasonal period(s) using the estimated values as continuations of the variable, either with or without a training.set
. Also computationally expensive due to the recalculation of seasonal periods for each estimated value.
plot
, seasonal.plot
and intervals
These are the plotting arguments, easily enabled or disabled with TRUE
or FALSE
. seasonal.plot = TRUE
will not plot without plot = TRUE
. If a seasonal analysis is all that is desired, NNS.seas
is the function specifically suited for that task. intervals
will plot the surrounding estimated values iff intervals = TRUE & seasonal.factor = FALSE
.
If the user is so motivated, detailed arguments and proofs are provided within the following: