es()
is a part of smooth package. It allows constructing Exponential Smoothing (also known as ETS), selecting the most appropriate one among 30 possible ones, including exogenous variables and many more.
In this vignette we will use data from Mcomp
package, so it is adviced to install it.
Let’s load the necessary packages:
require(smooth)
require(Mcomp)
You may note that Mcomp
depends on forecast
package and if you load both forecast
and smooth
, then you will have a message that forecast()
function is masked from the environment. There is nothing to be worried about - smooth
uses this function for consistency purposes and has exactly the same original forecast()
as in the forecast
package. The inclusion of this function in smooth
was done only in order not to include forecast
in dependencies of the package.
The simplest call of this function is:
es(M3$N2457$x, h=18, holdout=TRUE)
## Forming the pool of models based on... ANN, ANA, AAN, Estimation progress: 100%... Done!
## Time elapsed: 0.37 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.145
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.413
## Cost function type: MSE; Cost function value: 1288657
##
## Information criteria:
## AIC AICc BIC
## 1645.978 1646.236 1653.702
## Forecast errors:
## MPE: 26.3%; Bias: 87%; MAPE: 39.8%; SMAPE: 49.4%
## MASE: 2.944; sMAE: 120.1%; RelMAE: 1.258; sMSE: 242.7%
In this case function uses branch and bound algorithm to form a pool of models to check and after that constructs a model with the lowest information criterion. As we can see, it also produces an output with brief information about the model, which contains:
holdout=TRUE
).The function has also produced a graph with actuals, fitted values and point forecasts.
If we need prediction intervals, then we run:
es(M3$N2457$x, h=18, holdout=TRUE, intervals=TRUE)
## Forming the pool of models based on... ANN, ANA, AAN, Estimation progress: 100%... Done!
## Time elapsed: 0.37 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.145
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.413
## Cost function type: MSE; Cost function value: 1288657
##
## Information criteria:
## AIC AICc BIC
## 1645.978 1646.236 1653.702
## 95% parametric prediction intervals were constructed
## 72% of values are in the prediction interval
## Forecast errors:
## MPE: 26.3%; Bias: 87%; MAPE: 39.8%; SMAPE: 49.4%
## MASE: 2.944; sMAE: 120.1%; RelMAE: 1.258; sMSE: 242.7%
Due to multiplicative nature of error term in the model, the intervals are asymmetric. This is the expected behaviour. The other thing to note is that the output now also provides the theoretical width of prediction intervals and its actual coverage.
If we save the model (and let’s say we want it to work silently):
ourModel <- es(M3$N2457$x, h=18, holdout=TRUE, silent="all")
we can then reuse it for different purposes:
es(M3$N2457$x, model=ourModel, h=18, holdout=FALSE, intervals="np", level=0.93)
## Time elapsed: 0.09 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.145
## Initial values were provided by user.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.434
## Cost function type: MSE; Cost function value: 1965686
##
## Information criteria:
## AIC AICc BIC
## 1998.861 1999.078 2007.096
## 93% nonparametric prediction intervals were constructed
Or we can just use persistence or initials from one model to construct the other one:
es(M3$N2457$x, model="MNN", h=18, holdout=FALSE, initial=ourModel$initial, silent="graph")
## Time elapsed: 0.06 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.151
## Initial values were provided by user.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.434
## Cost function type: MSE; Cost function value: 1965401
##
## Information criteria:
## AIC AICc BIC
## 1998.845 1999.061 2007.079
es(M3$N2457$x, model="MNN", h=18, holdout=FALSE, persistence=ourModel$persistence, silent="graph")
## Time elapsed: 0.03 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.145
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.434
## Cost function type: MSE; Cost function value: 1965686
##
## Information criteria:
## AIC AICc BIC
## 1998.861 1999.078 2007.096
or provide some arbitrary values:
es(M3$N2457$x, model="MNN", h=18, holdout=FALSE, initial=1500, silent="graph")
## Time elapsed: 0.03 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.15
## Initial values were provided by user.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.435
## Cost function type: MSE; Cost function value: 1968546
##
## Information criteria:
## AIC AICc BIC
## 1999.029 1999.245 2007.263
Using some other parameters may lead to completely different model and forecasts:
es(M3$N2457$x, h=18, holdout=TRUE, cfType="aMSTFE", bounds="a", ic="BIC", intervals=TRUE)
## Forming the pool of models based on... ANN, ANA, AAN, Estimation progress: 100%... Done!
## Time elapsed: 0.42 seconds
## Model estimated: ETS(MNN)
## Persistence vector g:
## alpha
## 0.08
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 0.42
## Cost function type: aMSTFE; Cost function value: 246
##
## Information criteria:
## AIC AICc BIC
## 25551.52 25556.16 25690.55
## 95% parametric prediction intervals were constructed
## 72% of values are in the prediction interval
## Forecast errors:
## MPE: 33.3%; Bias: 90.4%; MAPE: 43.3%; SMAPE: 56.3%
## MASE: 3.232; sMAE: 131.9%; RelMAE: 1.381; sMSE: 277.6%
You can play around with all the available parameters to see what’s their effect on final model.
Model selection from a specified pool and forecasts combination are called using respectively:
es(M3$N2457$x, model=c("ANN","AAN","AAdN","ANA","AAA","AAdA"), h=18, holdout=TRUE, silent="graph")
## Estimation progress: 17%33%50%67%83%100%... Done!
## Time elapsed: 0.75 seconds
## Model estimated: ETS(ANN)
## Persistence vector g:
## alpha
## 0.158
## Initial values were optimised.
## 3 parameters were estimated in the process
## Residuals standard deviation: 1439.368
## Cost function type: MSE; Cost function value: 2007705
##
## Information criteria:
## AIC AICc BIC
## 1688.987 1689.245 1696.711
## Forecast errors:
## MPE: 25.3%; Bias: 86%; MAPE: 39.4%; SMAPE: 48.6%
## MASE: 2.909; sMAE: 118.7%; RelMAE: 1.243; sMSE: 238.1%
es(M3$N2457$x, model="CCN", h=18, holdout=TRUE, silent="graph")
## Estimation progress: 10%20%30%40%50%60%70%80%90%100%... Done!
## Time elapsed: 1.29 seconds
## Model estimated: ETS(CCN)
## Initial values were optimised.
## Residuals standard deviation: 1406.273
## Cost function type: MSE
##
## Information criteria:
## Combined AICc
## 1647.524
## Forecast errors:
## MPE: 27.2%; Bias: 88.1%; MAPE: 40.3%; SMAPE: 50.3%
## MASE: 2.982; sMAE: 121.7%; RelMAE: 1.274; sMSE: 247.3%
Now let’s introduce some artificial exogenous variables:
x <- cbind(rnorm(length(M3$N2457$x),50,3),rnorm(length(M3$N2457$x),100,7))
and fit a model with exogenous without update first:
es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=x)
## Forming the pool of models based on... ANN, ANA, AAN, Estimation progress: 100%... Done!
## Time elapsed: 0.53 seconds
## Model estimated: ETSX(MNN)
## Persistence vector g:
## alpha
## 0.119
## Initial values were optimised.
## 5 parameters were estimated in the process
## Residuals standard deviation: 0.464
## Xreg coefficients were estimated in a normal style
## Cost function type: MSE; Cost function value: 1285044
##
## Information criteria:
## AIC AICc BIC
## 1649.706 1650.365 1662.579
## Forecast errors:
## MPE: 32.5%; Bias: 90%; MAPE: 43.1%; SMAPE: 55.7%
## MASE: 3.205; sMAE: 130.8%; RelMAE: 1.369; sMSE: 273.3%
and then with the update:
es(M3$N2457$x, model="ZZZ", h=18, holdout=TRUE, xreg=x, updateX=TRUE)
## Forming the pool of models based on... ANN, ANA, AAN, Estimation progress: 40%50%60%70%80%90%100%... Done!
## Time elapsed: 2.11 seconds
## Model estimated: ETSX(MAN)
## Persistence vector g:
## alpha beta
## 0 0
## Initial values were optimised.
## 13 parameters were estimated in the process
## Residuals standard deviation: NaN
## Xreg coefficients were estimated in a crazy style
## Cost function type: MSE; Cost function value: 1168910
##
## Information criteria:
## AIC AICc BIC
## 1656.518 1660.903 1689.989
## Forecast errors:
## MPE: 40.3%; Bias: 92.5%; MAPE: 47.2%; SMAPE: 64.1%
## MASE: 3.534; sMAE: 144.2%; RelMAE: 1.51; sMSE: 318.3%
Be careful, however, when non additive ETS models are used with exogenous variables. The results may be highly unsatisfactory and unstable.