Assignment2.Rmd

---
title: "**Assignment 2 \n**"
subtitle: "Applied Forecasting in Complex Systems 2021"
author: Emiel Steegh (14002558)
date: "University of Amsterdam \n &nbsp;  \n November, 21, 2021 "
output: pdf_document
fontsize: 11pt
highlight: tango
---

```{r setup, include=FALSE}

options(digits = 3)
library(fpp3)
library(latex2exp)
library(forecast)
library(formatR)
library(gridExtra)
library(gt)

knitr::opts_chunk$set(
  echo = FALSE,
  message = FALSE,
  warning = FALSE,
  cache = TRUE,
  dev.args = list(pointsize = 11)
)
knitr::opts_chunk$set(tidy.opts = list(width.cutoff = 60), tidy = TRUE)

```

_Note to the reader:\
All of the code can be found in the appendix, starting on page 10_

# Exercise 1

## 1.1)
```{r 11, fig.height = 3}
afg_pop <- global_economy %>%
  filter(Country == "Afghanistan") %>%
  mutate(Population = Population/1000000) %>%
  select(Year, Population)
sov_afg_war <- c(1979+( (1/12) * (11+(24/31)) ), 1989+( (1/12) * (1+(15/31)) )) # https://en.wikipedia.org/wiki/Soviet%E2%80%93Afghan_War

afg_pop %>%
  autoplot(Population) +
  labs(y = "Mln. population", title = "Afghan Population",subtitle = "The start and end of the Soviet-Afghan war are marked with a vertical line")  +
  geom_vline(xintercept = sov_afg_war, colour = "red", linetype = "longdash")
```

In the plot of Afghanistan's population we can clearly see a correlation between the period of the Soviet-Afghan war and the countries population growth.
The trend can be divided into three sections:\
1. 1960:1979 an upward trend until the year before the start of the war\
2. 1980:1988 a downward trend which ends a little over a year before the end of the war\
3. 1989:2017 an upward trend steeper than the pre-war period.

War frequently contributes to a higher death-rate and a lower birth-rate in a country, an effect that can be observed in section two. In the third section we can see a steeper growth, which might be explained by a post-war (economic) boom. It also looks like this section includes light cyclic behaviour. Other than that no cyclicity or seasonality can be observer

To make the plot easier to read, population has been scaled to millions.

## 1.2)

```{r 12a, fig.height = 3}
afg_pop_fit <- afg_pop %>%
        model(
          Linear = TSLM(Population ~ trend()),
          Piecewise = TSLM(Population ~ trend(knots=sov_afg_war))
        )

afg_pop %>% autoplot(Population) + #plot
        labs(y = "Mln. population", title = "Afghan Population + Predictions") +
        geom_line(data = fitted(afg_pop_fit),
                  aes(y = .fitted, colour = .model), size = 0.8)
afg_pop_fit %>% glance() %>% select(.model , r_squared , adj_r_squared , AICc, CV) %>% gt %>%
        tab_header(
                title = md("**Forecasting model accuracies**")
        ) %>%
        opt_align_table_header(align = "center")#table
```

Clearly the _linear model_ is a poor fit to the actual data. The _piecewise model_, however looks like a decent estimate. A plot of both models' residuals (found in Appendix 1.2) reveals that like expected the _linear model_'s residuals look nothing like white noise. There is a strong upward trend in the residuals from 1989 onwards introducing an increasing bias.

Although the _piecewise model_'s residuals look better, they still do no behave like white noise.

A Box-Ljung test with 2 DoF for the _linear model_ ($\beta_0$ and $\beta_1$) and another with 6 DoF for the _piecewise model_ ($\beta_0$ and $\beta_1$ for all three sections) result in p-values $< 0.05$ which means that we can reject the accompanying null hypothesis that the residuals are zero. Simply put neither model is able to describe all the information in the timeseries.

The accuracies table shows us that the _piecewise model_ does indeed do much better than the simpler _linear model_, because it has a higher $R^2$ (almost 1) and much lower AICc and CV scores.

```{r 12b, eval=F}
gg_tsresiduals(select(afg_pop_fit, Linear))
gg_tsresiduals(select(afg_pop_fit, Piecewise))

rbind ((augment(afg_pop_fit %>% select(Linear)) %>%
        features(.innov, ljung_box, lag=10, dof=2)),
      (augment(afg_pop_fit %>% select(Piecewise)) %>%
        features(.innov, ljung_box, lag=10, dof=6))) %>%
        gt() %>%
        tab_header(
                title = md("**Box-Ljung for the linear and piecewise model**"),
                subtitle = md("Linear with 2 DoF, Piecewise with 6 DoF")
        ) %>%
        opt_align_table_header(align = "center")
```

## 1.3)

```{r 13, fig.height = 3}
afg_pop_fc <- forecast(afg_pop_fit, h=5)
afg_pop %>%
        filter(Year >= 2002) %>%
        autoplot(Population) +
        geom_line(data = fitted(afg_pop_fit) %>% filter(Year >= 2002),
                  aes(y = .fitted, colour = .model), size = 0.8) +
        autolayer(afg_pop_fc, level = 95, size = 0.8) +
         labs(title = "Afghan Population", ylab = "Mln. population")
```

The prediction of the _piecewise model_ looks decent, when eyeballing the graph, it seems to follow the latest trend and it has a small confidence interval. The _linear model_ produces a terrible forecast, it starts a little over 5 million below the last known value, a very unlikely jump for the data to make and additionally has a confidence interval of over 10 million. The _piecewise model_ definitely outperforms the _linear model_, but from the information obtained in _(1.2)_ I would suggest neither.

---

# Exercise 2

## 2.1)

```{r 21a}
#get data
data <- aus_arrivals %>%
        filter(Origin == 'NZ',
               Quarter >= yearquarter('1981 Q1'),
               Quarter <= yearquarter('2012 Q3')) %>%
        select(!Origin)

#transform to stabilize variance
lambda <- data %>%
        features(Arrivals, features = guerrero) %>%
        pull(lambda_guerrero)
data <- data %>%
        mutate(Arrivals_transformed = BoxCox(Arrivals, lambda))

#plot
p1 <- data %>% autoplot(Arrivals_transformed)  +
        labs(y = "Transformed Arrivals (count)", title = latex2exp::TeX(paste0(
                "Transformed Arrivals New Zealand -> Australia with $\\;\\;\\lambda$ = ",
                round(lambda, 2))))

#test,train split
test <- data %>% filter(Quarter >= yearquarter('2010 Q3'))
train <- data %>% filter(Quarter < yearquarter('2010 Q3'))

#train model
fit <- train %>%
        model(multiplicative = ETS(Arrivals_transformed ~ error("M") + trend("A") + season("M")))
#fit model
fc <- fit %>%
         forecast(h = "2 years")
#plot fit
p2 <- fc %>% autoplot((data %>% filter(Quarter >= yearquarter('2008 Q1'))), level = NULL) +
        labs(y = "Transformed Arrivals (count)", title = latex2exp::TeX(paste0(
                "Holt-Winters Multiplicative prediction for Transformed Arrivals New Zealand -> Australia with $\\;\\;\\lambda$ = ",
                round(lambda, 2)))) +
        guides(colour = guide_legend(title = "Forecast"))

grid.arrange(p1,p2,ncol=1)
```

```{r 21b, eval=F}
fit %>% report()
data %>% select(Arrivals_transformed) %>% acf(lag.max = 4*6)
```

We observe an obvious seasonal pattern, The ACF (in appendix 2.1) helps to establish this. The seasonal period is a year long, starting low in Q1 then making a jump for Q2, Q3 & Q4, with Q3 usually being the highest out of them all. This is likely explained by the timing of holidays and nicer weather.

Furthermore there is a positive trend and growing variance over levels. Because this growing variance makes predicting harder a Box-Cox transformation was performed with $\lambda = 0.32$, this helps stabilize the variance a lot. The data contains cyclicity over levels, 4 of which can be seen in the sample.

A multiplicative method for Holt-Winters model makes sense because the seasonal variations are changing proportionally to the level of the series, i.e. we see the variance grow as the trendline rises, even with the transformation. Because we have quarterly data we set $m=4$.

A model report (Appendix 2.1) reveal the following smoothing parameters:\
- $\alpha = 0.636$ : level smoothing is applied so there is an update in the levels\
- $\beta\approx0$ : no trend smoothing required so we have a stable linear trend\
- $\gamma=0.221$ : a small amount of seasonal smoothing means that the seasonal component hardly changes.

## 2.2)

Next we fit the following four models to the data, without prior transformation this time.

| **Model name**              | **Description**                                                                                   |
|-----------------------------|---------------------------------------------------------------------------------------------------|
| AutoETS                     | ETS model where all the parameters are automatically chosen                                       |
| SNaive                      | Seasonal Naïve model with a 1 year lag                                                            |
| Additive log ETS            | ETS model that is fixed to additive applied to log transformed data                               |
| Seasonally adjusted AutoETS | An AutoETS model applied to the seasonally adjusted component of an STL decomposition of the data |

```{r 22a, fig.height = 3}
fit_tr <- train %>% model(
        "AutoETS" = ETS(Arrivals),
        "SNaive" = SNAIVE(Arrivals ~ lag("year")),
        "Additive log ETS" = ETS(log(Arrivals) ~ error("A") + trend("A") + season("A")),
        "Seasonally adjusted AutoETS" = decomposition_model(
                STL(log(Arrivals)),
                ETS(season_adjust)
        )
)

fc_tr <- fit_tr %>%
        forecast(h = "2 years")

fc_tr %>% autoplot((data %>% filter(Quarter >= yearquarter('2008 Q1'))), level = NULL) +
        labs(title="New Zealand -> Australia Flights",
             y="Arrivals (count)") +
        guides(colour = guide_legend(title = "Forecast"))

acc_fc_tr <- accuracy(fc_tr, test) %>%
        select(c(".model", "RMSE", "MAE", "MPE", "MAPE")) %>%
        arrange(RMSE)
acc_fit_tr <- accuracy(fit_tr) %>%
        select(c(".model", "RMSE", "MAE", "MPE", "MAPE")) %>%
        arrange(RMSE)
acc_fit_tr %>% gt() %>%
        tab_header(
                title = md("**Model _fit_ accuracies**"),
                subtitle = md("as fitted to the _New Zealand -> Australia_ flight arrivals (RMSE sorted)")
        ) %>%
        opt_align_table_header(align = "center")
acc_fc_tr %>% gt() %>%
        tab_header(
                title = md("**Model _forecasting_ accuracies**"),
                subtitle = md("as fitted to the _New Zealand -> Australia_ flight arrivals (RMSE sorted)")
        ) %>%
        opt_align_table_header(align = "center")
```

From the fit and forecast accuracy tables we can draw some inferences, they are ordered by RMSE. _Seasonally adjusted AutoETS_ comes out on top for model fit, however it comes in third with forecasting accuracy. _Auto ETS_ does pretty well in both, getting first place for forecasts and second for fit.

When reporting the parameters of these two best models (Appendix 2.2) you should note that the _AutoETS_ model picked near identical to the parameters of Holt-Winters multiplicative model. The residuals plots for the top ranking forecasting model (almost) resembles white noise. Although they have decreasing variance over time and a somewhat skewed histogram, all the correlations of the ACF are within the confidence interval.
A Box-Ljung test (Appendix 2.2) with DoF = 6 for all estimated parameters, and 10 lags results in $p\approx 0.168 > \alpha$ where $\alpha = 0.05$. This allows us to not reject the weak null-hypothesis that all residuals are zero. This means that the _AutoETS_ model captures most of available information. Because the _AutoETS model_ is essentially equivalent to the _Holt-Winters Multiplicative model_ from _(2.1)_ we conclude the same for both.

```{r 22b, eval=F}
fit_tr %>% select(AutoETS) %>% report()
fit_tr %>% select("Seasonally adjusted AutoETS") %>% report()

gg_tsresiduals(fit_tr %>% select("AutoETS"))

augment(fit_tr %>% select(AutoETS)) %>%
        features(.innov, ljung_box, lag=10, dof=6)
```


## 2.3)

```{r 23}
# Creating the sets for cross-validation
train_cv <- data %>%
        stretch_tsibble(.init = 4*10, .step = 1) %>%
        relocate(Quarter, Arrivals, .id)

# CV accuracy
fit_cv <- train_cv %>% model(
        "CV AutoETS" = ETS(Arrivals),
        "CV SNaive" = SNAIVE(Arrivals),
        "CV Additive log ETS" = ETS(log(Arrivals) ~ error("A") + trend("A") + season("A")),
        "CV Seasonally adjusted autoETS" = decomposition_model(
                STL(log(Arrivals) ~ trend() + season(window = 'periodic'), robust = T),
                ETS(season_adjust)))
fc_cv <- fit_cv %>% forecast(h = "2 years")
acc_cv <- accuracy(fc_cv, data) %>%
        select(c(".model", "RMSE", "MAE", "MPE", "MAPE"))
rbind(acc_cv, acc_fc_tr) %>%
        arrange(RMSE)%>%
        gt() %>%
        tab_header(
                title = md("**Forecasting model accuracies: Cross Validated (CV) vs the models from 2.2 **"),
                subtitle = md("as fitted to the _New Zealand -> Australia_ flight arrivals (RMSE sorted)")
        ) %>% opt_align_table_header(align = "center")
```

the initial amount of data-points for cross validation is selected as 40, 30% of the data rounded up to the next full seasonal period.

Error went up for each model when using cross validation, this is not surprising as the calculated accuracies are aggregates from the same model also trained on way fewer data, taking down the average.

Seasonally adjusted auto ETS would probably make the best forecasting model because it has the best accuracy on the multi-step forecast environment, meaning that it is likely to generalize the best out of all four.

---

# Exercise 3

## 3.1)

```{r 31a, fig.height = 3}
usgdp <- global_economy %>%
        filter(Country == "United States") %>%
        mutate(GDP = (GDP/Population)*100) %>%
        select(Year, GDP)

lambda <- usgdp %>%
        features(GDP, features = guerrero) %>%
        pull(lambda_guerrero)

usgdp %>% autoplot(BoxCox(GDP, lambda)) +
        labs(y = "US$", title = latex2exp::TeX(paste0(
                "GDP per capita per year for the United States with $\\;\\;\\lambda$ = ",
                round(lambda, 2))))

```

The original is transformed to per-capita, to account for population growth, an inflation transformation is left aside because that seems out of the scope of the question. After the per capita transformation, the data still has a non-linear (lightly exponential) trend, so a Box-Cox transformation is applied. Using the guerrero method we find $\lambda = 0.39$ which approximates a root transformation. The data now displays a more linear upward trend with a drop in 2008, likely correlated with the 2008 financial crisis. There is no seasonality present nor large enough variation in trend to speak of cyclicity.

```{r 31b}
usgdp_fit <- usgdp %>% model ("ARIMA" = ARIMA(box_cox(GDP, lambda)))
p2 <- usgdp_fit %>% residuals() %>% ACF() %>% autoplot() + labs(title = "ACF of auto ARIMA residuals")

us_recessions <- c(1960, 1970,1973,1980,1990,2001,2008) #https://www.thebalance.com/the-history-of-recessions-in-the-united-states-3306011

p1 <- usgdp %>% autoplot(GDP) + #plot
        labs(y = "US$", title = "auto ARIMA fitted to transformed US GDP", subtitle = "lambda = 0.39; vertical dashed lines indicate recessions") +
        geom_line(data = fitted(usgdp_fit),
                  aes(y = .fitted, colour = .model), size = 0.8) +
        geom_vline(xintercept = us_recessions, colour = "gray", linetype = "longdash")


grid.arrange(p1, p2, ncol = 1)
```

We fit an auto ARIMA model to the transformed data which results in an `ARIMA(1,1,0) with dirft` model. It has first order Auto-Regressive and first order differencing part, to reach a stationary result. The model follows the data closely but struggles a little around the hitches that coincide around the recessions (marked by vertical dashed lines) likely because they are unpredictable events (in the scope of this time series). But, the residuals of the model behave nicely like white noise.

## 3.2)

The _General process for forecasting using an ARIMA model_ flowchart in Chapter 9.7 from the Forecasting: Principles and Practice (3rd edition) will be used to identify a plausible ARIMA model. Steps one and two have been conducted in _(3.1)_: The data was plotted, inspected and transfomed with a Box-Cox because it was appropriate.

```{r 32a}
usgdp %>% gg_tsdisplay(difference(box_cox(GDP, lambda)), plot_type = 'partial') + labs(title = 'First order difference residuals transformed US GDP')
usgdp %>% features(difference(box_cox(GDP, lambda)), unitroot_kpss) %>% gt() %>%
        tab_header(
                title = md("**Unitroot KPSS test**")
        ) %>% opt_align_table_header(align = "center")
```

The next step is to inspect the pACF and ACF of the first order difference. We immedeatly see a positive spike for the first lag, which hints that the data is non-stationary. However, the unitroot KPSS test results in $p=0.1 > \alpha$ which does not let us reject the null hypothesis that the time series is stationary. Since the data is non-stationary we should try $d=1$, and because of the large positive $r_1$ we should either pick $p=1 \vee q=1$. So we try both `ARIMA(1,1,0)` and `ARIMA(0,1,1)`. Notice that the first on is the model obtained from running the auto ARIMA in _(3.3)_

```{r  32b}
usgdp_arima_fit <- usgdp %>% model ("ARIMA(1,1,0)" = ARIMA(box_cox(GDP, lambda) ~ pdq(1,1,0)),
                                    "ARIMA(0,1,1)" = ARIMA(box_cox(GDP, lambda) ~ pdq(0,1,1)))
usgdp_arima_fit %>% report() %>% select(-c(ar_roots,ma_roots)) %>% gt() %>%
        tab_header(
                title = md("**Manual ARIMA model comparison**")
        ) %>% opt_align_table_header(align = "center")

augment(usgdp_arima_fit %>% select("ARIMA(1,1,0)")) %>%
        features(.innov, ljung_box, lag=10, dof=2) %>%
        gt() %>%
        tab_header(
        title = md("**Box-Ljung for the `ARIMA(1,1,0)`**")
        ) %>%
        opt_align_table_header(align = "center")
```

From the model comparison we can conclude that `ARIMA(1,1,0)` outperforms `ARIMA(0,1,1)` in all four categories. Considering that the better model is exactly the same as the model from _(3,2)_ the same conclusion about it's residuals apply: they look like white noise. A Ljung-Box test with DoF=2 and 10 lags gives us $p=0.89 > \alpha$ and we cannot _reject_ the null hypothesis that the residuals are 0. In other words, the residuals behave like white noise and the model captures all (or enough) information in the time series.

## 3.3)

```{r 33, fig.height = 3}

fit <- usgdp %>% model(
        ARIMA = ARIMA(box_cox(GDP, lambda) ~ pdq(p=1, d=1, q=0)),
        ETS = ETS(GDP)
)
fc <- fit %>% forecast(h=20)

usgdp %>%
        autoplot(GDP) +
        geom_line(data = fitted(fit),
                  aes(y = .fitted, colour = .model), size = 0.8) +
        autolayer(fc, level = 90, size = 0.8) +
        labs(title = "US GDP", ylab = "$US")
```

Comparing the model derived in _(3.3)_ to an automatically generated ETS shows the ARIMA fitting the true data better than the ETS, even if only slightly. The bigger difference comes in the 20 year forecasts. The ETS makes a linear prediction, while the ARIMA model continues the exponential trend observed in the data. Next to that the confidence interval of the ETS is much larger and grows faster.

---

# Appendix

_Note: Due to an error in Knitr that I have not been able to fix, ggplot titles and labels are not being wrapped properly. However, you are able to read the contents in the graphs just fine._

## Setup code

```{r, ref.label = 'setup',eval=F,echo=T}
```

## 1.1 code

```{r, ref.label = '11',eval=F,echo=T}
```

## 1.2 code

```{r, ref.label = '12a',eval=F,echo=T}
```

```{r, ref.label = '12b',echo=T}
```

## 1.3 code

```{r, ref.label = '13',eval=F,echo=T}
```

---

## 2.1 code

```{r, ref.label = '21a',eval=F,echo=T}
```

```{r, ref.label = '21b',echo=T}
```

## 2.2 code

```{r, ref.label = '22a',eval=F,echo=T}
```

```{r, ref.label = '22b',echo=T}
```

## 2.3 code

```{r, ref.label = '23',eval=F,echo=T}
```

## 3.1 code

```{r, ref.label = '31a',eval=F,echo=T}
```

```{r, ref.label = '31b',eval=F,echo=T}
```

## 3.2 code

```{r, ref.label = '32a',eval=F,echo=T}
```

```{r, ref.label = '32b',eval=F,echo=T}
```

## 3.3 code

```{r, ref.label = '33',eval=F,echo=T}
```