Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-consider IC output for refitted models #269

Open
TimothyHyndman opened this issue May 18, 2020 · 6 comments
Open

Re-consider IC output for refitted models #269

TimothyHyndman opened this issue May 18, 2020 · 6 comments

Comments

@TimothyHyndman
Copy link
Contributor

Using the latest github version of fable here (but issue is also present on CRAN version).

library(tsibble)
library(fable)
#> Loading required package: fabletools
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union

lung_deaths <- as_tsibble(mdeaths)

# Fit ARIMA on first part of timeseries.
ts1 <- lung_deaths %>% filter(index <= as.Date("1977-01-01"))
#> Warning in mask$eval_all_filter(dots, env_filter): Incompatible methods
#> ("<=.vctrs_vctr", "<=.Date") for "<="
fit <- ts1 %>%
  model(ARIMA(value ~ 1 + pdq(1,0,0) + PDQ(0,0,0)))

# Refit ARIMA with one more observation
ts2 <- lung_deaths %>% filter(index > as.Date("1977-01-01"))
#> Warning in mask$eval_all_filter(dots, env_filter): Incompatible methods
#> (">.vctrs_vctr", ">.Date") for ">"
fit %>%
  refit(ts2 %>% head(1)) %>%
  report()
#> Series: value 
#> Model: ARIMA(1,0,0) w/ mean 
#> 
#> Coefficients:
#>          ar1  constant
#>       0.7851  364.8801
#> s.e.  0.1030   43.9553
#> 
#> sigma^2 estimated as 1579:  log likelihood=-5.58
#> AIC=13.16   AICc=9.16   BIC=11.16

# Refit ARIMA with two more observations
fit %>%
  refit(ts2 %>% head(2)) %>%
  report()
#> Warning: It looks like you're trying to fully specify your ARIMA model but have not said if a constant should be included.
#> You can include a constant using `ARIMA(y~1)` to the formula or exclude it by adding `ARIMA(y~0)`.
#> Error: Problem with `mutate()` input `ARIMA(value ~ 1 + pdq(1, 0, 0) + PDQ(0, 0, 0))`.
#> x Could not find an appropriate ARIMA model.
#> This is likely because automatic selection does not select models with characteristic roots that may be numerically unstable.
#> For more details, refer to https://otexts.com/fpp3/arima-r.html#plotting-the-characteristic-roots
#> ℹ Input `ARIMA(value ~ 1 + pdq(1, 0, 0) + PDQ(0, 0, 0))` is `(function (object, ...) ...`.

Created on 2020-05-18 by the reprex package (v0.3.0)

@mitchelloharawild
Copy link
Member

@robjhyndman
This is coming from here (which comes from forecast::auto.arima())

fable/R/arima.R

Line 153 in 64b36c6

npar <- length(new$coef[new$mask]) + 1

Any reason why npar is the number of estimated coefficients +1?

@mitchelloharawild
Copy link
Member

Also @TimothyHyndman, when you say "Refit ARIMA with two more observations", you're refitting the model with only two observations (not two more observations). You'll need to refit the model with the historical data and the next two observations (or use stream.ARIMA once #251 is done).

@robjhyndman
Copy link
Member

npar = # coefficients + 1 to account for the residual variance. This is how the AIC/AICc/BIC defines # parameters in a model.

@mitchelloharawild
Copy link
Member

Got it, thanks.

fable/R/arima.R

Line 159 in 64b36c6

new$aicc <- new$aic + 2 * npar * (npar + 1) / (nstar - npar - 1)

Then in this case (n=2, d=0, D=0), nstar = 2 giving division by (nstar - npar - 1) = 0. Is this correct?

@robjhyndman
Copy link
Member

It is problematic computing AICc on a refit because the parameters were not estimated on that data set. If the data was used for estimation, then the equation is correct. But you would expect nstar to always be bigger than npar+1 or you would be over-fitting.

I'm not sure what we should return as AICc value on a refit -- possibly the original AICc on the original data, or perhaps NA or NULL.

@TimothyHyndman
Copy link
Contributor Author

Also @TimothyHyndman, when you say "Refit ARIMA with two more observations", you're refitting the model with only two observations (not two more observations). You'll need to refit the model with the historical data and the next two observations (or use stream.ARIMA once #251 is done).

Ah, I was thinking that refit used the data contained in fit in addition to the data passed in with new_data = . Thanks for the correction.

@mitchelloharawild mitchelloharawild changed the title refit.ARIMA trying to perform model selection and sometimes fails when refitting with a small number of new observations Re-consider IC output for refitted models Jun 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants