Obtained different forecasts with same data and params #112

rogerwzeng · 2021-12-08T13:39:27Z

rogerwzeng
Dec 8, 2021

Hi,
The AutoTS package looks great and super simple to use. I just ran monthly sales of our 5 stores for the past 3 years through it and ask the model to predict the monthly sales for the next 12 months. I ran the same training data set and use the same model parameters (mostly default) twice, but the results varied about 20% the first and second time. I used multivariate modeling with the monthly sales of the 5 stores as covariates.

Any sugguestion on where I should look or change in approach to get more consistent predictions?

Thanks!

RZ

winedarksea · 2021-12-08T14:48:48Z

winedarksea
Dec 8, 2021
Maintainer

Are you generating the forecasts with AutoTS.predict() or are you using model_forecast?

It sounds like you are using AutoTS and yes, that will generate an entirely different model each time it is run because it is searching many different models. Seeing such a large difference is a pretty good sign that the search hasn't converged yet, try increasing max_generations a good bit. Also try adjusting the type of validation and the number of validation rounds.

Once you've found a model you like, you can pull out the exact model template (see extended_tutorial) and run it with model_forecast which should generate the same results every time (there might be slight variation for some models there but not more than a percent or two).

Let me know if that helps!

2 replies

rogerwzeng Dec 9, 2021
Author

I used AutoTS.predcit() to generate the forecasts, so it was probably a issue with non-convergence here. I will try with a higher max_generation and report back the results.

As well, what would be the best way to process multivariate time series that have varying length? We have sales data for multiple stores that have different operating history (store opened at different dates) . For example,

monthIdx	store1	store2	store3
01-21	30	NaN	NaN
02-21	40	NaN	NaN
03-21	50	10	NaN
04-21	60	30	30
05-21	70	5	45
06-21	60	0	NaN
07-21	50	30	NaN
08-21	40	30	50
09-21	60	0	30
10-21	90	50	20
11-21	80	50	40

Is there a way to run multivariate time series forecast with correlation on this type of data without dropping the months that have NaN because some store wasn't open?

Thanks!

winedarksea Dec 9, 2021
Maintainer

Firstly, once you've found a model that you like, I do encourage you to extract the best_model and run it in model_forecast.

As for the data question, as long as the data is like you show, it shouldn't be a problem. AutoTS is built to handle NaN, and many series I work with look like yours above. It automatically tests different methods for handling the NaN and then uses the methods that lead to the highest accuracy forecasts in cross validation.

When it can be a problem is when a store is currently closed but plans to reopen - so all of the most recent data is NaN. That should be handled automatically by AutoTS, but not quite as well.
Something you can try is passing a future_regressor with historical store open hours and in the future, planned open hours.
Ultimately, if some of your stores are very different from the others, it might be worth modeling them separately in a different AutoTS run.

rogerwzeng · 2021-12-10T06:31:35Z

rogerwzeng
Dec 10, 2021
Author

Hi Colin, I set up an univariate time series with one store and ran the model with the following parameters. I asked the model to predict sales for last three months and plotted against the actuals. The predictions were quite a bit lower than the actuals (see plot below). What parameters should I tweak here? The model actually performed better in predicting 12-month forward without dropping any data, but then I have no way of validate against actuals in the past. Or do I need to split data into training and test set first? Thanks, Roger

…

-- Sent from my ThinkPad From: Colin Catlin Sent: Thursday, December 9, 2021 4:24 PM To: winedarksea/AutoTS Cc: rogerwzeng; Author Subject: Re: [winedarksea/AutoTS] Obtained different forecasts with same data and params (Discussion #112) Firstly, once you've found a model that you like, I do encourage you to extract the best_model and run it in model_forecast. As for the data question, as long as the data is like you show, it shouldn't be a problem. AutoTS is built to handle NaN, and many series I work with look like yours above. It automatically tests different methods for handling the NaN and then uses the methods that lead to the highest accuracy forecasts in cross validation. When it can be a problem is when a store is currently closed but plans to reopen - so all of the most recent data is NaN. That should be handled automatically by AutoTS, but not quite as well. Something you can try is passing a future_regressor with historical store open hours and in the future, planned open hours. Ultimately, if some of your stores are very different from the others, it might be worth modeling them separately in a different AutoTS run. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

1 reply

winedarksea Dec 10, 2021
Maintainer

I don't see the plot, but still have some ideas.

Try adjusting metric_weighting to make sure it chooses a model based on your prioriities.
Try adjusting cross validation settings. I like the "similarity" and "seasonal" methods especially. Having cross validation segments that represent your window of interest is critical - if your model is chosen, say, only on validation windows from the winter, it wouldn't be surprising if it then does poorly in summer, for example. No, you don't need to split into train and test, that is handled automatically by autots (num_validations + 1 times).
If you don't have much data, cross validation is going to be limited and that can make model selection difficult.
There is a back_forecast() function with args if you want to view the performance in past cross validations (it can be slow to run)
Ultimately, predicting the future is impossible to do perfectly. If you are evaluating on only one segment, there's a good chance that a model can 'get lucky'. Models can also only predict what they can see - although adding a future_regressor or two can help.

rogerwzeng · 2021-12-11T00:28:28Z

rogerwzeng
Dec 11, 2021
Author

Hi Colin, Back_forecast() and future_regressor() sounds interesting. Good call! I’ll look them up and give it a try. Thanks again! Roger

…

-- Sent from my ThinkPad From: Colin Catlin Sent: Saturday, December 11, 2021 12:22 AM To: winedarksea/AutoTS Cc: rogerwzeng; Author Subject: Re: [winedarksea/AutoTS] Obtained different forecasts with same data and params (Discussion #112) I don't see the plot, but still have some ideas. 1. Try adjusting metric_weighting to make sure it chooses a model based on your prioriities. 2. Try adjusting cross validation settings. I like the "similarity" and "seasonal" methods especially. Having cross validation segments that represent your window of interest is critical - if your model is chosen, say, only on validation windows from the winter, it wouldn't be surprising if it then does poorly in summer, for example. No, you don't need to split into train and test, that is handled automatically by autots (num_validations + 1 times). 3. If you don't have much data, cross validation is going to be limited and that can make model selection difficult. 4. There is a back_forecast() function with args if you want to view the performance in past cross validations (it can be slow to run) 5. Ultimately, predicting the future is impossible to do perfectly. If you are evaluating on only one segment, there's a good chance that a model can 'get lucky'. Models can also only predict what they can see - although adding a future_regressor or two can help. — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Obtained different forecasts with same data and params #112

{{title}}

Replies: 3 comments 3 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Obtained different forecasts with same data and params #112

rogerwzeng Dec 8, 2021

Replies: 3 comments · 3 replies

winedarksea Dec 8, 2021 Maintainer

rogerwzeng Dec 9, 2021 Author

winedarksea Dec 9, 2021 Maintainer

rogerwzeng Dec 10, 2021 Author

winedarksea Dec 10, 2021 Maintainer

rogerwzeng Dec 11, 2021 Author

rogerwzeng
Dec 8, 2021

Replies: 3 comments 3 replies

winedarksea
Dec 8, 2021
Maintainer

rogerwzeng Dec 9, 2021
Author

winedarksea Dec 9, 2021
Maintainer

rogerwzeng
Dec 10, 2021
Author

winedarksea Dec 10, 2021
Maintainer

rogerwzeng
Dec 11, 2021
Author