Playground of forecasting in R using Prophet.
Perform fine-grained forecasting in an efficient manner, leveraging the distributed computational power of the Databricks Platform.
Prophet is a forecasting tool that uses an additive model where non-linear trends are fit with yearly, weekly, and daily seasonality, plus holiday effects. It works best with time series that have strong seasonal effects and several seasons of historical data. The library is especially suited for business forecasts like forecasting website traffic, sales, and inventory.
The original paper from the FB team is in the Forecasting_at_Scale.pdf file.
For the tests the Store Item Demand Forecasting from Kaggle is used. The data are in the dataset
directory.
Keggle Dataset: https://www.kaggle.com/datasets/aswinkum/demand-forecasting-kernels-only
The dataset contains 5-years of store-item unit sales data for 50 items across 10 different stores (913,000 observations).
The source file in R for the forecasting is in Forecasting.R.
Read the comments for step-by-step execution of the forecast and plotting.
The DataBricks notebook for forecasting at scale is in the file OE_Demand_Forecasting.dbc.
500 forecasts are generated in the notebook.
Upload the ./dataset/train.csv
file to the catalog as a new table using the Data Ingestion
of the DataBricks main interface.
Change the notebook code below accordingly:
train_data = spark_read_csv(sc, name = "train_data", path = "/Volumes/industry_solutions/esg_scoring/data/train.csv")
The auto-generated DataBricks notebook with the best AutoML model is in the file AutoML-24-04-30-16_08-Prophet.dbc
With this file is possible to re-run the experiment on forecasting 500 time series and see the results.