Skip to content

Commit 2b6d656

Browse files
added-license-notebook
1 parent 99201e7 commit 2b6d656

17 files changed

+105
-49
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -186,7 +186,7 @@ We encourage you to read through [examples/global_daily.py](https://github.com/d
186186

187187
### Foundation Models
188188

189-
Foundation time series models are transformer based models pretrained on millions or billions of time series. These models can produce analysis (i.e. forecasting, anomaly detection, classfication) on an unforeseen time series without training or tuning. We support open source models from multiple sources: [chronos](https://github.com/amazon-science/chronos-forecasting), [moirai](https://blog.salesforceairesearch.com/moirai/), and [moment](https://github.com/moment-timeseries-foundation-model/moment). Covariates (i.e. exogenous regressors) and fine-tuning are currently not yet supported. This is a rapidly changing field, and we are working on updating the supported models and new features as the field evolves.
189+
Foundation time series models are transformer based models pretrained on millions or billions of time points. These models can produce analysis (i.e. forecasting, anomaly detection, classification) on an unforeseen time series without training or tuning. We support open source models from multiple sources: [chronos](https://github.com/amazon-science/chronos-forecasting), [moirai](https://blog.salesforceairesearch.com/moirai/), and [moment](https://github.com/moment-timeseries-foundation-model/moment). Covariates (i.e. exogenous regressors) and fine-tuning are currently not yet supported. This is a rapidly changing field, and we are working on updating the supported models and new features as the field evolves.
190190

191191
To get started, attach the [examples/foundation_daily.py](https://github.com/databricks-industry-solutions/many-model-forecasting/blob/main/examples/foundation_daily.py) notebook to a cluster running [DBR 14.3 LTS for ML](https://docs.databricks.com/en/release-notes/runtime/index.html) or later versions. We recommend using a single-node cluster with multiple GPU instances such as [g4dn.12xlarge [T4]](https://aws.amazon.com/ec2/instance-types/g4/) on AWS or [Standard_NC64as_T4_v3](https://learn.microsoft.com/en-us/azure/virtual-machines/nct4-v3-series) on Azure. Multi-node setup is currently not supported.
192192

examples/foundation_daily.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,8 +80,9 @@ def transform_group(df):
8080

8181
# COMMAND ----------
8282

83-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
84-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
83+
catalog = "mmf" # Name of the catalog we use to manage our assets
84+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
85+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
8586

8687
# COMMAND ----------
8788

@@ -147,7 +148,7 @@ def transform_group(df):
147148
dbutils.notebook.run(
148149
"run_daily",
149150
timeout_seconds=0,
150-
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id})
151+
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id, "user": user})
151152

152153
# COMMAND ----------
153154

examples/foundation_monthly.py

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,11 @@ def transform_group(df):
8484

8585
# COMMAND ----------
8686

87-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
88-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
87+
catalog = "mmf" # Name of the catalog we use to manage our assets
88+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
89+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
90+
91+
# COMMAND ----------
8992

9093
# Making sure that the catalog and the schema exist
9194
_ = spark.sql(f"CREATE CATALOG IF NOT EXISTS {catalog}")
@@ -145,7 +148,7 @@ def transform_group(df):
145148
dbutils.notebook.run(
146149
"run_monthly",
147150
timeout_seconds=0,
148-
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id})
151+
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id, "user": user})
149152

150153
# COMMAND ----------
151154

examples/global_daily.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,8 +82,9 @@ def transform_group(df):
8282

8383
# COMMAND ----------
8484

85-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
86-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
85+
catalog = "mmf" # Name of the catalog we use to manage our assets
86+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
87+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
8788

8889
# COMMAND ----------
8990

@@ -152,7 +153,7 @@ def transform_group(df):
152153
dbutils.notebook.run(
153154
"run_daily",
154155
timeout_seconds=0,
155-
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id})
156+
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id, "user": user})
156157

157158
# COMMAND ----------
158159

examples/global_external_regressors_daily.py

Lines changed: 6 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -46,9 +46,10 @@
4646

4747
# COMMAND ----------
4848

49-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
50-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
51-
volume = "rossmann" # Name of the volume where you have your rossmann dataset csv sotred
49+
catalog = "mmf" # Name of the catalog we use to manage our assets
50+
db = "rossmann" # Name of the schema we use to manage our assets (e.g. datasets)
51+
volume = "csv" # Name of the volume where you have your rossmann dataset csv sotred
52+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
5253

5354
# COMMAND ----------
5455

@@ -65,7 +66,7 @@
6566

6667
# Number of time series to sample
6768
sample = True
68-
size = 100
69+
size = 1000
6970
stores = sorted(random.sample(range(0, 1000), size))
7071

7172
train = spark.read.csv(f"/Volumes/{catalog}/{db}/{volume}/train.csv", header=True, inferSchema=True)
@@ -136,7 +137,7 @@
136137
dbutils.notebook.run(
137138
"run_external_regressors_daily",
138139
timeout_seconds=0,
139-
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id})
140+
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id, "user": user})
140141

141142
# COMMAND ----------
142143

examples/global_monthly.py

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -84,8 +84,9 @@ def transform_group(df):
8484

8585
# COMMAND ----------
8686

87-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
88-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
87+
catalog = "mmf" # Name of the catalog we use to manage our assets
88+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
89+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
8990

9091
# COMMAND ----------
9192

@@ -148,7 +149,7 @@ def transform_group(df):
148149
dbutils.notebook.run(
149150
"run_monthly",
150151
timeout_seconds=0,
151-
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id})
152+
arguments={"catalog": catalog, "db": db, "model": model, "run_id": run_id, "user": user})
152153

153154
# COMMAND ----------
154155

examples/licenses.py

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,34 @@
1+
# Databricks notebook source
2+
# MAGIC %md
3+
# MAGIC © 2024 Databricks, Inc. All rights reserved.
4+
# MAGIC
5+
# MAGIC The sources in all notebooks in this directory and the sub-directories are provided subject to the Databricks License. All included or referenced third party libraries are subject to the licenses set forth below.
6+
# MAGIC
7+
# MAGIC | library | description | license | source |
8+
# MAGIC |----------------------------------------|-------------------------|------------|-----------------------------------------------------|
9+
# MAGIC | rpy2 | Python interface to the R language (embedded R) | GNU General Public License v2 or later | https://pypi.org/project/rpy2/
10+
# MAGIC | kaleido | Static image export for web-based visualization libraries with zero dependencies | MIT | https://pypi.org/project/kaleido/
11+
# MAGIC | fugue | An abstraction layer for distributed computation | Apache 2.0 | https://pypi.org/project/fugue/
12+
# MAGIC | Jinja2 | A very fast and expressive template engine | BSD | https://pypi.org/project/Jinja2/
13+
# MAGIC | omegaconf | A flexible configuration library | BSD | https://pypi.org/project/omegaconf/
14+
# MAGIC | missingno | Missing data visualization module for Python | MIT | https://pypi.org/project/missingno/
15+
# MAGIC | datasetsforecast | Datasets for Time series forecasting | MIT | https://pypi.org/project/datasetsforecast/
16+
# MAGIC | statsforecast | Time series forecasting suite using statistical models | Apache 2.0 | https://pypi.org/project/statsforecast/
17+
# MAGIC | neuralforecast | Time series forecasting suite using deep learning models | Apache 2.0 | https://pypi.org/project/neuralforecast/
18+
# MAGIC | fable | Forecasting Models for Tidy Time Series | GPL-3 | https://cran.r-project.org/web/packages/fable/index.html
19+
# MAGIC | fabletools | Core Tools for Packages in the 'fable' Framework | GPL-3 | https://cran.r-project.org/web/packages/fabletools/index.html
20+
# MAGIC | feasts | Feature Extraction and Statistics for Time Series | GPL-3 | https://cran.r-project.org/web/packages/feasts/index.html
21+
# MAGIC | lazyeval | Lazy (Non-Standard) Evaluation | GPL-3 | https://cran.r-project.org/web/packages/lazyeval/index.html
22+
# MAGIC | tsibble | Tidy Temporal Data Frames and Tools | GPL-3 | https://cran.r-project.org/web/packages/tsibble/index.html
23+
# MAGIC | urca | Unit Root and Cointegration Tests for Time Series Data | GPL-3 | https://cran.r-project.org/web/packages/urca/index.html
24+
# MAGIC | sktime | A unified framework for machine learning with time series | BSD 3-Clause | https://pypi.org/project/sktime/
25+
# MAGIC | tbats | BATS and TBATS for time series forecasting | MIT | https://pypi.org/project/tbats/
26+
# MAGIC | lightgbm | LightGBM Python Package | MIT | https://pypi.org/project/lightgbm/
27+
# MAGIC | Chronos | Pretrained (Language) Models for Probabilistic Time Series Forecasting | Apache 2.0 | https://github.com/amazon-science/chronos-forecasting
28+
# MAGIC | Moirai | Unified Training of Universal Time Series Forecasting Transformers | Apache 2.0 | https://github.com/SalesforceAIResearch/uni2ts
29+
# MAGIC | Moment | A Family of Open Time-series Foundation Models | MIT | https://github.com/moment-timeseries-foundation-model/moment
30+
# MAGIC | TimesFM | A pretrained time-series foundation model developed by Google Research for time-series forecasting | Apache 2.0 | https://github.com/google-research/timesfm
31+
32+
# COMMAND ----------
33+
34+

examples/local_univariate_daily.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -86,8 +86,9 @@ def transform_group(df):
8686

8787
# COMMAND ----------
8888

89-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
90-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
89+
catalog = "mmf" # Name of the catalog we use to manage our assets
90+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
91+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
9192

9293
# COMMAND ----------
9394

@@ -183,10 +184,10 @@ def transform_group(df):
183184
stride=10,
184185
metric="smape",
185186
train_predict_ratio=1,
186-
data_quality_check=False,
187+
data_quality_check=True,
187188
resample=False,
188189
active_models=active_models,
189-
experiment_path=f"/Shared/mmf_experiment",
190+
experiment_path=f"/Users/{user}/mmf/m4_daily",
190191
use_case_name="m4_daily",
191192
)
192193

examples/local_univariate_external_regressors_daily.py

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -38,9 +38,10 @@
3838

3939
# COMMAND ----------
4040

41-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
42-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
43-
volume = "rossmann" # Name of the volume where you have your rossmann dataset csv sotred
41+
catalog = "mmf" # Name of the catalog we use to manage our assets
42+
db = "rossmann" # Name of the schema we use to manage our assets (e.g. datasets)
43+
volume = "csv" # Name of the volume where you have your rossmann dataset csv sotred
44+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
4445

4546
# COMMAND ----------
4647

@@ -156,7 +157,7 @@
156157
active_models=active_models,
157158
data_quality_check=False,
158159
resample=False,
159-
experiment_path=f"/Shared/mmf_rossmann",
160+
experiment_path=f"/Users/{user}/mmf/rossmann_daily",
160161
use_case_name="rossmann_daily",
161162
)
162163

@@ -192,3 +193,7 @@
192193
# COMMAND ----------
193194

194195
display(spark.sql(f"delete from {catalog}.{db}.rossmann_daily_scoring_output"))
196+
197+
# COMMAND ----------
198+
199+

examples/local_univariate_monthly.py

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -91,8 +91,9 @@ def transform_group(df):
9191

9292
# COMMAND ----------
9393

94-
catalog = "solacc_uc" # Name of the catalog we use to manage our assets
95-
db = "mmf" # Name of the schema we use to manage our assets (e.g. datasets)
94+
catalog = "mmf" # Name of the catalog we use to manage our assets
95+
db = "m4" # Name of the schema we use to manage our assets (e.g. datasets)
96+
user = spark.sql('select current_user() as user').collect()[0]['user'] # User email address
9697

9798
# COMMAND ----------
9899

@@ -181,10 +182,10 @@ def transform_group(df):
181182
stride=1,
182183
metric="smape",
183184
train_predict_ratio=1,
184-
data_quality_check=False,
185+
data_quality_check=True,
185186
resample=False,
186187
active_models=active_models,
187-
experiment_path=f"/Shared/mmf_experiment_monthly",
188+
experiment_path=f"/Users/{user}/mmf/m4_monthly",
188189
use_case_name="m4_monthly",
189190
)
190191

0 commit comments

Comments
 (0)