Skip to content

Commit

Permalink
Merge pull request #71 from winedarksea/dev
Browse files Browse the repository at this point in the history
0.3.2
  • Loading branch information
winedarksea authored Jul 1, 2021
2 parents ec48749 + a10ade6 commit baed432
Show file tree
Hide file tree
Showing 59 changed files with 1,293 additions and 333 deletions.
49 changes: 31 additions & 18 deletions .github/workflows/codeql-analysis.yml
Original file line number Diff line number Diff line change
@@ -1,38 +1,51 @@
name: "Code scanning - action"
# For most projects, this workflow file will not need changing; you simply need
# to commit it to your repository.
#
# You may wish to alter this file to override the set of languages analyzed,
# or to provide custom queries or build logic.
#
# ******** NOTE ********
# We have attempted to detect the languages in your repository. Please check
# the `language` matrix defined below to confirm you have the correct set of
# supported CodeQL languages.
#
name: "CodeQL"

on:
push:
branches: [master, ]
branches: [master]
pull_request:
# The branches below must be a subset of the branches above
branches: [master]
branches: [master, dev]
schedule:
- cron: '0 6 * * 1'
- cron: '23 1 * * 4'

jobs:
CodeQL-Build:

analyze:
name: Analyze
runs-on: ubuntu-latest

strategy:
fail-fast: false
matrix:
language: [ 'python' ]
# CodeQL supports [ 'cpp', 'csharp', 'go', 'java', 'javascript', 'python' ]
# Learn more:
# https://docs.github.com/en/free-pro-team@latest/github/finding-security-vulnerabilities-and-errors-in-your-code/configuring-code-scanning#changing-the-languages-that-are-analyzed

steps:
- name: Checkout repository
uses: actions/checkout@v2
with:
# We must fetch at least the immediate parents so that if this is
# a pull request then we can checkout the head.
fetch-depth: 2

# If this run was triggered by a pull request event, then checkout
# the head of the pull request instead of the merge commit.
- run: git checkout HEAD^2
if: ${{ github.event_name == 'pull_request' }}

# Initializes the CodeQL tools for scanning.
- name: Initialize CodeQL
uses: github/codeql-action/init@v1
# Override language selection by uncommenting this and choosing your languages
# with:
# languages: go, javascript, csharp, python, cpp, java
with:
languages: ${{ matrix.language }}
# If you wish to specify custom queries, you can do so here or in a config file.
# By default, queries listed here will override any specified in a config file.
# Prefix the list here with "+" to use these queries and those in the config file.
# queries: ./path/to/local/query, your-org/your-repo/queries@main

# Autobuild attempts to build any compiled languages (C/C++, C#, or Java).
# If this step fails, then you should remove it and run the build manually (see below)
Expand Down
14 changes: 12 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,14 @@ AutoML for forecasting with open-source time series implementations.

For other time series needs, check out the list [here](https://github.com/MaxBenChrist/awesome_time_series_in_python).

## Table of Contents
* [Features](https://github.com/winedarksea/AutoTS#features)
* [Installation](https://github.com/winedarksea/AutoTS#installation)
* [Basic Use](https://github.com/winedarksea/AutoTS#basic-use)
* [Tips for Speed and Large Data](https://github.com/winedarksea/AutoTS#tips-for-speed-and-large-data)
* Extended Tutorial [GitHub](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md) or [Docs](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html)
* [Production Example](https://github.com/winedarksea/AutoTS/blob/master/production_example.py)

## Features
* Finds optimal time series forecasting model and data transformations by genetic programming optimization
* Handles univariate and multivariate/parallel time series
Expand All @@ -31,7 +39,7 @@ For other time series needs, check out the list [here](https://github.com/MaxBen
```
pip install autots
```
This includes dependencies for basic models, but additonal packages are required for some models and methods.
This includes dependencies for basic models, but [additonal packages](https://github.com/winedarksea/AutoTS/blob/master/extended_tutorial.md#installation-and-dependency-versioning) are required for some models and methods.

## Basic Use

Expand Down Expand Up @@ -91,11 +99,13 @@ The lower-level API, in particular the large section of time series transformers

Check out [extended_tutorial.md](https://winedarksea.github.io/AutoTS/build/html/source/tutorial.html) for a more detailed guide to features!

Also take a look at the [production_example.py](https://github.com/winedarksea/AutoTS/blob/master/production_example.py)


## Tips for Speed and Large Data:
* Use appropriate model lists, especially the predefined lists:
* `superfast` (simple naive models) and `fast` (more complex but still faster models)
* `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given mave many CPU cores are available
* `fast_parallel` (a combination of `fast` and `parallel`) or `parallel`, given many CPU cores are available
* `n_jobs` usually gets pretty close with `='auto'` but adjust as necessary for the environment
* see a dict of predefined lists (some defined for internal use) with `from autots.models.model_list import model_lists`
* Use the `subset` parameter when there are many similar series, `subset=100` will often generalize well for tens of thousands of similar series.
Expand Down
22 changes: 10 additions & 12 deletions TODO.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,18 +15,16 @@
* Forecasts are desired for the future immediately following the most recent data.

# Latest
* Additional models to GluonTS
* GeneralTransformer transformation_params - now handle None or empty dict
* cleaning up of the appropriately named 'ModelMonster'
* improving MotifSimulation
* better error message for all models
* enable histgradientboost regressor, left it out before thinking it wouldn't stay experimental this long
* import_template now has slightly better `method` input style
* allow `ensemble` parameter to be a list
* NumericTransformer
* add .fit_transform method
* generally more options and speed improvement
* added NumericTransformer to future_regressors, should now coerce if they have different dtypes
* Table of Contents to Extended Tutorial/Readme.md
* Production Example
* add weights="mean"/median/min/max
* UnivariateRegression
* fix check_pickle error for ETS
* fix error in Prophet with latest version
* VisibleDeprecation warning for hidden_layers random choice in sklearn fixed
* prefill_na option added to allow quick filling of NaNs if desired (with zeroes for say, sales forecasting)
* made horizontal generalization more stable
* fixed bug in VAR where failing on data with negatives

# Known Errors:
DynamicFactor holidays Exceptions 'numpy.ndarray' object has no attribute 'values'
Expand Down
2 changes: 1 addition & 1 deletion autots/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@
from autots.tools.transform import GeneralTransformer, RandomTransform
from autots.tools.shaping import long_to_wide

__version__ = '0.3.1'
__version__ = '0.3.2'

TransformTS = GeneralTransformer

Expand Down
61 changes: 37 additions & 24 deletions autots/datasets/fred.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,23 +14,23 @@
_has_fred = True


def get_fred_data(fredkey: str, SeriesNameDict: dict = {'SeriesID': 'SeriesName'}):
"""
Imports Data from Federal Reserve
def get_fred_data(fredkey: str, SeriesNameDict: dict = None, long=True, **kwargs):
"""Imports Data from Federal Reserve.
For simplest results, make sure requested series are all of the same frequency.
args:
fredkey - an API key from FRED
SeriesNameDict, pairs of FRED Series IDs and Series Names
fredkey (str): an API key from FRED
SeriesNameDict (dict): pairs of FRED Series IDs and Series Names like: {'SeriesID': 'SeriesName'} or a list of FRED IDs.
Series id must match Fred IDs, but name can be anything
if default is use, several default samples are returned
if None, several default series are returned
long (bool): if True, return long style data, else return wide style data with dt index
"""
if not _has_fred:
raise ImportError("Package fredapi is required")

fred = Fred(api_key=fredkey)

if SeriesNameDict == {'SeriesID': 'SeriesName'}:
if SeriesNameDict is None:
SeriesNameDict = {
'T10Y2Y': '10 Year Treasury Constant Maturity Minus 2 Year Treasury Constant Maturity',
'DGS10': '10 Year Treasury Constant Maturity Rate',
Expand All @@ -44,29 +44,42 @@ def get_fred_data(fredkey: str, SeriesNameDict: dict = {'SeriesID': 'SeriesName'
'USEPUINDXD': 'Economic Policy Uncertainty Index for United States', # also very irregular
}

series_desired = list(SeriesNameDict.keys())
if isinstance(SeriesNameDict, dict):
series_desired = list(SeriesNameDict.keys())
else:
series_desired = list(SeriesNameDict)

fred_timeseries = pd.DataFrame(
columns=['date', 'value', 'series_id', 'series_name']
)
if long:
fred_timeseries = pd.DataFrame(
columns=['date', 'value', 'series_id', 'series_name']
)
else:
fred_timeseries = pd.DataFrame()

for series in series_desired:
data = fred.get_series(series)
try:
series_name = SeriesNameDict[series]
except Exception:
series_name = series
data_df = pd.DataFrame(
{
'date': data.index,
'value': data,
'series_id': series,
'series_name': series_name,
}
)
data_df.reset_index(drop=True, inplace=True)
fred_timeseries = pd.concat(
[fred_timeseries, data_df], axis=0, ignore_index=True
)

if long:
data_df = pd.DataFrame(
{
'date': data.index,
'value': data,
'series_id': series,
'series_name': series_name,
}
)
data_df.reset_index(drop=True, inplace=True)
fred_timeseries = pd.concat(
[fred_timeseries, data_df], axis=0, ignore_index=True
)
else:
data.name = series_name
fred_timeseries = fred_timeseries.merge(
data, how="outer", left_index=True, right_index=True
)

return fred_timeseries
51 changes: 46 additions & 5 deletions autots/evaluator/auto_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,12 @@
from autots.evaluator.metrics import PredictionEval
from autots.tools.transform import RandomTransform, GeneralTransformer, shared_trans
from autots.models.ensemble import EnsembleForecast, generalize_horizontal
from autots.models.model_list import no_params, recombination_approved, no_shared
from autots.models.model_list import (
no_params,
recombination_approved,
no_shared,
superfast,
)
from itertools import zip_longest
from autots.models.basics import (
MotifSimulation,
Expand Down Expand Up @@ -146,6 +151,20 @@ def ModelMonster(
**parameters,
)
return model
elif model == 'UnivariateRegression':
from autots.models.sklearn import UnivariateRegression

model = UnivariateRegression(
frequency=frequency,
prediction_interval=prediction_interval,
holiday_country=holiday_country,
random_seed=random_seed,
verbose=verbose,
n_jobs=n_jobs,
forecast_length=forecast_length,
**parameters,
)
return model

elif model == 'UnobservedComponents':
model = UnobservedComponents(
Expand Down Expand Up @@ -658,6 +677,7 @@ def PredictWitch(
if isinstance(template, pd.Series):
template = pd.DataFrame(template).transpose()
template = template.head(1)
full_model_created = False # make at least one full model, horziontal only
for index_upper, row_upper in template.iterrows():
# if an ensemble
if row_upper['Model'] == 'Ensemble':
Expand Down Expand Up @@ -750,18 +770,25 @@ def PredictWitch(
model_str = row_upper['Model']
parameter_dict = json.loads(row_upper['ModelParameters'])
transformation_dict = json.loads(row_upper['TransformationParameters'])
# this is needed for horizontal generalization if any models failed, at least one full model on all series
if model_str in superfast and not full_model_created:
make_full_flag = True
else:
make_full_flag = False
if (
horizontal_subset is not None
and model_str in no_shared
and all(
trs not in shared_trans
for trs in list(transformation_dict['transformations'].values())
)
and not make_full_flag
):
df_train_low = df_train.reindex(copy=True, columns=horizontal_subset)
# print(f"Reducing to subset for {model_str} with {df_train_low.columns}")
else:
df_train_low = df_train.copy()
full_model_created = True

df_forecast = ModelPrediction(
df_train_low,
Expand Down Expand Up @@ -816,6 +843,7 @@ def TemplateWizard(
'TransformationParameters',
'Ensemble',
],
traceback: bool = False,
):
"""
Take Template, returns Results.
Expand Down Expand Up @@ -844,6 +872,7 @@ def TemplateWizard(
max_generations (int): info to pass to print statements
model_interrupt (bool): if True, keyboard interrupts are caught and only break current model eval.
template_cols (list): column names of columns used as model template
traceback (bool): include tracebook over just error representation
Returns:
TemplateEvalObject
Expand Down Expand Up @@ -1030,11 +1059,23 @@ def TemplateWizard(
raise KeyboardInterrupt
except Exception as e:
if verbose >= 0:
print(
'Template Eval Error: {} in model {}: {}'.format(
(repr(e)), template_result.model_count, model_str
if traceback:
import traceback as tb

print(
'Template Eval Error: {} in model {}: {}'.format(
''.join(tb.format_exception(None, e, e.__traceback__)),
template_result.model_count,
model_str,
)
)
)
else:
print(
'Template Eval Error: {} in model {}: {}'.format(
(repr(e)), template_result.model_count, model_str
)
)

result = pd.DataFrame(
{
'ID': create_model_id(
Expand Down
Loading

0 comments on commit baed432

Please sign in to comment.