Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

features for multivariate time series, especially with mixture of categorical and continues values #716

Open
Sandy4321 opened this issue Jun 17, 2020 · 1 comment

Comments

@Sandy4321
Copy link

what is about features for multivariate time series, especially with mixture of categorical and continues values
can you share some such a dataset (train and test ) with performance of your code

for example with multivariate time series with table per each label
like target is YES
date f1 f2 f3
dec 0.1 a 234
jan -0.5 a 456
feb 3.4 b 123
march 0.6 b 678

like target is NO
date f1 f2 f3
dec -0.1 c 1234
jan 0.5 a 4456
feb 2.4 g 2123
march 1.6 b 6678

@nils-braun
Copy link
Collaborator

Hello @Sandy4321
In the moment tsfresh can only handle numerical values.
Some of our feature extractors might also work well with categorical columns (such as everything related to counting values), but our full pipeline was really built for numerical values.

What you could try, is to transform the categorical values in numbers, and apply the feature extraction on them, but many extractors will be non-sense (what is the mean of that column?). It would be interesting to see anyways.
I do not have any performance to show here - also because the real performance heavily depends on the ML method you are using after that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants