Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No rolling statistics #180

Open
NoahSlv opened this issue Aug 2, 2016 · 4 comments
Open

No rolling statistics #180

NoahSlv opened this issue Aug 2, 2016 · 4 comments

Comments

@NoahSlv
Copy link

NoahSlv commented Aug 2, 2016

I have a simple time series of a single column. About 20,000 measurements, once every 5 minutes.

Using featurize.featurize_time_series with some built in features (median, minimum, maximum, etc.) I get back ONE SINGLE measurement for the entire time series.

Most other time series libraries I've used will generate a series of measurements, often with user specified lag, so that prediction can be built. Does cesium have similar functionality?

Thanks!

@profjsb
Copy link
Contributor

profjsb commented Aug 2, 2016

@NoahSlv It might be helpful to learn more about your use case. We've not explicitly built a time series forecast engine, which is what your question implies. Instead, the featurization part of the codebase is used transform the input timeseries into an array of features, which in turn can be used to learn a (supervised) classifier. That is:

 f_n(ts) -> Real  (for each feature n)

This leads to a feature vector of size n. What you are suggesting is something like

f_n(ts) -> Real^m (where m is the output vector, say a smoothed version of the original time series)

while in principle one could create a feature vector of size m x n, this isn't normally how time series are featurized for classification.

If you can tell us a bit more about your use case we can take it from there. Thanks!

@stefanv
Copy link
Contributor

stefanv commented Oct 6, 2016

@NoahSlv Please let us know your thoughts on the above. Thanks!

@sfrodrigues
Copy link

Hey guys,

many thanks for this very nice tool!

I have a similar case to what NoahSlv was describing: I have a time series dataset like:

Index, Feature_1, ...., Feature_n, Label
2015-01-01, 2.4, ..., 2.7, 3
2015-01-02, 2.2, ..., 2.2, 4
2015-01-03, 2.3, ..., 2.5, 2

And I would like to extract features from it. However, Im not looking for a single array of features for the entire dataset. I want to extract features for each of the rows while only using info that is know at each row, i.e. only using that row and the previous ones (time dependency).

So that after I would have something like:

Index, Feature_1, ...., Feature_n, New_Feature_1, ...., New_Feature_n, Label
2015-01-01, 2.4, ..., 2.7, 3.4, ..., 1.7, 3
2015-01-02, 2.2, ..., 2.2, 3.2, ..., 7.7, 4
2015-01-03, 2.3, ..., 2.5, 2.5, ..., 2.8, 2

Is it possible to do this with cesium? Or are you planning on expanding the tool to allow it?

Cheers

@bnaul
Copy link
Contributor

bnaul commented Mar 21, 2017

@sfragosorodrig this is not something we currently support but we have considered adding something along those lines; it's not yet being actively developed, though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants