No rolling statistics #180

NoahSlv · 2016-08-02T04:28:10Z

I have a simple time series of a single column. About 20,000 measurements, once every 5 minutes.

Using featurize.featurize_time_series with some built in features (median, minimum, maximum, etc.) I get back ONE SINGLE measurement for the entire time series.

Most other time series libraries I've used will generate a series of measurements, often with user specified lag, so that prediction can be built. Does cesium have similar functionality?

Thanks!

profjsb · 2016-08-02T18:32:48Z

@NoahSlv It might be helpful to learn more about your use case. We've not explicitly built a time series forecast engine, which is what your question implies. Instead, the featurization part of the codebase is used transform the input timeseries into an array of features, which in turn can be used to learn a (supervised) classifier. That is:

 f_n(ts) -> Real  (for each feature n)

This leads to a feature vector of size n. What you are suggesting is something like

f_n(ts) -> Real^m (where m is the output vector, say a smoothed version of the original time series)

while in principle one could create a feature vector of size m x n, this isn't normally how time series are featurized for classification.

If you can tell us a bit more about your use case we can take it from there. Thanks!

stefanv · 2016-10-06T21:13:30Z

@NoahSlv Please let us know your thoughts on the above. Thanks!

sfrodrigues · 2017-03-19T17:35:24Z

Hey guys,

many thanks for this very nice tool!

I have a similar case to what NoahSlv was describing: I have a time series dataset like:

Index, Feature_1, ...., Feature_n, Label
2015-01-01, 2.4, ..., 2.7, 3
2015-01-02, 2.2, ..., 2.2, 4
2015-01-03, 2.3, ..., 2.5, 2

And I would like to extract features from it. However, Im not looking for a single array of features for the entire dataset. I want to extract features for each of the rows while only using info that is know at each row, i.e. only using that row and the previous ones (time dependency).

So that after I would have something like:

Index, Feature_1, ...., Feature_n, New_Feature_1, ...., New_Feature_n, Label
2015-01-01, 2.4, ..., 2.7, 3.4, ..., 1.7, 3
2015-01-02, 2.2, ..., 2.2, 3.2, ..., 7.7, 4
2015-01-03, 2.3, ..., 2.5, 2.5, ..., 2.8, 2

Is it possible to do this with cesium? Or are you planning on expanding the tool to allow it?

Cheers

bnaul · 2017-03-21T20:30:00Z

@sfragosorodrig this is not something we currently support but we have considered adding something along those lines; it's not yet being actively developed, though.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No rolling statistics #180

No rolling statistics #180

NoahSlv commented Aug 2, 2016

profjsb commented Aug 2, 2016

stefanv commented Oct 6, 2016

sfrodrigues commented Mar 19, 2017

bnaul commented Mar 21, 2017

No rolling statistics #180

No rolling statistics #180

Comments

NoahSlv commented Aug 2, 2016

profjsb commented Aug 2, 2016

stefanv commented Oct 6, 2016

sfrodrigues commented Mar 19, 2017

bnaul commented Mar 21, 2017