Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Clocks Not returning proper intervals. #58

Open
ydesai-exos opened this issue Nov 20, 2018 · 3 comments
Open

Python Clocks Not returning proper intervals. #58

ydesai-exos opened this issue Nov 20, 2018 · 3 comments

Comments

@ydesai-exos
Copy link

ydesai-exos commented Nov 20, 2018

@icexelloss

The clocks function for Flint in python is returning incorrect intervals.

The time intervals appear far too large than what I am specifying into the function.

For example:

from ts.flint import clocks
clock = clocks.uniform(sqlContext, frequency="1s", offset="0ns")
clock.show()

returns

time:timestamp
+-------------------+
|               time|
+-------------------+
|1970-01-01 00:00:00|
|1970-01-01 00:16:40|
|1970-01-01 00:33:20|
|1970-01-01 00:50:00|
|1970-01-01 01:06:40|
|1970-01-01 01:23:20|
|1970-01-01 01:40:00|
|1970-01-01 01:56:40|
|1970-01-01 02:13:20|
|1970-01-01 02:30:00|
|1970-01-01 02:46:40|
|1970-01-01 03:03:20|
|1970-01-01 03:20:00|
|1970-01-01 03:36:40|
|1970-01-01 03:53:20|
|1970-01-01 04:10:00|
|1970-01-01 04:26:40|
|1970-01-01 04:43:20|
|1970-01-01 05:00:00|
|1970-01-01 05:16:40|
+-------------------+
only showing top 20 rows

It should be 1 second intervals but returns intervals of 16 min 40 seconds.

Similarly, an interval of 1 day returns intervals of 2 years.

from ts.flint import clocks
clock = clocks.uniform(sqlContext, frequency="1d", offset="0ns", )
clock.show()
+-------------------+
|               time|
+-------------------+
|1970-01-01 00:00:00|
|1972-09-27 00:00:00|
|1975-06-24 00:00:00|
|1978-03-20 00:00:00|
|1980-12-14 00:00:00|
|1983-09-10 00:00:00|
|1986-06-06 00:00:00|
|1989-03-02 00:00:00|
|1991-11-27 00:00:00|
|1994-08-23 00:00:00|
|1997-05-19 00:00:00|
|2000-02-13 00:00:00|
|2002-11-09 00:00:00|
|2005-08-05 00:00:00|
|2008-05-01 00:00:00|
|2011-01-26 00:00:00|
|2013-10-22 00:00:00|
|2016-07-18 00:00:00|
|2019-04-14 00:00:00|
|2022-01-08 00:00:00|
+-------------------+
only showing top 20 rows

Also when I supply custom start and end times, the years returned are way out of range.

from ts.flint import clocks
clock = clocks.uniform(sqlContext, frequency="1d", offset="0ns", begin_date_time="2014-04-23", end_date_time="2015-04-23")
clock.show()
time:timestamp
+--------------------+
|                time|
+--------------------+
|46277-07-20 00:00...|
|46280-04-15 00:00...|
|46283-01-10 00:00...|
|46285-10-06 00:00...|
|46288-07-02 00:00...|
|46291-03-29 00:00...|
|46293-12-23 00:00...|
|46296-09-18 00:00...|
|46299-06-15 00:00...|
|46302-03-12 00:00...|
|46304-12-06 00:00...|
|46307-09-02 00:00...|
|46310-05-29 00:00...|
|46313-02-22 00:00...|
|46315-11-19 00:00...|
|46318-08-15 00:00...|
|46321-05-11 00:00...|
|46324-02-05 00:00...|
|46326-11-01 00:00...|
|46329-07-28 00:00...|
+--------------------+
only showing top 20 rows
@5mdd
Copy link

5mdd commented Jan 24, 2019

I have exactly the same problem. It seems like clock mixes seconds with milliseconds. A workaround ("hack") is to use ms instead of s in the code, in your case for a 1s interval:
from ts.flint import clocks
clock = clocks.uniform(sqlContext, frequency="1ms", offset="0ns")
clock.show()

or for one day interval:
from ts.flint import clocks
clock = clocks.uniform(sqlContext, frequency="86400ms", offset="0ns")
clock.show()

@LeoDashTM
Copy link

Thanks for the workaround @5mdd - I'll try it out.
But of course, TwoSigma just needs to fix this!
@icexelloss ?

@5mdd
Copy link

5mdd commented Jan 28, 2019

I forgot to mention that I am using databricks runtime 5.2 ML (Spark 2.4.0/Scala 2.11) with the databricks flint jar:
flint_0_6_0_databricks.jar

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants