Skip to content

Conversation

vringar
Copy link
Contributor

@vringar vringar commented Apr 9, 2021

This parameter allows for filtering out VisitIds that are part of
incompleted_visits or that had a command with a command_status other than
"ok" since users probably shouldn't consider them for analysis

This filtering functionality is extracted into the TableFilter class to
be reused by other Datasets.

Stefan Zabka and others added 2 commits April 9, 2021 12:38
Downloading files via the SparkContext was much slower than
downloading via boto (which is what S3Dataset does.
So now both classes use the same method, as PySparkS3Dataset
inherits from S3Dataset
This parameter allows for filtering out VisitIds that are part of
`incompleted_visits` or that had a command with a command_status other than
"ok" since users probably shouldn't consider them for analysis

This filtering functionality is extracted into the TableFilter class to
be reused by other Datasets.
@vringar vringar requested a review from englehardt April 9, 2021 10:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant