-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add uploading of datasets. #52
Comments
Comments/Suggestions
|
HI @smartcaveman, and thanks for your comments, let me answer along your quotes:
Indeed, that's why the data source specification demands the output to be a
Will do. Thanks.
Some of our datasets, like census, won't be changing, so it was already considered, although not written on the issue, that some data sources, shouldn't be running at every execution. However, I will make sure that when, let's call them "static" data_sources, they have their code updated will be executed too.
This is an interesting remark. Yesterday I had a call with Anton, regarding this issue, and have had that in mind while thinking in the design of the solution.
I will check them in more detail during the weekend if I have time. Definitely they look really interesting. |
@ManuelAlvarezC sorry, my comment was ambiguous. Re: "To simplify this comparison, store the hash for source files.", I was referring to source code and data source files. So, if you download a large dataset that's expensive to process, and the hash of the dataset is the same as it was last time it was processed, then it doesn't need to be reprocessed. |
Description
Trello card :https://trello.com/c/tb08vrGi
We need to upload the datasets generated by our data sources, to make them easily accesible to other teams.
To do we need to:
The text was updated successfully, but these errors were encountered: