-
Notifications
You must be signed in to change notification settings - Fork 390
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Add a repo provider for CKAN datasets #1833
Conversation
Thanks for submitting your first pull request! You are awesome! 🤗 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love to see the binderhub set up! Yay! Apologies for the late response.
This is a companion review to jupyterhub/repo2docker#1336 (review). And like that one, there're a couple minor stylistic issues.
But the fundamental question is one of versions and 'ref'. Should 'version' be a concept here directly specifyable, similar to 'ref' in git? I think you have enough knowledge of CKAN to help answer that.
Regardless, excited to find time to get this PR merged :)
binderhub/repoproviders.py
Outdated
client = AsyncHTTPClient() | ||
|
||
api = parsed_repo._replace( | ||
path=re.sub(self.url_regex, "/api/3/action/", parsed_repo.path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same comment about using regexes as in jupyterhub/repo2docker#1336 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I have made changes according to your suggestion.
binderhub/repoproviders.py
Outdated
except HTTPError: | ||
return None | ||
|
||
def parse_date(json_body): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that this function is used only once, let's just inline that here. Otherwise the function gets redefined each time the parent function is called. Also adds another layer of indirection that's otherwise not needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. I will fix this as soon as possible.
Thank you again for the review!
Please also find my comments at jupyterhub/repo2docker#1336 (comment). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for the delay! I've left a review again here and on repo2docker.
"https://demo.ckan.org/dataset/sample-dataset-1", | ||
"https://demo.ckan.org/dataset/sample-dataset-1", | ||
"sample-dataset-1.v", | ||
"https://demo.ckan.org/dataset/sample-dataset-1", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some additional tests here for /history/
as well as ?activity_id=
URLs?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, added.
binderhub/repoproviders.py
Outdated
async def get_resolved_ref(self): | ||
parsed_repo = urlparse(self.repo) | ||
|
||
url_parts_1 = parsed_repo.path.split("/history/") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this to use the parsing changes in jupyterhub/repo2docker#1336 (comment)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure and updated.
@yuvipanda Thank you for the review. I have made some modifications according to your suggestions. Please have a look. |
Steps to merge this:
No action currently needed from you, @u10313335! |
Tested this locally (on https://demo.ckan.org/dataset/sample-dataset-1) and it works! Thank you very much for your patience in getting this through, @u10313335! Look forward to more contributions from you :) |
I'll deploy this on mybinder.org shortly |
jupyterhub/binderhub#1833 Merge pull request #1833 from depositar/provider-ckan
Deployment to mybinder.org now complete, @u10313335! Thank you for your contribution :) |
Thank you @yuvipanda! |
This PR adds a repository provider for datasets on repositories built upon CKAN, which is an open-source DMS (data management system) for powering data hubs and data portals.
This PR requires a content provider for
repo2docker
: jupyterhub/repo2docker#1336.We have adopted this PR in our Binder service (as the
CKAN dataset
repository): https://binder.depositar.io/, e.g.: