-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Supporting data access to hugging face data sets #964
Comments
This writeup does not give me enough to act/prioritize. Please provide specifics. Here are some things you may want to address (but your writeup does not have to be limited those). You can describe the use case that you want to enable, what data set you need to access, do you need to write data back to HF, what other methods have you used to access (read or write) the data, why methods did you try before and why what you are proposing will work better. |
I'll spell out the most important use cases for Open Trusted Data Initiative (OTDI):
I would think it is obvious that users need direct R/W access to the world's most important AI dataset repository. Without this feature, the Alliance will either have to fork DPK to use it or adopt an alternative. cc: @nirmdesai |
@deanwampler Thanks. There are ample documentations out there to show how you can do all 3 using Hugging Face APIs. Few questions so we can add clarity on what you are trying to do: 1- Where have you used the HF APIs in your recipe and where did it fail in allowing you to realize your objectives for ingesting the initial data and then writing it back at the end? Any clarificiation on how you tried in the past to solve the problem using readily available tools would be helpful. |
|
Search before asking
Component
Library/core
Feature
Currently, DPK supports two data location options - local file system and S3 compatible. At the same time, one of the largest collections of public datasets is the HF hub. Natively supporting data there opens up many capabilities for the users and is also quite important for AI Alliance.
Are you willing to submit a PR?
The text was updated successfully, but these errors were encountered: