Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FR] Loading/uploading dataset with CVAT docker container's "share" mounted drive #5461

Open
lejrn opened this issue Feb 4, 2025 · 1 comment
Labels
feature Work on a feature request

Comments

@lejrn
Copy link

lejrn commented Feb 4, 2025

I would like to request an official way for CVAT to load images directly from the "share" mounted drive of the CVAT Docker container when using FiftyOne for annotation tasks, rather than uploading (copying) them from the local drive to the container each time.

Motivation

Use Case: When working with large datasets, uploading files from the local drive to the CVAT Docker container every time an annotation task is created can be time-consuming and redundant, especially if the data already exists in a shared mounted directory.

Value to FiftyOne Users:
This would significantly reduce the time and resources required for setting up annotation tasks, improving workflow efficiency for users handling large datasets.

Value to My Project/Organization:
My project has dozens of GBs that I'd not need to upload everytime (takes long time) when this dataset could be accessed in seconds through the mounted shared drive.

Current Difficulty:
Currently, CVAT uploads the files even if they are already available in the Docker container via a shared mount. There is no documented method to instruct CVAT to directly reference these files, leading to unnecessary data transfer...

What areas of FiftyOne does this feature affect?
.annote() method for example

Details

I came across a related issue here: voxel51/fiftyone#1235, which discusses a similar challenge.

The solution proposed in that thread involves modifying the fiftyone.utils.cvat module to allow CVAT to use server_files when referencing images from the shared directory.
However, I wasn't sure which implementation is the best to work, and furthermore, maybe there's already some integrated newer solution that resolves this exact issue.

Willingness to contribute

  • Yes. I would be willing to contribute this feature with guidance from the FiftyOne community
@lejrn lejrn added the feature Work on a feature request label Feb 4, 2025
@lejrn
Copy link
Author

lejrn commented Feb 12, 2025

Alright, just a thought:
fiftyOne package seems to rely mostly on CVAT api, while I think that the CVAT SDK already suggests a very streamlined way to choose the resource type, like the Shared drive, so maybe this feature should be considered as part of a broader modification of fiftyOne to use the CVAT SDK instead of CVAT api, which I believe, would simplify more of the backend of fiftyOne.

Example of using CVAT SDK:

                task = client.tasks.create_from_data(
                    spec=task_spec,
                    resources=image_files,
                    resource_type=ResourceType.SHARE,  # Reference shared files
                )

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Work on a feature request
Projects
None yet
Development

No branches or pull requests

1 participant