You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In my case, my image files are stored on disk in a content-addressable manner mimicking how git store and name files. E.g. typically, a JPEG file could be stored as /var/misc/images/1f/ec4f5cee029f96c1e9eddd09821a51c0a9f80a.
Problem
The problem is related to the CVAT engine MIME type detection which is based on file extensions:
def_is_image(path):
mime=mimetypes.guess_type(path)
# Exclude vector graphic images because Pillow cannot work with themreturnmime[0] isnotNoneandmime[0].startswith('image') and \
notmime[0].startswith('image/svg')
tl;dr
In my case, all the uploaded image files get ignored.
Describe the solution you'd like
I think it would be great if MIME type detection could be expanded to support magic detection (file headers), e.g. using https://github.com/ahupp/python-magic or anything equivalent. In other words, do not get limited to file extension based detection (.jpg, etc).
NB.: I am talking about images, but same could be done for other media types of course.
Describe alternatives you've considered
I am forced to rename (add an extension) at upload time (work around).
Additional context
No response
The text was updated successfully, but these errors were encountered:
python-magic is significantly slower. We used it in the past, but it was decided to work with extensions.
Right, that's a drawback.
Additionally, it will not work with cloud storages as CVAT needs to download file content -> much much slower.
True (perhaps the Content-Type (HTTP header) and/or HEAD requests could be leveraged here - not sure how it's being handled right now).
For context: when using the FiftyOne built-in CVAT integration, this even turns into a bug as _get_job_ids polls forever (and no job is ever returned).
Actions before raising this issue
Is your feature request related to a problem? Please describe.
Context
I am uploading image files via https://app.cvat.ai/api/docs/#tag/tasks/operation/tasks_create_data (using the
client_files
parameters).In my case, my image files are stored on disk in a content-addressable manner mimicking how git store and name files. E.g. typically, a JPEG file could be stored as
/var/misc/images/1f/ec4f5cee029f96c1e9eddd09821a51c0a9f80a
.Problem
The problem is related to the CVAT engine MIME type detection which is based on file extensions:
cvat/cvat/apps/engine/task.py
Lines 215 to 231 in f93d58c
cvat/cvat/apps/engine/media_extractors.py
Lines 859 to 863 in f93d58c
E.g.
is_image
builds upon https://docs.python.org/3/library/mimetypes.html#mimetypes.guess_type:tl;dr
In my case, all the uploaded image files get ignored.
Describe the solution you'd like
I think it would be great if MIME type detection could be expanded to support magic detection (file headers), e.g. using https://github.com/ahupp/python-magic or anything equivalent. In other words, do not get limited to file extension based detection (
.jpg
, etc).NB.: I am talking about images, but same could be done for other media types of course.
Describe alternatives you've considered
I am forced to rename (add an extension) at upload time (work around).
Additional context
No response
The text was updated successfully, but these errors were encountered: