Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement hints for filetype detection #1

Open
PeterKraus opened this issue May 14, 2024 · 0 comments
Open

Implement hints for filetype detection #1

PeterKraus opened this issue May 14, 2024 · 0 comments
Milestone

Comments

@PeterKraus
Copy link
Contributor

This issue is a follow-up of marda-alliance/metadata_extractors_schema#45.

In marda-alliance/metadata_extractors_schema#48, we have implemented the associated_file_extensions slot in the FileType schema, to specify some metadata that can be used to match files to FileTypes.

However, further hints could be included, such as common MIME types or magic bits. This idea needs a bit of planning work.

See also here:

This is my usecase. If someone uploads an arbitrary file to my ELN and I have a whole registry of tools to process it, the ELN still needs to figure out which tool to use. Identifying the FileType would give you the connection. Otherwise, I need to rely on the source (e.g. user) to tell me the type.

To apply a tool, the ELN needs to figure out the FileType one way or another. This is why you ask for a FileType identifier, right? Maybe it is difficult, but If you agree that it is a valid use-case, why wait for the next MaRDA WG to figure it out? I am not sure how additional information would reduce the useful-ness.

Let's say we are not using the registry to identify FileTypes. The tools in the registry still need to somehow tell what their intended input FileType is. And it ought to be more specific than JSON, HDF5, csv, etc. Why not describe the FileType by characteristics that would help to identify a file's type?

Originally posted by @markus1978 in marda-alliance/metadata_extractors_schema#9 (comment)

@PeterKraus PeterKraus added this to the 2.0 milestone May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant