Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add path.mimeType() and path.charset() to node:path #54595

Open
aralroca opened this issue Aug 27, 2024 · 6 comments
Open

add path.mimeType() and path.charset() to node:path #54595

aralroca opened this issue Aug 27, 2024 · 6 comments
Labels
feature request Issues that request new features to be added to Node.js. path Issues and PRs related to the path subsystem.

Comments

@aralroca
Copy link

What is the problem this feature will solve?

Adding path.mimeType() and path.charset will solve the dependency on an external library, mime-types, to get the content type of a file based on its extension. This can be done with Node.js itself, just as Bun.js does with Bun.file(file).type.

What is the feature you are proposing to solve the problem?

I am proposing to add two new methods to the path module:

  • mimeType - returns the MIME type of a file based on its extension.
  • charset - returns the charset of a MIME type.
path.mimeType('file.js'); // returns 'application/javascript'
// or:
path.mimeType(path.extname('file.js')); // returns 'application/javascript'

And:

path.charset('file.js'); // returns 'utf-8'
// or:
path.charset(path.mimeType('file.js')); // returns 'utf-8'

What alternatives have you considered?

Alternatively, instead of being in node:path, it could be in fs.stat, but I think that in the end it can be extracted from the path without having to parse the file, besides it is better to use functions for this and the work is done only if they are executed individually.

CC: @vdeturckheim

@aralroca aralroca added the feature request Issues that request new features to be added to Node.js. label Aug 27, 2024
@github-project-automation github-project-automation bot moved this to Awaiting Triage in Node.js feature requests Aug 27, 2024
@aralroca aralroca changed the title provide mime-type and charset information of a file add path.mimeType() and path.charset() to node:path Aug 27, 2024
@jasnell
Copy link
Member

jasnell commented Aug 27, 2024

For path.mimeType(...) a couple of questions...

  1. Would you expect the actual path to be resolved and for the mime type to be sniffed from the contents or based entirely on the file extension?
  2. Which master list of mime types to file extensions do you propose we use and what is the source of that? How is it maintained.

For path.charset('...') ... the only way to accomplish this is to actually try reading the file. I don't believe that belongs in the path module but possibly in fs. Even then it can be a bit problematic to reliably determine the encoding of a file. Would it be based entirely on the presence of a BOM? Would it be limited to just the charset encodings we already know of (e.g. utf8, utf16le, etc)? Given cases where the charset is ambiguous (e.g. utf8 without a bom that only contains ascii characters could be utf8, ascii, or latin1), what would be the expected result?

@avivkeller avivkeller added the path Issues and PRs related to the path subsystem. label Aug 27, 2024
@avivkeller
Copy link
Member

avivkeller commented Aug 27, 2024

Personally, I'm -1 on adding this to core.

Mimetypes can be easily done in userland using a Map<extension, mime-type>

Charsets require reading the file, which is not something the path module does, and I don't think it is something the path module should.

@aralroca
Copy link
Author

  1. Would you expect the actual path to be resolved and for the mime type to be sniffed from the contents or based entirely on the file extension?
  2. Which master list of mime types to file extensions do you propose we use and what is the source of that? How is it maintained.

For path.charset('...') ... the only way to accomplish this is to actually try reading the file. I don't believe that belongs in the path module but possibly in fs. Even then it can be a bit problematic to reliably determine the encoding of a file. Would it be based entirely on the presence of a BOM? Would it be limited to just the charset encodings we already know of (e.g. utf8, utf16le, etc)? Given cases where the charset is ambiguous (e.g. utf8 without a bom that only contains ascii characters could be utf8, ascii, or latin1), what would be the expected result?

@jasnell It is not clear to me that it is necessary to look at the contents of the file. The chatset can be extracted from the mimeType, and the mimeType from the file extension. Nowadays the libraries that support this make use of this JSON reference: https://github.com/jshttp/mime-db/blob/master/db.json

Personally, I'm -1 on adding this to core.

Mimetypes can be easily done in userland using a Map<extension, mime-type>

Charsets require reading the file, which is not something the path module does, and I don't think it is something the path module should.

@redyetidev as I commented the current libraries to do this extract the charset depending on the mimeType without looking the internal content. This is very useful for frameworks to serve dynamic assets and put the Content-Type in a fast way. If the file has to be read it would lose efficiency if you want to serve the file in streaming.

And about the mapping, yes, can be done via a Map, but this Map is huge, and it doesn't make sense to depend on a library to maintain this JSON, it is better that it is part of the runtime itself IMO, besides that in the end it is adding dependencies for basic things.

@avivkeller
Copy link
Member

avivkeller commented Aug 30, 2024

(CC @nodejs/path)


(While I am not a core collaborator) I still strongly disagree with the addition of this API into the core of Node.js. While I do greatly appreciate this issue being opened, I have some major concerns with this API. Here are a few of them:

1. Maintainability

If Node.js were to implement this feature, the project would need to take on the significant responsibility of maintaining a constantly evolving and extensive map from file extensions to character sets and MIME types. Additionally, the project would have to handle edge cases, conflicts, and potential ambiguities that may arise due to overlapping file extensions or varying interpretations of MIME type specifications across different platforms and software.

2. Redundancy with Existing Ecosystem

The Node.js ecosystem already includes well-established modules like mime and mime-types that effectively manage MIME type and charset detection. These modules are actively maintained by the community and provide flexibility and extensibility beyond what a core implementation could offer. Introducing path.mimeType() and path.charset() would duplicate functionality that is already readily available and well-supported, leading to redundancy and possibly fragmenting the ecosystem.

3. Performance Concerns

Implementing and maintaining a reliable MIME type and charset detection mechanism within the Node.js core could introduce performance overhead, particularly in cases where the detection process is complex or requires frequent updates to a large mapping table. Even if the performance impact is minimal, the added complexity in the core could have unintended side effects on the overall performance of the Node.js runtime.

Copy link
Contributor

There has been no activity on this feature request for 5 months. To help maintain relevant open issues, please add the never-stale Mark issue so that it is never considered stale label or close this issue if it should be closed. If not, the issue will be automatically closed 6 months after the last non-automated comment.
For more information on how the project manages feature requests, please consult the feature request management document.

@github-actions github-actions bot added the stale label Feb 27, 2025
@bjohansebas
Copy link

By the way, if it helps in the future when this conversation comes up again, mime-db doesn't strictly follow semver since the records can change (see jshttp/mime-db#331, jshttp/mime-db#330) and break certain applications.

Just my two cents, but even though other runtimes have it implemented, I don't think Node.js should implement it.

@github-actions github-actions bot removed the stale label Mar 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request Issues that request new features to be added to Node.js. path Issues and PRs related to the path subsystem.
Projects
Status: Awaiting Triage
Development

No branches or pull requests

4 participants