Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add ->has_encoding & ->is_encoded_content to files #263

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

madsen
Copy link
Contributor

@madsen madsen commented Jan 5, 2014

This is a followup to #254. In this version, has_encoding acts as a normal Moose predicate (except for FromCode files, where it always returns true because the encoding attribute is read-only). The new method is_encoded_content indicates whether the "canonical" version is encoded_content or content.

If you're happy with this, I'll write up some docs.

The idea is that an EncodingProvider can use has_encoding to determine whether it's possible to set the encoding, and then use is_encoded_content to decide whether it can examine the file to determine an encoding.

@karenetheridge
Copy link
Contributor

The name "is_encoded_content" is a bit confusing - it sounds like it means "isn't decodable", but doesn't it really mean "has decodable content"?

@madsen
Copy link
Contributor Author

madsen commented Jan 5, 2014

Not exactly. There's 2 ways to populate a File object. Either you give it bytes, or you give it characters. is_encoded_content is true if the file was given bytes, and false if it was given characters. (In the case of a MutableFile, this can change when you set the content.)

I'm open to suggestions for a better name.

@karenetheridge
Copy link
Contributor

No, it looks like is_encoded_content is true if the file encoding eq 'bytes', which is not the same thing as being provided bytes that are, say, iso-8859-5 encoded characters. In the latter case, those bytes are still decodable.

I think has_decodable_content is marginally better, but I think the main thing needed here is more documentation to be explicitly clear about what these new methods provide (and can be used to interpret) -- such as "is it possible to determine an encoding for this file".

@madsen
Copy link
Contributor Author

madsen commented Jan 5, 2014

I'm not sure what you're talking about. Are you reading code_return_type as encoding? is_encoded_content is independent of the encoding. It simply means the File object was populated from bytes, not characters. Whether those bytes represent characters in some encoding or a JPEG file is irrelevant.

@karenetheridge
Copy link
Contributor

Are you reading code_return_type as encoding? is_encoded_content is independent of the encoding.

Yes, I was. Oops!

Still, I suppose my original point was made... ENEEDMOREDOCS ;)

@madsen
Copy link
Contributor Author

madsen commented Jan 5, 2014

As I said in the request, I'll write docs if @rjbs thinks the approach is acceptable.

@madsen
Copy link
Contributor Author

madsen commented Jan 5, 2014

Maybe is_from_bytes would be a better name for is_encoded_content. But that could be confused with is_bytes, which does mean encoding eq 'bytes'.

@rjbs
Copy link
Owner

rjbs commented Apr 21, 2016

I have put this PR off because I've never felt good about the names, nor did I have any better ones to suggest. Now I suggest this:

File will require:

  • has_content
  • has_encoded_content
  • has_encoding

An auto-detector of encodings can look for:

if ($file->has_encoded_content && ! $file->has_content && ! $file->has_encoding) { ... }

I will code that all up myself if you agree this is still useful and sane, @madsen.

@madsen
Copy link
Contributor Author

madsen commented Apr 21, 2016

That seems like a viable approach to me. Looks like I'll finally get to release the SmartEncoding plugin I've been sitting on for 2 years 😕. Thanks. 😃

@madsen
Copy link
Contributor Author

madsen commented Apr 25, 2016

Can we get this into v6 before it leaves trial status? This sort of API change seems tailor-made for a major-version bump.

@rjbs rjbs force-pushed the main branch 10 times, most recently from 9361f28 to f825e3b Compare June 3, 2024 00:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants