Skip to content

Support encoding to file-like object #754

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 20 commits into
base: main
Choose a base branch
from

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Jul 5, 2025

This PR adds the to_file_like() method to the AudioEncoder. This allows users to encode samples into a custom file-like object that supports seek and write methods.

Marking this as draft because there are unresolved points (see below), but this is still ready for a solid first review round.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 5, 2025
@NicolasHug NicolasHug marked this pull request as draft July 5, 2025 14:49

# This check is useless but it's critical to keep it to ensures that samples
# is still alive during the call to encode_audio_to_file_like.
assert samples.is_contiguous()
Copy link
Member Author

@NicolasHug NicolasHug Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate that we have to do this but I do not see any other obvious way to keep the input samples alive for the duration of the call.
Claude is saying that we could just pass samples as a py::object. We won't be able to turn it back to a tensor (as mentioned in the code comment above), but claude claims that passing it as a parameter will ensure that pybind will keep it alive. I cannot verify this.

@scotts, any thoughts?

class AVIOFileLikeContext : public AVIOContextHolder {
// TODO: explain this. We probably don't want it.
class __attribute__((visibility("default"))) AVIOFileLikeContext
: public AVIOContextHolder {
Copy link
Member Author

@NicolasHug NicolasHug Jul 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, this has been causing some headaches for me.

Forget about the __attribute__((visibility("default"))) for a second.
In this PR, the AVIOFileLikeContext class is now a direct dependency of the Encoder [1]. It means that this file must be compiled along with the libtorchcodec_decoderN library. Before (i.e. in main), this file was only compiled with libtorchcodec_pybind_opsN.

And we know from https://pybind11.readthedocs.io/en/stable/faq.html#someclass-declared-with-greater-visibility-than-the-type-of-its-field-someclass-member-wattributes that this class should be compiled with "hidden" visibility: it holds a py::object fileLike_ attribute, which itself is set to "hidden" by pybind.

But if we were to set AVIOFileLikeContext to "hidden", then we'd have another problem: we'd get the same kind of warning, but for the Encoder, because the Encoder would not be "hidden":

Encoder declared with greater visibility than the type of its field Encoder::avioFileLikeContext_

We can't set the Encoder as hidden, that would make it invisible to custom ops and other things. It just wouldn't work.

[1] See comment below in this PR about the need for 2 different constructors: one for AVIOFileLikeContext, one for AVIOToTensorContext.


So here with __attribute__((visibility("default"))) we are explicitly setting the visibility of AVIOFileLikeContext to "default".
"default" actually means public, i.e. NOT hidden. This override pybind's desire of having the py::object fileLike_ to be hidden, and avoid the related warnings/errors. I do not think this is a great solution though, because IIUC, we're not actually solving the issue, we're mainly silencing a warning by forcing something to be "default"/public when it shouldn't. I think we are thus exposing ourselves to symbol conflicts:

As to why -fvisibility=hidden is necessary, because pybind modules could have been compiled under different versions of pybind itself, it is also important that the symbols defined in one module do not clash with the potentially-incompatible symbols defined in another. While Python extension modules are usually loaded with localized symbols (under POSIX systems typically using dlopen with the RTLD_LOCAL flag), this Python default can be changed, but even if it isn’t it is not always enough to guarantee complete independence of the symbols involved when not using -fvisibility=hidden.


If we don't want to do this (by this I mean setting AVIOFileLikeContext to "default"), then I think we'll need to avoid relying on AVIOFileLikeContext within the Encoder. That means we will only want to interact with the base AVIOContextHolder class, and I think that means we'll need to expose the getOutputTensor as virtual, even know it will only make sense for AVIOToTensorContext.

I hope this makes somewhat sense. I'm still wrapping my head around all this.

${pybind_ops_library_name}
PUBLIC
"-fvisibility=hidden"
)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now because the new __attribute__((visibility("default"))) has priority over any compilation flag. See other big comment above.

@NicolasHug NicolasHug marked this pull request as ready for review July 6, 2025 14:51
@NicolasHug NicolasHug marked this pull request as draft July 7, 2025 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants