-
Notifications
You must be signed in to change notification settings - Fork 423
Description
Overview
Hi. Firstly, apologies if this is a dupe. I did some cursory searching and found this thread which indicates a 'roll it yourself' POV.
I'd like to propose an intermediary solution that puts the onus on the user for doing anything with the transform, but exposes it to clients of PyAV
Use case:
As a client of PyAV building a robust video processing system which consumes arbitrary videos, its a requirement to handle track transforms as we decode video within pyAV for ML / deep learning tasks. If we dont have our transform we may produce incorrect predictions, thumbnails and the like.
See #570 for an earlier discussion
Existing FFmpeg API
Expected PyAV API
being able to introspect a video track and extract it rotation would add a class method to av.video.stream.VideoStream
something like getDisplayRotation() or perhaps a more pythonic get_display_rotation which would return a np.array containing the rotation matrix.
Should a track not have a preferred transform, I see two possible solutions. Return None or return an identity matrix
Other media processing apis take a similar approach, for example, AVFoundation which is Apples media processing api exposes a prefferedTransform property on a asset track (equivalent of a stream in ffmpeg parlance)
Thank you for your consideration, and apologies if I missed anything in the existing apI. Time permitting I may open a PR for this feature.