Skip to content

Allow access to an affine transform of a particular video tracks display rotation. #1047

@vade

Description

@vade

Overview

Hi. Firstly, apologies if this is a dupe. I did some cursory searching and found this thread which indicates a 'roll it yourself' POV.

#570

I'd like to propose an intermediary solution that puts the onus on the user for doing anything with the transform, but exposes it to clients of PyAV

Use case:

As a client of PyAV building a robust video processing system which consumes arbitrary videos, its a requirement to handle track transforms as we decode video within pyAV for ML / deep learning tasks. If we dont have our transform we may produce incorrect predictions, thumbnails and the like.

See #570 for an earlier discussion

Existing FFmpeg API

av_display_rotation_get

Expected PyAV API

being able to introspect a video track and extract it rotation would add a class method to av.video.stream.VideoStream

something like getDisplayRotation() or perhaps a more pythonic get_display_rotation which would return a np.array containing the rotation matrix.

Should a track not have a preferred transform, I see two possible solutions. Return None or return an identity matrix

Other media processing apis take a similar approach, for example, AVFoundation which is Apples media processing api exposes a prefferedTransform property on a asset track (equivalent of a stream in ffmpeg parlance)

https://developer.apple.com/documentation/avfoundation/avpartialasyncproperty/3816142-preferredtransform

Thank you for your consideration, and apologies if I missed anything in the existing apI. Time permitting I may open a PR for this feature.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions