VideoStreamMetadata#
- class torchcodec.decoders.VideoStreamMetadata(duration_seconds_from_header: float | None, begin_stream_seconds_from_header: float | None, bit_rate: float | None, codec: str | None, stream_index: int, duration_seconds: float | None, begin_stream_seconds: float | None, begin_stream_seconds_from_content: float | None, end_stream_seconds_from_content: float | None, width: int | None, height: int | None, num_frames_from_header: int | None, num_frames_from_content: int | None, average_fps_from_header: float | None, pixel_aspect_ratio: Fraction | None, rotation: float | None, color_primaries: str | None, color_space: str | None, color_transfer_characteristic: str | None, pixel_format: str | None, end_stream_seconds: float | None, num_frames: int | None, average_fps: float | None)[source]#
Metadata of a single video stream.
Examples using
VideoStreamMetadata:- average_fps: float | None#
Average fps of the stream. If a scan was perfomed, this is computed from the number of frames and the duration of the stream. Otherwise we fall back to
average_fps_from_header.
- average_fps_from_header: float | None#
Averate fps of the stream, obtained from the header (float or None). We recommend using the
average_fpsattribute instead.
- begin_stream_seconds: float | None#
Beginning of the stream, in seconds (float). Conceptually, this corresponds to the first frame’s pts. If a scan was performed and
begin_stream_seconds_from_contentis not None, then it is returned. Otherwise, this value is 0.
- begin_stream_seconds_from_content: float | None#
Beginning of the stream, in seconds (float or None). Conceptually, this corresponds to the first frame’s pts. It is only computed when a scan is done as min(frame.pts) across all frames in the stream. Usually, this is equal to 0.
- begin_stream_seconds_from_header: float | None#
Beginning of the stream, in seconds, obtained from the header (float or None). Usually, this is equal to 0.
- color_transfer_characteristic: str | None#
Color transfer characteristic as reported by FFmpeg E.g.
"bt709","smpte2084"(PQ),"arib-std-b67"(HLG).
- duration_seconds: float | None#
Duration of the stream in seconds. We try to calculate the duration from the actual frames if a scan was performed. Otherwise we fall back to
duration_seconds_from_header. If that value is also None, we instead calculate the duration fromnum_frames_from_headerandaverage_fps_from_header. If all of those are unavailable, we fall back to the container-levelduration_seconds_from_header.
- duration_seconds_from_header: float | None#
Duration of the stream, in seconds, obtained from the header (float or None). This could be inaccurate.
- end_stream_seconds: float | None#
End of the stream, in seconds (float or None). Conceptually, this corresponds to last_frame.pts + last_frame.duration. If scan was performed and``end_stream_seconds_from_content`` is not None, then that value is returned. Otherwise, returns
duration_seconds.
- end_stream_seconds_from_content: float | None#
End of the stream, in seconds (float or None). Conceptually, this corresponds to last_frame.pts + last_frame.duration. It is only computed when a scan is done as max(frame.pts + frame.duration) across all frames in the stream. Note that no frame is played at this time value, so calling
get_frame_played_at()with this value would result in an error. Retrieving the last frame is best done by simply indexing theVideoDecoderobject with[-1].
- num_frames: int | None#
Number of frames in the stream (int or None). This corresponds to
num_frames_from_contentif a scan was made, otherwise it corresponds tonum_frames_from_header. If that value is also None, the number of frames is calculated from the duration and the average fps.
- num_frames_from_content: int | None#
Number of frames computed by TorchCodec by scanning the stream’s content (the scan doesn’t involve decoding). This is more accurate than
num_frames_from_header. We recommend using thenum_framesattribute instead. (int or None).
- num_frames_from_header: int | None#
Number of frames, from the stream’s metadata. This is potentially inaccurate. We recommend using the
num_framesattribute instead. (int or None).
- pixel_aspect_ratio: Fraction | None#
Pixel Aspect Ratio (PAR), also known as Sample Aspect Ratio (SAR — not to be confused with Storage Aspect Ratio, also SAR), is the ratio between the width and height of each pixel (
fractions.Fractionor None).
- pixel_format: str | None#
The source pixel format of the video as reported by FFmpeg. E.g.
'yuv420p','yuv444p', etc.
- rotation: float | None#
Rotation angle in degrees (counter-clockwise rounded to the nearest multiple of 90 degrees) from the display matrix metadata. This indicates how the video should be rotated for correct display. TorchCodec automatically applies this rotation during decoding, so the returned frames are in the correct orientation (float or None).
Note
The
widthandheightattributes report the post-rotation dimensions, i.e., the dimensions of frames as they will be returned by TorchCodec’s decoding methods. For videos with 90 or -90 degree rotation, this means width and height are swapped compared to the raw encoded dimensions in the container.