Shortcuts

VideoEncoder

class torchcodec.encoders.VideoEncoder(frames: Tensor, *, frame_rate: float)[source]

A video encoder.

Parameters:
  • frames (torch.Tensor) – The frames to encode. This must be a 4D tensor of shape (N, C, H, W) where N is the number of frames, C is 3 channels (RGB), H is height, and W is width. Values must be uint8 in the range [0, 255].

  • frame_rate (float) – The frame rate of the input frames. Also defines the encoded output frame rate.

Examples using VideoEncoder:

Encoding video frames with VideoEncoder

Encoding video frames with VideoEncoder
to_file(dest: Union[str, Path], *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) None[source]

Encode frames into a file.

Parameters:
  • dest (str or pathlib.Path) – The path to the output file, e.g. video.mp4. The extension of the file determines the video container format.

  • codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.

  • pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.

  • crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.

  • preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.

  • extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g. {"qp": 5, "tune": "film"}. See Extra Options for details.

to_file_like(file_like, format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) None[source]

Encode frames into a file-like object.

Parameters:
  • file_like – A file-like object that supports write() and seek() methods, such as io.BytesIO(), an open file in binary write mode, etc. Methods must have the following signature: write(data: bytes) -> int and seek(offset: int, whence: int = 0) -> int.

  • format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.

  • codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.

  • pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.

  • crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.

  • preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.

  • extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g. {"qp": 5, "tune": "film"}. See Extra Options for details.

to_tensor(format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) Tensor[source]

Encode frames into raw bytes, as a 1D uint8 Tensor.

Parameters:
  • format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.

  • codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.

  • pixel_format (str, optional) – The pixel format to encode frames into (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.

  • crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.

  • preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.

  • extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g. {"qp": 5, "tune": "film"}. See Extra Options for details.

Returns:

The raw encoded bytes as 1D uint8 Tensor.

Return type:

Tensor

Docs

Access comprehensive developer documentation for PyTorch

View Docs

Tutorials

Get in-depth tutorials for beginners and advanced developers

View Tutorials

Resources

Find development resources and get your questions answered

View Resources