VideoEncoder¶
- class torchcodec.encoders.VideoEncoder(frames: Tensor, *, frame_rate: float)[source]¶
A video encoder on CPU or CUDA..
- Parameters:
frames (
torch.Tensor) – The frames to encode. This must be a 4D tensor of shape(N, C, H, W)where N is the number of frames, C is 3 channels (RGB), H is height, and W is width. Values must be uint8 in the range[0, 255]. The tensor can be on CPU or CUDA. The device of the tensor determines which encoder is used (CPU or GPU).frame_rate (float) – The frame rate of the input
frames. Also defines the encoded output frame rate.
Examples using
VideoEncoder:- to_file(dest: str | pathlib.Path, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[dict[str, Any]] = None) None[source]¶
Encode frames into a file.
- Parameters:
dest (str or
pathlib.Path) – The path to the output file, e.g.video.mp4. The extension of the file determines the video container format.codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. Must be left as
Nonewhen encoding CUDA tensors. See Pixel Format for details.crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- to_file_like(file_like, format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[dict[str, Any]] = None) None[source]¶
Encode frames into a file-like object.
- Parameters:
file_like – A file-like object that supports
write()andseek()methods, such as io.BytesIO(), an open file in binary write mode, etc. Methods must have the following signature:write(data: bytes) -> intandseek(offset: int, whence: int = 0) -> int.format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.
codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. Must be left as
Nonewhen encoding CUDA tensors. See Pixel Format for details.crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- to_tensor(format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[dict[str, Any]] = None) Tensor[source]¶
Encode frames into raw bytes, as a 1D uint8 Tensor.
- Parameters:
format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.
codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format to encode frames into (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. Must be left as
Nonewhen encoding CUDA tensors. See Pixel Format for details.crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- Returns:
The raw encoded bytes as 1D uint8 Tensor on CPU regardless of the device of the input frames.
- Return type:
Tensor