VideoEncoder¶
- class torchcodec.encoders.VideoEncoder(frames: Tensor, *, frame_rate: float)[source]¶
A video encoder.
- Parameters:
frames (
torch.Tensor) – The frames to encode. This must be a 4D tensor of shape(N, C, H, W)where N is the number of frames, C is 3 channels (RGB), H is height, and W is width. Values must be uint8 in the range[0, 255].frame_rate (float) – The frame rate of the input
frames. Also defines the encoded output frame rate.
Examples using
VideoEncoder:- to_file(dest: Union[str, Path], *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) None[source]¶
Encode frames into a file.
- Parameters:
dest (str or
pathlib.Path) – The path to the output file, e.g.video.mp4. The extension of the file determines the video container format.codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.
crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- to_file_like(file_like, format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) None[source]¶
Encode frames into a file-like object.
- Parameters:
file_like – A file-like object that supports
write()andseek()methods, such as io.BytesIO(), an open file in binary write mode, etc. Methods must have the following signature:write(data: bytes) -> intandseek(offset: int, whence: int = 0) -> int.format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.
codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format for encoding (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.
crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- to_tensor(format: str, *, codec: Optional[str] = None, pixel_format: Optional[str] = None, crf: Optional[Union[int, float]] = None, preset: Optional[Union[str, int]] = None, extra_options: Optional[Dict[str, Any]] = None) Tensor[source]¶
Encode frames into raw bytes, as a 1D uint8 Tensor.
- Parameters:
format (str) – The container format of the encoded frames, e.g. “mp4”, “mov”, “mkv”, “avi”, “webm”, “flv”, etc.
codec (str, optional) – The codec to use for encoding (e.g., “libx264”, “h264”). If not specified, the default codec for the container format will be used. See Codec Selection for details.
pixel_format (str, optional) – The pixel format to encode frames into (e.g., “yuv420p”, “yuv444p”). If not specified, uses codec’s default format. See Pixel Format for details.
crf (int or float, optional) – Constant Rate Factor for encoding quality. Lower values mean better quality. Valid range depends on the encoder (e.g. 0-51 for libx264). Defaults to None (which will use encoder’s default). See CRF (Constant Rate Factor) for details.
preset (str or int, optional) – Encoder option that controls the tradeoff between encoding encoding speed and compression (output size). Valid on the encoder (commonly a string: “fast”, “medium”, “slow”). Defaults to None (which will use encoder’s default). See Preset for details.
extra_options (dict[str, Any], optional) – A dictionary of additional encoder options to pass, e.g.
{"qp": 5, "tune": "film"}. See Extra Options for details.
- Returns:
The raw encoded bytes as 1D uint8 Tensor.
- Return type:
Tensor