AudioEncoder¶
- class torchcodec.encoders.AudioEncoder(samples: Tensor, *, sample_rate: int)[source]¶
An audio encoder.
- Parameters:
samples (
torch.Tensor) – The samples to encode. This must be a 2D tensor of shape(num_channels, num_samples), or a 1D tensor in which casenum_channels = 1is assumed. Values must be float values in[-1, 1].sample_rate (int) – The sample rate of the input
samples. The sample rate of the encoded output can be specified using the encoding methods (to_file, etc.).
Examples using
AudioEncoder:- to_file(dest: Union[str, Path], *, bit_rate: Optional[int] = None, num_channels: Optional[int] = None, sample_rate: Optional[int] = None) None[source]¶
Encode samples into a file.
- Parameters:
dest (str or
pathlib.Path) – The path to the output file, e.g.audio.mp3. The extension of the file determines the audio format and container.bit_rate (int, optional) – The output bit rate. Encoders typically support a finite set of bit rate values, so
bit_ratewill be matched to one of those supported values. The default is chosen by FFmpeg.num_channels (int, optional) – The number of channels of the encoded output samples. By default, the number of channels of the input
samplesis used.sample_rate (int, optional) – The sample rate of the encoded output. By default, the sample rate of the input
samplesis used.
- to_file_like(file_like, format: str, *, bit_rate: Optional[int] = None, num_channels: Optional[int] = None, sample_rate: Optional[int] = None) None[source]¶
Encode samples into a file-like object.
- Parameters:
file_like – A file-like object that supports
write()andseek()methods, such as io.BytesIO(), an open file in binary write mode, etc. Methods must have the following signature:write(data: bytes) -> intandseek(offset: int, whence: int = 0) -> int.format (str) – The format of the encoded samples, e.g. “mp3”, “wav” or “flac”.
bit_rate (int, optional) – The output bit rate. Encoders typically support a finite set of bit rate values, so
bit_ratewill be matched to one of those supported values. The default is chosen by FFmpeg.num_channels (int, optional) – The number of channels of the encoded output samples. By default, the number of channels of the input
samplesis used.sample_rate (int, optional) – The sample rate of the encoded output. By default, the sample rate of the input
samplesis used.
- to_tensor(format: str, *, bit_rate: Optional[int] = None, num_channels: Optional[int] = None, sample_rate: Optional[int] = None) Tensor[source]¶
Encode samples into raw bytes, as a 1D uint8 Tensor.
- Parameters:
format (str) – The format of the encoded samples, e.g. “mp3”, “wav” or “flac”.
bit_rate (int, optional) – The output bit rate. Encoders typically support a finite set of bit rate values, so
bit_ratewill be matched to one of those supported values. The default is chosen by FFmpeg.num_channels (int, optional) – The number of channels of the encoded output samples. By default, the number of channels of the input
samplesis used.sample_rate (int, optional) – The sample rate of the encoded output. By default, the sample rate of the input
samplesis used.
- Returns:
The raw encoded bytes as 1D uint8 Tensor.
- Return type:
Tensor