.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "generated_examples/decoding/audio_decoding.py" .. LINE NUMBERS ARE GIVEN BELOW. .. rst-class:: sphx-glr-example-title .. _sphx_glr_generated_examples_decoding_audio_decoding.py: ======================================== Decoding audio streams with AudioDecoder ======================================== In this example, we'll learn how to decode an audio file using the :class:`~torchcodec.decoders.AudioDecoder` class. .. GENERATED FROM PYTHON SOURCE LINES 18-21 First, a bit of boilerplate: we'll download an audio file from the web and define an audio playing utility. You can ignore that part and jump right below to :ref:`creating_decoder_audio`. .. GENERATED FROM PYTHON SOURCE LINES 21-38 .. code-block:: Python import requests from IPython.display import Audio def play_audio(samples): return Audio(samples.data, rate=samples.sample_rate) # Audio source is CC0: https://opengameart.org/content/town-theme-rpg # Attribution: cynicmusic.com pixelsphere.org url = "https://opengameart.org/sites/default/files/TownTheme.mp3" response = requests.get(url, headers={"User-Agent": ""}) if response.status_code != 200: raise RuntimeError(f"Failed to download video. {response.status_code = }.") raw_audio_bytes = response.content .. GENERATED FROM PYTHON SOURCE LINES 40-48 .. _creating_decoder_audio: Creating a decoder ------------------ We can now create a decoder from the raw (encoded) audio bytes. You can of course use a local audio file and pass the path as input. You can also decode audio streams from videos! .. GENERATED FROM PYTHON SOURCE LINES 48-53 .. code-block:: Python from torchcodec.decoders import AudioDecoder decoder = AudioDecoder(raw_audio_bytes) .. GENERATED FROM PYTHON SOURCE LINES 54-57 The has not yet been decoded by the decoder, but we already have access to some metadata via the ``metadata`` attribute which is an :class:`~torchcodec.decoders.AudioStreamMetadata` object. .. GENERATED FROM PYTHON SOURCE LINES 57-59 .. code-block:: Python print(decoder.metadata) .. rst-class:: sphx-glr-script-out .. code-block:: none AudioStreamMetadata: duration_seconds_from_header: 97.48897959183674 begin_stream_seconds_from_header: 0.02505668934240363 bit_rate: 108039 codec: mp3 stream_index: 0 duration_seconds: 97.48897959183674 begin_stream_seconds: 0.02505668934240363 sample_rate: 44100 num_channels: 2 sample_format: fltp .. GENERATED FROM PYTHON SOURCE LINES 60-66 Decoding samples ---------------- To get decoded samples, we just need to call the :meth:`~torchcodec.decoders.AudioDecoder.get_all_samples` method, which returns an :class:`~torchcodec.AudioSamples` object: .. GENERATED FROM PYTHON SOURCE LINES 66-72 .. code-block:: Python samples = decoder.get_all_samples() print(samples) play_audio(samples) .. rst-class:: sphx-glr-script-out .. code-block:: none AudioSamples: data (shape): torch.Size([2, 4297722]) pts_seconds: 0.02505668934240363 duration_seconds: 97.45401360544217 sample_rate: 44100 .. raw:: html

.. GENERATED FROM PYTHON SOURCE LINES 73-87 The ``.data`` field is a tensor of shape ``(num_channels, num_samples)`` and of float dtype with values in [-1, 1]. The ``.pts_seconds`` field indicates the starting time of the output samples. Here it's 0.025 seconds, even though we asked for samples starting from 0. Not all streams start exactly at 0! This is not a bug in TorchCodec, this is a property of the file that was defined when it was encoded. Specifying a range ------------------ If we don't need all the samples, we can use :meth:`~torchcodec.decoders.AudioDecoder.get_samples_played_in_range` to decode the samples within a custom range: .. GENERATED FROM PYTHON SOURCE LINES 87-93 .. code-block:: Python samples = decoder.get_samples_played_in_range(start_seconds=10, stop_seconds=70) print(samples) play_audio(samples) .. rst-class:: sphx-glr-script-out .. code-block:: none AudioSamples: data (shape): torch.Size([2, 2646000]) pts_seconds: 10.0 duration_seconds: 60.0 sample_rate: 44100 .. raw:: html

.. GENERATED FROM PYTHON SOURCE LINES 94-101 Custom sample rate ------------------ We can also decode the samples into a desired sample rate using the ``sample_rate`` parameter of :class:`~torchcodec.decoders.AudioDecoder`. The ouput will sound similar, but note that the number of samples greatly decreased: .. GENERATED FROM PYTHON SOURCE LINES 101-107 .. code-block:: Python decoder = AudioDecoder(raw_audio_bytes, sample_rate=16_000) samples = decoder.get_all_samples() print(samples) play_audio(samples) .. rst-class:: sphx-glr-script-out .. code-block:: none AudioSamples: data (shape): torch.Size([2, 1559264]) pts_seconds: 0.02505668934240363 duration_seconds: 97.454 sample_rate: 16000 .. raw:: html

.. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 1.314 seconds) .. _sphx_glr_download_generated_examples_decoding_audio_decoding.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: audio_decoding.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: audio_decoding.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: audio_decoding.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_