.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "generated_examples/decoding/custom_frame_mappings.py" .. LINE NUMBERS ARE GIVEN BELOW. .. rst-class:: sphx-glr-example-title .. _sphx_glr_generated_examples_decoding_custom_frame_mappings.py: ==================================== Decoding with custom frame mappings ==================================== In this example, we will describe the ``custom_frame_mappings`` parameter of the :class:`~torchcodec.decoders.VideoDecoder` class. This parameter allows you to provide pre-computed frame mapping information to speed up :class:`~torchcodec.decoders.VideoDecoder` instantiation, while maintaining the frame seeking accuracy of ``seek_mode="exact"``. This makes it ideal for workflows where: 1. Frame accuracy is critical, so :doc:`approximate mode ` cannot be used 2. Videos can be preprocessed once and then decoded many times .. GENERATED FROM PYTHON SOURCE LINES 25-29 First, some boilerplate: we'll download a short video from the web, and use ffmpeg to create a longer version by repeating it multiple times. We'll end up with two videos: a short one of approximately 14 seconds and a long one of about 12 minutes. You can ignore this part and skip below to :ref:`frame_mappings_creation`. .. GENERATED FROM PYTHON SOURCE LINES 29-62 .. code-block:: Python import tempfile from pathlib import Path import subprocess import requests # Video source: https://www.pexels.com/video/dog-eating-854132/ # License: CC0. Author: Coverr. url = "https://videos.pexels.com/video-files/854132/854132-sd_640_360_25fps.mp4" response = requests.get(url, headers={"User-Agent": ""}) if response.status_code != 200: raise RuntimeError(f"Failed to download video. {response.status_code = }.") temp_dir = tempfile.mkdtemp() short_video_path = Path(temp_dir) / "short_video.mp4" with open(short_video_path, 'wb') as f: for chunk in response.iter_content(): f.write(chunk) long_video_path = Path(temp_dir) / "long_video.mp4" ffmpeg_command = [ "ffmpeg", "-stream_loop", "50", # repeat video 50 times to get a ~12 min video "-i", f"{short_video_path}", "-c", "copy", f"{long_video_path}" ] subprocess.run(ffmpeg_command, check=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE) from torchcodec.decoders import VideoDecoder print(f"Short video duration: {VideoDecoder(short_video_path).metadata.duration_seconds} seconds") print(f"Long video duration: {VideoDecoder(long_video_path).metadata.duration_seconds / 60} minutes") .. rst-class:: sphx-glr-script-out .. code-block:: none Short video duration: 13.8 seconds Long video duration: 11.729999999999999 minutes .. GENERATED FROM PYTHON SOURCE LINES 63-75 .. _frame_mappings_creation: Creating custom frame mappings with ffprobe ------------------------------------------- To generate JSON files containing the required video metadata, we recommend using ffprobe. The following frame metadata fields are needed (the ``pkt_`` prefix is needed for older versions of FFmpeg): - ``pts`` / ``pkt_pts``: Presentation timestamps for each frame - ``duration`` / ``pkt_duration``: Duration of each frame - ``key_frame``: Boolean indicating which frames are key frames .. GENERATED FROM PYTHON SOURCE LINES 75-104 .. code-block:: Python from pathlib import Path import subprocess import tempfile from time import perf_counter_ns import json # Lets define a simple function to run ffprobe on a video's first stream index, then writes the results in output_json_path. def generate_frame_mappings(video_path, output_json_path, stream_index): ffprobe_cmd = ["ffprobe", "-i", f"{video_path}", "-select_streams", f"{stream_index}", "-show_frames", "-show_entries", "frame=pts,duration,key_frame", "-of", "json"] print(f"Running ffprobe:\n{' '.join(ffprobe_cmd)}\n") ffprobe_result = subprocess.run(ffprobe_cmd, check=True, capture_output=True, text=True) with open(output_json_path, "w") as f: f.write(ffprobe_result.stdout) stream_index = 0 long_json_path = Path(temp_dir) / "long_custom_frame_mappings.json" short_json_path = Path(temp_dir) / "short_custom_frame_mappings.json" generate_frame_mappings(long_video_path, long_json_path, stream_index) generate_frame_mappings(short_video_path, short_json_path, stream_index) with open(short_json_path) as f: sample_data = json.loads(f.read()) print("Sample of fields in custom frame mappings:") for frame in sample_data["frames"][:3]: print(f"{frame['key_frame'] = }, {frame['pts'] = }, {frame['duration'] = }") .. rst-class:: sphx-glr-script-out .. code-block:: none Running ffprobe: ffprobe -i /tmp/tmpx4qs2oi7/long_video.mp4 -select_streams 0 -show_frames -show_entries frame=pts,duration,key_frame -of json Running ffprobe: ffprobe -i /tmp/tmpx4qs2oi7/short_video.mp4 -select_streams 0 -show_frames -show_entries frame=pts,duration,key_frame -of json Sample of fields in custom frame mappings: frame['key_frame'] = 1, frame['pts'] = 0, frame['duration'] = 1 frame['key_frame'] = 0, frame['pts'] = 1, frame['duration'] = 1 frame['key_frame'] = 0, frame['pts'] = 2, frame['duration'] = 1 .. GENERATED FROM PYTHON SOURCE LINES 105-113 .. _custom_frame_mappings_perf_creation: Performance: ``VideoDecoder`` creation -------------------------------------- Custom frame mappings affect the **creation** of a :class:`~torchcodec.decoders.VideoDecoder` object. As video length or resolution increases, the performance gain compared to exact mode increases. .. GENERATED FROM PYTHON SOURCE LINES 113-150 .. code-block:: Python import torch # Here, we define a benchmarking function, with the option to seek to the start of a file_like. def bench(f, file_like=False, average_over=50, warmup=2, **f_kwargs): for _ in range(warmup): f(**f_kwargs) if file_like: f_kwargs["custom_frame_mappings"].seek(0) times = [] for _ in range(average_over): start = perf_counter_ns() f(**f_kwargs) end = perf_counter_ns() times.append(end - start) if file_like: f_kwargs["custom_frame_mappings"].seek(0) times = torch.tensor(times) * 1e-6 # ns to ms std = times.std().item() med = times.median().item() print(f"{med = :.2f}ms +- {std:.2f}") for video_path, json_path in ((short_video_path, short_json_path), (long_video_path, long_json_path)): print(f"\nRunning benchmarks on {Path(video_path).name}") print("Creating a VideoDecoder object with custom_frame_mappings:") with open(json_path, "r") as f: bench(VideoDecoder, file_like=True, source=video_path, stream_index=stream_index, custom_frame_mappings=f) # Compare against exact seek_mode print("Creating a VideoDecoder object with seek_mode='exact':") bench(VideoDecoder, source=video_path, stream_index=stream_index, seek_mode="exact") .. rst-class:: sphx-glr-script-out .. code-block:: none Running benchmarks on short_video.mp4 Creating a VideoDecoder object with custom_frame_mappings: med = 7.67ms +- 0.02 Creating a VideoDecoder object with seek_mode='exact': med = 8.00ms +- 0.03 Running benchmarks on long_video.mp4 Creating a VideoDecoder object with custom_frame_mappings: med = 33.59ms +- 0.31 Creating a VideoDecoder object with seek_mode='exact': med = 59.50ms +- 0.73 .. GENERATED FROM PYTHON SOURCE LINES 151-158 Performance: Frame decoding with custom frame mappings ------------------------------------------------------ Although using ``custom_frame_mappings`` only impacts the initialization speed of :class:`~torchcodec.decoders.VideoDecoder`, decoding workflows involve creating a :class:`~torchcodec.decoders.VideoDecoder` instance, so the performance benefits are realized. .. GENERATED FROM PYTHON SOURCE LINES 158-178 .. code-block:: Python def decode_frames(video_path, seek_mode = "exact", custom_frame_mappings = None): decoder = VideoDecoder( source=video_path, seek_mode=seek_mode, custom_frame_mappings=custom_frame_mappings ) decoder.get_frames_in_range(start=0, stop=10) for video_path, json_path in ((short_video_path, short_json_path), (long_video_path, long_json_path)): print(f"\nRunning benchmarks on {Path(video_path).name}") print("Decoding frames with custom_frame_mappings:") with open(json_path, "r") as f: bench(decode_frames, file_like=True, video_path=video_path, custom_frame_mappings=f) print("Decoding frames with seek_mode='exact':") bench(decode_frames, video_path=video_path, seek_mode="exact") .. rst-class:: sphx-glr-script-out .. code-block:: none Running benchmarks on short_video.mp4 Decoding frames with custom_frame_mappings: med = 23.32ms +- 0.04 Decoding frames with seek_mode='exact': med = 23.63ms +- 0.04 Running benchmarks on long_video.mp4 Decoding frames with custom_frame_mappings: med = 49.27ms +- 0.12 Decoding frames with seek_mode='exact': med = 75.91ms +- 0.47 .. GENERATED FROM PYTHON SOURCE LINES 179-185 Accuracy: Metadata and frame retrieval -------------------------------------- In addition to the instantiation speed up compared to ``seek_mode="exact"``, using custom frame mappings also retains the benefit of exact metadata and frame seeking. .. GENERATED FROM PYTHON SOURCE LINES 185-203 .. code-block:: Python print("Metadata of short video with custom_frame_mappings:") with open(short_json_path, "r") as f: print(VideoDecoder(short_video_path, custom_frame_mappings=f).metadata) print("Metadata of short video with seek_mode='exact':") print(VideoDecoder(short_video_path, seek_mode="exact").metadata) with open(short_json_path, "r") as f: custom_frame_mappings_decoder = VideoDecoder(short_video_path, custom_frame_mappings=f) exact_decoder = VideoDecoder(short_video_path, seek_mode="exact") for i in range(len(exact_decoder)): torch.testing.assert_close( exact_decoder.get_frame_at(i).data, custom_frame_mappings_decoder.get_frame_at(i).data, atol=0, rtol=0, ) print("Frame seeking is the same for this video!") .. rst-class:: sphx-glr-script-out .. code-block:: none Metadata of short video with custom_frame_mappings: VideoStreamMetadata: duration_seconds_from_header: 13.8 begin_stream_seconds_from_header: 0.0 bit_rate: 505790.0 codec: h264 stream_index: 0 begin_stream_seconds_from_content: 0.0 end_stream_seconds_from_content: 13.8 width: 640 height: 360 num_frames_from_header: 345 num_frames_from_content: 345 average_fps_from_header: 25.0 pixel_aspect_ratio: 1 duration_seconds: 13.8 begin_stream_seconds: 0.0 end_stream_seconds: 13.8 num_frames: 345 average_fps: 25.0 Metadata of short video with seek_mode='exact': VideoStreamMetadata: duration_seconds_from_header: 13.8 begin_stream_seconds_from_header: 0.0 bit_rate: 505790.0 codec: h264 stream_index: 0 begin_stream_seconds_from_content: 0.0 end_stream_seconds_from_content: 13.8 width: 640 height: 360 num_frames_from_header: 345 num_frames_from_content: 345 average_fps_from_header: 25.0 pixel_aspect_ratio: 1 duration_seconds: 13.8 begin_stream_seconds: 0.0 end_stream_seconds: 13.8 num_frames: 345 average_fps: 25.0 Frame seeking is the same for this video! .. GENERATED FROM PYTHON SOURCE LINES 204-223 How do custom_frame_mappings help? ---------------------------------- Custom frame mappings contain the same frame index information that would normally be computed during the :term:`scan` operation in exact mode. By providing this information to the :class:`~torchcodec.decoders.VideoDecoder` as a JSON, it eliminates the need for the expensive scan while preserving the accuracy benefits. Which mode should I use? ------------------------ - For fastest decoding when speed is more important than exact seeking accuracy, "approximate" mode is recommended. - For exact frame seeking, custom frame mappings will benefit workflows where the same videos are decoded repeatedly, and some preprocessing work can be done. - For exact frame seeking without preprocessing, use "exact" mode. .. rst-class:: sphx-glr-timing **Total running time of the script:** (0 minutes 31.497 seconds) .. _sphx_glr_download_generated_examples_decoding_custom_frame_mappings.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: custom_frame_mappings.ipynb ` .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: custom_frame_mappings.py ` .. container:: sphx-glr-download sphx-glr-download-zip :download:`Download zipped: custom_frame_mappings.zip ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_