Skip to content

Conversation

dvrogozh
Copy link
Contributor

@dvrogozh dvrogozh commented Aug 22, 2025

Background

I submit this PR to illustrate filter graphs usage for multi-gpu support on the example of ffmpeg vaapi backend. The difference of this PR from #558 is usage of ffmpeg filters for color space conversion. Pay attention that XpuDeviceInterface implemented in the last commit of this PR can further be generalized to support any hw (and cpu) ffmpeg backend by having these 3 backend specific stuff:

  1. Function to create ffmpeg HW device context from the torch device (should return AVBufferRef)
  2. Settings to select and tune HW backend specific filters (scale vs. scale_vaapi vs. scale_cuda)
  3. Function to covert output AVFrame (from filters) to the tensor

The reason of this PR being a draft is that currently full (dec+filters) ffmpeg-vaapi pipeline generates worse quality output vs. pipeline in #558. Reason for that is handling of color standards in scale_vaapi filter and Intel media driver. I believe there are some issues here in these components which needs to be reported and fixed. I will follow up.

Details

This commit enables support for Intel GPUs in torchcodec. It adds:

  • ffmpeg-vaapi for decoding and color space conversion from decoding
    output to RGBA
  • RGBA surface import as torch tensor (on torch xpu device)
  • RGBA to RGB24 tensor slicing

To build torchcodec with Intel GPU support:

  • Install pytorch with XPU backend support. For example, with:
pip3 install torch --index-url https://download.pytorch.org/whl/xpu

CC: @scotts @NicolasHug @eromomon

FFmpeg filter graphs allow to cover a lot of use cases including
cpu and gpu usages. This commit moves filter graph support out of
CPU device interface which allows flexibility in usage across
other contexts.

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Dmitry Rogozhkin and others added 2 commits August 22, 2025 13:07
This commit enables support for Intel GPUs in torchcodec. It adds:
* ffmpeg-vaapi for decoding and color space conversion from decoding
  output to RGBA
* RGBA surface import as torch tensor (on torch xpu device)
* RGBA to RGB24 tensor slicing

To build torchcodec with Intel GPU support:
* Install pytorch with XPU backend support. For example, with:
```
pip3 install torch --index-url https://download.pytorch.org/whl/xpu
```
* Install oneAPI development environment following
  https://github.com/pytorch/pytorch?tab=readme-ov-file#intel-gpu-support
* Build and install FFmpeg with `--enable-vaapi`
* Install torcheval (for tests): `pip3 install torcheval`
* Build torchcodec with: `ENABLE_XPU=1 python3 setup.py devel`

Notes:
* RGB24 is not supported color format on current Intel GPUs (as it
  is considered to be suboptimal due to odd alignments)
* Intel media and compute APIs can't seamlessly work with the
  memory from each other. For example, Intel computes's Unified
  Shared Memory pointers are not recognized by media APIs. Thus,
  lower level sharing via dma fds is needed. This alos makes this
  part of the solution OS dependent.
* Color space conversion algoriths might be quite different as it
  happens for Intel. This requires to check PSNR values instead of
  per-pixel atol/rtol differences.
* Installing oneAPI environment is neded due to
  pytorch/pytorch#149075

This commit was primary verfied on Intel Battlemage G21 (0xe20b) and
Intel Data Center GPU Flex (0x56c0).

Co-authored-by: Edgar Romo Montiel <[email protected]>
Signed-off-by: Edgar Romo Montiel <[email protected]>
Signed-off-by: Dmitry Rogozhkin <[email protected]>

Use ffmpeg-vaapi filters for color conversion in XPU interface

Signed-off-by: Dmitry Rogozhkin <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant