Skip to content

Support CPU fallback for videos that don't get decoded by nvdec #792

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Jul 30, 2025

Conversation

NicolasHug
Copy link
Member

@NicolasHug NicolasHug commented Jul 27, 2025

This PR enables support for some 10bit videos. It addresses part of #776.

A bit of context, rehashed from #776:
Sometimes, NVDEC cannot decode the video, so it actually relies on the CPU to do the decoding. For those videos, all that's left to do is the color-conversion step. We can either rely on our existing CPU implementation for that, or move the decoded frame to the GPU and run the color-conversion there.This PR implements the former as it's easier, but we should try to do the second option eventually. This is left as a TODO.

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jul 27, 2025
@NicolasHug NicolasHug changed the title Support CPU fallaback for videos that don't get decoded by nvdec Support CPU fallback for videos that don't get decoded by nvdec Jul 28, 2025
@NicolasHug NicolasHug marked this pull request as ready for review July 28, 2025 18:50
frame_gpu = decoder_gpu.get_frame_at(frame_index).data
assert frame_gpu.device.type == "cuda"
frame_cpu = decoder_cpu.get_frame_at(frame_index).data
assert_frames_equal(frame_gpu.cpu(), frame_cpu)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the case where NVDEC is unable to decode a frame stops occurring, will this test still pass?
In other words, would we expect the frames to not be equal when decoded by GPU vs CPU?

This testing approach looks good overall. I wonder if we could use pytest patch / mocks to check that cpuInterface->convertAVFrameToFrameOutput gets called, but it's not urgent at the moment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the case where NVDEC is unable to decode a frame stops occurring, will this test still pass?

If NVDEC is suddenly able to decode this frame on the GPU, then we'd be failing this other check a bit later:

  TORCH_CHECK(
      actualFormat == AV_PIX_FMT_NV12,

That's because this video is 10bits, so it's not in AV_PIX_FMT_NV12 format.

That's only for the current state of this PR though. In #790 which I'm working on, we'll be able to remove this actualFormat == AV_PIX_FMT_NV12 check.

I wonder if we could use pytest patch / mocks to check that cpuInterface->convertAVFrameToFrameOutput gets called

You're right, it'd be great if we could do that. Right now we're only testing at a high level that this video can be decoded, but we don't know for sure that it's going through the cpu path. Well, we know it does because of our previous tests, but it's true that this is a weak guarantee.

I don't know how we could easily test this from python/pytest, because this is a very low-level C++ path and I'm not sure how to mock it. But that's a great point, ideally this is what we should test.

@NicolasHug NicolasHug merged commit 5becf10 into pytorch:main Jul 30, 2025
45 checks passed
@NicolasHug NicolasHug deleted the enable10bitscudaAV_PIX_FMT_CUDA branch July 30, 2025 07:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Meta Open Source bot.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants