deps/media-playback and plugins/win-dshow: Prioritize CUDA for media source decode #10607
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
8K60 files would not play back at full rate in media source on Windows on Nvidia cards, due to D3D11VA and DXVA2 not being able to decode fast enough. This pull request puts CUDA (Really, NVDEC) decoder at highest priority, to enable high resolution / frame rate file decode at full speed. This does NOT use any additional CUDA cores, it just leverages the NVDEC block on the Nvidia GPU.
Motivation and Context
I have only been able to play back 8K HEVC video files on MacOS at full speed...this brings the Windows version at feature parity. I am looking to make this change because VR video playback and re-streaming requires 8K decode, at HEVC and possibly AV1.
How Has This Been Tested?
I tested the media source with HW Decode on 2 Windows machines, one with an Nvidia L4 and one with an Nvidia ADA 6000. This does not appear to affect any other code. Using the D3D11VA decoder, the media source playback plays the 8K60 file at approximately 20fps (1/3 speed). Using the CUDA decoder, the media source playback plays the file at 60fps (full speed).
I gathered detailed stats on the L4 machine through task manager. The CUDA decoder actually uses considerably fewer GPU resources than the D3D11VA decoder. I averaged between 86 and 90% 3D usage with the D3D11VA decoder, and between 42% and 46% 3D usage with the CUDA decoder. GPU memory usage was the same, and Decoder block usage was actually slightly less with CUDA (11% vs 15%). There was no change in CPU or RAM usage.
This appears to fix this video playback issue with no impact on anything else within OBS.
Types of changes
Bug fix (non-breaking change which fixes an issue)
Performance enhancement (non-breaking change which improves efficiency)
Checklist: