Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Demo for Drafting: On-GPU Data Access (CUDA, CuPy, PyTorch, DLPack) #2429

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

xloem
Copy link

@xloem xloem commented Nov 8, 2022

This is a scratch work showcasing concepts for gpu data access coming together. It's a quick half-working thing.

#1986 #1985 #210 #2391 #120 (note: #120 includes opencl work!) #2426 #57

Demo (watch the text output: attaches a vertex buffer to a program, then exports it to pytorch and cupy in-place):

pip3 install torch cupy-cuda11x # or as appropriate
python3 -m vispy.gloo.dlpack.cuda_
# the pauses are from importing torch and cupy during the data export

I copied from the dlpack repository to try to write a basic dlpack wrapper for vispy's GL buffers. I spent maybe six hours on this, and am sharing it now that it actually runs at all. I'm not sure whether I will return to this or not, but I expect others would find it helpful (as well as frustrating and undocumented).

EDIT: I first posted this without striding the merged program data. It is now manually unstrided, and the output data is correct.

EDIT: I first posted this with an extant cupy crash. I've now addressed byte offset quirks for both torch and cupy, and passed the device type as a python int rather than a ctypes value. cupy now loads the data correctly.

EDIT: The next remaining issue would be organizing the code with vispy's GL pipeline. I'm not sure how to flush GL commands from the client without breaking the pipeline (which I haven't studied). A similar but separate issue may be synchronizing the CUDA stream with the GLIR queue.

But it was pretty exciting for me to see similar data output in the torch tensor as I passed in to vispy's program, once I got this far.

EDIT: Current output is:

Sending position buffer to vispy:
position buffer = [[-1. -1.]
 [-1.  1.]
 [ 1. -1.]
 [ 1.  1.]]
Pulling position out:
position torch tensor =  tensor([[-1., -1.],
        [-1.,  1.],
        [ 1., -1.],
        [ 1.,  1.]], device='cuda:0')
position cupy array =  [[-1. -1.]
 [-1.  1.]
 [ 1. -1.]
 [ 1.  1.]]

@jakirkham
Copy link

Thanks for sharing! 🙏

But it was pretty exciting for me to see similar data output in the torch tensor as I passed in to vispy's program, once I got this far.

Could you please share specifically what was run and the error seen?

@xloem
Copy link
Author

xloem commented Nov 9, 2022

I've updated the post to include the output. The command run is the first code block.

@xloem
Copy link
Author

xloem commented Nov 9, 2022

It runs without errors now. As added in my edit, current issues are:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants