You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Now that we stay at the head and vendor CCCL, it is time to revisit the struggle we had in #2629 (cc: @emcastillo for vis, given your heroic efforts back then 馃槃).
The vendored Thrust complex headers currently are used at both build- and run- time:
At build time, we use them to build cupy/cuda/{cub,thrust}.pyx.
At run time, we use complex<T> in the runtime-JIT kernels. Users do that too, via including <cupy/complex.cuh> in the code compiled by RawKernel/RawModule, which is documented here.
Therefore, below are the obvious requirements:
No code change needed in user land (it's acceptable if changes in CuPy internal code are needed).
No performance regression.
Support NVRTC.
This also helps set stage for future support (ex: the upcoming support NVIDIA/cccl#1140 will help #3370).
Right now, the immediate challenge seems to be due to recent Thrust namespace change in v2.3.1 that I cannot easily update the submodule and investigate the migration approach (xref: NVIDIA/cccl#1493). @miscco is helping investigation (#8221).
The text was updated successfully, but these errors were encountered:
Now that we stay at the head and vendor CCCL, it is time to revisit the struggle we had in #2629 (cc: @emcastillo for vis, given your heroic efforts back then 馃槃).
The vendored Thrust complex headers currently are used at both build- and run- time:
cupy/cuda/{cub,thrust}.pyx
.complex<T>
in the runtime-JIT kernels. Users do that too, via including<cupy/complex.cuh>
in the code compiled byRawKernel
/RawModule
, which is documented here.Therefore, below are the obvious requirements:
This also helps set stage for future support (ex: the upcoming support NVIDIA/cccl#1140 will help #3370).
Right now, the immediate challenge seems to be due to recent Thrust namespace change in v2.3.1 that I cannot easily update the submodule and investigate the migration approach (xref: NVIDIA/cccl#1493). @miscco is helping investigation (#8221).
The text was updated successfully, but these errors were encountered: