Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisit: Using upstream Thrust complex headers and drop the vendored ones #8222

Open
leofang opened this issue Mar 6, 2024 · 1 comment
Open
Assignees
Labels
cat:enhancement Improvements to existing features prio:medium

Comments

@leofang
Copy link
Member

leofang commented Mar 6, 2024

Now that we stay at the head and vendor CCCL, it is time to revisit the struggle we had in #2629 (cc: @emcastillo for vis, given your heroic efforts back then 馃槃).

The vendored Thrust complex headers currently are used at both build- and run- time:

  • At build time, we use them to build cupy/cuda/{cub,thrust}.pyx.
  • At run time, we use complex<T> in the runtime-JIT kernels. Users do that too, via including <cupy/complex.cuh> in the code compiled by RawKernel/RawModule, which is documented here.

Therefore, below are the obvious requirements:

  1. No code change needed in user land (it's acceptable if changes in CuPy internal code are needed).
  2. No performance regression.
  3. Support NVRTC.

This also helps set stage for future support (ex: the upcoming support NVIDIA/cccl#1140 will help #3370).

Right now, the immediate challenge seems to be due to recent Thrust namespace change in v2.3.1 that I cannot easily update the submodule and investigate the migration approach (xref: NVIDIA/cccl#1493). @miscco is helping investigation (#8221).

@leofang leofang self-assigned this Mar 6, 2024
@leofang
Copy link
Member Author

leofang commented Mar 6, 2024

cc: @jrhemstad for vis

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cat:enhancement Improvements to existing features prio:medium
Projects
None yet
Development

No branches or pull requests

2 participants