Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

math.nextafter for cuda #9541

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open

Conversation

s-m-e
Copy link

@s-m-e s-m-e commented Apr 23, 2024

This PR adds support for math.nextafter for CUDA. Partially fixes #9435 (no support for numpy.nextafter), related to #9424 and #9438.

@s-m-e s-m-e requested a review from gmarkall as a code owner April 23, 2024 23:43
@s-m-e
Copy link
Author

s-m-e commented Apr 24, 2024

Caveat: This PR intentionally only implements support for FP32 and FP64, but not FP16. I failed to figure out how to do this correctly and do not require it for my use case at the moment.

@s-m-e
Copy link
Author

s-m-e commented Apr 26, 2024

@esc It appears I forgot the release notes - please re-run CI.

@s-m-e
Copy link
Author

s-m-e commented Apr 30, 2024

Ping @gmarkall

@gmarkall gmarkall added this to the 0.61.0-rc1 milestone Apr 30, 2024

binarys_fastmath = {}
binarys_fastmath['powf'] = 'fast_powf'
binarys_fastmath['nextafterf'] = 'fast_nextafterf'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like there is a fast version of nextafter for 64-bit operands too: https://docs.nvidia.com/cuda/libdevice-users-guide/__nv_nextafter.html#__nv_nextafter - I'm not 100% sure this will work (been a while since I thought about these implementations) but maybe:

Suggested change
binarys_fastmath['nextafterf'] = 'fast_nextafterf'
binarys_fastmath['nextafterf'] = 'fast_nextafterf'
binarys_fastmath['nextafter'] = 'fast_nextafter'

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additionally - if there are fastmath implementations of these functions, they should also be tested in numba/cuda/tests/cudapy/test_fastmath.py - you should be able to create a test by following the pattern used in the tests for other functions. If you're having trouble working out what patterns to check for, let me know and I'll see if I can help work out something appropriate.

Copy link
Member

@gmarkall gmarkall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! On the whole this looks good, I just have a comment about the fast implementation and testing it on the diff.

I think omitting float16 support for now is fine - I didn't see an implementation in the cuda_fp16.{h,hpp} headers. I did see something that may be applicable in libcu++, but the route to using it (or whether it is applicable) is not obvious so I'm happy with leaving it for now.

@gmarkall gmarkall added 4 - Waiting on author Waiting for author to respond to review CUDA CUDA related issue/PR and removed 3 - Ready for Review labels May 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
4 - Waiting on author Waiting for author to respond to review CUDA CUDA related issue/PR
Projects
None yet
Development

Successfully merging this pull request may close these issues.

nextafter (via both math and numpy) missing for CUDA
3 participants