math.nextafter for cuda #9541

s-m-e · 2024-04-23T23:43:17Z

This PR adds support for math.nextafter for CUDA. Partially fixes #9435 (no support for numpy.nextafter), related to #9424 and #9438.

s-m-e · 2024-04-24T07:47:50Z

Caveat: This PR intentionally only implements support for FP32 and FP64, but not FP16. I failed to figure out how to do this correctly and do not require it for my use case at the moment.

s-m-e · 2024-04-26T09:13:14Z

@esc It appears I forgot the release notes - please re-run CI.

s-m-e · 2024-04-30T05:39:56Z

Ping @gmarkall

gmarkall · 2024-04-30T20:57:33Z

numba/cuda/mathimpl.py


 binarys_fastmath = {}
 binarys_fastmath['powf'] = 'fast_powf'
+binarys_fastmath['nextafterf'] = 'fast_nextafterf'


Looks like there is a fast version of nextafter for 64-bit operands too: https://docs.nvidia.com/cuda/libdevice-users-guide/__nv_nextafter.html#__nv_nextafter - I'm not 100% sure this will work (been a while since I thought about these implementations) but maybe:

Suggested change

binarys_fastmath['nextafterf'] = 'fast_nextafterf'

binarys_fastmath['nextafterf'] = 'fast_nextafterf'

binarys_fastmath['nextafter'] = 'fast_nextafter'

Additionally - if there are fastmath implementations of these functions, they should also be tested in numba/cuda/tests/cudapy/test_fastmath.py - you should be able to create a test by following the pattern used in the tests for other functions. If you're having trouble working out what patterns to check for, let me know and I'll see if I can help work out something appropriate.

gmarkall

Thanks for the PR! On the whole this looks good, I just have a comment about the fast implementation and testing it on the diff.

I think omitting float16 support for now is fine - I didn't see an implementation in the cuda_fp16.{h,hpp} headers. I did see something that may be applicable in libcu++, but the route to using it (or whether it is applicable) is not obvious so I'm happy with leaving it for now.

s-m-e added 5 commits April 23, 2024 17:21

test case

3b121a1

fix unary to binary

49c06d2

nextafter impl

7aab995

cuda math

e441645

doc nextafter support

936e510

s-m-e requested a review from gmarkall as a code owner April 23, 2024 23:43

s-m-e added 2 commits April 24, 2024 09:40

rm broken fp16 test

6408dc4

fix number of args in test

38dd9a5

esc added the 3 - Ready for Review label Apr 26, 2024

fix release notes

dd879a0

gmarkall added this to the 0.61.0-rc1 milestone Apr 30, 2024

gmarkall reviewed Apr 30, 2024

View reviewed changes

gmarkall requested changes Apr 30, 2024

View reviewed changes

gmarkall added 4 - Waiting on author Waiting for author to respond to review CUDA CUDA related issue/PR and removed 3 - Ready for Review labels May 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

math.nextafter for cuda #9541

math.nextafter for cuda #9541

s-m-e commented Apr 23, 2024

s-m-e commented Apr 24, 2024

s-m-e commented Apr 26, 2024

s-m-e commented Apr 30, 2024

gmarkall Apr 30, 2024

gmarkall Apr 30, 2024

gmarkall left a comment

	binarys_fastmath['nextafterf'] = 'fast_nextafterf'
	binarys_fastmath['nextafterf'] = 'fast_nextafterf'
	binarys_fastmath['nextafter'] = 'fast_nextafter'

math.nextafter for cuda #9541

Are you sure you want to change the base?

math.nextafter for cuda #9541

Conversation

s-m-e commented Apr 23, 2024

s-m-e commented Apr 24, 2024

s-m-e commented Apr 26, 2024

s-m-e commented Apr 30, 2024

gmarkall Apr 30, 2024

Choose a reason for hiding this comment

gmarkall Apr 30, 2024

Choose a reason for hiding this comment

gmarkall left a comment

Choose a reason for hiding this comment