-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New CUDA changes completely break rwkv.cpp #272
Comments
OK, it looks like the API usage has just changed. You have to manually change the Thanks to @JohannesGaessler for giving us this info~ -Emily |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Reposting this here with no changes because we're too upset to perfect it, sorry.
It looks like the latest GPU changes have completely broken rwkv.cpp inference - here is a pull request that seems to reproduce the issue: RWKV/rwkv.cpp#103
without cuBLAS:
with cuBLAS:
Removing the calls to
ggml_cuda_assign_buffers
fixes the issue......but of course then it might not actually be doing anything with cuBLAS~
(In practice, I know it probably is, because the precision seems slightly messed up, but I don't know if this is making use of the full acceleration or not.)
AIUI, the usage contract for cuBLAS acceleration has changed, but I can't seem to figure out how it has changed.
Any help would be much appreciated~
-Emily
The text was updated successfully, but these errors were encountered: