Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update GGML #103

Merged
merged 6 commits into from
Jun 26, 2023
Merged

Update GGML #103

merged 6 commits into from
Jun 26, 2023

Conversation

LoganDark
Copy link
Contributor

This updates GGML to the latest version with Metal and whatever support, and improved CUDA support. A lot changed, including some fundamental operations, so we had to rework the memory estimation again (sorry!). The new one should be more readable though..

Except now enabling cuBLAS creates total nonsense output, so ggml probably broke something, or maybe we are not properly transforming every single operation we perform on any tensor that touches the GPU.

We don't have time to fix this immediately but decided to open this draft PR since we reworked the memory estimation system (again) and everything runs as long as you don't enable cuBLAS.

-Emily

Of course we forgot why we did this, and broke the build again, in
the exact same way, a second time.
rwkv.cpp Outdated Show resolved Hide resolved
Properly set the backend and then call ggml_cuda_transform_tensor
@saharNooby
Copy link
Collaborator

saharNooby commented Jun 21, 2023

@LoganDark Will it be much work to add operators to rwkv_future_tensor (add, mul, etc.), so that we can have unified code, that constructs the graph in terms of "future tensors"?

In Dicsord, you/Emily mentioned C++ templates, but current approach with rwkv_future_tensor does not use templates and does not look too complicated, so I wonder if we can just extend it.

(as a side note, I just realized what a stupid kind of work we do just because ggml did not separate graph building and tensor allocation... Such a simple idea, but for some reason they did not)

@LoganDark
Copy link
Contributor Author

@LoganDark Will it be much work to add operators to rwkv_future_tensor (add, mul, etc.), so that we can have unified code, that constructs the graph in terms of "future tensors"?

yes that would require creating cgraph again from scratch or creating some other kind of graph data structure.

In Dicsord, you/Emily mentioned C++ templates, but current approach with rwkv_future_tensor does not use templates

it doesn't use templates exactly BECAUSE it does not do multiple things. if you wanted it to do multiple things it would have to use templates. I don't want to use templates

@LoganDark LoganDark marked this pull request as ready for review June 21, 2023 22:13
probably should slip this in now before we forget it's a thing.
@saharNooby
Copy link
Collaborator

@LoganDark Hi again! Is the PR ready for review (cuBLAS working)?

@LoganDark
Copy link
Contributor Author

@LoganDark Hi again! Is the PR ready for review (cuBLAS working)?

Yes it is , I thought that much was obvious when I amrked it as non draft, but now I feel kind of bad that it took me so long to see this comment

@saharNooby saharNooby merged commit ffc085c into RWKV:master Jun 26, 2023
12 checks passed
@saharNooby
Copy link
Collaborator

Yes it is , I thought that much was obvious when I amrked it as non draft

Was not obvious to me :) But I will then review non-draft PRs in the future.

@LoganDark LoganDark deleted the update-ggml branch June 26, 2023 21:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants