Update GGML #103

LoganDark · 2023-06-20T14:47:31Z

This updates GGML to the latest version with Metal and whatever support, and improved CUDA support. A lot changed, including some fundamental operations, so we had to rework the memory estimation again (sorry!). The new one should be more readable though..

Except now enabling cuBLAS creates total nonsense output, so ggml probably broke something, or maybe we are not properly transforming every single operation we perform on any tensor that touches the GPU.

We don't have time to fix this immediately but decided to open this draft PR since we reworked the memory estimation system (again) and everything runs as long as you don't enable cuBLAS.

-Emily

Of course we forgot why we did this, and broke the build again, in the exact same way, a second time.

rwkv.cpp

Properly set the backend and then call ggml_cuda_transform_tensor

saharNooby · 2023-06-21T09:42:41Z

@LoganDark Will it be much work to add operators to rwkv_future_tensor (add, mul, etc.), so that we can have unified code, that constructs the graph in terms of "future tensors"?

In Dicsord, you/Emily mentioned C++ templates, but current approach with rwkv_future_tensor does not use templates and does not look too complicated, so I wonder if we can just extend it.

(as a side note, I just realized what a stupid kind of work we do just because ggml did not separate graph building and tensor allocation... Such a simple idea, but for some reason they did not)

LoganDark · 2023-06-21T15:22:05Z

@LoganDark Will it be much work to add operators to rwkv_future_tensor (add, mul, etc.), so that we can have unified code, that constructs the graph in terms of "future tensors"?

yes that would require creating cgraph again from scratch or creating some other kind of graph data structure.

In Dicsord, you/Emily mentioned C++ templates, but current approach with rwkv_future_tensor does not use templates

it doesn't use templates exactly BECAUSE it does not do multiple things. if you wanted it to do multiple things it would have to use templates. I don't want to use templates

probably should slip this in now before we forget it's a thing.

saharNooby · 2023-06-25T06:14:47Z

@LoganDark Hi again! Is the PR ready for review (cuBLAS working)?

LoganDark · 2023-06-26T01:41:18Z

@LoganDark Hi again! Is the PR ready for review (cuBLAS working)?

Yes it is , I thought that much was obvious when I amrked it as non draft, but now I feel kind of bad that it took me so long to see this comment

saharNooby · 2023-06-26T11:23:08Z

Yes it is , I thought that much was obvious when I amrked it as non draft

Was not obvious to me :) But I will then review non-draft PRs in the future.

LoganDark added 2 commits June 20, 2023 07:24

Update GGML

f243f94

Fix linux build

4c7c74c

Of course we forgot why we did this, and broke the build again, in the exact same way, a second time.

LoganDark mentioned this pull request Jun 20, 2023

CUDA out of memory - but there's plenty of memory ggerganov/llama.cpp#1866

Closed

JohannesGaessler reviewed Jun 20, 2023

View reviewed changes

rwkv.cpp Outdated Show resolved Hide resolved

LoganDark mentioned this pull request Jun 20, 2023

New CUDA changes completely break rwkv.cpp ggerganov/ggml#272

Closed

LoganDark force-pushed the update-ggml branch from f9ad712 to eae7100 Compare June 20, 2023 18:13

Fix cuBLAS

013ce1b

Properly set the backend and then call ggml_cuda_transform_tensor

LoganDark force-pushed the update-ggml branch from eae7100 to 013ce1b Compare June 20, 2023 18:14

Merge remote-tracking branch 'upstream' into update-ggml

e077496

LoganDark marked this pull request as ready for review June 21, 2023 22:13

LoganDark added 2 commits June 21, 2023 15:23

Rename xx to x_prev

01f2907

probably should slip this in now before we forget it's a thing.

See how easy updates are now? (update GGML)

b72abcd

LoganDark force-pushed the update-ggml branch from 12277f2 to b72abcd Compare June 24, 2023 21:51

LoganDark mentioned this pull request Jun 25, 2023

Elide logits if the logits pointer parameter is NULL #107

Merged

saharNooby approved these changes Jun 26, 2023

View reviewed changes

saharNooby merged commit ffc085c into RWKV:master Jun 26, 2023
12 checks passed

LoganDark deleted the update-ggml branch June 26, 2023 21:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update GGML #103

Update GGML #103

LoganDark commented Jun 20, 2023

saharNooby commented Jun 21, 2023 •

edited

Loading

LoganDark commented Jun 21, 2023

saharNooby commented Jun 25, 2023

LoganDark commented Jun 26, 2023

saharNooby commented Jun 26, 2023

Update GGML #103

Update GGML #103

Conversation

LoganDark commented Jun 20, 2023

saharNooby commented Jun 21, 2023 • edited Loading

LoganDark commented Jun 21, 2023

saharNooby commented Jun 25, 2023

LoganDark commented Jun 26, 2023

saharNooby commented Jun 26, 2023

saharNooby commented Jun 21, 2023 •

edited

Loading