-
Notifications
You must be signed in to change notification settings - Fork 233
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35
Comments
joytianya
changed the title
Can the for 30B LLama model, server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
for 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
Apr 19, 2023
joytianya
changed the title
for 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
for 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-256 (128g)? I tried 1, 64,4 and 1,32,8 but they don't seem to work.
Jun 16, 2023
joytianya
changed the title
for 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-256 (128g)? I tried 1, 64,4 and 1,32,8 but they don't seem to work.
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
Jun 16, 2023
Any luck training the 30B on a single TPU v3-8 so far? Does it even fit? The 7B needs 84GB of VRAM, so I would expect the 30B to need at least 4 times that. |
Currently, there is still no work. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.
The text was updated successfully, but these errors were encountered: