For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35

joytianya · 2023-04-19T14:00:58Z

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work.

versae · 2023-09-03T15:50:13Z

Any luck training the 30B on a single TPU v3-8 so far? Does it even fit? The 7B needs 84GB of VRAM, so I would expect the 30B to need at least 4 times that.

joytianya · 2023-09-05T04:07:28Z

Currently, there is still no work.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35

joytianya commented Apr 19, 2023 •

edited

versae commented Sep 3, 2023

joytianya commented Sep 5, 2023

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35

For 30B LLama model, can server be supported by configuring mesh_dims on tpu v3-8 (128g)? I tried 8,1 and 4,1 but they don't seem to work. #35

Comments

joytianya commented Apr 19, 2023 • edited

versae commented Sep 3, 2023

joytianya commented Sep 5, 2023

joytianya commented Apr 19, 2023 •

edited