Add support for fp8 (H100) #387

tgaddair · 2024-04-04T19:31:42Z

No description provided.

tgaddair · 2024-04-04T19:35:29Z

Native support in PyTorch is experimental:

https://github.com/pytorch-labs/float8_experimental

We could consider adding this, or wait for official support.

tgaddair · 2024-04-09T21:51:51Z

Initial results using the PyTorch codebase are not good. About 10x decrease in throughput vs fp16 on H100.

https://github.com/predibase/lorax/tree/fp8

Will need to investigate transformer engine or dig into the PyTorch implementation in more detail. Definitely would appear that there is too much conversion between types happening at the moment (as opposed to everything happening in fp8 natively).

tgaddair added the enhancement New feature or request label Apr 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for fp8 (H100) #387

Add support for fp8 (H100) #387

tgaddair commented Apr 4, 2024

tgaddair commented Apr 4, 2024

tgaddair commented Apr 9, 2024

Add support for fp8 (H100) #387

Add support for fp8 (H100) #387

Comments

tgaddair commented Apr 4, 2024

tgaddair commented Apr 4, 2024

tgaddair commented Apr 9, 2024