You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
which would tremendously speedup the build given that the target is always Hopper, Ampere.
This would involve having for example our own wheel index, handle the build when we want to upgrade, and use whl in TGI Dockerfiles.
Motivation
Building TGI is prohibitively slow for developers
Your contribution
I could work on that sometime
The text was updated successfully, but these errors were encountered:
fxmarty
changed the title
Use pre-built FA2, vllm, quantization kernels in the docker image
Use pre-built FA2, vllm, quantization kernels in the dockerfiles
May 7, 2024
Feature request
We could just use
which would tremendously speedup the build given that the target is always Hopper, Ampere.
This would involve having for example our own wheel index, handle the build when we want to upgrade, and use whl in TGI Dockerfiles.
Motivation
Building TGI is prohibitively slow for developers
Your contribution
I could work on that sometime
The text was updated successfully, but these errors were encountered: