-
Notifications
You must be signed in to change notification settings - Fork 252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
支持跨主机部署模型推理 #1472
Milestone
Comments
这么做在 xinf 中实现并不难,但我比较质疑多机 TP 的效率。 |
嗯,好处是可以低成本去验证大尺寸模型,来为硬件投入做参考;对产品自身来讲也是个比较好的宣传点。 |
目前xinference支持一个模型部署在多台机器上么? |
目前xinference支持一个模型部署在多台机器上么? |
目前xinference支持一个模型部署在多台机器上么? |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
vLLM官方文档称可利用ray实现模型多机推理
xinference launch model 时只能选择本机的显卡,希望可以支持跨主机部署模型推理
The text was updated successfully, but these errors were encountered: