-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gradio和vllm的结合问题 #331
Comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
您好,感谢社区给出的很好的示例。我看到gradio和vllm分布式加速推理是放到了两个地方来示例。如果我想要用gradio来充当大模型的访问界面,同时我还想要用vllm来给部署的大模型加速,这个问题该如何解决?我想到的方法是分别启动两个服务,然后从gradio服务里边去调用vllm服务的api作为处理函数,我这样做对吗,两者结合的标准范式是什么呢
The text was updated successfully, but these errors were encountered: