Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Request Limits参数对stream=true情况下无法限制并发数 #1467

Open
gggauss opened this issue May 10, 2024 · 0 comments
Open

Request Limits参数对stream=true情况下无法限制并发数 #1467

gggauss opened this issue May 10, 2024 · 0 comments
Labels
question Further information is requested
Milestone

Comments

@gggauss
Copy link

gggauss commented May 10, 2024

调用/v1/chat/completions流式输出api时,后台debug日志显示 Leave wrapped_func, elapsed time: 0 s,时间永远是0,所有的请求都是瞬间完成,计数器也为0:current serve request count: 0,所以无法对模型并发数进行有效控制
image

@gggauss gggauss added the question Further information is requested label May 10, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.0, v0.11.1 May 10, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.1, v0.11.2 May 17, 2024
@XprobeBot XprobeBot modified the milestones: v0.11.2, v0.11.3 May 24, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants