Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime error about concurrency #905

Open
meanwo opened this issue Feb 16, 2024 · 0 comments
Open

Runtime error about concurrency #905

meanwo opened this issue Feb 16, 2024 · 0 comments

Comments

@meanwo
Copy link

meanwo commented Feb 16, 2024

I tested with Apache Benchmark to test how many api calls Openllm can handle at the same time.

I set concurrent user 4, requests num 40(10 requests per one).

For 40 requests, 35 returned 200 success and the remaining 5 returned runtime errors.

This is the error log.

File "/usr/local/lib/python3.8/dist-packages/openllm_core/_schemas.py", line 165, in from_runner
structured = orjson.loads(data)
orjson.JSONDecodeError: unexpected character: line 1 column 1 (char 0)

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/openllm/_llm.py", line 116, in generate_iterator
generated = GenerationOutput.from_runner(out).with_options(prompt=prompt)
File "/usr/local/lib/python3.8/dist-packages/openllm_core/_schemas.py", line 167, in from_runner
raise ValueError(f'Failed to parse JSON from SSE message: {data!r}') from e
ValueError: Failed to parse JSON from SSE message: 'Service Busy'

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/usr/local/lib/python3.8/dist-packages/bentoml/_internal/server/http_app.py", line 341, in api_func
output = await api.func(*args)
File "/usr/local/lib/python3.8/dist-packages/openllm/_service.py", line 23, in generate_v1
return (await llm.generate(**llm_model_class(**input_dict).model_dump())).model_dump()
File "/usr/local/lib/python3.8/dist-packages/openllm/_llm.py", line 55, in generate
async for result in self.generate_iterator(
File "/usr/local/lib/python3.8/dist-packages/openllm/_llm.py", line 125, in generate_iterator
raise RuntimeError(f'Exception caught during generation: {err}') from err
RuntimeError: Exception caught during generation: Failed to parse JSON from SSE message: 'Service Busy'

Do I have to use a load balancer like nginx or gobetween to solve it?
Is this problem can't solve only in openllm?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant