Inference shouldn't have a timeout #502

PlanetMacro · 2024-04-27T17:11:04Z

Especially with larger local models on OLLAMA, inference might take some time. Especially on the initial loading of the model. Currently devika throws a timeout, basically rendering its useless for such a setup.

DGdev91 · 2024-05-01T00:38:00Z

Seems like the timeout is hardcoded in src/llm/llm.py

if int(elapsed_time) == 30:
    emit_agent("inference", {"type": "warning", "message": "Inference is taking longer than expected"})
if elapsed_time > 60:
    raise concurrent.futures.TimeoutError
time.sleep(1)
                    
response = future.result(timeout=60).strip()

As a quick hack to make it work you can increase those values, or you can just comment the fiirst 4 lines and remove the timeout on the last one.

Anyway, i agree. it shouldn't have a timeout, or at least it should be easly configurable to increase/disable it if needed

* chore: minor updates * Add: send live inference time to frontend * add: timeout in settings * Improve: response parsing, add temperature to models * patches close #510, #507, #502, #468

ARajgor mentioned this issue May 2, 2024

Improve and Fix: response parser, invalid response and others #522

Merged

6 tasks

ARajgor closed this as completed May 2, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference shouldn't have a timeout #502

Inference shouldn't have a timeout #502

PlanetMacro commented Apr 27, 2024

DGdev91 commented May 1, 2024 •

edited

Inference shouldn't have a timeout #502

Inference shouldn't have a timeout #502

Comments

PlanetMacro commented Apr 27, 2024

DGdev91 commented May 1, 2024 • edited

DGdev91 commented May 1, 2024 •

edited