Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker image is no longer starting #856

Closed
ainub opened this issue Nov 21, 2023 · 10 comments
Closed

Docker image is no longer starting #856

ainub opened this issue Nov 21, 2023 · 10 comments
Assignees

Comments

@ainub
Copy link

ainub commented Nov 21, 2023

Describe the bug
I haven't used it for some time and decided to update the image and give it a shot.
(traps: tabby[382782] trap invalid opcode ip:55b5f1164829 sp:7ffd27c1fb20 error:0 in tabby[55b5f0133000+1067000])
The executable is no longer starting with or without cuda. I suspect is something related to AVX/AVX2? Are you using such flags when compiling it?
Oobabooga seem to have a different list of requirements regarding that issue.

Information about your version
tabby 0.5.5

Information about your GPU
0 NVIDIA GeForce GTX 1070
1 NVIDIA GeForce GTX 1070

It used to work at some point.
CPU flagx: AVX, not AVX2

@wsxiaoys
Copy link
Member

Could you provide the command you used to start tabby? And confirm if 0.4.0 works for you?

@ainub
Copy link
Author

ainub commented Nov 22, 2023

Yes. 0.4.0 is working as intended.
The command to run it is:
docker run -it -e TABBY_DISABLE_USAGE_COLLECTION=1 --gpus '"device=1"' -p 8080:8080 -v $HOME/.tabby:/data tabbyml/tabby:0.4.0 serve --model TabbyML/StarCoder-3B --device cuda
Also this seems to be working fine:
docker run -it --gpus all -p 8080:8080 -e RUST_BACKTRACE=1 -e TABBY_DISABLE_USAGE_COLLECTION=1 -v $HOME/.tabby:/data tabbyml/tabby:0.4.0 serve --model TabbyML/StarCoder-3B --device cuda --device-indices 1

@wsxiaoys
Copy link
Member

wsxiaoys commented Nov 22, 2023

Potential related: ggerganov/llama.cpp#1583
A fixed PR has been merged in ggerganov/llama.cpp@c41ea36
which synced to tabby in cde3602

Could you try with the nightly docker tag to see if it already fixes the issue for you?

@wsxiaoys wsxiaoys added bug Something isn't working and removed bug-unconfirmed labels Nov 22, 2023
@ainub
Copy link
Author

ainub commented Nov 22, 2023

No, unfortunately, the problem persists. Maybe the image is not yet updated:
Digest: sha256:b2be09111141a26ae309e788c1449f4c67e241657729ffc7bf05f8952e89ee35
Status: Downloaded newer image for tabbyml/tabby:nightly

docker run -it --gpus all -p 8080:8080 -e RUST_BACKTRACE=1 -e TABBY_DISABLE_USAGE_COLLECTION=1 -v $HOME/.tabby:/data tabbyml/tabby:nightly serve --model TabbyML/StarCoder-3B --device cuda

traps: tabby[509035] trap invalid opcode ip:557fe6681569 sp:7ffd0e961010 error:0 in tabby[557fe52a4000+1415000]

@wsxiaoys
Copy link
Member

After giving a second look of ggerganov/llama.cpp@c41ea36 , it seems the avx2 check happens only at compile time. There seems no runtime detection for avx / avx2 branch.

@ainub
Copy link
Author

ainub commented Nov 22, 2023

It is probably why oobabooga provide two different requirement files for llama: one with avx2 and one without. The optimization is related to CPU processing though, it shouldn't affect GPU and yet...

@wsxiaoys wsxiaoys self-assigned this Nov 22, 2023
@yayitazale
Copy link

yayitazale commented Dec 3, 2023

Same problem, only 0.4.0 starts. There is no error on the logs of the container it just doesn't start.

Information about your version
tabby 0.5.5 (cuda and not cuda)

Information about your GPU
0 NVIDIA GeForce GTX 1060

CPU flags: not AVX, not AVX2 (Intel® Pentium® Gold G5600 CPU)

@rowanfuchs
Copy link

rowanfuchs commented Dec 15, 2023

Same problem, the docker instance is running but tabby doesn't start. Using:

version: '3.5'

services:
  tabby:
    container_name: tabby
    restart: always
    image: tabbyml/tabby
    command: serve --model TabbyML/StarCoder-1B
    volumes:
      - /tabby:/data
    ports:
      - 8181:8080

0.4.0 launches, didn't check a higher version.

@justaCasualCoder
Copy link

After compiling from source , it works with just avx.

@wsxiaoys wsxiaoys removed the bug Something isn't working label Dec 29, 2023
@wsxiaoys
Copy link
Member

Filing #1142 to track the issue

@wsxiaoys wsxiaoys closed this as not planned Won't fix, can't repro, duplicate, stale Dec 29, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants