-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Bug] llama2 7b android compilation is giving "Can only handle constant size stack allocation for now" error #2282
Comments
I am not sure what was happening on this case, perhaps it is related to come stale variant of compiler. we recently updated our android SDK https://llm.mlc.ai/docs/deploy/android.html please try follow the new instructions |
This is not just happening on the Android SDK, I am using this on a CUDA device, but this error still happens. I tried to use a built from source mlc_llm, the error would happen, but when I use the pre-built the mlc_llm, this error goes away. It is hard to locate where is the problem, can anyone give more hints @tqchen? In my case, I already installed a pre-built mlc_llm & tvm-unity package in my environment, then I git clone the newest version of mlc_llm, and created a virtual environment, then built it from source, I tested a Qwen2-0.5B model with these these 2 versions of mlc_llm mlc-ai-nightly-cu122 0.15.dev297
mlc_llm 0.1.dev1231+gbc6e3edd /xxx/mlc-llm/python
mlc-llm-nightly-cu122 0.1.dev1145 --------------- Update ---------------- mlc-ai-nightly-cu122 0.15.dev364 I guess the mlc_llm & tvm are closely tangled, there are some mismatch bewteen new version of mlc and old version of tvm |
Glad this is resolved. likely it was due to old ersion of the tvm |
Model: https://huggingface.co/mlc-ai/Llama-2-7b-chat-hf-q4f16_1-MLC
Compiling above model with cmd:
" mlc_llm compile ./dist/Llama-2-7b-chat-hf-q4f16_1-MLC/mlc-chat-config.json --device android -o ./dist/Llama-2-7b-chat-hf-q4f16_1-MLC/llama-2-7b-chat-hf-q4f16_1-android.tar "
giving below errors:
File "/home/test/Ramees/relax/src/target/source/codegen_c.h", line 104, in tvm::codegen::CodeGenC::PrintStmt(tvm::tir::Stmt const&)
void PrintStmt(const Stmt& n) { VisitStmt(n); }
File "/home/test/Ramees/relax/src/target/source/codegen_c.cc", line 989, in tvm::codegen::CodeGenC::VisitStmt_(tvm::tir::AllocateNode const*)
ICHECK_GT(constant_size, 0) << "Can only handle constant size stack allocation for now";
tvm.error.InternalError: Traceback (most recent call last):
6: operator()
at /home/test/Ramees/relax/src/driver/driver_api.cc:531
5: tvm::TIRToRuntime(tvm::runtime::Map<tvm::Target, tvm::IRModule, void, void> const&, tvm::Target const&)
at /home/test/Ramees/relax/src/driver/driver_api.cc:514
4: tvm::codegen::Build(tvm::IRModule, tvm::Target)
at /home/test/Ramees/relax/src/target/codegen.cc:73
3: tvm::codegen::BuildOpenCL(tvm::IRModule, tvm::Target)
at /home/test/Ramees/relax/src/target/source/codegen_opencl.cc:619
2: tvm::codegen::CodeGenC::AddFunction(tvm::GlobalVar const&, tvm::tir::PrimFunc const&)
at /home/test/Ramees/relax/src/target/source/codegen_c.cc:167
1: tvm::codegen::CodeGenC::PrintStmt(tvm::tir::Stmt const&)
at /home/test/Ramees/relax/src/target/source/codegen_c.h:104
0: tvm::codegen::CodeGenC::VisitStmt_(tvm::tir::AllocateNode const*)
at /home/test/Ramees/relax/src/target/source/codegen_c.cc:989
File "/home/test/Ramees/relax/src/target/source/codegen_c.cc", line 989
InternalError: Check failed: constant_size > 0 (0 vs. 0) : Can only handle constant size stack allocation for now
Any idea why it is happening?? @tqchen
The text was updated successfully, but these errors were encountered: