[Bug] `mlc_llm chat` throws errors for model `mlc-ai/Qwen1.5-1.8B-Chat-q4f16_1-MLC` #2254

BodhiHu · 2024-04-30T08:36:39Z

🐛 Bug

Hello,

HF://mlc-ai/Qwen1.5-1.8B-Chat-q4f16_1-MLC seems to be incomplete:

missing max_batch_size in mlc-chat-config.json;
no tokenizers found under Qwen1.5-1.8B-Chat-q4f16_1-MLC/

And above two missings will cause mlc_llm chat ... throw errors.

Steps to reproduce the behavior:

Platform (e.g. WebGPU/Vulkan/IOS/Android/CUDA): Mac M1, Metal
Operating system (e.g. Ubuntu/Windows/MacOS/...): MacOS
Device (e.g. iPhone 12 Pro, PC+RTX 3090, ...)
How you installed MLC-LLM (conda, source): yes
How you installed TVM-Unity (pip, source):
Python version (e.g. 3.10):
GPU driver version (if applicable):
CUDA/cuDNN version (if applicable):
TVM Unity Hash Tag (python -c "import tvm; print('\n'.join(f'{k}: {v}' for k, v in tvm.support.libinfo().items()))", applicable if you compile models):
Any other relevant information:

The text was updated successfully, but these errors were encountered:

mengshyu · 2024-05-29T13:58:44Z

Hi @BodhiHu, I've updated the config file, can you try it again with latest mlc llm, thanks.

BodhiHu added the bug Confirmed bugs label Apr 30, 2024

tqchen closed this as completed Jun 7, 2024