Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

阿里云DSW环境,总是提示 The model's max seq len (8192) is larger than the maximum number of tokens that can be stored in KV cache (5392). Try increasing gpu_memory_utilization or decreasing max_model_len when initializing the engine. #38

Open
leeeex opened this issue Feb 16, 2024 · 1 comment

Comments

@leeeex
Copy link

leeeex commented Feb 16, 2024

KwaiKEG/kagentlms_qwen_7b_mat
Qwen/Qwen-7B-Chat
两个都一样的提示。
尝试qwen1.5新版本,提示keyerror,qwen2

@leeeex
Copy link
Author

leeeex commented Feb 16, 2024

python3 -m fastchat.serve.model_worker --model-path KwaiKEG/kagentlms_qwen_7b_mat --controller http://localhost:21001 --port 31000 --worker http://localhost:31000
查了fastchat官网,第二个服务用这个命令可以解决

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant