-
Notifications
You must be signed in to change notification settings - Fork 276
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 demo encount: RuntimeError: CUDA error: invalid configuration argument #385
Comments
From @bozheng-hit, huggingface/transformers@304c6a1 breaks the GPTQ of Qwen1.5-MoE. Please try an earlier snapshot of the |
Please also see this issue at transformers: huggingface/transformers#30515 |
Thank you so much @jklj077 , I reverted that pr and it worked. This is a response of just a "hello":
It took about 2.7s to generate 52 tokens on RTX4090, is this right? I think it seems a little slow. |
This is a response from Qwen1.5-4B-Chat:
It took about 1.2s to generate 59 tokens on RTX4090. Twice as fast than Qwen1.5-MoE-A2.7B-Chat-GPTQ-Int4 model. |
Hi, thanks for sharing your perf results! We believe that evaluating performance through metrics such as speed can prove challenging due to the multitude of influencing factors.
For a performance assessment, we recommend referring to our results at https://qwenlm.github.io/blog/qwen-moe/#costs-and-efficiency. Here, we employ While it provides a foundation for understanding performance, real-world applications often necessitate nuanced decision-making. Factors such as the specific hardware configuration and the availability of optimized software implementations can significantly influence the actual performance of a model. Users are encouraged to make informed decisions tailored to their unique requirements and constraints, ensuring optimal performance in their specific reality. |
Thank you for your reply:) I will check this out on my own. |
Hi :)
I'm running Qwen MoE demo code from qwen blog, but get this error:
some of my device infomations:
I have searched for a long time, but no luck. Could anyone help? Thanks a lot:)
code:
The text was updated successfully, but these errors were encountered: