New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Llama3 with LlamaForSequenceClassification - Shape mismatch error #30548
Comments
Hey! Pretty sure that is expected: there are no pretrained checkpoints on sequence classification no? |
You're right -- this was a warning not an error. The warning is followed by the following error.
It looks like there are similar open issues. Something to do with tied weights not being detected correctly |
cc @SunMarc |
Hi @parasurama, thanks for reporting ! I'll have a look asap |
Hi @parasurama, this happens because you changed |
@SunMarc the error is raised even if I use the default Here's a minimum example:
|
This happens because the default vocab_size of LlamaConfig is 32000 but llama v3 checkpoint have a vocab_size of 128256 but llama v2 checkpoint have a vocab_size of 32000. So by passing LlamaConfig() with "meta-llama/Meta-Llama-3-8B", you are modifying the model. |
System Info
transformers
version: 4.40.0Who can help?
@ArthurZucker @younesbelkada
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
I'm getting a shape mismatch error when loading the LLama3 model with LlamaForSequenceClassification
Expected behavior
Returns the following error
The text was updated successfully, but these errors were encountered: