-
Notifications
You must be signed in to change notification settings - Fork 201
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
exception:enum PyPreTokenizerTypeWrapper, while loading the fine-tuned model for evaluation #520
Comments
+1 |
I solved the error by updating the tokenizer and transformer libraries with pip -U |
@jmatzat - which version of tokenizer and transformer I need to go with?? If you can see above I am using below version of transformer: and one more thing, where i have to use the pip -U command for updating the version, I mean during fine-tuning or during validation as we are having different instance for both. |
I encountered the problem while loading the SetFit Model from pretrained. tokenizers_version: 0.19.1 transformers_version: 4.40.2 You might have to update scikit-learn aswell, after updating tokenizer and transformer |
@jmatzat I tried to update the tokenizer and transformer version, but ended up with below error
|
Have you tried updating setFit aswell? Updating Tokenizer and Transformer might require you to update other packages aswell, that depend on them. |
@jmatzat - yes, I tried with two Setfit version setfit==0.7.0 and 1.0.3. But if you can see transformer version itself not compatible with Tokenizer version first. |
Hi - currently we are fine-tuning the model: "paraphrase-multilingual-MiniLM-L12-v2" for our use case. In our pipeline, we have a model validation part where we are loading the trained model with:
model = SetFitModel.from_pretrained(model_dir)
but unfortunately, we are getting the below exception: -
Exception: data did not match any variant of untagged enum PyPreTokenizerTypeWrapper at line 83 column 3.
Note: I am using amazon Sagemaker platform for finetuning with below configuration:
for traning:
instance_type: "ml.g5.2xlarge"
instance_count: 1
transformers_version: "4.28.1"
pytorch_version: "2.0.0"
setfit_version: "0.7.0"
py_version: "py310"
for validation:
instance_type: "ml.t3.xlarge"
instance_count: 1
it was working fine with the above configuration but since last couple of days we are getting the above-mentioned exception. So, it would be great if anyone can help us out to fix the issue.
Do let me know if any other information is required from our side.
The text was updated successfully, but these errors were encountered: