-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnicodeDecodeError at tag method #106
Comments
Hello, I have exactly the same issue, if I am able to train my model with bytes but when I use the tagger if the output is a bytes there is an internal error (same as above) which provide me to get the tag. The only solution I have for the moment is to use crfsuite instead which is able to output non-ascii tags... |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Currently I base my code on this tutorial and I have some problems with
tag
method after the train section. I catch theUnicodeDecodeError
exception like thisThe output looks like this
I tried to decode my
X_test
beforetag
usingdecode('utf-8')
but does seems not to works.Just in case, I had some
UnicodeEncodeError
problems at thetrainer
object as shown below but seems that works usingencode('utf-8')
for every substring. With this method I'm forcing manual encoding before append objects in trainer. This issue is mentioned at #96 and this solution works for me.NOTE: Sorry for my deficent english. I hope I've been clear enough. If not, please tell me!!! :)
The text was updated successfully, but these errors were encountered: