Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Invalid Tagger args are silently ignored #53

Open
polm opened this issue Dec 23, 2021 · 4 comments
Open

Invalid Tagger args are silently ignored #53

polm opened this issue Dec 23, 2021 · 4 comments
Labels
bug Something isn't working

Comments

@polm
Copy link
Owner

polm commented Dec 23, 2021

If you do something like fugashi.Tagger("d /asdf") (no hyphen), the invalid arguments will be ignored and you'll get a working tagger. That's weird and unhelpful. It seems to be the way the MeCab API works, but the actual MeCab command line behaves reasonably (gives an error), so there should be a way to modify the behavior.

@lambdadog
Copy link
Contributor

lambdadog commented Dec 24, 2021

Confused why this is the case, studying the codepath in mecab.

mecab_new creates its tagger using createTagger, which then calls TaggerImpl::open, which calls ModelImpl::open which then calls Param::open and load_dictionary_resource just like mecab_do (the entry-point for the executable) all with (as far as I can tell) the exact same arguments.

A param parse failure should go up the chain and cause mecab_new to return 0 instead of the tagger, as far as I can tell reading the code, but that's clearly not the case.

@polm
Copy link
Owner Author

polm commented Dec 24, 2021

For reference, while the code paths do look the same, this isn't the first inconsistency like this to come up - by default errors in the Tagger don't give output when called via the API because the error message is cleared somewhere, and there's a ridiculous workaround involving the Model class to make that work at the moment.

Another note since several issues have come up: my good arm is in a cast right now, and while I have use of my fingers I have less bandwidth than usual.

@aehlke
Copy link

aehlke commented Jul 12, 2022

fyi also, if the path has whitespace in it, mecab will not accept it

@polm
Copy link
Owner Author

polm commented Jul 13, 2022

@aehlke That isn't the same issue, and you can get around it with quoting, see here.

@polm polm added the bug Something isn't working label Nov 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants