-
Notifications
You must be signed in to change notification settings - Fork 331
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packaging mecab-ko for easier use #398
Comments
Hello, @polm!
|
Thanks for the heads-up on |
I never got around to doing this with fugashi, but someone made a package called pymecab-ko that's like mecab-python3 for mecab-ko, so that might be useful to anyone who was waiting on it. Like mecab-python3 it allows you install a working dictionary and MeCab just using pip. In spaCy we're considering switching to using this package, but are concerned it might be disruptive for existing users (PR here). Do you have any idea how common it is to use a customized dictionary with mecab-ko? Are there alternatives to mecab-ko-dic in use? (In Japanese there's ipadic, different UniDics, and NEologd, for example, but I'm not aware of anything in Korean.) |
@polm I made pymecab-ko for some people who misuse mecab. (e.g. mecab(not mecab-ko) with mecab-ko-dic) And it was totally influenced by your great work. Thank you. I'm also working on applying pymecab-ko to KoNLPy and plannig to make a PR within this week. Usually, general users rarely use custom dictionaries. but I heared that some companies have their own custom dictionaries made with their in-house corpus. Recently, a dictionary trained using a new Korean corpus has been released. I will upload it to PyPI in this week. |
@NoUnique Thank you for making pymecab-ko, it's a great project to have! Thank you also for the extra information about custom dictionary usage and the new dictionary release. It's great to have these resources for Korean NLP and for things to be easier to use in general. |
Unfortunately, the replacement to pymecab-ko cannot be made until KoNLPy's Python2 support is completely ended. This is because pymecab-ko only supports Python 3.6 or higher. |
Hello. I'm a spaCy core developer, and we currently use mecab-ko for our Korean language support, but we're not entirely satisfied with it because it requires mecab-ko to be installed outside of Python, which is inconvenient for many users.
Separately from my work on spaCy I also maintain MeCab related packages for Japanese, mecab-python3 and fugashi. These packages use wheels so that MeCab is included in them and doesn't have to be installed outside of Python - you can get a working setup with just
pip install
.Would there be any interest in creating a similar package for mecab-ko? I don't speak Korean so I can't check if results are correct or not on my own beyond a very basic level, but I'd be glad to help with packaging or getting started. I could even just reproduce fugashi and replace MeCab with mecab-ko and set everything up if someone could check things and (better) take over the project from me after that.
The text was updated successfully, but these errors were encountered: