Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using the spaCy tokenizer #114

Open
koaning opened this issue Apr 6, 2021 · 2 comments
Open

Consider using the spaCy tokenizer #114

koaning opened this issue Apr 6, 2021 · 2 comments

Comments

@koaning
Copy link
Contributor

koaning commented Apr 6, 2021

When looking at the config.yml I read:

  - name: "SpacyEntityExtractor"
    # Note: It is not possible to use the SpacyTokenizer + SpacyFeaturizer in 
    #       combination with the WhitespaceTokenizer, and as a result the
    #       PERSON extraction by Spacy is not very robust.
    #       Because of this, the nlu training data is annotated as well, and the
    #       DIETClassifier will also extract PERSON entities.

This reads as a fair warning, but it also begs the question: why aren't we using the spaCy tokenizer here? I'll gladly make a PR but I'm curious if there's something that I'm currently not considering.

@ArjaanBuijk
Copy link
Contributor

@koaning ,
Great question!
When using the spaCy tokenizer, the performance of the time/duration entity extraction by duckling goes down.
That was the reason not to switch over to the spaCy tokenizer, but it definitely deserves a re-evaluation.

@koaning
Copy link
Contributor Author

koaning commented Jun 21, 2021

Ah yeah. This was brought up during a research meeting a while ago.

I ended up making an experimental alternative to Duckling, should you be interested. It's part of rasa-nlu-examples. If you're exploring alternatives to Duckling, I'm all ears to any feedback.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants