Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Detect duplicates of "the the" #188

Open
cmichi opened this issue Jul 13, 2021 · 1 comment
Open

Detect duplicates of "the the" #188

cmichi opened this issue Jul 13, 2021 · 1 comment

Comments

@cmichi
Copy link

cmichi commented Jul 13, 2021

Is your feature request related to a particular use-case?

I just discovered that the ink! codebase has quite a number of "the the" occurrences: use-ink/ink@4ac7691.

Describe the solution you'd like to implement/see implemented

AFAIK there is no legitimate use of "the the", hence cargo-spellcheck could flag those.

@drahnr
Copy link
Owner

drahnr commented Jul 19, 2021

The difficult with those special rules is: there are plenty of those special patterns could be added, but I think it'd require a systematic approach allow multi tokens to be represented. And that would quickly require the token structure provided by nlprule, so I think providing custom rules and documenting that properly is the only way to make this feasable without introducing tons of special cases.

The basic functionality of adding a custom nlpruleset is already there, it's mostly a matter of documenting how to write / expand it with custom ones and make that something easy to do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants