Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tags normalization #401

Open
FrancescoManfredi opened this issue Apr 16, 2024 · 1 comment
Open

Tags normalization #401

FrancescoManfredi opened this issue Apr 16, 2024 · 1 comment

Comments

@FrancescoManfredi
Copy link
Contributor

A high number of tags refer to the same concept with different wording or different casing/styling for the same words.
It might be a good idea to add a normalization pipeline for the tags in each company.
Here is a mapping from original to normalized tags in the form of a python dict (easily convertible in any other format) that might be useful as a starting point: https://github.com/FrancescoManfredi/AIRV-analysis/blob/main/tags_repl.py
I'm the author of that mapping and this is an invite to make use of it in any way you prefer.

@edoardocostantinidev
Copy link
Collaborator

Hi @FrancescoManfredi, first of all thanks for your input and the blog post! Super fascinating.
I agree this an issue that can be fixed relatively easily. We'll probably convert your mapping to integrate it into our golang validator/generator so we stick to a single language.

I personally don't have much time these days to tackle the issue but if no one picks it up by mid of May I'll try and tackle it myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants