Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Comparison to factorization methods (e.g. SVD) #139

Open
amy12xx opened this issue Oct 4, 2020 · 1 comment · May be fixed by #685
Open

Comparison to factorization methods (e.g. SVD) #139

amy12xx opened this issue Oct 4, 2020 · 1 comment · May be fixed by #685

Comments

@amy12xx
Copy link
Contributor

amy12xx commented Oct 4, 2020

I used the notebook 02_fit_predict_plot_employee_salaries.html to compare the performance using SVD (with same number of components as MinHashEncoder). This could be added to the current or different notebook, if you agree.

one-hot encoding
r2 score: mean: 0.856; std: 0.034

target encoding
r2 score: mean: 0.774; std: 0.033

similarity encoding
r2 score: mean: 0.915; std: 0.012

minhash encoding
r2 score: mean: 0.753; std: 0.025

svd encoding
r2 score: mean: 0.856; std: 0.014

@GaelVaroquaux
Copy link
Member

GaelVaroquaux commented Oct 26, 2020 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants